Version 462

closes #1007
This commit is contained in:
Hydrus Network Developer 2021-11-17 15:22:27 -06:00
parent cbacaac448
commit ca2f5f1612
24 changed files with 596 additions and 125 deletions

View File

@ -8,6 +8,38 @@
<div class="content">
<h3 id="changelog"><a href="#changelog">changelog</a></h3>
<ul>
<li><h3 id="version_462"><a href="#version_462">version 462</a></h3></li>
<ul>
<li>misc:</li>
<li>fixed a recent serious regression that could cause a crash when playing audio in mpv (issue #1007)</li>
<li>the main importer file log now does 'get next/all/count imports with status y' calls significantly faster, particularly on very large lists. these calls happen all the time for different status text changes and general 'which import to try next?' work. all busy downloader situations should see CPU gains to regular and background work</li>
<li>fixed a problem where importing with the min/max file resolution options set would give a typo error when the size was violated rather than 'ignored'</li>
<li>I think I have fixed an issue with subscriptions not wanting to run a query if by random accident that query has an invalid URL selected as the query's 'example url' for various pre-work login and bandwidth tests</li>
<li>hydrus can now capture duration/fps of videos that specify two very close fps, e.g. 60 and 59.99. previously, these would default to the 24 fallback and could cause some weirdness in mpv</li>
<li>replaced the default pixiv artist page api parser with one that fetches the newer url format, matching the tag search. existing users will see no automatic change but will receive the new parser, so if you are a big pixiv user, you might like to switch 'pixiv artist gallery page api' to the 'new urls' parser variant under _network->downloader components->manage url class links_. note that if you do this, your pixiv artist subscriptions will do a mini-resync (which involves some wasted time/bandwidth) as their urls change!</li>
<li>.</li>
<li>network redirect improvements:</li>
<li>gallery page jobs now give their child 'next gallery page' jobs themselves as a referrer</li>
<li>when the gallery downloader gets a 3XX redirect, the file import objects and next gallery pages it makes now get the redirected URL as referral url (previously, it used the original gallery url)</li>
<li>when the post downloader gets a 3XX redirect, the redirected url is now added as a primary source url</li>
<li>when the post downloader gets a 3XX redirect, child import objects and direct file downloads now get the redirect URL as referral url (previously, it used the original post url)</li>
<li>when the raw file downloader gets a 3XX redirect, the redirected url is now added as a primary source url</li>
<li>when the raw file downloader gets a 3XX redirect to a Post URL, it now tries to queue that URL back up in the file log, just like when a gallery fetch comes back with a Post URL. some safety code stops potential loops here as well</li>
<li>.</li>
<li>new services:</li>
<li>a new client now starts with a second local tag service called 'downloader tags'. default tag import options are now initialised in a fresh client to pull all file post tags to this service. this should relieve new users' confusion about setting up default tag import options</li>
<li>similarly, a new client now starts with a like/dislike rating service called 'favourites'. existing users who have no rating services will be asked if they want to get it on update. many users are unaware of the rating system, so this is a chance to play with it</li>
<li>the 'getting started with downloading' 'and '...with ratings' help has some updated screenshots and talk about the new default services and parsing</li>
<li>.</li>
<li>database fixes:</li>
<li>fixed a very slow database query that was involved with file search using unnamespaced tags when other search predicates had already limited the scope of search</li>
<li>fixed a similar slow query when the 'bad' search predicate was 'namespace:anything', particularly when the namespace is a wildcard</li>
<li>fixed the 'clear orphan tables' database maintenance routine. it had decayed due to bit rot and was deleting new repository processing tracking tables. the routine is now plugged directly into the new database modules system, and any module now can be upgraded to discover surplus service tables. the system has been updated to permit the detection and removal of duplicate tables in the wrong database file, and it also makes a statement if no orphan tables were found</li>
<li>the 'metadata resync' repository maintenance task now removes surplus file rows from the processing tracking tables</li>
<li>the 'metadata resync' repository maintenance task now corrects content type rows in the main processing tracking table</li>
<li>the process of registering updates to a repository is now a little faster, more reliable, repairs basic damage, and keeps more good work when damage needs to be repaired</li>
<li>I _think_ the users who were still getting PTR processing errors related to a database confusion about content/definitions update files should now be fixed after another full metadata resync! please let me know if there are still problems</li>
</ul>
<li><h3 id="version_461"><a href="#version_461">version 461</a></h3></li>
<ul>
<li>misc:</li>

Binary file not shown.

Before

Width:  |  Height:  |  Size: 255 KiB

After

Width:  |  Height:  |  Size: 886 KiB

View File

@ -15,26 +15,27 @@
<h3 id="start"><a href="#start">let's do it</a></h3>
<p>Open the new page selector with F9 and then hit <i>download->gallery</i>:</p>
<p><img src="downloader_page.png" /></p>
<p class="warning">You can do a test download here of a few files if you want, but don't start downloading loads of stuff until you have read about parsing tags!</p>
<p>The gallery page can download from multiple sources at the same time. Each entry in the list represents a basic combination of two things:</p>
<ul>
<li><b>source</b> - The site you are getting from. Safebooru or Danbooru or Deviant Art or twitter or anywhere else.</li>
<li><b>query text</b> - Something like 'contrapposto' or 'blonde_hair blue_eyes' or an artist name like 'incase'. Whatever is searched on the site to return a list of ordered media.</li>
</ul>
<p>So, when you want to start a new download, you first select the source with the button--by default, it is probably 'Artstation' for you--and then type in a query in the text box and hit enter. The download will soon start and fill in information, and thumbnails should stream in, just like the hard drive importer. The downloader typically works by walking through the search's gallery pages one by one, queueing up the found files for later download. There are several intentional delays built into the system, so do not worry if work seems to halt for a little while--you will get a feel for it with experience.</p>
<p>So, when you want to start a new download, you first select the source with the button and then type in a query in the text box and hit enter. The download will soon start and fill in information, and thumbnails should stream in, just like the hard drive importer. The downloader typically works by walking through the search's gallery pages one by one, queueing up the found files for later download. There are several intentional delays built into the system, so do not worry if work seems to halt for a little while--you will get a feel for hydrus's 'slow persistent growth' style with experience.</p>
<p class="warning">Do a test download now, for fun! Pause its gallery search after a page or two, and then pause the file import queue after a dozen or so files come in.</p>
<p>The thumbnail panel can only show results from one queue at a time, so double-click on an entry to 'highlight' it, which will show its thumbs and also give more detailed info and controls in the 'highlighted query' panel. I encourage you to explore the highlight panel over time, as it can show and do quite a lot. Double-click again to 'clear' it.</p>
<p>It is a good idea to 'test' larger downloads, either by visiting the site itself for that query, or just waiting a bit and reviewing the first files that come in. Just make sure that you <i>are</i> getting what you thought you would, whether that be verifying that the query text is correct or that the site isn't only giving you bloated gifs or other bad quality files. The 'file limit', which stops the gallery search after the set number of files, is also great for limiting fishing expeditions (such as overbroad searches like 'wide_hips', which on the bigger boorus have 100k+ results and return <i>variable</i> quality). If the gallery search runs out of new files before the file limit is hit, the search will naturally stop (and the entry in the list should gain a &#x23f9; 'stop' symbol).</p>
<p><i>Note that some sites only serve 25 or 50 pages of results, despite their indices suggesting hundreds. If you notice that one site always bombs out at, say, 500 results, it may be due to a decision on their end. You can usually test this by visiting the pages hydrus tried in your web browser.</i></p>
<p><b>In general, particularly when starting out, artist searches are best.</b> They are usually fewer than a thousand files and have fairly uniform quality throughout.</p>
<h3 id="parsing_tags"><a href="#parsing_tags">parsing tags</a></h3>
<p>But we don't just want files--most sites offer tags as well. <b>By default, hydrus does not fetch any tags for downloads.</b> As you use the client, you will figure out what sorts of tags you are interested in and shape your parsing rules appropriately, but for now, let's do a test that just gets everything--click <i>tag import options</i>:</p>
<p><img src="tag_import_options.png" /></p>
<p>By default, all 'tag import options' objects defer to the client's defaults. Since we want to change this queue from the current default of 'get nothing' to 'get everything', uncheck the top default checkbox and then click 'get tags' on a tag service, whether that is your 'my tags' or the PTR if you have added it. Hit apply and run a simple query for something, like 'blue_eyes' on one of the boorus. Pause its gallery search after a page or two, and then pause the import queue after a dozen or so files come in--they should be really well tagged!</p>
<p>It is easy to get tens of thousands of tags this way. Different sites offer different kinds and qualities of tags, and the client's downloaders (which were designed by me, the dev, or a user) may parse all or only some of them. Many users like to just get everything on offer, but others only ever want, say, 'creator', 'series', and 'character' tags. If you feel brave, click that 'all tags' button on tag import options, which will take you into hydrus's advanced 'tag filter', which allows you to whitelist or blacklist the incoming list of tags according to whatever your preferences are.</p>
<p class="warning">The file limit and file/tag import options on the upper panel, if changed, will only apply to <b>new</b> queries. If you want to change the options for an existing queue, either do so on its highlight panel or use the 'set options to queries' button.</p>
<p>Tag import options can get complicated. The blacklist button will let you skip downloading files that have certain tags (perhaps you would like to auto-skip all images with 'gore', 'scat', or 'diaper'?), again using the tag filter. The 'additional tags' also allow you to add some personal tags to all files coming in--for instance, you might like to add 'process into favourites' to your 'my tags' for some query you really like so you can find those files again later and process them separately. That little 'cog' icon button can also do some advanced things. I recommend you start by just getting everything (or nothing, if you really would rather tag everything yourself), and then revisiting it once you have some more experience. Once you have played with this a bit, let's fix your preferences as the new default:</p>
<h3 id="default_tio"><a href="#default_tio">default tag import options</a></h3>
<p>Hit <i>network->downloaders->manage default tag import options</i>. Set a new default for 'file posts', and that will be the default (that we originally turned off above) for all gallery download pages (and subscriptions, which you will learn about later). You can have different TIOs for each site, but again, we will leave it simple for now.</p>
<p>But we don't just want files--most sites offer tags as well. By default, hydrus now starts with a local tag service called 'downloader tags' and it will parse (get) all the tags from normal gallery sites and put them in this service. You don't have to do anything, you will get some decent tags. As you use the client, you will figure out which tags you like and where you want them. On the downloader page, click <i>tag import options</i>:</p>
<p><img src="tag_import_options_default.png" /></p>
<p>This is an important dialog, although you will not need to use it much. It governs which tags are parsed and where they go. To keep things easy to manage, a new downloader will refer to the 'default' tag import options for a website, but for now let's set some values just for this downloader:</p>
<p><img src="tag_import_options_specific.png" /></p>
<p>You can see that each tag service on your client has a separate section. If you add the PTR, that will get a new box too. A new client is set to <i>get all tags</i> for 'downloader tags' service. Things can get much more complicated. Have a play around with the options here as you figure things out. Most of the controls have tooltips or longer explainers in sub-dialogs, so don't be afraid to try things.</p>
<p>It is easy to get tens of thousands of tags by downloading this way. Different sites offer different kinds and qualities of tags, and the client's downloaders (which were designed by me, the dev, or a user) may parse all or only some of them. Many users like to just get everything on offer, but others only ever want, say, 'creator', 'series', and 'character' tags. If you feel brave, click that 'all tags' button, which will take you into hydrus's advanced 'tag filter', which allows you to select which of the incoming list of tags will be added.</p>
<p>The blacklist button will let you skip downloading files that have certain tags (perhaps you would like to auto-skip all images with 'gore', 'scat', or 'diaper'?), again using the tag filter, while the whitelist enables you to only allow files that have at least one of a set of tags. The 'additional tags' adds some fixed personal tags to all files coming in--for instance, you might like to add 'process into favourites' to your 'my tags' for some query you really like so you can find those files again later and process them separately. That little 'cog' icon button can also do some advanced things.</p>
<p>To edit the defaults, hit up <i>network->downloaders->manage default tag import options</i>. You should do this as you get a better idea of your preferences. You can set them for all file posts generally, all watchers, and for specific sites as well.</p>
<p class="warning">The file limit and file/tag import options on the upper panel, if changed, will only apply to <b>new</b> queries. If you want to change the options for an existing queue, either do so on its highlight panel below or use the 'set options to queries' button.</p>
<h3 id="threads"><a href="#threads">watching threads</a></h3>
<p>If you are an imageboard user, try going to a thread you like and drag-and-drop its URL (straight from your web browser's address bar) onto the hydrus client. It should open up a new 'watcher' page and import the thread's files!</p>
<p><img src="watcher_page.png" /></p>

View File

@ -9,7 +9,7 @@
<p><a href="getting_started_downloading.html"><--- Back to downloading</a></p>
<p>The hydrus client supports two kinds of ratings: <i>like/dislike</i> and <i>numerical</i>. Let's start with the simpler one:</p>
<h3 id="like_dislike"><a href="#like_dislike">like/dislike</a></h3>
<p>This can set one of two values to a file. It does not have to represent like or dislike--it can be anything you want. Go to <i>services->manage services->local->like/dislike ratings</i>:</p>
<p>A new client starts with one of these, called 'favourites'. It can set one of two values to a file. It does not have to represent like or dislike--it can be anything you want, like 'send to export folder' or 'explicit/safe' or 'cool babes'. Go to <i>services->manage services->local->like/dislike ratings</i>:</p>
<p><img src="ratings_like.png" /></p>
<p>You can set a variety of colours and shapes.</p>
<h3 id="numerical"><a href="#numerical">numerical</a></h3>

Binary file not shown.

Before

Width:  |  Height:  |  Size: 216 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

View File

@ -538,6 +538,8 @@ global_pixmaps = GlobalPixmaps.instance
DEFAULT_LOCAL_TAG_SERVICE_KEY = b'local tags'
DEFAULT_LOCAL_DOWNLOADER_TAG_SERVICE_KEY = b'downloader tags'
LOCAL_FILE_SERVICE_KEY = b'local files'
LOCAL_UPDATE_SERVICE_KEY = b'repository updates'
@ -546,6 +548,8 @@ LOCAL_BOORU_SERVICE_KEY = b'local booru'
LOCAL_NOTES_SERVICE_KEY = b'local notes'
DEFAULT_FAVOURITES_RATING_SERVICE_KEY = b'favourites'
CLIENT_API_SERVICE_KEY = b'client api'
TRASH_SERVICE_KEY = b'trash'

View File

@ -268,8 +268,6 @@ def GetDefaultParsers():
def GetDefaultScriptRows():
from hydrus.core import HydrusData
script_info = []
script_info.append( ( 32, 'iqdb danbooru', 2, HydrusData.GetNow(), '''["https://danbooru.iqdb.org/", 1, 0, [55, 1, [[], "some hash bytes"]], "file", {}, [[29, 1, ["link to danbooru", [27, 6, [[26, 1, [[62, 2, [0, "td", {"class": "image"}, 1, null, false, [51, 1, [3, "", null, null, "example string"]]]], [62, 2, [0, "a", {}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]]]], 0, "href", [51, 1, [3, "", null, null, "example string"]], [55, 1, [[], "parsed information"]]]], [[30, 4, ["", 0, [27, 6, [[26, 1, [[62, 2, [0, "section", {"id": "tag-list"}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]], [62, 2, [0, "li", {"class": "tag-type-1"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]], [62, 2, [0, "a", {"class": "search-tag"}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]]]], 1, "", [51, 1, [3, "", null, null, "example string"]], [55, 1, [[], "parsed information"]]]], 0, false, "creator"]], [30, 4, ["", 0, [27, 6, [[26, 1, [[62, 2, [0, "section", {"id": "tag-list"}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]], [62, 2, [0, "li", {"class": "tag-type-3"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]], [62, 2, [0, "a", {"class": "search-tag"}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]]]], 1, "", [51, 1, [3, "", null, null, "example string"]], [55, 1, [[], "parsed information"]]]], 0, false, "series"]], [30, 4, ["", 0, [27, 6, [[26, 1, [[62, 2, [0, "section", {"id": "tag-list"}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]], [62, 2, [0, "li", {"class": "tag-type-4"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]], [62, 2, [0, "a", {"class": "search-tag"}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]]]], 1, "", [51, 1, [3, "", null, null, "example string"]], [55, 1, [[], "parsed information"]]]], 0, false, "character"]], [30, 4, ["", 0, [27, 6, [[26, 1, [[62, 2, [0, "section", {"id": "tag-list"}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]], [62, 2, [0, "li", {"class": "tag-type-0"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]], [62, 2, [0, "a", {"class": "search-tag"}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]]]], 1, "", [51, 1, [3, "", null, null, "example string"]], [55, 1, [[], "parsed information"]]]], 0, false, ""]], [30, 4, ["", 0, [27, 6, [[26, 1, [[62, 2, [0, "section", {"id": "post-information"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]], [62, 2, [0, "li", {}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]]], 1, "", [51, 1, [2, "Rating:*", null, null, "Rating: Safe"]], [55, 1, [[[0, 8]], "Rating: Safe"]]]], 0, false, "rating"]], [30, 4, ["", 7, [27, 6, [[26, 1, [[62, 2, [0, "section", {"id": "post-information"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]], [62, 2, [0, "li", {}, null, null, true, [51, 1, [2, "Source:*", null, null, "Source:"]]]], [62, 2, [0, "a", {}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]]], 0, "href", [51, 1, [3, "", null, null, "example string"]], [55, 1, [[], "parsed information"]]]], 0, false, [8, 0]]]]]], [30, 4, ["no iqdb match found", 8, [27, 6, [[26, 1, [[62, 2, [0, "th", {}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]]], 1, "", [51, 1, [3, "", null, null, "example string"]], [55, 1, [[], "parsed information"]]]], 0, false, [false, [51, 1, [2, "Best match", null, null, "Best match"]]]]]]]''' ) )
@ -651,6 +649,18 @@ def SetDefaultDomainManagerData( domain_manager ):
domain_manager.TryToLinkURLClassesAndParsers()
#
from hydrus.client.importing.options import TagImportOptions
service_tag_import_options = TagImportOptions.ServiceTagImportOptions( get_tags = True )
service_keys_to_service_tag_import_options = { CC.DEFAULT_LOCAL_DOWNLOADER_TAG_SERVICE_KEY : service_tag_import_options }
tag_import_options = TagImportOptions.TagImportOptions( service_keys_to_service_tag_import_options = service_keys_to_service_tag_import_options )
domain_manager.SetDefaultFilePostTagImportOptions( tag_import_options )
def SetDefaultFavouriteSearchManagerData( favourite_search_manager ):
from hydrus.client.media import ClientMedia

View File

@ -285,8 +285,8 @@ class ClientOptions( HydrusSerialisable.SerialisableBase ):
from hydrus.client.metadata import ClientTags
self._dictionary[ 'duplicate_action_options' ][ HC.DUPLICATE_BETTER ] = ClientDuplicates.DuplicateActionOptions( [ ( CC.DEFAULT_LOCAL_TAG_SERVICE_KEY, HC.CONTENT_MERGE_ACTION_MOVE, HydrusTags.TagFilter() ) ], [], sync_archive = True, sync_urls_action = HC.CONTENT_MERGE_ACTION_COPY )
self._dictionary[ 'duplicate_action_options' ][ HC.DUPLICATE_SAME_QUALITY ] = ClientDuplicates.DuplicateActionOptions( [ ( CC.DEFAULT_LOCAL_TAG_SERVICE_KEY, HC.CONTENT_MERGE_ACTION_TWO_WAY_MERGE, HydrusTags.TagFilter() ) ], [], sync_archive = True, sync_urls_action = HC.CONTENT_MERGE_ACTION_TWO_WAY_MERGE )
self._dictionary[ 'duplicate_action_options' ][ HC.DUPLICATE_BETTER ] = ClientDuplicates.DuplicateActionOptions( [ ( CC.DEFAULT_LOCAL_TAG_SERVICE_KEY, HC.CONTENT_MERGE_ACTION_MOVE, HydrusTags.TagFilter() ), ( CC.DEFAULT_LOCAL_DOWNLOADER_TAG_SERVICE_KEY, HC.CONTENT_MERGE_ACTION_MOVE, HydrusTags.TagFilter() ) ], [ ( CC.DEFAULT_FAVOURITES_RATING_SERVICE_KEY, HC.CONTENT_MERGE_ACTION_MOVE ) ], sync_archive = True, sync_urls_action = HC.CONTENT_MERGE_ACTION_COPY )
self._dictionary[ 'duplicate_action_options' ][ HC.DUPLICATE_SAME_QUALITY ] = ClientDuplicates.DuplicateActionOptions( [ ( CC.DEFAULT_LOCAL_TAG_SERVICE_KEY, HC.CONTENT_MERGE_ACTION_TWO_WAY_MERGE, HydrusTags.TagFilter() ), ( CC.DEFAULT_LOCAL_DOWNLOADER_TAG_SERVICE_KEY, HC.CONTENT_MERGE_ACTION_TWO_WAY_MERGE, HydrusTags.TagFilter() ) ], [ ( CC.DEFAULT_FAVOURITES_RATING_SERVICE_KEY, HC.CONTENT_MERGE_ACTION_TWO_WAY_MERGE ) ], sync_archive = True, sync_urls_action = HC.CONTENT_MERGE_ACTION_TWO_WAY_MERGE )
self._dictionary[ 'duplicate_action_options' ][ HC.DUPLICATE_ALTERNATE ] = ClientDuplicates.DuplicateActionOptions()
#

View File

@ -2567,34 +2567,35 @@ class DB( HydrusDB.HydrusDB ):
def _ClearOrphanTables( self ):
service_ids = self._STL( self._Execute( 'SELECT service_id FROM services;' ) )
all_table_names = set()
table_prefixes = []
db_names = [ name for ( index, name, path ) in self._Execute( 'PRAGMA database_list;' ) if name not in ( 'mem', 'temp', 'durable_temp' ) ]
table_prefixes.append( 'repository_hash_id_map_' )
table_prefixes.append( 'repository_tag_id_map_' )
table_prefixes.append( 'repository_updates_' )
good_table_names = set()
for service_id in service_ids:
for db_name in db_names:
suffix = str( service_id )
table_names = self._STS( self._Execute( 'SELECT name FROM {}.sqlite_master WHERE type = ?;'.format( db_name ), ( 'table', ) ) )
for table_prefix in table_prefixes:
if db_name != 'main':
good_table_names.add( table_prefix + suffix )
table_names = { '{}.{}'.format( db_name, table_name ) for table_name in table_names }
all_table_names.update( table_names )
existing_table_names = set()
all_surplus_table_names = set()
existing_table_names.update( self._STS( self._Execute( 'SELECT name FROM sqlite_master WHERE type = ?;', ( 'table', ) ) ) )
existing_table_names.update( self._STS( self._Execute( 'SELECT name FROM external_master.sqlite_master WHERE type = ?;', ( 'table', ) ) ) )
for module in self._modules:
surplus_table_names = module.GetSurplusServiceTableNames( all_table_names )
all_surplus_table_names.update( surplus_table_names )
existing_table_names = { name for name in existing_table_names if True in ( name.startswith( table_prefix ) for table_prefix in table_prefixes ) }
surplus_table_names = sorted( existing_table_names.difference( good_table_names ) )
if len( surplus_table_names ) == 0:
HydrusData.ShowText( 'No orphan tables!' )
for table_name in surplus_table_names:
@ -2703,14 +2704,32 @@ class DB( HydrusDB.HydrusDB ):
init_service_info.append( ( CC.TRASH_SERVICE_KEY, HC.LOCAL_FILE_TRASH_DOMAIN, 'trash' ) )
init_service_info.append( ( CC.LOCAL_UPDATE_SERVICE_KEY, HC.LOCAL_FILE_DOMAIN, 'repository updates' ) )
init_service_info.append( ( CC.DEFAULT_LOCAL_TAG_SERVICE_KEY, HC.LOCAL_TAG, 'my tags' ) )
init_service_info.append( ( CC.DEFAULT_LOCAL_DOWNLOADER_TAG_SERVICE_KEY, HC.LOCAL_TAG, 'downloader tags' ) )
init_service_info.append( ( CC.LOCAL_BOORU_SERVICE_KEY, HC.LOCAL_BOORU, 'local booru' ) )
init_service_info.append( ( CC.LOCAL_NOTES_SERVICE_KEY, HC.LOCAL_NOTES, 'local notes' ) )
init_service_info.append( ( CC.DEFAULT_FAVOURITES_RATING_SERVICE_KEY, HC.LOCAL_RATING_LIKE, 'favourites' ) )
init_service_info.append( ( CC.CLIENT_API_SERVICE_KEY, HC.CLIENT_API_SERVICE, 'client api' ) )
for ( service_key, service_type, name ) in init_service_info:
dictionary = ClientServices.GenerateDefaultServiceDictionary( service_type )
if service_key == CC.DEFAULT_FAVOURITES_RATING_SERVICE_KEY:
from hydrus.client.metadata import ClientRatings
dictionary[ 'shape' ] = ClientRatings.STAR
like_colours = {}
like_colours[ ClientRatings.LIKE ] = ( ( 0, 0, 0 ), ( 240, 240, 65 ) )
like_colours[ ClientRatings.DISLIKE ] = ( ( 0, 0, 0 ), ( 200, 80, 120 ) )
like_colours[ ClientRatings.NULL ] = ( ( 0, 0, 0 ), ( 191, 191, 191 ) )
like_colours[ ClientRatings.MIXED ] = ( ( 0, 0, 0 ), ( 95, 95, 95 ) )
dictionary[ 'colours' ] = list( like_colours.items() )
self._AddService( service_key, service_type, name, dictionary )
@ -7572,7 +7591,10 @@ class DB( HydrusDB.HydrusDB ):
if do_hash_table_join:
# temp hashes to mappings to temp tags
queries = [ 'SELECT hash_id FROM {} WHERE EXISTS ( SELECT 1 FROM {} CROSS JOIN {} USING ( tag_id ) WHERE {}.hash_id = {}.hash_id );'.format( hash_ids_table_name, table_name, temp_tag_ids_table_name, table_name, hash_ids_table_name ) for table_name in table_names ]
# old method, does not do EXISTS efficiently, it makes a list instead and checks that
# queries = [ 'SELECT hash_id FROM {} WHERE EXISTS ( SELECT 1 FROM {} CROSS JOIN {} USING ( tag_id ) WHERE {}.hash_id = {}.hash_id );'.format( hash_ids_table_name, table_name, temp_tag_ids_table_name, table_name, hash_ids_table_name ) for table_name in table_names ]
# new method, this seems to actually do the correlated scalar subquery, although it does seem to be sqlite voodoo
queries = [ 'SELECT hash_id FROM {} WHERE EXISTS ( SELECT 1 FROM {} WHERE {}.hash_id = {}.hash_id AND EXISTS ( SELECT 1 FROM {} WHERE {}.tag_id = {}.tag_id ) );'.format( hash_ids_table_name, table_name, table_name, hash_ids_table_name, temp_tag_ids_table_name, table_name, temp_tag_ids_table_name ) for table_name in table_names ]
else:
@ -7798,7 +7820,9 @@ class DB( HydrusDB.HydrusDB ):
else:
# temp hashes to mappings to tags to temp namespaces
queries = [ 'SELECT hash_id FROM {} WHERE EXISTS ( SELECT 1 FROM {} CROSS JOIN {} USING ( tag_id ) CROSS JOIN {} USING ( namespace_id ) WHERE {}.hash_id = {}.hash_id );'.format( hash_ids_table_name, mappings_table_name, tags_table_name, temp_namespace_ids_table_name, mappings_table_name, hash_ids_table_name ) for ( mappings_table_name, tags_table_name ) in mapping_and_tag_table_names ]
# this was originally a 'WHERE EXISTS' thing, but doing that on a three way cross join is too complex for that to work well
# let's hope DISTINCT can save time too
queries = [ 'SELECT DISTINCT hash_id FROM {} CROSS JOIN {} USING ( hash_id ) CROSS JOIN {} USING ( tag_id ) CROSS JOIN {} USING ( namespace_id );'.format( hash_ids_table_name, mappings_table_name, tags_table_name, temp_namespace_ids_table_name ) for ( mappings_table_name, tags_table_name ) in mapping_and_tag_table_names ]
@ -14957,6 +14981,85 @@ class DB( HydrusDB.HydrusDB ):
if version == 461:
try:
num_rating_services = len( self.modules_services.GetServiceIds( HC.RATINGS_SERVICES ) )
if num_rating_services == 0:
def ask_what_to_do_ratings_service():
message = 'New clients now start with a simple like/dislike rating service. You are not new, but you have no rating services--would you like to get this default now and try ratings out?'
from hydrus.client.gui import ClientGUIDialogsQuick
result = ClientGUIDialogsQuick.GetYesNo( None, message, title = 'Get rating service?' )
return result == QW.QDialog.Accepted
add_favourites = self._controller.CallBlockingToQt( None, ask_what_to_do_ratings_service )
if add_favourites:
( service_key, service_type, name ) = ( CC.DEFAULT_FAVOURITES_RATING_SERVICE_KEY, HC.LOCAL_RATING_LIKE, 'favourites' )
dictionary = ClientServices.GenerateDefaultServiceDictionary( service_type )
from hydrus.client.metadata import ClientRatings
dictionary[ 'shape' ] = ClientRatings.STAR
like_colours = {}
like_colours[ ClientRatings.LIKE ] = ( ( 0, 0, 0 ), ( 240, 240, 65 ) )
like_colours[ ClientRatings.DISLIKE ] = ( ( 0, 0, 0 ), ( 200, 80, 120 ) )
like_colours[ ClientRatings.NULL ] = ( ( 0, 0, 0 ), ( 191, 191, 191 ) )
like_colours[ ClientRatings.MIXED ] = ( ( 0, 0, 0 ), ( 95, 95, 95 ) )
dictionary[ 'colours' ] = list( like_colours.items() )
self._AddService( service_key, service_type, name, dictionary )
except Exception as e:
HydrusData.PrintException( e )
message = 'Trying to add a default favourites service failed. Please let hydrus dev know!'
self.pub_initial_message( message )
#
try:
domain_manager = self.modules_serialisable.GetJSONDump( HydrusSerialisable.SERIALISABLE_TYPE_NETWORK_DOMAIN_MANAGER )
domain_manager.Initialise()
#
domain_manager.OverwriteDefaultParsers( ( 'pixiv artist gallery page api parser new urls' ) )
#
self.modules_serialisable.SetJSONDump( domain_manager )
except Exception as e:
HydrusData.PrintException( e )
message = 'Trying to update some downloader objects failed! Please let hydrus dev know!'
self.pub_initial_message( message )
self._controller.frame_splash_status.SetTitleText( 'updated db to v{}'.format( HydrusData.ToHumanInt( version + 1 ) ) )
self._Execute( 'UPDATE version SET version = ?;', ( version + 1, ) )

View File

@ -18,12 +18,19 @@ from hydrus.client.db import ClientDBFilesStorage
from hydrus.client.db import ClientDBModule
from hydrus.client.db import ClientDBServices
REPOSITORY_HASH_ID_MAP_PREFIX = 'repository_hash_id_map_'
REPOSITORY_TAG_ID_MAP_PREFIX = 'repository_tag_id_map_'
REPOSITORY_UPDATES_PREFIX = 'repository_updates_'
REPOSITORY_UNREGISTERED_UPDATES_PREFIX = 'repository_unregistered_updates_'
REPOSITORY_UPDATES_PROCESSED_PREFIX = 'repository_updates_processed_'
def GenerateRepositoryDefinitionTableNames( service_id: int ):
suffix = str( service_id )
hash_id_map_table_name = 'external_master.repository_hash_id_map_{}'.format( suffix )
tag_id_map_table_name = 'external_master.repository_tag_id_map_{}'.format( suffix )
hash_id_map_table_name = 'external_master.{}{}'.format( REPOSITORY_HASH_ID_MAP_PREFIX, suffix )
tag_id_map_table_name = 'external_master.{}{}'.format( REPOSITORY_TAG_ID_MAP_PREFIX, suffix )
return ( hash_id_map_table_name, tag_id_map_table_name )
@ -41,9 +48,9 @@ def GenerateRepositoryTagDefinitionTableName( service_id: int ):
def GenerateRepositoryUpdatesTableNames( service_id: int ):
repository_updates_table_name = 'repository_updates_{}'.format( service_id )
repository_unregistered_updates_table_name = 'repository_unregistered_updates_{}'.format( service_id )
repository_updates_processed_table_name = 'repository_updates_processed_{}'.format( service_id )
repository_updates_table_name = '{}{}'.format( REPOSITORY_UPDATES_PREFIX, service_id )
repository_unregistered_updates_table_name = '{}{}'.format( REPOSITORY_UNREGISTERED_UPDATES_PREFIX, service_id )
repository_updates_processed_table_name = '{}{}'.format( REPOSITORY_UPDATES_PROCESSED_PREFIX, service_id )
return ( repository_updates_table_name, repository_unregistered_updates_table_name, repository_updates_processed_table_name )
@ -127,6 +134,17 @@ class ClientDBRepositories( ClientDBModule.ClientDBModule ):
}
def _GetServiceTablePrefixes( self ):
return {
REPOSITORY_HASH_ID_MAP_PREFIX,
REPOSITORY_TAG_ID_MAP_PREFIX,
REPOSITORY_UPDATES_PREFIX,
REPOSITORY_UNREGISTERED_UPDATES_PREFIX,
REPOSITORY_UPDATES_PROCESSED_PREFIX
}
def _GetServiceIdsWeGenerateDynamicTablesFor( self ):
return self.modules_services.GetServiceIds( HC.REPOSITORIES )
@ -150,6 +168,8 @@ class ClientDBRepositories( ClientDBModule.ClientDBModule ):
def _RegisterUpdates( self, service_id, hash_ids = None ):
# it is ok if this guy gets hash ids that are already in the 'processed' table--it'll now resync them and correct if needed
( repository_updates_table_name, repository_unregistered_updates_table_name, repository_updates_processed_table_name ) = GenerateRepositoryUpdatesTableNames( service_id )
if hash_ids is None:
@ -166,38 +186,63 @@ class ClientDBRepositories( ClientDBModule.ClientDBModule ):
if len( hash_ids ) > 0:
self._ClearOutstandingWorkCache( service_id )
service_type = self.modules_services.GetService( service_id ).GetServiceType()
with self._MakeTemporaryIntegerTable( hash_ids, 'hash_id' ) as temp_hash_ids_table_name:
hash_ids_to_mimes = { hash_id : mime for ( hash_id, mime ) in self._Execute( 'SELECT hash_id, mime FROM {} CROSS JOIN files_info USING ( hash_id );'.format( temp_hash_ids_table_name ) ) }
current_rows = set( self._Execute( 'SELECT hash_id, content_type FROM {} CROSS JOIN {} USING ( hash_id );'.format( temp_hash_ids_table_name, repository_updates_processed_table_name ) ) )
correct_rows = set()
for ( hash_id, mime ) in hash_ids_to_mimes.items():
if mime == HC.APPLICATION_HYDRUS_UPDATE_DEFINITIONS:
content_types = ( HC.CONTENT_TYPE_DEFINITIONS, )
else:
content_types = tuple( HC.SERVICE_TYPES_TO_CONTENT_TYPES[ service_type ] )
correct_rows.update( ( ( hash_id, content_type ) for content_type in content_types ) )
deletee_rows = current_rows.difference( correct_rows )
if len( deletee_rows ) > 0:
# these were registered wrong at some point
self._ExecuteMany( 'DELETE FROM {} WHERE hash_id = ? AND content_type = ?;'.format( repository_updates_processed_table_name ), deletee_rows )
insert_rows = correct_rows.difference( current_rows )
if len( insert_rows ) > 0:
processed = False
self._ExecuteMany( 'INSERT OR IGNORE INTO {} ( hash_id, content_type, processed ) VALUES ( ?, ?, ? );'.format( repository_updates_processed_table_name ), ( ( hash_id, content_type, processed ) for ( hash_id, content_type ) in insert_rows ) )
if len( hash_ids_to_mimes ) > 0:
inserts = []
processed = False
for ( hash_id, mime ) in hash_ids_to_mimes.items():
if mime == HC.APPLICATION_HYDRUS_UPDATE_DEFINITIONS:
content_types = ( HC.CONTENT_TYPE_DEFINITIONS, )
else:
content_types = tuple( HC.SERVICE_TYPES_TO_CONTENT_TYPES[ service_type ] )
inserts.extend( ( ( hash_id, content_type, processed ) for content_type in content_types ) )
self._ExecuteMany( 'INSERT OR IGNORE INTO {} ( hash_id, content_type, processed ) VALUES ( ?, ?, ? );'.format( repository_updates_processed_table_name ), inserts )
self._ExecuteMany( 'DELETE FROM {} WHERE hash_id = ?;'.format( repository_unregistered_updates_table_name ), ( ( hash_id, ) for hash_id in hash_ids_to_mimes.keys() ) )
if len( deletee_rows ) + len( insert_rows ) > 0:
content_types_that_changed = { content_type for ( hash_id, content_type ) in deletee_rows.union( insert_rows ) }
for content_type in content_types_that_changed:
self._ClearOutstandingWorkCache( service_id, content_type = content_type )
def _ReprocessRepository( self, service_id, content_types ):
@ -220,31 +265,6 @@ class ClientDBRepositories( ClientDBModule.ClientDBModule ):
self.modules_files_maintenance_queue.AddJobs( update_hash_ids, job_type )
def _UnregisterUpdates( self, service_id, hash_ids = None ):
( repository_updates_table_name, repository_unregistered_updates_table_name, repository_updates_processed_table_name ) = GenerateRepositoryUpdatesTableNames( service_id )
if hash_ids is None:
hash_ids = self._STS( self._Execute( 'SELECT hash_id FROM {};'.format( repository_updates_processed_table_name ) ) )
else:
with self._MakeTemporaryIntegerTable( hash_ids, 'hash_id' ) as temp_hash_ids_table_name:
hash_ids = self._STS( self._Execute( 'SELECT hash_id FROM {} CROSS JOIN {} USING ( hash_id );'.format( temp_hash_ids_table_name, repository_updates_processed_table_name ) ) )
if len( hash_ids ) > 0:
self._ClearOutstandingWorkCache( service_id )
self._ExecuteMany( 'DELETE FROM {} WHERE hash_id = ?;'.format( repository_updates_processed_table_name ), ( ( hash_id, ) for hash_id in hash_ids ) )
self._ExecuteMany( 'INSERT OR IGNORE INTO {} ( hash_id ) VALUES ( ? );'.format( repository_unregistered_updates_table_name ), ( ( hash_id, ) for hash_id in hash_ids ) )
def AssociateRepositoryUpdateHashes( self, service_key: bytes, metadata_slice: HydrusNetwork.Metadata ):
service_id = self.modules_services.GetServiceId( service_key )
@ -576,7 +596,10 @@ class ClientDBRepositories( ClientDBModule.ClientDBModule ):
for service_id in self.modules_services.GetServiceIds( HC.REPOSITORIES ):
self._UnregisterUpdates( service_id, hash_ids )
( repository_updates_table_name, repository_unregistered_updates_table_name, repository_updates_processed_table_name ) = GenerateRepositoryUpdatesTableNames( service_id )
self._ExecuteMany( 'INSERT OR IGNORE INTO {} ( hash_id ) VALUES ( ? );'.format( repository_unregistered_updates_table_name ), ( ( hash_id, ) for hash_id in hash_ids ) )
self._RegisterUpdates( service_id, hash_ids )
@ -687,19 +710,37 @@ class ClientDBRepositories( ClientDBModule.ClientDBModule ):
def SetRepositoryUpdateHashes( self, service_key: bytes, metadata: HydrusNetwork.Metadata ):
# this is a full metadata resync
service_id = self.modules_services.GetServiceId( service_key )
all_future_update_hash_ids = self.modules_hashes_local_cache.GetHashIds( metadata.GetUpdateHashes() )
( repository_updates_table_name, repository_unregistered_updates_table_name, repository_updates_processed_table_name ) = GenerateRepositoryUpdatesTableNames( service_id )
current_update_hash_ids = self._STS( self._Execute( 'SELECT hash_id FROM {};'.format( repository_updates_table_name ) ) )
#
all_future_update_hash_ids = self.modules_hashes_local_cache.GetHashIds( metadata.GetUpdateHashes() )
current_update_hash_ids = self._STS( self._Execute( 'SELECT hash_id FROM {};'.format( repository_updates_table_name ) ) )
deletee_hash_ids = current_update_hash_ids.difference( all_future_update_hash_ids )
self._ExecuteMany( 'DELETE FROM {} WHERE hash_id = ?;'.format( repository_updates_table_name ), ( ( hash_id, ) for hash_id in deletee_hash_ids ) )
self._ExecuteMany( 'DELETE FROM {} WHERE hash_id = ?;'.format( repository_unregistered_updates_table_name ), ( ( hash_id, ) for hash_id in deletee_hash_ids ) )
self._ExecuteMany( 'DELETE FROM {} WHERE hash_id = ?;'.format( repository_updates_processed_table_name ), ( ( hash_id, ) for hash_id in deletee_hash_ids ) )
#
self._Execute( 'DELETE FROM {};'.format( repository_unregistered_updates_table_name ) )
#
good_current_hash_ids = current_update_hash_ids.intersection( all_future_update_hash_ids )
current_processed_table_update_hash_ids = self._STS( self._Execute( 'SELECT hash_id FROM {};'.format( repository_updates_processed_table_name ) ) )
deletee_processed_table_update_hash_ids = current_processed_table_update_hash_ids.difference( good_current_hash_ids )
self._ExecuteMany( 'DELETE FROM {} WHERE hash_id = ?;'.format( repository_updates_processed_table_name ), ( ( hash_id, ) for hash_id in deletee_processed_table_update_hash_ids ) )
#
inserts = []
@ -721,12 +762,10 @@ class ClientDBRepositories( ClientDBModule.ClientDBModule ):
self._ExecuteMany( 'INSERT OR IGNORE INTO {} ( update_index, hash_id ) VALUES ( ?, ? );'.format( repository_updates_table_name ), inserts )
self._ExecuteMany( 'INSERT OR IGNORE INTO {} ( hash_id ) VALUES ( ? );'.format( repository_unregistered_updates_table_name ), ( ( hash_id, ) for ( update_index, hash_id ) in inserts ) )
self._ExecuteMany( 'INSERT OR IGNORE INTO {} ( hash_id ) VALUES ( ? );'.format( repository_unregistered_updates_table_name ), ( ( hash_id, ) for hash_id in all_future_update_hash_ids ) )
self._RegisterUpdates( service_id )
self._ClearOutstandingWorkCache( service_id )
def SetUpdateProcessed( self, service_id: int, update_hash: bytes, content_types: typing.Collection[ int ] ):

View File

@ -1035,7 +1035,9 @@ class FrameGUI( ClientGUITopLevelWindows.MainFrameThatResizes ):
def _ClearOrphanFileRecords( self ):
text = 'This will instruct the database to review its file records and delete any orphans. You typically do not ever see these files and they are basically harmless, but they can offset some file counts confusingly. You probably only need to run this if you can\'t process the apparent last handful of duplicate filter pairs or hydrus dev otherwise told you to try it.'
text = 'DO NOT RUN THIS UNLESS YOU KNOW YOU NEED TO'
text += os.linesep * 2
text += 'This will instruct the database to review its file records and delete any orphans. You typically do not ever see these files and they are basically harmless, but they can offset some file counts confusingly. You probably only need to run this if you can\'t process the apparent last handful of duplicate filter pairs or hydrus dev otherwise told you to try it.'
text += os.linesep * 2
text += 'It will create a popup message while it works and inform you of the number of orphan records found.'
@ -1049,7 +1051,9 @@ class FrameGUI( ClientGUITopLevelWindows.MainFrameThatResizes ):
def _ClearOrphanHashedSerialisables( self ):
text = 'This force-runs a routine that regularly removes some spare data from the database. You most likely do not need to run it.'
text = 'DO NOT RUN THIS UNLESS YOU KNOW YOU NEED TO'
text += os.linesep * 2
text += 'This force-runs a routine that regularly removes some spare data from the database. You most likely do not need to run it.'
result = ClientGUIDialogsQuick.GetYesNo( self, text, yes_label = 'do it', no_label = 'forget it' )
@ -1079,7 +1083,9 @@ class FrameGUI( ClientGUITopLevelWindows.MainFrameThatResizes ):
def _ClearOrphanTables( self ):
text = 'This will instruct the database to review its service tables and delete any orphans. This will typically do nothing, but hydrus dev may tell you to run this, just to check. Be sure you have a semi-recent backup before you run this.'
text = 'DO NOT RUN THIS UNLESS YOU KNOW YOU NEED TO'
text += os.linesep * 2
text += 'This will instruct the database to review its service tables and delete any orphans. This will typically do nothing, but hydrus dev may tell you to run this, just to check. Be sure you have a recent backup before you run this--if it deletes something important by accident, you will want to roll back!'
text += os.linesep * 2
text += 'It will create popups if it finds anything to delete.'

View File

@ -297,8 +297,9 @@ class mpvWidget( QW.QWidget ):
current_frame_index = int( round( ( current_timestamp_ms / self._media.GetDuration() ) * num_frames ) )
current_frame_index = min( current_frame_index, num_frames - 1 )
current_frame_index = min( current_frame_index, num_frames - 1 )
current_timestamp_ms = min( current_timestamp_ms, self._media.GetDuration() )

View File

@ -1,3 +1,4 @@
import bisect
import collections
import itertools
import os
@ -352,7 +353,7 @@ class FileSeed( HydrusSerialisable.SerialisableBase ):
self._CheckTagsVeto( self._tags, tag_import_options )
def DownloadAndImportRawFile( self, file_url: str, file_import_options, network_job_factory, network_job_presentation_context_factory, status_hook, override_bandwidth = False ):
def DownloadAndImportRawFile( self, file_url: str, file_import_options, network_job_factory, network_job_presentation_context_factory, status_hook, override_bandwidth = False, forced_referral_url = None, file_seed_cache = None ):
self.AddPrimaryURLs( ( file_url, ) )
@ -360,7 +361,11 @@ class FileSeed( HydrusSerialisable.SerialisableBase ):
try:
if self.file_seed_data != file_url:
if forced_referral_url is not None:
referral_url = forced_referral_url
elif self.file_seed_data != file_url:
referral_url = self.file_seed_data
@ -387,6 +392,50 @@ class FileSeed( HydrusSerialisable.SerialisableBase ):
network_job.WaitUntilDone()
actual_fetched_url = network_job.GetActualFetchedURL()
if actual_fetched_url != file_url:
self._AddPrimaryURLs( ( actual_fetched_url, ) )
( actual_url_type, actual_match_name, actual_can_parse, actual_cannot_parse_reason ) = HG.client_controller.network_engine.domain_manager.GetURLParseCapability( actual_fetched_url )
if actual_url_type == HC.URL_TYPE_POST and actual_can_parse:
# we just had a 3XX redirect to a Post URL!
if file_seed_cache is None:
raise Exception( 'The downloader thought it had a raw file url with "{}", but that redirected to the apparent Post URL "{}", but then there was no file log in which to queue that download!'.format( file_url, actual_fetched_url ) )
else:
( original_url_type, original_match_name, original_can_parse, original_cannot_parse_reason ) = HG.client_controller.network_engine.domain_manager.GetURLParseCapability( self.file_seed_data )
if original_url_type == actual_url_type and original_match_name == actual_match_name:
raise Exception( 'The downloader thought it had a raw file url with "{}", but that redirected to the apparent Post URL "{}". As that URL has the same class as this import job\'s original URL, we are stopping here in case this is a looping redirect!'.format( file_url, actual_fetched_url ) )
file_seed = FileSeed( FILE_SEED_TYPE_URL, actual_fetched_url )
file_seed.SetReferralURL( file_url )
file_seeds = [ file_seed ]
file_seed_cache.AddFileSeeds( file_seeds )
status = CC.STATUS_SUCCESSFUL_AND_NEW
note = 'was redirected on file download to a post url, which has been queued in the parent file log'
self.SetStatus( status, note = note )
return
status_hook( 'importing file' )
self.Import( temp_path, file_import_options, status_hook = status_hook )
@ -777,7 +826,14 @@ class FileSeed( HydrusSerialisable.SerialisableBase ):
if self.file_seed_type == FILE_SEED_TYPE_URL:
( url_type, match_name, can_parse, cannot_parse_reason ) = HG.client_controller.network_engine.domain_manager.GetURLParseCapability( self.file_seed_data )
try:
( url_type, match_name, can_parse, cannot_parse_reason ) = HG.client_controller.network_engine.domain_manager.GetURLParseCapability( self.file_seed_data )
except HydrusExceptions.URLClassException:
return False
if url_type == HC.URL_TYPE_POST:
@ -1063,6 +1119,8 @@ class FileSeed( HydrusSerialisable.SerialisableBase ):
post_url = self.file_seed_data
url_for_child_referral = post_url
( url_to_check, parser ) = HG.client_controller.network_engine.domain_manager.GetURLToFetchAndParser( post_url )
status_hook( 'downloading file page' )
@ -1095,12 +1153,18 @@ class FileSeed( HydrusSerialisable.SerialisableBase ):
if actual_fetched_url != url_to_check:
( url_type, match_name, can_parse, cannot_parse_reason ) = HG.client_controller.network_engine.domain_manager.GetURLParseCapability( actual_fetched_url )
# we have redirected, a 3XX response
if url_type == HC.URL_TYPE_POST and can_parse:
( actual_url_type, actual_match_name, actual_can_parse, actual_cannot_parse_reason ) = HG.client_controller.network_engine.domain_manager.GetURLParseCapability( actual_fetched_url )
if actual_url_type == HC.URL_TYPE_POST and actual_can_parse:
self._AddPrimaryURLs( ( actual_fetched_url, ) )
post_url = actual_fetched_url
url_for_child_referral = post_url
( url_to_check, parser ) = HG.client_controller.network_engine.domain_manager.GetURLToFetchAndParser( post_url )
@ -1154,7 +1218,7 @@ class FileSeed( HydrusSerialisable.SerialisableBase ):
# multiple child urls generated by a subsidiary page parser
file_seeds = ClientImporting.ConvertAllParseResultsToFileSeeds( all_parse_results, self.file_seed_data, file_import_options )
file_seeds = ClientImporting.ConvertAllParseResultsToFileSeeds( all_parse_results, url_for_child_referral, file_import_options )
for file_seed in file_seeds:
@ -1218,7 +1282,7 @@ class FileSeed( HydrusSerialisable.SerialisableBase ):
if should_download_file:
self.DownloadAndImportRawFile( file_url, file_import_options, network_job_factory, network_job_presentation_context_factory, status_hook, override_bandwidth = True )
self.DownloadAndImportRawFile( file_url, file_import_options, network_job_factory, network_job_presentation_context_factory, status_hook, override_bandwidth = True, forced_referral_url = url_for_child_referral, file_seed_cache = file_seed_cache )
elif url_type == HC.URL_TYPE_POST and can_parse:
@ -1254,7 +1318,7 @@ class FileSeed( HydrusSerialisable.SerialisableBase ):
duplicate_file_seed.file_seed_data = child_url
duplicate_file_seed.SetReferralURL( self.file_seed_data )
duplicate_file_seed.SetReferralURL( url_for_child_referral )
if self._referral_url is not None:
@ -1295,7 +1359,7 @@ class FileSeed( HydrusSerialisable.SerialisableBase ):
file_url = self.file_seed_data
self.DownloadAndImportRawFile( file_url, file_import_options, network_job_factory, network_job_presentation_context_factory, status_hook )
self.DownloadAndImportRawFile( file_url, file_import_options, network_job_factory, network_job_presentation_context_factory, status_hook, file_seed_cache = file_seed_cache )
@ -1675,11 +1739,14 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
self._file_seeds_to_indices = {}
self._statuses_to_indexed_file_seeds = collections.defaultdict( list )
self._file_seed_cache_key = HydrusData.GenerateKey()
self._status_cache = FileSeedCacheStatus()
self._status_dirty = True
self._statuses_to_indexed_file_seeds_dirty = True
self._lock = threading.Lock()
@ -1689,6 +1756,52 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
return len( self._file_seeds )
def _FixFileSeedsStatusPosition( self, file_seeds ):
indices_and_file_seeds_affected = []
for file_seed in file_seeds:
if file_seed in self._file_seeds_to_indices:
indices_and_file_seeds_affected.append( ( self._file_seeds_to_indices[ file_seed ], file_seed ) )
else:
self._SetStatusesToFileSeedsDirty()
return
for row in indices_and_file_seeds_affected:
correct_status = row[1].status
if row in self._statuses_to_indexed_file_seeds[ correct_status ]:
continue
for ( status, indices_and_file_seeds ) in self._statuses_to_indexed_file_seeds.items():
if status == correct_status:
continue
if row in indices_and_file_seeds:
indices_and_file_seeds.remove( row )
bisect.insort( self._statuses_to_indexed_file_seeds[ correct_status ], row )
break
def _GenerateStatus( self ):
fscs = FileSeedCacheStatus()
@ -1709,7 +1822,12 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
else:
return [ file_seed for file_seed in self._file_seeds if file_seed.status == status ]
if self._statuses_to_indexed_file_seeds_dirty:
self._RegenerateStatusesToFileSeeds()
return [ file_seed for ( index, file_seed ) in self._statuses_to_indexed_file_seeds[ status ] ]
@ -1729,12 +1847,33 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
def _GetNextFileSeed( self, status: int ) -> typing.Optional[ FileSeed ]:
for file_seed in self._file_seeds:
# the problem with this is if a file seed recently changed but 'notifyupdated' hasn't had a chance to go yet
# there could be a FS in a list other than the one we are looking at that has the status we want
# _however_, it seems like I do not do any async calls to notifyupdated in the actual FSC, only from notifyupdated to GUI elements, so we _seem_ to be good
if self._statuses_to_indexed_file_seeds_dirty:
self._RegenerateStatusesToFileSeeds()
indexed_file_seeds = self._statuses_to_indexed_file_seeds[ status ]
while len( indexed_file_seeds ) > 0:
row = indexed_file_seeds[ 0 ]
file_seed = row[1]
if file_seed.status == status:
return file_seed
else:
self._FixFileSeedsStatusPosition( ( file_seed, ) )
indexed_file_seeds = self._statuses_to_indexed_file_seeds[ status ]
return None
@ -1764,9 +1903,19 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
statuses_to_counts = collections.Counter()
for file_seed in self._file_seeds:
if self._statuses_to_indexed_file_seeds_dirty:
statuses_to_counts[ file_seed.status ] += 1
self._RegenerateStatusesToFileSeeds()
for ( status, indexed_file_seeds ) in self._statuses_to_indexed_file_seeds.items():
count = len( indexed_file_seeds )
if count > 0:
statuses_to_counts[ status ] = count
return statuses_to_counts
@ -1789,6 +1938,30 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
self._file_seeds_to_indices = { file_seed : index for ( index, file_seed ) in enumerate( self._file_seeds ) }
self._SetStatusesToFileSeedsDirty()
def _RegenerateStatusesToFileSeeds( self ):
self._statuses_to_indexed_file_seeds = collections.defaultdict( list )
for ( file_seed, index ) in self._file_seeds_to_indices.items():
self._statuses_to_indexed_file_seeds[ file_seed.status ].append( ( index, file_seed ) )
for indexed_file_seeds in self._statuses_to_indexed_file_seeds.values():
indexed_file_seeds.sort()
self._statuses_to_indexed_file_seeds_dirty = False
def _SetStatusesToFileSeedsDirty( self ):
self._statuses_to_indexed_file_seeds_dirty = True
def _SetStatusDirty( self ):
@ -2035,7 +2208,14 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
self._file_seeds.append( file_seed )
self._file_seeds_to_indices[ file_seed ] = len( self._file_seeds ) - 1
index = len( self._file_seeds ) - 1
self._file_seeds_to_indices[ file_seed ] = index
if not self._statuses_to_indexed_file_seeds_dirty:
self._statuses_to_indexed_file_seeds[ file_seed.status ].append( ( index, file_seed ) )
self._SetStatusDirty()
@ -2063,6 +2243,8 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
self._file_seeds_to_indices = { file_seed : index for ( index, file_seed ) in enumerate( self._file_seeds ) }
self._SetStatusesToFileSeedsDirty()
self.NotifyFileSeedsUpdated( ( file_seed, ) )
@ -2121,6 +2303,8 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
self._file_seeds = new_file_seeds
self._file_seeds_to_indices = { file_seed : index for ( index, file_seed ) in enumerate( self._file_seeds ) }
self._SetStatusesToFileSeedsDirty()
self._SetStatusDirty()
@ -2142,6 +2326,8 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
self._file_seeds_to_indices = { file_seed : index for ( index, file_seed ) in enumerate( self._file_seeds ) }
self._SetStatusesToFileSeedsDirty()
self.NotifyFileSeedsUpdated( ( file_seed, ) )
@ -2205,7 +2391,16 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
else:
example_seed = self._GetNextFileSeed( CC.STATUS_UNKNOWN )
good_file_seeds = [ file_seed for file_seed in self._file_seeds[-30:] if file_seed.status in CC.SUCCESSFUL_IMPORT_STATES ]
if len( good_file_seeds ) > 0:
example_seed = random.choice( good_file_seeds )
else:
example_seed = self._GetNextFileSeed( CC.STATUS_UNKNOWN )
if example_seed is None:
@ -2241,14 +2436,13 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
else:
for file_seed in self._file_seeds:
if self._statuses_to_indexed_file_seeds_dirty:
if file_seed.status == status:
result += 1
self._RegenerateStatusesToFileSeeds()
return len( self._statuses_to_indexed_file_seeds[ status ] )
return result
@ -2430,6 +2624,8 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
self._file_seeds_to_indices = { file_seed : index for ( index, file_seed ) in enumerate( self._file_seeds ) }
self._SetStatusesToFileSeedsDirty()
self._SetStatusDirty()
@ -2442,6 +2638,13 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
with self._lock:
if not self._statuses_to_indexed_file_seeds_dirty:
self._FixFileSeedsStatusPosition( file_seeds )
#
self._SetStatusDirty()
@ -2458,6 +2661,8 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
self._file_seeds_to_indices = { file_seed : index for ( index, file_seed ) in enumerate( self._file_seeds ) }
self._SetStatusesToFileSeedsDirty()
self._SetStatusDirty()

View File

@ -322,6 +322,8 @@ class GallerySeed( HydrusSerialisable.SerialisableBase ):
gallery_url = self.url
url_for_child_referral = gallery_url
( url_type, match_name, can_parse, cannot_parse_reason ) = HG.client_controller.network_engine.domain_manager.GetURLParseCapability( gallery_url )
if url_type not in ( HC.URL_TYPE_GALLERY, HC.URL_TYPE_WATCHABLE ):
@ -380,6 +382,8 @@ class GallerySeed( HydrusSerialisable.SerialisableBase ):
gallery_url = actual_fetched_url
url_for_child_referral = gallery_url
( url_to_check, parser ) = HG.client_controller.network_engine.domain_manager.GetURLToFetchAndParser( gallery_url )
else:
@ -399,7 +403,7 @@ class GallerySeed( HydrusSerialisable.SerialisableBase ):
file_seed = ClientImportFileSeeds.FileSeed( ClientImportFileSeeds.FILE_SEED_TYPE_URL, actual_fetched_url )
file_seed.SetReferralURL( gallery_url )
file_seed.SetReferralURL( url_for_child_referral )
file_seeds = [ file_seed ]
@ -426,7 +430,7 @@ class GallerySeed( HydrusSerialisable.SerialisableBase ):
raise HydrusExceptions.VetoException( 'The parser found nothing in the document!' )
file_seeds = ClientImporting.ConvertAllParseResultsToFileSeeds( all_parse_results, gallery_url, file_import_options )
file_seeds = ClientImporting.ConvertAllParseResultsToFileSeeds( all_parse_results, url_for_child_referral, file_import_options )
title = ClientParsing.GetTitleFromAllParseResults( all_parse_results )
@ -560,6 +564,7 @@ class GallerySeed( HydrusSerialisable.SerialisableBase ):
for next_gallery_seed in next_gallery_seeds:
next_gallery_seed.SetRunToken( self._run_token )
next_gallery_seed.SetReferralURL( url_for_child_referral )
next_gallery_seed.SetExternalFilterableTags( self._external_filterable_tags )
next_gallery_seed.SetExternalAdditionalServiceKeysToTags( self._external_additional_service_keys_to_tags )

View File

@ -2066,6 +2066,14 @@ class NetworkDomainManager( HydrusSerialisable.SerialisableBase ):
def SetDefaultFilePostTagImportOptions( self, tag_import_options ):
with self._lock:
self._file_post_default_tag_import_options = tag_import_options
def SetDefaultGUGKeyAndName( self, gug_key_and_name ):
with self._lock:

View File

@ -81,7 +81,7 @@ options = {}
# Misc
NETWORK_VERSION = 20
SOFTWARE_VERSION = 461
SOFTWARE_VERSION = 462
CLIENT_API_VERSION = 22
SERVER_THUMBNAIL_DIMENSIONS = ( 200, 200 )

View File

@ -68,6 +68,11 @@ class HydrusDBModule( HydrusDBBase.DBBase ):
return table_generation_dict
def _GetServiceTablePrefixes( self ) -> typing.Collection:
return set()
def _GetServiceIdsWeGenerateDynamicTablesFor( self ):
return []
@ -160,6 +165,24 @@ class HydrusDBModule( HydrusDBBase.DBBase ):
return list( table_generation_dict.keys() )
def GetSurplusServiceTableNames( self, all_table_names ) -> set:
prefixes = self._GetServiceTablePrefixes()
if len( prefixes ) == 0:
return set()
all_service_table_names = { table_name for table_name in all_table_names if True in ( table_name.startswith( prefix ) or '.{}'.format( prefix ) in table_name for prefix in prefixes ) }
good_service_table_names = self.GetExpectedServiceTableNames()
surplus_table_names = all_service_table_names.difference( good_service_table_names )
return surplus_table_names
def GetTablesAndColumnsThatUseDefinitions( self, content_type: int ) -> typing.List[ typing.Tuple[ str, str ] ]:
# could also do another one of these for orphan tables that have service id in the name.

View File

@ -246,6 +246,23 @@ def ConvertPrettyStringsToUglyNamespaces( pretty_strings ):
def ConvertResolutionToPrettyString( resolution ):
if resolution is None:
return 'no resolution'
if not isinstance( resolution, tuple ):
try:
resolution = tuple( resolution )
except:
'broken resolution'
if resolution in HC.NICE_RESOLUTIONS:
return HC.NICE_RESOLUTIONS[ resolution ]

View File

@ -711,7 +711,24 @@ def ParseFFMPEGFPSPossibleResults( video_line ):
possible_results.discard( 0 )
confident = len( possible_results ) <= 1
if len( possible_results ) == 0:
confident = False
else:
# if we have 60 and 59.99, that's fine mate
max_fps = max( possible_results )
if False not in ( possible_fps >= max_fps * 0.95 for possible_fps in possible_results ):
confident = True
else:
confident = len( possible_results ) <= 1
return ( possible_results, confident )

View File

@ -762,7 +762,7 @@ class TestClientDB( unittest.TestCase ):
predicates.append( ClientSearch.Predicate( ClientSearch.PREDICATE_TYPE_SYSTEM_EVERYTHING, min_current_count = 1 ) )
predicates.append( ClientSearch.Predicate( ClientSearch.PREDICATE_TYPE_SYSTEM_INBOX, min_current_count = 1 ) )
predicates.append( ClientSearch.Predicate( ClientSearch.PREDICATE_TYPE_SYSTEM_ARCHIVE, min_current_count = 0 ) )
predicates.extend( [ ClientSearch.Predicate( predicate_type ) for predicate_type in [ ClientSearch.PREDICATE_TYPE_SYSTEM_NUM_TAGS, ClientSearch.PREDICATE_TYPE_SYSTEM_LIMIT, ClientSearch.PREDICATE_TYPE_SYSTEM_SIZE, ClientSearch.PREDICATE_TYPE_SYSTEM_AGE, ClientSearch.PREDICATE_TYPE_SYSTEM_MODIFIED_TIME, ClientSearch.PREDICATE_TYPE_SYSTEM_KNOWN_URLS, ClientSearch.PREDICATE_TYPE_SYSTEM_HAS_AUDIO, ClientSearch.PREDICATE_TYPE_SYSTEM_HASH, ClientSearch.PREDICATE_TYPE_SYSTEM_DIMENSIONS, ClientSearch.PREDICATE_TYPE_SYSTEM_DURATION, ClientSearch.PREDICATE_TYPE_SYSTEM_NOTES, ClientSearch.PREDICATE_TYPE_SYSTEM_NUM_WORDS, ClientSearch.PREDICATE_TYPE_SYSTEM_MIME, ClientSearch.PREDICATE_TYPE_SYSTEM_SIMILAR_TO, ClientSearch.PREDICATE_TYPE_SYSTEM_FILE_SERVICE, ClientSearch.PREDICATE_TYPE_SYSTEM_TAG_AS_NUMBER, ClientSearch.PREDICATE_TYPE_SYSTEM_FILE_RELATIONSHIPS, ClientSearch.PREDICATE_TYPE_SYSTEM_FILE_VIEWING_STATS ] ] )
predicates.extend( [ ClientSearch.Predicate( predicate_type ) for predicate_type in [ ClientSearch.PREDICATE_TYPE_SYSTEM_NUM_TAGS, ClientSearch.PREDICATE_TYPE_SYSTEM_LIMIT, ClientSearch.PREDICATE_TYPE_SYSTEM_SIZE, ClientSearch.PREDICATE_TYPE_SYSTEM_AGE, ClientSearch.PREDICATE_TYPE_SYSTEM_MODIFIED_TIME, ClientSearch.PREDICATE_TYPE_SYSTEM_KNOWN_URLS, ClientSearch.PREDICATE_TYPE_SYSTEM_HAS_AUDIO, ClientSearch.PREDICATE_TYPE_SYSTEM_HASH, ClientSearch.PREDICATE_TYPE_SYSTEM_DIMENSIONS, ClientSearch.PREDICATE_TYPE_SYSTEM_DURATION, ClientSearch.PREDICATE_TYPE_SYSTEM_NOTES, ClientSearch.PREDICATE_TYPE_SYSTEM_NUM_WORDS, ClientSearch.PREDICATE_TYPE_SYSTEM_MIME, ClientSearch.PREDICATE_TYPE_SYSTEM_RATING, ClientSearch.PREDICATE_TYPE_SYSTEM_SIMILAR_TO, ClientSearch.PREDICATE_TYPE_SYSTEM_FILE_SERVICE, ClientSearch.PREDICATE_TYPE_SYSTEM_TAG_AS_NUMBER, ClientSearch.PREDICATE_TYPE_SYSTEM_FILE_RELATIONSHIPS, ClientSearch.PREDICATE_TYPE_SYSTEM_FILE_VIEWING_STATS ] ] )
self.assertEqual( set( result ), set( predicates ) )
@ -1842,11 +1842,11 @@ class TestClientDB( unittest.TestCase ):
TestClientDB._clear_db()
result = self._read( 'services', ( HC.LOCAL_FILE_DOMAIN, HC.LOCAL_FILE_TRASH_DOMAIN, HC.COMBINED_LOCAL_FILE, HC.LOCAL_TAG ) )
result = self._read( 'services', ( HC.LOCAL_FILE_DOMAIN, HC.LOCAL_FILE_TRASH_DOMAIN, HC.COMBINED_LOCAL_FILE, HC.LOCAL_TAG, HC.LOCAL_RATING_LIKE ) )
result_service_keys = { service.GetServiceKey() for service in result }
self.assertEqual( { CC.TRASH_SERVICE_KEY, CC.LOCAL_FILE_SERVICE_KEY, CC.LOCAL_UPDATE_SERVICE_KEY, CC.COMBINED_LOCAL_FILE_SERVICE_KEY, CC.DEFAULT_LOCAL_TAG_SERVICE_KEY }, result_service_keys )
self.assertEqual( { CC.TRASH_SERVICE_KEY, CC.LOCAL_FILE_SERVICE_KEY, CC.LOCAL_UPDATE_SERVICE_KEY, CC.COMBINED_LOCAL_FILE_SERVICE_KEY, CC.DEFAULT_LOCAL_TAG_SERVICE_KEY, CC.DEFAULT_LOCAL_DOWNLOADER_TAG_SERVICE_KEY, CC.DEFAULT_FAVOURITES_RATING_SERVICE_KEY }, result_service_keys )
#
@ -1862,7 +1862,7 @@ class TestClientDB( unittest.TestCase ):
#
NUM_DEFAULT_SERVICES = 10
NUM_DEFAULT_SERVICES = 12
services = self._read( 'services' )

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.2 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 2.7 KiB