Version 319

This commit is contained in:
Hydrus Network Developer 2018-08-22 16:10:59 -05:00
parent 658fdf672c
commit c45aa2c914
32 changed files with 1289 additions and 360 deletions

View File

@ -8,6 +8,32 @@
<div class="content">
<h3>changelog</h3>
<ul>
<li><h3>version 319</h3></li>
<ul>
<li>started the new convert-query-text-to-gallery-urls object. these objects, which I was thinking of calling 'Searchers', will be called the more specific and practical 'Gallery URL Generators', or GUGs for short</li>
<li>the first version of GUGs is done, and I've written some test ui for advanced users under network->downloader definitions->manage gugs. this ui doesn't save anything yet, but lets you mess around with different values. if we don't think of anything else needed in the next week, I will fix this code for v320 and start filling in defaults</li>
<li>watchers now have a global checking slot, much like the recent change to galleries and subs. it safely throttles dozens of threads so they don't rudely hammer your (or the destination server's) CPU if they all happen to want to go at once (like just after your computer wakes up). the option is similarly under options->downloading, and is global for the moment</li>
<li>moved the new gallery delay/token management code to the better-fit bandwidth manager (it was in domain manager before)</li>
<li>the gallery delay/token code now works per-domain!</li>
<li>moved the gallery delay/token checking code into the network job proper, simplifying a bunch of import-level code and making the text display now appear in the network job control. token consumption now occurs after bandwidth (it is now the last hoop to jump through, which reduces the chance of a pileup in unusual situations) I expect to soon add some kind of 'force-go' action to the cog menu</li>
<li>the network engine will now not permit more than three jobs active per domain, and the overall limit has been raised from ten to fifteen</li>
<li>the media right-click menu now supports copying: all of a files recognised urls; all of a files urls; all selected files' urls of a specific url class; and all selected files urls</li>
<li>reworked and harmonised a bunch of urlparsing and generation code--all urls should now appear as full unicode across the program, generally without %20-type encoding characters unless explicitly entered by the user. character encoding now all happens on the backend in requests</li>
<li>non-url-class-matched urls now have their query parameters alphabetised as part of the normalisation process</li>
<li>all urls in the db will have their query params alphabetised on update, and any file relationships merged to the new/existing normalised url</li>
<li>the manage urls dialog will now normalise newly added urls (but should also still permit the removal of non-normalised urls)</li>
<li>reworked how gallery hits update file import object caches, particularly for subscriptions</li>
<li>fixed an issue in subscriptions gallery logging where the gallery log would always state it had found the max number of files and typically redundantly generate an 'ignored' stub--it should now say something like 'found 7 files - saw 5 previously seen urls, so assuming we caught up' as originally intended</li>
<li>simplified some gallery->file import object creation</li>
<li>galleries now compact until 100 entries (was 25)</li>
<li>watchers now gallery-compact after a successful check</li>
<li>watchers now show the 'just added'/'already watching' status for 15s, up from 5s</li>
<li>network report mode now reports three time--once each for job addition, start, and successful completion</li>
<li>fixed an issue with the new 'max width' popup sizing calculation that was sometimes not fitting for new height requirements correctly</li>
<li>fixed an issue with the new url class next page generation code</li>
<li>fixed an issue where TIOs with data regarding since-deleted services were failing to initialise at the ui level</li>
<li>misc status text cleanup</li>
</ul>
<li><h3>version 318</h3></li>
<ul>
<li>downloaders:</li>

View File

@ -7,7 +7,7 @@
<body>
<div class="content">
<p><a href="downloader_parsers.html"><---- Back to Parsers</a></p>
<h3>searches</h3>
<h3>muh gugs</h3>
<p class="right"><a href="downloader_completion.html">Now let's put it all together ----></a></p>
</div>
</body>

View File

@ -17,8 +17,8 @@
<li>This takes a string like 'blue_eyes' to produce a series of thumbnail gallery pages URLs that can be parsed for image page URLs which can ultimately be parsed for file URLs and metadata like tags. Boorus fall into this category.</li>
<li><h3>Thread Watcher</h3></li>
<li>This takes a URL that it will check repeatedly, parsing it for new URLs that it then queues up to be downloaded. It typically stops checking after the 'file velocity' (such as '1 new file per day') drops below a certain level.</li>
<li><h3>Single Page Downloader</h3></li>
<li>This takes a URL one-time and parses it for more URLs. This is a miscellaneous system for certain simple gallery types. The 'page of images' downloader is one of these.</li>
<li><h3>Simple Downloader</h3></li>
<li>This takes a URL one-time and parses it for more URLs. This is a miscellaneous system for certain simple gallery types.</li>
</ul>
<p>The system currently supports HTML and JSON parsing.</p>
<h3>what does a downloader do?</h3>
@ -32,12 +32,12 @@
</ul>
<p>So we have three components:</p>
<ul>
<li><b>Search:</b> faces the user and converts text input into a series of Gallery URLs.</li>
<li><b>URL Class:</b> identifies URLs and informs the client how to deal with them.</li>
<li><b>Parser:</b> converts data from URLs into hydrus-understandable metadata.</li>
<li><a href="downloader_gugs.html"><b>Gallery URL Generator (GUG):</b></a> faces the user and converts text input into Gallery URLs.</li>
<li><a href="downloader_url_classes.html"><b>URL Class:</b></a> identifies URLs and informs the client how to deal with them.</li>
<li><a href="downloader_parsers.html"><b>Parser:</b></a> converts data from URLs into hydrus-understandable metadata.</li>
</ul>
<p>Thread watchers and single page downloaders do not need the 'Search' component, as the input in this case <i>is</i> a URL. You drop an imageboard thread URL on the client and it automatically recognises what it is, launches a thread watcher page for it, and finds the correct parser for the output.</p>
<p class="right"><a href="downloader_url_classes.html">Let's learn about URL Classes ----></a></p>
<p>Thread watchers and simple downloaders do not need the Gallery URL Generator, as the input in this case <i>is</i> a URL. You drop an imageboard thread URL on the client and it automatically recognises what it is, launches a thread watcher page for it, and finds the correct parser for the output.</p>
<p class="right"><a href="downloader_url_classes.html">Let's first learn about URL Classes ----></a></p>
</div>
</body>
</html>

View File

@ -9,7 +9,7 @@
<p><a href="downloader_url_classes.html"><---- Back to URL Classes</a></p>
<p class="warning">This system is still under construction in places! Even when it is done, it will only be for advanced users!</p>
<h3>parsers</h3>
<p>In hydrus, a parser is an object that takes a single block of HTML or JSON data (as returned by a URL) and returns many kinds of hydrus-level metadata.</p>
<p>In hydrus, a parser is an object that takes a single block of HTML or JSON data and returns many kinds of hydrus-level metadata.</p>
<p>Parsers are flexible and potentially quite complicated. You might like to open <i>network->manage parsers</i> and explore the UI as you read these pages. Check out how the default parsers already in the client work, and if you want to write a new one, see if there is something already in there that is similar--it is usually easier to duplicate an existing parser and then alter it than to create a new one from scratch every time.</p>
<p>There are three main components in the parsing system (click to open each component's help page):</p>
<ul>
@ -23,7 +23,7 @@
<li><a href="downloader_parsers_full_example_gallery_page.html">e621 HTML gallery page</a></li>
<li><a href="downloader_parsers_full_example_thread.html">8chan JSON thread API</a></li>
</ul>
<p class="right"><a href="downloader_searches.html">Taken a break? Now let's learn about Searches ----></a></p>
<p class="right"><a href="downloader_gugs.html">Taken a break? Now let's learn about Gallery URL Generators ----></a></p>
</div>
</body>
</html>

View File

@ -10823,6 +10823,68 @@ class DB( HydrusDB.HydrusDB ):
if version == 318:
try:
import urlparse
self._controller.pub( 'splash_set_status_subtext', 'normalising some urls: initialising' )
all_url_ids = self._STL( self._c.execute( 'SELECT url_id FROM urls;' ) )
num_to_do = len( all_url_ids )
for ( i, url_id ) in enumerate( all_url_ids ):
( url, ) = self._c.execute( 'SELECT url FROM urls WHERE url_id = ?;', ( url_id, ) ).fetchone()
p = urlparse.urlparse( url )
scheme = p.scheme
netloc = p.netloc
path = p.path
params = p.params
query = ClientNetworkingDomain.AlphabetiseQueryText( p.query )
fragment = p.fragment
r = urlparse.ParseResult( scheme, netloc, path, params, query, fragment )
normalised_url = r.geturl()
#
if normalised_url != url:
# ok, it changed, so lets remap the files if needed and then delete the old record
hash_ids = self._STL( self._c.execute( 'SELECT hash_id FROM url_map WHERE url_id = ?;', ( url_id, ) ) )
normalised_url_id = self._GetURLId( normalised_url )
self._c.executemany( 'INSERT OR IGNORE INTO url_map ( hash_id, url_id ) VALUES ( ?, ? );', ( ( hash_id, normalised_url_id ) for hash_id in hash_ids ) )
self._c.execute( 'DELETE FROM url_map WHERE url_id = ?;', ( url_id, ) )
self._c.execute( 'DELETE FROM urls WHERE url = ?;', ( url, ) )
if i % 100 == 0:
self._controller.pub( 'splash_set_status_subtext', 'normalising some uls: ' + HydrusData.ConvertValueRangeToPrettyString( i, num_to_do ) )
except Exception as e:
HydrusData.PrintException( e )
message = 'Trying to normalise urls at the db level failed! Please let hydrus dev know!'
self.pub_initial_message( message )
self._controller.pub( 'splash_set_title_text', 'updated db to v' + str( version + 1 ) )
self._c.execute( 'UPDATE version SET version = ?;', ( version + 1, ) )

View File

@ -401,6 +401,14 @@ def GetDefaultBoorus():
return boorus
def GetDefaultGUGs():
dir_path = os.path.join( HC.STATIC_DIR, 'default', 'gugs' )
import ClientNetworkingDomain
return GetDefaultObjectsFromPNGs( dir_path, ( ClientNetworkingDomain.GalleryURLGenerator, ) )
def GetDefaultImageboards():
imageboards = []

View File

@ -762,7 +762,7 @@ class GalleryBooru( Gallery ):
tags = tags_to_use
tags_replace = self._search_separator.join( [ urllib.quote( HydrusData.ToByteString( tag ), '' ) for tag in tags ] )
tags_replace = self._search_separator.join( [ tag for tag in tags ] )
return self._search_url.replace( '%tags%', tags_replace ).replace( '%index%', str( url_index ) )
@ -1383,7 +1383,7 @@ class GalleryPixivTag( GalleryPixiv ):
tag = query
gallery_url = 'https://www.pixiv.net/search.php?word=' + urllib.quote( HydrusData.ToByteString( tag ), '' ) + '&s_mode=s_tag_full&order=date_d'
gallery_url = 'https://www.pixiv.net/search.php?word=' + tag + '&s_mode=s_tag_full&order=date_d'
return gallery_url + '&p=' + str( page_index + 1 )

View File

@ -1720,9 +1720,9 @@ class FrameGUI( ClientGUITopLevelWindows.FrameThatResizes ):
submenu = wx.Menu()
ClientGUIMenus.AppendMenuItem( self, submenu, 'UNDER CONSTRUCTION: manage gallery url generators', 'Manage the client\'s GUGs, which convert search terms into URLs.', self._ManageGUGs )
ClientGUIMenus.AppendMenuItem( self, submenu, 'manage url classes', 'Configure which URLs the client can recognise.', self._ManageURLMatches )
ClientGUIMenus.AppendMenuItem( self, submenu, 'manage parsers', 'Manage the client\'s parsers, which convert URL content into hydrus metadata.', self._ManageParsers )
ClientGUIMenus.AppendMenuLabel( submenu, 'UNDER CONSTRUCTION: manage searchers', 'Manage the client\'s searchers, which convert search terms into URLs.' )
ClientGUIMenus.AppendSeparator( submenu )
@ -2322,6 +2322,33 @@ class FrameGUI( ClientGUITopLevelWindows.FrameThatResizes ):
self._controller.CallToThread( THREAD_do_it, self._controller )
def _ManageGUGs( self ):
title = 'manage gallery url generators'
with ClientGUITopLevelWindows.DialogEdit( self, title ) as dlg:
domain_manager = self._controller.network_engine.domain_manager
gugs = domain_manager.GetGUGs()
import ClientNetworkingDomain
gugs.append( ClientNetworkingDomain.GalleryURLGenerator( 'test gug', url_template = 'https://www.gelbooru.com/index.php?page=post&s=list&tags=%tags%&pid=0' ) )
panel = ClientGUIScrolledPanelsEdit.EditGUGsPanel( dlg, gugs )
dlg.SetPanel( panel )
if dlg.ShowModal() == wx.ID_OK:
gugs = panel.GetValue()
domain_manager.SetGUGs( gugs )
def _ManageImportFolders( self ):
def wx_do_it():

View File

@ -2235,6 +2235,11 @@ class ListBoxTagsCensorship( ListBoxTags ):
self._DataHasChanged()
def GetTags( self ):
return list( self._ordered_terms )
def RemoveTags( self, tags ):
for tag in tags:

View File

@ -39,50 +39,216 @@ import yaml
import HydrusData
import HydrusGlobals as HG
def AddKnownURLsViewCopyMenu( win, menu, media ):
def CopyMediaURLs( medias ):
urls = media.GetLocationsManager().GetURLs()
urls = set()
if len( urls ) > 0:
for media in medias:
urls = list( urls )
media_urls = media.GetLocationsManager().GetURLs()
labels_and_urls = []
unmatched_urls = []
urls.update( media_urls )
for url in urls:
urls = list( urls )
urls.sort()
urls_string = os.linesep.join( urls )
HG.client_controller.pub( 'clipboard', 'text', urls_string )
def CopyMediaURLMatchURLs( medias, url_match ):
urls = set()
for media in medias:
media_urls = media.GetLocationsManager().GetURLs()
for url in media_urls:
if url_match.Matches( url ):
urls.add( url )
urls = list( urls )
urls.sort()
urls_string = os.linesep.join( urls )
HG.client_controller.pub( 'clipboard', 'text', urls_string )
def AddKnownURLsViewCopyMenu( win, menu, focus_media, selected_media = None ):
# figure out which urls this focused file has
focus_urls = focus_media.GetLocationsManager().GetURLs()
focus_matched_labels_and_urls = []
focus_unmatched_urls = []
focus_labels_and_urls = []
if len( focus_urls ) > 0:
for url in focus_urls:
url_match = HG.client_controller.network_engine.domain_manager.GetURLMatch( url )
if url_match is None:
unmatched_urls.append( url )
focus_unmatched_urls.append( url )
else:
label = url_match.GetName() + ': ' + url
labels_and_urls.append( ( label, url ) )
focus_matched_labels_and_urls.append( ( label, url ) )
labels_and_urls.sort()
unmatched_urls.sort()
focus_matched_labels_and_urls.sort()
focus_unmatched_urls.sort()
labels_and_urls.extend( ( ( url, url ) for url in unmatched_urls ) )
focus_labels_and_urls = list( focus_matched_labels_and_urls )
focus_labels_and_urls.extend( ( ( url, url ) for url in focus_unmatched_urls ) )
# figure out which urls these selected files have
selected_media_url_matches_to_copy = set()
can_copy_selected_all_urls = False
if selected_media is not None and len( selected_media ) > 1:
selected_media = ClientMedia.FlattenMedia( selected_media )
SAMPLE_SIZE = 256
if len( selected_media ) > SAMPLE_SIZE:
selected_media_sample = random.sample( selected_media, SAMPLE_SIZE )
else:
selected_media_sample = selected_media
for media in selected_media_sample:
media_urls = media.GetLocationsManager().GetURLs()
for url in media_urls:
url_match = HG.client_controller.network_engine.domain_manager.GetURLMatch( url )
if url_match is None:
can_copy_selected_all_urls = True
else:
selected_media_url_matches_to_copy.add( url_match )
if len( selected_media_url_matches_to_copy ) > 1:
can_copy_selected_all_urls = True
if len( focus_labels_and_urls ) > 0 or len( selected_media_url_matches_to_copy ) > 0 or can_copy_selected_all_urls:
urls_menu = wx.Menu()
urls_visit_menu = wx.Menu()
urls_copy_menu = wx.Menu()
for ( label, url ) in labels_and_urls:
# copy each this file's urls (of a particular type)
if len( focus_labels_and_urls ) > 0:
ClientGUIMenus.AppendMenuItem( win, urls_visit_menu, label, 'Open this url in your web browser.', ClientPaths.LaunchURLInWebBrowser, url )
ClientGUIMenus.AppendMenuItem( win, urls_copy_menu, label, 'Copy this url to your clipboard.', HG.client_controller.pub, 'clipboard', 'text', url )
urls_visit_menu = wx.Menu()
for ( label, url ) in focus_labels_and_urls:
ClientGUIMenus.AppendMenuItem( win, urls_visit_menu, label, 'Open this url in your web browser.', ClientPaths.LaunchURLInWebBrowser, url )
ClientGUIMenus.AppendMenuItem( win, urls_copy_menu, label, 'Copy this url to your clipboard.', HG.client_controller.pub, 'clipboard', 'text', url )
ClientGUIMenus.AppendMenu( urls_menu, urls_visit_menu, 'open' )
ClientGUIMenus.AppendMenu( urls_menu, urls_visit_menu, 'open' )
# copy this file's urls
can_copy_all_recognised_urls = len( focus_matched_labels_and_urls ) > 1
can_copy_all_urls = len( focus_unmatched_urls ) > 0 and len( focus_labels_and_urls ) > 1 # if there are unmatched urls and more than one thing total
if can_copy_all_recognised_urls or can_copy_all_urls:
ClientGUIMenus.AppendSeparator( urls_copy_menu )
if can_copy_all_recognised_urls:
urls = [ url for ( label, url ) in focus_matched_labels_and_urls ]
urls_string = os.linesep.join( urls )
label = 'copy this file\'s ' + HydrusData.ToHumanInt( len( urls ) ) + ' recognised urls to your clipboard'
ClientGUIMenus.AppendMenuItem( win, urls_copy_menu, label, 'Copy these urls to your clipboard.', HG.client_controller.pub, 'clipboard', 'text', urls_string )
if can_copy_all_urls:
urls = [ url for ( label, url ) in focus_labels_and_urls ]
urls_string = os.linesep.join( urls )
label = 'copy this file\'s ' + HydrusData.ToHumanInt( len( urls ) ) + ' urls to your clipboard'
ClientGUIMenus.AppendMenuItem( win, urls_copy_menu, label, 'Copy this url to your clipboard.', HG.client_controller.pub, 'clipboard', 'text', urls_string )
# copy these files' urls (of a particular type)
can_copy_selected_recognised_urls = len( selected_media_url_matches_to_copy ) > 0
if can_copy_selected_recognised_urls or can_copy_selected_all_urls:
ClientGUIMenus.AppendSeparator( urls_copy_menu )
if can_copy_selected_recognised_urls:
selected_media_url_matches_to_copy = list( selected_media_url_matches_to_copy )
selected_media_url_matches_to_copy.sort( key = lambda url_match: url_match.GetName() )
for url_match in selected_media_url_matches_to_copy:
label = 'copy files\' ' + url_match.GetName() + ' urls'
ClientGUIMenus.AppendMenuItem( win, urls_copy_menu, label, 'Copy this url class for all files.', CopyMediaURLMatchURLs, selected_media, url_match )
if can_copy_selected_all_urls:
label = 'copy all files\' urls'
ClientGUIMenus.AppendMenuItem( win, urls_copy_menu, label, 'Copy urls for all files.', CopyMediaURLs, selected_media )
#
ClientGUIMenus.AppendMenu( urls_menu, urls_copy_menu, 'copy' )
ClientGUIMenus.AppendMenu( menu, urls_menu, 'known urls' )
@ -3409,7 +3575,7 @@ class MediaPanelThumbnails( MediaPanel ):
#
AddKnownURLsViewCopyMenu( self, menu, self._focussed_media )
AddKnownURLsViewCopyMenu( self, menu, self._focussed_media, selected_media = self._selected_media )
# share

View File

@ -770,13 +770,17 @@ class PopupMessageManager( wx.Frame ):
max_width = wrap_width * 1.2
best_size = self.GetBestSize()
( best_width, best_height ) = self.GetBestSize()
if best_size[0] < max_width and best_size != self._last_best_size_i_fit_on:
best_width = min( best_width, max_width )
best_size = ( best_width, best_height )
if best_size != self._last_best_size_i_fit_on:
self._last_best_size_i_fit_on = best_size
self.Fit()
self.SetClientSize( best_size )
self.Layout()

View File

@ -1350,6 +1350,316 @@ class EditFrameLocationPanel( ClientGUIScrolledPanels.EditPanel ):
return ( name, remember_size, remember_position, last_size, last_position, default_gravity, default_position, maximised, fullscreen )
class EditGUGPanel( ClientGUIScrolledPanels.EditPanel ):
def __init__( self, parent, gug ):
ClientGUIScrolledPanels.EditPanel.__init__( self, parent )
self._original_gug = gug
self._name = wx.TextCtrl( self )
self._url_template = wx.TextCtrl( self )
min_width = ClientGUICommon.ConvertTextToPixelWidth( self._url_template, 74 )
self._url_template.SetMinClientSize( ( min_width, -1 ) )
self._replacement_phrase = wx.TextCtrl( self )
self._search_terms_separator = wx.TextCtrl( self )
self._initial_search_text = wx.TextCtrl( self )
self._example_search_text = wx.TextCtrl( self )
self._example_url = wx.TextCtrl( self, style = wx.TE_READONLY )
self._matched_url_match = wx.TextCtrl( self, style = wx.TE_READONLY )
#
name = gug.GetName()
( url_template, replacement_phrase, search_terms_separator, example_search_text ) = gug.GetURLTemplateVariables()
initial_search_text = gug.GetInitialSearchText()
self._name.SetValue( name )
self._url_template.SetValue( url_template )
self._replacement_phrase.SetValue( replacement_phrase )
self._search_terms_separator.SetValue( search_terms_separator )
self._initial_search_text.SetValue( initial_search_text )
self._example_search_text.SetValue( example_search_text )
self._UpdateExampleURL()
#
rows = []
rows.append( ( 'name: ', self._name ) )
rows.append( ( 'url template: ', self._url_template) )
rows.append( ( 'replacement phrase: ', self._replacement_phrase ) )
rows.append( ( 'search terms separator: ', self._search_terms_separator ) )
rows.append( ( 'initial search text (to prompt user): ', self._initial_search_text ) )
rows.append( ( 'example search text: ', self._example_search_text ) )
rows.append( ( 'example url: ', self._example_url ) )
rows.append( ( 'matches as a: ', self._matched_url_match ) )
gridbox = ClientGUICommon.WrapInGrid( self, rows )
vbox = wx.BoxSizer( wx.VERTICAL )
vbox.Add( gridbox, CC.FLAGS_EXPAND_SIZER_PERPENDICULAR )
self.SetSizer( vbox )
#
self._url_template.Bind( wx.EVT_TEXT, self.EventUpdate )
self._replacement_phrase.Bind( wx.EVT_TEXT, self.EventUpdate )
self._search_terms_separator.Bind( wx.EVT_TEXT, self.EventUpdate )
self._example_search_text.Bind( wx.EVT_TEXT, self.EventUpdate )
def _GetValue( self ):
gug_key = self._original_gug.GetGUGKey()
name = self._name.GetValue()
url_template = self._url_template.GetValue()
replacement_phrase = self._replacement_phrase.GetValue()
search_terms_separator = self._search_terms_separator.GetValue()
initial_search_text = self._initial_search_text.GetValue()
example_search_text = self._example_search_text.GetValue()
gug = ClientNetworkingDomain.GalleryURLGenerator( name, gug_key = gug_key, url_template = url_template, replacement_phrase = replacement_phrase, search_terms_separator = search_terms_separator, initial_search_text = initial_search_text, example_search_text = example_search_text )
return gug
def _UpdateExampleURL( self ):
gug = self._GetValue()
try:
example_url = gug.GetExampleURL()
self._example_url.SetValue( example_url )
except HydrusExceptions.GUGException as e:
reason = HydrusData.ToUnicode( e )
self._example_url.SetValue( 'Could not generate - ' + reason )
example_url = None
if example_url is None:
self._matched_url_match.SetValue( '' )
else:
url_match = HG.client_controller.network_engine.domain_manager.GetURLMatch( example_url )
if url_match is None:
url_match_text = 'Did not match a known url class.'
else:
url_match_text = 'Matched ' + url_match.GetName() + ' url class.'
self._matched_url_match.SetValue( url_match_text )
def EventUpdate( self, event ):
self._UpdateExampleURL()
def GetValue( self ):
gug = self._GetValue()
try:
gug.GetExampleURL()
except HydrusExceptions.GUGException:
raise HydrusExceptions.VetoException( 'Please ensure your generator can make an example url!' )
return gug
class EditGUGsPanel( ClientGUIScrolledPanels.EditPanel ):
def __init__( self, parent, gugs ):
ClientGUIScrolledPanels.EditPanel.__init__( self, parent )
menu_items = []
page_func = HydrusData.Call( ClientPaths.LaunchPathInWebBrowser, os.path.join( HC.HELP_DIR, 'downloader_gugs.html' ) )
menu_items.append( ( 'normal', 'open the url classes help', 'Open the help page for url classes in your web browesr.', page_func ) )
help_button = ClientGUICommon.MenuBitmapButton( self, CC.GlobalBMPs.help, menu_items )
help_hbox = ClientGUICommon.WrapInText( help_button, self, 'help for this panel -->', wx.Colour( 0, 0, 255 ) )
self._list_ctrl_panel = ClientGUIListCtrl.BetterListCtrlPanel( self )
columns = [ ( 'name', 16 ), ( 'example url', -1 ), ( 'gallery url class?', 20 ) ]
self._list_ctrl = ClientGUIListCtrl.BetterListCtrl( self._list_ctrl_panel, 'gugs', 30, 74, columns, self._ConvertDataToListCtrlTuples, delete_key_callback = self._Delete, activation_callback = self._Edit )
self._list_ctrl_panel.SetListCtrl( self._list_ctrl )
self._list_ctrl_panel.AddButton( 'add', self._Add )
self._list_ctrl_panel.AddButton( 'edit', self._Edit, enabled_only_on_selection = True )
self._list_ctrl_panel.AddButton( 'delete', self._Delete, enabled_only_on_selection = True )
self._list_ctrl_panel.AddSeparator()
self._list_ctrl_panel.AddImportExportButtons( ( ClientNetworkingDomain.GalleryURLGenerator, ), self._AddGUG )
self._list_ctrl_panel.AddSeparator()
self._list_ctrl_panel.AddDefaultsButton( ClientDefaults.GetDefaultGUGs, self._AddGUG )
#
self._list_ctrl.AddDatas( gugs )
self._list_ctrl.Sort( 0 )
#
vbox = wx.BoxSizer( wx.VERTICAL )
vbox.Add( help_hbox, CC.FLAGS_BUTTON_SIZER )
vbox.Add( self._list_ctrl_panel, CC.FLAGS_EXPAND_BOTH_WAYS )
self.SetSizer( vbox )
def _Add( self ):
gug = ClientNetworkingDomain.GalleryURLGenerator( 'new gallery url generator' )
with ClientGUITopLevelWindows.DialogEdit( self, 'edit gallery url generator' ) as dlg:
panel = EditGUGPanel( dlg, gug )
dlg.SetPanel( panel )
if dlg.ShowModal() == wx.ID_OK:
gug = panel.GetValue()
self._AddGUG( gug )
self._list_ctrl.Sort()
def _AddGUG( self, gug ):
HydrusSerialisable.SetNonDupeName( gug, self._GetExistingNames() )
gug.RegenerateGUGKey()
self._list_ctrl.AddDatas( ( gug, ) )
def _ConvertDataToListCtrlTuples( self, gug ):
name = gug.GetName()
example_url = gug.GetExampleURL()
url_match = HG.client_controller.network_engine.domain_manager.GetURLMatch( example_url )
if url_match is None:
gallery_url_match = False
pretty_gallery_url_match = ''
else:
gallery_url_match = True
pretty_gallery_url_match = url_match.GetName()
pretty_name = name
pretty_example_url = example_url
display_tuple = ( pretty_name, pretty_example_url, pretty_gallery_url_match )
sort_tuple = ( name, example_url, gallery_url_match )
return ( display_tuple, sort_tuple )
def _Delete( self ):
# This GUG is in NGUG blah, you sure?
with ClientGUIDialogs.DialogYesNo( self, 'Remove all selected?' ) as dlg:
if dlg.ShowModal() == wx.ID_YES:
self._list_ctrl.DeleteSelected()
def _Edit( self ):
for gug in self._list_ctrl.GetData( only_selected = True ):
with ClientGUITopLevelWindows.DialogEdit( self, 'edit gallery url generator' ) as dlg:
panel = EditGUGPanel( dlg, gug )
dlg.SetPanel( panel )
if dlg.ShowModal() == wx.ID_OK:
self._list_ctrl.DeleteDatas( ( gug, ) )
gug = panel.GetValue()
HydrusSerialisable.SetNonDupeName( gug, self._GetExistingNames() )
self._list_ctrl.AddDatas( ( gug, ) )
else:
break
self._list_ctrl.Sort()
def _GetExistingNames( self ):
gugs = self._list_ctrl.GetData()
names = { gug.GetName() for gug in gugs }
return names
def GetValue( self ):
gugs = self._list_ctrl.GetData()
return gugs
class EditMediaViewOptionsPanel( ClientGUIScrolledPanels.EditPanel ):
def __init__( self, parent, info ):

View File

@ -1732,6 +1732,7 @@ class ManageOptionsPanel( ClientGUIScrolledPanels.ManagePanel ):
watchers = ClientGUICommon.StaticBox( self, 'watchers' )
self._watcher_page_wait_period = wx.SpinCtrl( watchers, min = 1, max = 120 )
self._highlight_new_watcher = wx.CheckBox( watchers )
checker_options = self._new_options.GetDefaultWatcherCheckerOptions()
@ -1756,9 +1757,7 @@ class ManageOptionsPanel( ClientGUIScrolledPanels.ManagePanel ):
gallery_page_tt += os.linesep
gallery_page_tt += '- To give servers a break (some gallery pages can be CPU-expensive to generate).'
gallery_page_tt += os.linesep * 2
gallery_page_tt += 'After this fixed wait has occurred, the gallery download job will run like any other network job, except that it will ignore bandwidth limits after thirty seconds to guarantee throughput and to stay synced with the source.'
gallery_page_tt += os.linesep * 2
gallery_page_tt += 'Update: Now that it is much easier to run multiple downloaders simultaneously, these delays are now global across the whole program. There is one page gallery download slot per x seconds and one subscription gallery download slot per y seconds.'
gallery_page_tt += 'These delays/lots are per-domain.'
gallery_page_tt += os.linesep * 2
gallery_page_tt += 'If you do not understand this stuff, you can just leave it alone.'
@ -1777,6 +1776,8 @@ class ManageOptionsPanel( ClientGUIScrolledPanels.ManagePanel ):
self._stop_character.SetValue( self._new_options.GetString( 'stop_character' ) )
self._show_deleted_on_file_seed_short_summary.SetValue( self._new_options.GetBoolean( 'show_deleted_on_file_seed_short_summary' ) )
self._watcher_page_wait_period.SetValue( self._new_options.GetInteger( 'watcher_page_wait_period' ) )
self._watcher_page_wait_period.SetToolTip( gallery_page_tt )
self._highlight_new_watcher.SetValue( self._new_options.GetBoolean( 'highlight_new_watcher' ) )
#
@ -1808,6 +1809,7 @@ class ManageOptionsPanel( ClientGUIScrolledPanels.ManagePanel ):
rows = []
rows.append( ( 'Additional fixed time (in seconds) to wait between watcher checks:', self._watcher_page_wait_period ) )
rows.append( ( 'If new watcher entered and no current highlight, highlight the new watcher:', self._highlight_new_watcher ) )
gridbox = ClientGUICommon.WrapInGrid( watchers, rows )
@ -1849,6 +1851,7 @@ class ManageOptionsPanel( ClientGUIScrolledPanels.ManagePanel ):
self._new_options.SetInteger( 'max_simultaneous_subscriptions', self._max_simultaneous_subscriptions.GetValue() )
self._new_options.SetBoolean( 'process_subs_in_random_order', self._process_subs_in_random_order.GetValue() )
self._new_options.SetInteger( 'watcher_page_wait_period', self._watcher_page_wait_period.GetValue() )
self._new_options.SetBoolean( 'highlight_new_watcher', self._highlight_new_watcher.GetValue() )
self._new_options.SetDefaultWatcherCheckerOptions( self._watcher_checker_options.GetValue() )
@ -6034,34 +6037,42 @@ class ManageURLsPanel( ClientGUIScrolledPanels.ManagePanel ):
def _EnterURL( self, url, only_add = False ):
if url in self._current_urls:
normalised_url = HG.client_controller.network_engine.domain_manager.NormaliseURL( url )
for u in ( url, normalised_url ):
if only_add:
if u in self._current_urls:
return
for index in range( self._urls_listbox.GetCount() ):
existing_url = self._urls_listbox.GetClientData( index )
if existing_url == url:
self._RemoveURL( index )
if only_add:
return
else:
self._urls_listbox.Append( url, url )
self._current_urls.add( url )
if url not in self._original_urls:
for index in range( self._urls_listbox.GetCount() ):
existing_url = self._urls_listbox.GetClientData( index )
if existing_url == u:
self._RemoveURL( index )
return
self._urls_to_add.add( url )
u = normalised_url
if u not in self._current_urls:
self._urls_listbox.Append( u, u )
self._current_urls.add( u )
if u not in self._original_urls:
self._urls_to_add.add( u )

View File

@ -1007,7 +1007,9 @@ class FileSeed( HydrusSerialisable.SerialisableBase ):
elif len( all_parse_results ) > 1:
( num_urls_added, num_urls_already_in_file_seed_cache, num_urls_total ) = ClientImporting.UpdateFileSeedCacheWithAllParseResults( file_seed_cache, all_parse_results, self.file_seed_data )
file_seeds = ClientImporting.ConvertAllParseResultsToFileSeeds( all_parse_results, self.file_seed_data )
( num_urls_added, num_urls_already_in_file_seed_cache, can_add_more_file_urls, stop_reason ) = ClientImporting.UpdateFileSeedCacheWithFileSeeds( file_seed_cache, file_seeds )
status = CC.STATUS_SUCCESSFUL_AND_NEW
note = 'Found ' + HydrusData.ToHumanInt( num_urls_added ) + ' new URLs.'
@ -1246,6 +1248,8 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
SERIALISABLE_NAME = 'Import File Status Cache'
SERIALISABLE_VERSION = 8
COMPACT_NUMBER = 100
def __init__( self ):
HydrusSerialisable.SerialisableBase.__init__( self )
@ -1615,12 +1619,12 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
with self._lock:
if len( self._file_seeds ) <= 100:
if len( self._file_seeds ) <= self.COMPACT_NUMBER:
return False
for file_seed in self._file_seeds[:-100]:
for file_seed in self._file_seeds[:-self.COMPACT_NUMBER]:
if file_seed.status == CC.STATUS_UNKNOWN:
@ -1641,14 +1645,14 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
with self._lock:
if len( self._file_seeds ) <= 100:
if len( self._file_seeds ) <= self.COMPACT_NUMBER:
return
new_file_seeds = HydrusSerialisable.SerialisableList()
for file_seed in self._file_seeds[:-100]:
for file_seed in self._file_seeds[:-self.COMPACT_NUMBER]:
still_to_do = file_seed.status == CC.STATUS_UNKNOWN
still_relevant = self._GetSourceTimestamp( file_seed ) > compact_before_this_source_time
@ -1659,7 +1663,7 @@ class FileSeedCache( HydrusSerialisable.SerialisableBase ):
new_file_seeds.extend( self._file_seeds[-100:] )
new_file_seeds.extend( self._file_seeds[-self.COMPACT_NUMBER:] )
self._file_seeds = new_file_seeds
self._file_seeds_to_indices = { file_seed : index for ( index, file_seed ) in enumerate( self._file_seeds ) }

View File

@ -425,29 +425,25 @@ class GalleryImport( HydrusSerialisable.SerialisableBase ):
return
( consumed, next_timestamp ) = HG.client_controller.network_engine.domain_manager.TryToConsumeAGalleryQuery( 'pages' )
if not consumed:
if self._current_page_index == 0:
page_check_status = 'checking first page (next slot ' + HydrusData.TimestampToPrettyTimeDelta( next_timestamp, just_now_threshold = 0 ) + ')'
else:
page_check_status = 'checking next page (next slot ' + HydrusData.TimestampToPrettyTimeDelta( next_timestamp, just_now_threshold = 0 ) + ')'
self._gallery_status = page_check_status
return
self._gallery_status = 'now checking next page'
self._gallery_status = 'checking next page'
if gallery_seed.WorksInNewSystem():
def file_seeds_callable( file_seeds ):
if self._file_limit is None:
max_new_urls_allowed = None
else:
max_new_urls_allowed = self._file_limit - self._num_new_urls_found
return ClientImporting.UpdateFileSeedCacheWithFileSeeds( self._file_seed_cache, file_seeds, max_new_urls_allowed )
def status_hook( text ):
with self._lock:
@ -461,18 +457,9 @@ class GalleryImport( HydrusSerialisable.SerialisableBase ):
return
if self._file_limit is None:
max_new_urls_allowed = None
else:
max_new_urls_allowed = self._file_limit - self._num_new_urls_found
try:
( num_urls_added, num_urls_already_in_file_seed_cache, num_urls_total, result_404 ) = gallery_seed.WorkOnURL( self._gallery_seed_log, self._file_seed_cache, status_hook, title_hook, self._NetworkJobFactory, self._GalleryNetworkJobPresentationContextFactory, self._file_import_options, max_new_urls_allowed = max_new_urls_allowed )
( num_urls_added, num_urls_already_in_file_seed_cache, num_urls_total, result_404, can_add_more_file_urls, stop_reason ) = gallery_seed.WorkOnURL( 'download page', self._gallery_seed_log, file_seeds_callable, status_hook, title_hook, self._NetworkJobFactory, self._GalleryNetworkJobPresentationContextFactory, self._file_import_options )
self._num_new_urls_found += num_urls_added
self._num_urls_found += num_urls_total
@ -514,6 +501,8 @@ class GalleryImport( HydrusSerialisable.SerialisableBase ):
network_job = ClientNetworkingJobs.NetworkJobDownloader( self._gallery_import_key, method, url, **kwargs )
network_job.SetGalleryToken( 'download page' )
network_job.OverrideBandwidth( 30 )
with self._lock:

View File

@ -76,6 +76,10 @@ class GallerySeed( HydrusSerialisable.SerialisableBase ):
url = 'https://nostrils-central.cx/index.php?post=s&tag=hyper_nostrils&page=3'
else:
url = HG.client_controller.network_engine.domain_manager.NormaliseURL( url )
HydrusSerialisable.SerialisableBase.__init__( self )
@ -120,11 +124,6 @@ class GallerySeed( HydrusSerialisable.SerialisableBase ):
self.modified = HydrusData.GetNow()
def Normalise( self ):
self.url = HG.client_controller.network_engine.domain_manager.NormaliseURL( self.url )
def SetReferralURL( self, referral_url ):
self._referral_url = referral_url
@ -162,7 +161,7 @@ class GallerySeed( HydrusSerialisable.SerialisableBase ):
return False
def WorkOnURL( self, gallery_seed_log, file_seed_cache, status_hook, title_hook, network_job_factory, network_job_presentation_context_factory, file_import_options, max_new_urls_allowed = None, gallery_urls_seen_before = None ):
def WorkOnURL( self, gallery_token_name, gallery_seed_log, file_seeds_callable, status_hook, title_hook, network_job_factory, network_job_presentation_context_factory, file_import_options, gallery_urls_seen_before = None ):
if gallery_urls_seen_before is None:
@ -179,6 +178,8 @@ class GallerySeed( HydrusSerialisable.SerialisableBase ):
num_urls_already_in_file_seed_cache = 0
num_urls_total = 0
result_404 = False
can_add_more_file_urls = False
stop_reason = ''
try:
@ -209,6 +210,8 @@ class GallerySeed( HydrusSerialisable.SerialisableBase ):
network_job = network_job_factory( 'GET', url_to_check, referral_url = referral_url )
network_job.SetGalleryToken( gallery_token_name )
network_job.OverrideBandwidth( 30 )
HG.client_controller.network_engine.AddJob( network_job )
@ -239,16 +242,11 @@ class GallerySeed( HydrusSerialisable.SerialisableBase ):
title_hook( title )
( num_urls_added, num_urls_already_in_file_seed_cache, num_urls_total ) = ClientImporting.UpdateFileSeedCacheWithAllParseResults( file_seed_cache, all_parse_results, self.url, max_new_urls_allowed )
file_seeds = ClientImporting.ConvertAllParseResultsToFileSeeds( all_parse_results, self.url )
if max_new_urls_allowed is None:
can_add_more_file_urls = True
else:
can_add_more_file_urls = num_urls_added < max_new_urls_allowed
num_urls_total = len( file_seeds )
( num_urls_added, num_urls_already_in_file_seed_cache, can_add_more_file_urls, stop_reason ) = file_seeds_callable( file_seeds )
status = CC.STATUS_SUCCESSFUL_AND_NEW
@ -261,7 +259,7 @@ class GallerySeed( HydrusSerialisable.SerialisableBase ):
if not can_add_more_file_urls:
note += ' - hit file limit'
note += ' - ' + stop_reason
# only keep searching if we found any files, otherwise this could be a blank results page with another stub page
@ -402,7 +400,7 @@ class GallerySeed( HydrusSerialisable.SerialisableBase ):
gallery_seed_log.NotifyGallerySeedsUpdated( ( self, ) )
return ( num_urls_added, num_urls_already_in_file_seed_cache, num_urls_total, result_404 )
return ( num_urls_added, num_urls_already_in_file_seed_cache, num_urls_total, result_404, can_add_more_file_urls, stop_reason )
HydrusSerialisable.SERIALISABLE_TYPES_TO_OBJECT_TYPES[ HydrusSerialisable.SERIALISABLE_TYPE_GALLERY_SEED ] = GallerySeed
@ -413,6 +411,8 @@ class GallerySeedLog( HydrusSerialisable.SerialisableBase ):
SERIALISABLE_NAME = 'Gallery Log'
SERIALISABLE_VERSION = 1
COMPACT_NUMBER = 100
def __init__( self ):
HydrusSerialisable.SerialisableBase.__init__( self )
@ -506,8 +506,6 @@ class GallerySeedLog( HydrusSerialisable.SerialisableBase ):
for gallery_seed in gallery_seeds:
gallery_seed.Normalise()
if gallery_seed in self._gallery_seeds_to_indices:
continue
@ -554,12 +552,12 @@ class GallerySeedLog( HydrusSerialisable.SerialisableBase ):
with self._lock:
if len( self._gallery_seeds ) <= 25:
if len( self._gallery_seeds ) <= self.COMPACT_NUMBER:
return False
for gallery_seed in self._gallery_seeds[:-25]:
for gallery_seed in self._gallery_seeds[:-self.COMPACT_NUMBER]:
if gallery_seed.status == CC.STATUS_UNKNOWN:
@ -580,14 +578,14 @@ class GallerySeedLog( HydrusSerialisable.SerialisableBase ):
with self._lock:
if len( self._gallery_seeds ) <= 25:
if len( self._gallery_seeds ) <= self.COMPACT_NUMBER:
return
new_gallery_seeds = HydrusSerialisable.SerialisableList()
for gallery_seed in self._gallery_seeds[:-25]:
for gallery_seed in self._gallery_seeds[:-self.COMPACT_NUMBER]:
still_to_do = gallery_seed.status == CC.STATUS_UNKNOWN
still_relevant = gallery_seed.created > compact_before_this_source_time
@ -598,7 +596,7 @@ class GallerySeedLog( HydrusSerialisable.SerialisableBase ):
new_gallery_seeds.extend( self._gallery_seeds[-25:] )
new_gallery_seeds.extend( self._gallery_seeds[-self.COMPACT_NUMBER:] )
self._gallery_seeds = new_gallery_seeds
self._gallery_seeds_to_indices = { gallery_seed : index for ( index, gallery_seed ) in enumerate( self._gallery_seeds ) }
@ -737,8 +735,6 @@ class GallerySeedLog( HydrusSerialisable.SerialisableBase ):
search_gallery_seed = GallerySeed( url )
search_gallery_seed.Normalise()
search_url = search_gallery_seed.url
return search_url in ( gallery_seed.url for gallery_seed in self._gallery_seeds )

View File

@ -1083,7 +1083,14 @@ class TagImportOptions( HydrusSerialisable.SerialisableBase ):
if len( sub_statements ) > 0:
name = HG.client_controller.services_manager.GetName( service_key )
try:
name = HG.client_controller.services_manager.GetName( service_key )
except HydrusExceptions.DataMissing:
continue
service_statement = name + ':' + os.linesep * 2 + os.linesep.join( sub_statements )

View File

@ -799,7 +799,12 @@ class URLsImport( HydrusSerialisable.SerialisableBase ):
status_hook = lambda s: s
title_hook = lambda s: s
gallery_seed.WorkOnURL( self._gallery_seed_log, self._file_seed_cache, status_hook, title_hook, self._NetworkJobFactory, self._GalleryNetworkJobPresentationContextFactory, self._file_import_options )
def file_seeds_callable( file_seeds ):
return ClientImporting.UpdateFileSeedCacheWithFileSeeds( self._file_seed_cache, file_seeds )
gallery_seed.WorkOnURL( 'download page', self._gallery_seed_log, file_seeds_callable, status_hook, title_hook, self._NetworkJobFactory, self._GalleryNetworkJobPresentationContextFactory, self._file_import_options )
except Exception as e:

View File

@ -156,6 +156,7 @@ class Subscription( HydrusSerialisable.SerialisableBaseNamed ):
return network_job_factory
def _NoDelays( self ):
return HydrusData.TimeHasPassed( self._no_work_until )
@ -601,7 +602,6 @@ class Subscription( HydrusSerialisable.SerialisableBaseNamed ):
done_first_page = False
seen_some_existing_urls = False
query_text = query.GetQueryText()
file_seed_cache = query.GetFileSeedCache()
@ -672,8 +672,6 @@ class Subscription( HydrusSerialisable.SerialisableBaseNamed ):
if gallery_seed.WorksInNewSystem():
num_existing_urls = 0
def status_hook( text ):
job_key.SetVariable( 'popup_text_1', prefix + ': ' + text )
@ -686,12 +684,14 @@ class Subscription( HydrusSerialisable.SerialisableBaseNamed ):
gallery_seed_log.AddGallerySeeds( ( gallery_seed, ) )
num_existing_urls_this_stream = 0
stop_reason = 'unknown stop reason'
keep_checking = True
try:
keep_checking = True
while keep_checking and gallery_seed_log.WorkToDo():
p1 = HC.options[ 'pause_subs_sync' ]
@ -720,33 +720,71 @@ class Subscription( HydrusSerialisable.SerialisableBaseNamed ):
break
( consumed, next_timestamp ) = HG.client_controller.network_engine.domain_manager.TryToConsumeAGalleryQuery( 'subscriptions' )
if not consumed:
if not done_first_page:
page_check_status = 'checking first page ' + HydrusData.TimestampToPrettyTimeDelta( next_timestamp )
else:
page_check_status = HydrusData.ToHumanInt( total_new_urls_for_this_sync ) + ' new urls found (next slot ' + HydrusData.TimestampToPrettyTimeDelta( next_timestamp, just_now_threshold = 0 ) + ')'
job_key.SetVariable( 'popup_text_1', prefix + ': ' + page_check_status )
time.sleep( 1 )
continue
job_key.SetVariable( 'popup_text_1', prefix + ': found ' + HydrusData.ToHumanInt( total_new_urls_for_this_sync ) + ' new urls, checking next page' )
this_page_file_seed_cache = ClientImportFileSeeds.FileSeedCache()
def file_seeds_callable( file_seeds ):
num_urls_added = 0
num_urls_already_in_file_seed_cache = 0
can_add_more_file_urls = True
for file_seed in file_seeds:
if file_limit_for_this_sync is not None and total_new_urls_for_this_sync + num_urls_added >= file_limit_for_this_sync:
if this_is_initial_sync:
stop_reason = 'hit initial file limit'
else:
self._ShowHitPeriodicFileLimitMessage( query_text )
stop_reason = 'hit periodic file limit'
can_add_more_file_urls = False
break
if file_seed in file_seeds_to_add:
# this catches the occasional overflow when a new file is uploaded while gallery parsing is going on
continue
if file_seed_cache.HasFileSeed( file_seed ):
num_urls_already_in_file_seed_cache += 1
WE_HIT_OLD_GROUND_THRESHOLD = 5
if num_urls_already_in_file_seed_cache > WE_HIT_OLD_GROUND_THRESHOLD:
can_add_more_file_urls = False
stop_reason = 'saw ' + HydrusData.ToHumanInt( WE_HIT_OLD_GROUND_THRESHOLD ) + ' previously seen urls, so assuming we caught up'
break
else:
num_urls_added += 1
file_seeds_to_add.add( file_seed )
file_seeds_to_add_ordered.append( file_seed )
return ( num_urls_added, num_urls_already_in_file_seed_cache, can_add_more_file_urls, stop_reason )
try:
( num_urls_added, num_urls_already_in_file_seed_cache, num_urls_total, result_404 ) = gallery_seed.WorkOnURL( gallery_seed_log, this_page_file_seed_cache, status_hook, title_hook, self._GenerateNetworkJobFactory( query ), ClientImporting.GenerateMultiplePopupNetworkJobPresentationContextFactory( job_key ), self._file_import_options, gallery_urls_seen_before = gallery_urls_seen_this_sync )
( num_urls_added, num_urls_already_in_file_seed_cache, num_urls_total, result_404, can_add_more_file_urls, stop_reason ) = gallery_seed.WorkOnURL( 'subscription', gallery_seed_log, file_seeds_callable, status_hook, title_hook, self._GenerateNetworkJobFactory( query ), ClientImporting.GenerateMultiplePopupNetworkJobPresentationContextFactory( job_key ), self._file_import_options, gallery_urls_seen_before = gallery_urls_seen_this_sync )
except HydrusExceptions.CancelledException as e:
@ -767,71 +805,11 @@ class Subscription( HydrusSerialisable.SerialisableBaseNamed ):
done_first_page = True
if num_urls_already_in_file_seed_cache > 0:
seen_some_existing_urls = True
keep_checking = can_add_more_file_urls
if num_urls_total == 0:
stop_reason = 'finished query'
break
num_existing_urls_this_stream += num_urls_already_in_file_seed_cache
for file_seed in this_page_file_seed_cache.GetFileSeeds():
if file_limit_for_this_sync is not None and total_new_urls_for_this_sync >= file_limit_for_this_sync:
if this_is_initial_sync:
stop_reason = 'hit initial file limit'
else:
if not seen_some_existing_urls: # this tests if we saw some urls in another gallery identifier
self._ShowHitPeriodicFileLimitMessage( query_text )
stop_reason = 'hit periodic file limit'
keep_checking = False
break
if file_seed in file_seeds_to_add:
# this catches the occasional overflow when a new file is uploaded while gallery parsing is going on
continue
if file_seed_cache.HasFileSeed( file_seed ):
num_existing_urls += 1
WE_HIT_OLD_GROUND_THRESHOLD = 5
if num_existing_urls > WE_HIT_OLD_GROUND_THRESHOLD:
keep_checking = False
stop_reason = 'saw ' + HydrusData.ToHumanInt( WE_HIT_OLD_GROUND_THRESHOLD ) + ' previously seen urls, so assuming we caught up'
break
else:
file_seeds_to_add.add( file_seed )
file_seeds_to_add_ordered.append( file_seed )
total_new_urls_for_this_sync += 1
total_new_urls_for_this_sync += num_urls_added
finally:
@ -857,6 +835,8 @@ class Subscription( HydrusSerialisable.SerialisableBaseNamed ):
job_key.SetVariable( 'popup_network_job', network_job )
network_job.SetGalleryToken( 'subscription' )
network_job.OverrideBandwidth( 30 )
return network_job
@ -865,7 +845,7 @@ class Subscription( HydrusSerialisable.SerialisableBaseNamed ):
gallery.SetNetworkJobFactory( network_job_factory )
page_index = 0
num_existing_urls = 0
num_existing_urls_this_stream = 0
keep_checking = True
while keep_checking:
@ -885,26 +865,6 @@ class Subscription( HydrusSerialisable.SerialisableBaseNamed ):
raise HydrusExceptions.CancelledException( 'gallery parsing cancelled, likely by user' )
( consumed, next_timestamp ) = HG.client_controller.network_engine.domain_manager.TryToConsumeAGalleryQuery( 'subscriptions' )
if not consumed:
if not done_first_page:
page_check_status = 'checking first page ' + HydrusData.TimestampToPrettyTimeDelta( next_timestamp )
else:
page_check_status = HydrusData.ToHumanInt( total_new_urls_for_this_sync ) + ' new urls found, checking next page (next slot ' + HydrusData.TimestampToPrettyTimeDelta( next_timestamp, just_now_threshold = 0 ) + ')'
job_key.SetVariable( 'popup_text_1', prefix + ': ' + page_check_status )
time.sleep( 1 )
continue
job_key.SetVariable( 'popup_text_1', prefix + ': found ' + HydrusData.ToHumanInt( total_new_urls_for_this_sync ) + ' new urls, checking next page' )
gallery_url = gallery.GetGalleryPageURL( query_text, page_index )
@ -930,7 +890,7 @@ class Subscription( HydrusSerialisable.SerialisableBaseNamed ):
if file_limit_for_this_sync is not None and total_new_urls_for_this_sync >= file_limit_for_this_sync:
if not seen_some_existing_urls and not this_is_initial_sync:
if not this_is_initial_sync:
self._ShowHitPeriodicFileLimitMessage( query_text )
@ -949,11 +909,9 @@ class Subscription( HydrusSerialisable.SerialisableBaseNamed ):
if file_seed_cache.HasFileSeed( file_seed ):
seen_some_existing_urls = True
num_existing_urls_this_stream += 1
num_existing_urls += 1
if num_existing_urls > 5:
if num_existing_urls_this_stream > 5:
keep_checking = False

View File

@ -22,7 +22,7 @@ class MultipleWatcherImport( HydrusSerialisable.SerialisableBase ):
SERIALISABLE_NAME = 'Multiple Watcher'
SERIALISABLE_VERSION = 2
ADDED_TIMESTAMP_DURATION = 5
ADDED_TIMESTAMP_DURATION = 15
def __init__( self, url = None ):
@ -570,6 +570,11 @@ class WatcherImport( HydrusSerialisable.SerialisableBase ):
def _CheckWatchableURL( self ):
def file_seeds_callable( file_seeds ):
return ClientImporting.UpdateFileSeedCacheWithFileSeeds( self._file_seed_cache, file_seeds )
def status_hook( text ):
with self._lock:
@ -590,9 +595,14 @@ class WatcherImport( HydrusSerialisable.SerialisableBase ):
self._gallery_seed_log.AddGallerySeeds( ( gallery_seed, ) )
with self._lock:
self._watcher_status = 'checking'
try:
( num_urls_added, num_urls_already_in_file_seed_cache, num_urls_total, result_404 ) = gallery_seed.WorkOnURL( self._gallery_seed_log, self._file_seed_cache, status_hook, title_hook, self._NetworkJobFactory, self._CheckerNetworkJobPresentationContextFactory, self._file_import_options )
( num_urls_added, num_urls_already_in_file_seed_cache, num_urls_total, result_404, can_add_more_file_urls, stop_reason ) = gallery_seed.WorkOnURL( 'watcher', self._gallery_seed_log, file_seeds_callable, status_hook, title_hook, self._NetworkJobFactory, self._CheckerNetworkJobPresentationContextFactory, self._file_import_options )
if num_urls_added > 0:
@ -648,6 +658,8 @@ class WatcherImport( HydrusSerialisable.SerialisableBase ):
self._UpdateNextCheckTime()
self._Compact()
if not watcher_status_should_stick:
@ -660,6 +672,15 @@ class WatcherImport( HydrusSerialisable.SerialisableBase ):
def _Compact( self ):
death_period = self._checker_options.GetDeathFileVelocityPeriod()
compact_before_this_time = self._last_check_time - ( death_period * 2 )
self._gallery_seed_log.Compact( compact_before_this_time )
def _DelayWork( self, time_delta, reason ):
self._no_work_until = HydrusData.GetNow() + time_delta

View File

@ -35,6 +35,28 @@ DID_SUBSTANTIAL_FILE_WORK_MINIMUM_SLEEP_TIME = 0.1
REPEATING_JOB_TYPICAL_PERIOD = 30.0
def ConvertAllParseResultsToFileSeeds( all_parse_results, source_url ):
file_seeds = []
for parse_results in all_parse_results:
parsed_urls = ClientParsing.GetURLsFromParseResults( parse_results, ( HC.URL_TYPE_DESIRED, ), only_get_top_priority = True )
for url in parsed_urls:
file_seed = ClientImportFileSeeds.FileSeed( ClientImportFileSeeds.FILE_SEED_TYPE_URL, url )
file_seed.SetReferralURL( source_url )
file_seed.AddParseResults( parse_results )
file_seeds.append( file_seed )
return file_seeds
def GenerateMultiplePopupNetworkJobPresentationContextFactory( job_key ):
def network_job_presentation_context_factory( network_job ):
@ -268,49 +290,41 @@ def THREADDownloadURLs( job_key, urls, title ):
job_key.Finish()
def UpdateFileSeedCacheWithAllParseResults( file_seed_cache, all_parse_results, source_url, max_new_urls_allowed = None ):
def UpdateFileSeedCacheWithFileSeeds( file_seed_cache, file_seeds, max_new_urls_allowed = None ):
new_file_seeds = []
num_urls_added = 0
num_urls_already_in_file_seed_cache = 0
num_urls_total = 0
can_add_more_file_urls = True
stop_reason = ''
for parse_results in all_parse_results:
for file_seed in file_seeds:
parsed_urls = ClientParsing.GetURLsFromParseResults( parse_results, ( HC.URL_TYPE_DESIRED, ), only_get_top_priority = True )
if max_new_urls_allowed is not None and num_urls_added >= max_new_urls_allowed:
can_add_more_file_urls = False
stop_reason = 'hit file limit'
break
for url in parsed_urls:
if file_seed_cache.HasFileSeed( file_seed ):
num_urls_total += 1
num_urls_already_in_file_seed_cache += 1
if max_new_urls_allowed is not None and num_urls_added == max_new_urls_allowed:
continue
else:
file_seed = ClientImportFileSeeds.FileSeed( ClientImportFileSeeds.FILE_SEED_TYPE_URL, url )
num_urls_added += 1
file_seed.SetReferralURL( source_url )
if file_seed_cache.HasFileSeed( file_seed ):
num_urls_already_in_file_seed_cache += 1
else:
num_urls_added += 1
file_seed.AddParseResults( parse_results )
new_file_seeds.append( file_seed )
new_file_seeds.append( file_seed )
file_seed_cache.AddFileSeeds( new_file_seeds )
return ( num_urls_added, num_urls_already_in_file_seed_cache, num_urls_total )
return ( num_urls_added, num_urls_already_in_file_seed_cache, can_add_more_file_urls, stop_reason )
def WakeRepeatingJob( job ):

View File

@ -41,7 +41,8 @@ job_status_str_lookup[ JOB_STATUS_RUNNING ] = 'running'
class NetworkEngine( object ):
MAX_JOBS = 10 # turn this into an option
MAX_JOBS_PER_DOMAIN = 3 # also turn this into an option
MAX_JOBS = 15 # turn this into an option
def __init__( self, controller, bandwidth_manager, session_manager, domain_manager, login_manager ):
@ -61,6 +62,8 @@ class NetworkEngine( object ):
self._new_work_to_do = threading.Event()
self._active_domains_counter = collections.Counter()
self._jobs_awaiting_validity = []
self._current_validation_process = None
self._jobs_awaiting_bandwidth = []
@ -80,7 +83,7 @@ class NetworkEngine( object ):
if HG.network_report_mode:
HydrusData.ShowText( 'Network Job: ' + job._method + ' ' + job._url )
HydrusData.ShowText( 'Network Job Added: ' + job._method + ' ' + job._url )
with self._lock:
@ -291,8 +294,25 @@ class NetworkEngine( object ):
return True
elif self._active_domains_counter[ job.GetSecondLevelDomain() ] >= self.MAX_JOBS_PER_DOMAIN:
job.SetStatus( u'waiting for a slot on this domain' )
return True
elif not job.TokensOK():
return True
else:
if HG.network_report_mode:
HydrusData.ShowText( 'Network Job Starting: ' + job._method + ' ' + job._url )
self._active_domains_counter[ job.GetSecondLevelDomain() ] += 1
self.controller.CallToThread( job.Start )
self._jobs_running.append( job )
@ -302,7 +322,7 @@ class NetworkEngine( object ):
else:
job.SetStatus( u'waiting for download slot\u2026' )
job.SetStatus( u'waiting for slot\u2026' )
return True
@ -312,6 +332,20 @@ class NetworkEngine( object ):
if job.IsDone():
if HG.network_report_mode:
HydrusData.ShowText( 'Network Job Done: ' + job._method + ' ' + job._url )
second_level_domain = job.GetSecondLevelDomain()
self._active_domains_counter[ second_level_domain ] -= 1
if self._active_domains_counter[ second_level_domain ] == 0:
del self._active_domains_counter[ second_level_domain ]
return False
else:

View File

@ -2,6 +2,8 @@ import collections
import ClientConstants as CC
import ClientNetworkingContexts
import HydrusConstants as HC
import HydrusData
import HydrusGlobals as HG
import HydrusNetworking
import HydrusThreading
import HydrusSerialisable
@ -23,6 +25,10 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
self._lock = threading.Lock()
self._last_pages_gallery_query_timestamps = collections.defaultdict( lambda: 0 )
self._last_subscriptions_gallery_query_timestamps = collections.defaultdict( lambda: 0 )
self._last_watchers_query_timestamps = collections.defaultdict( lambda: 0 )
self._network_contexts_to_bandwidth_trackers = collections.defaultdict( HydrusNetworking.BandwidthTracker )
self._network_contexts_to_bandwidth_rules = collections.defaultdict( HydrusNetworking.BandwidthRules )
@ -61,7 +67,7 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
def _GetSerialisableInfo( self ):
# note this discards ephemeral network contexts, which have page_key-specific identifiers and are temporary, not meant to be hung onto forever, and are generally invisible to the user
# note this discards ephemeral network contexts, which have temporary identifiers that are generally invisible to the user
all_serialisable_trackers = [ ( network_context.GetSerialisableTuple(), tracker.GetSerialisableTuple() ) for ( network_context, tracker ) in self._network_contexts_to_bandwidth_trackers.items() if not network_context.IsEphemeral() ]
all_serialisable_rules = [ ( network_context.GetSerialisableTuple(), rules.GetSerialisableTuple() ) for ( network_context, rules ) in self._network_contexts_to_bandwidth_rules.items() ]
@ -373,6 +379,46 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
def TryToConsumeAGalleryToken( self, second_level_domain, query_type ):
with self._lock:
if query_type == 'download page':
timestamps_dict = self._last_pages_gallery_query_timestamps
delay = HG.client_controller.new_options.GetInteger( 'gallery_page_wait_period_pages' )
elif query_type == 'subscription':
timestamps_dict = self._last_subscriptions_gallery_query_timestamps
delay = HG.client_controller.new_options.GetInteger( 'gallery_page_wait_period_subscriptions' )
elif query_type == 'watcher':
timestamps_dict = self._last_watchers_query_timestamps
delay = HG.client_controller.new_options.GetInteger( 'watcher_page_wait_period' )
next_timestamp = timestamps_dict[ second_level_domain ] + delay
if HydrusData.TimeHasPassed( next_timestamp ):
timestamps_dict[ second_level_domain ] = HydrusData.GetNow()
return ( True, 0 )
else:
return ( False, next_timestamp )
raise NotImplementedError( 'Unknown query type' )
def TryToStartRequest( self, network_contexts ):
# this wraps canstart and reportrequest in one transaction to stop 5/1 rq/s happening due to race condition

View File

@ -14,19 +14,9 @@ import time
import urllib
import urlparse
def CombineGETURLWithParameters( url, params_dict ):
def AlphabetiseQueryText( query_text ):
def make_safe( text ):
# convert unicode to raw bytes
# quote that to be url-safe, ignoring the default '/' 'safe' character
return urllib.quote( HydrusData.ToByteString( text ), '' )
request_string = '&'.join( ( make_safe( key ) + '=' + make_safe( value ) for ( key, value ) in params_dict.items() ) )
return url + '?' + request_string
return ConvertQueryDictToText( ConvertQueryTextToDict( query_text ) )
def ConvertDomainIntoAllApplicableDomains( domain ):
@ -103,6 +93,25 @@ def ConvertHTTPToHTTPS( url ):
raise Exception( 'Given a url that did not have a scheme!' )
def ConvertQueryDictToText( query_dict ):
# we now do everything with requests, which does all the unicode -> ascii -> %20 business naturally, phew
# so lets just stick with unicode, which we still want to call explicitly to coerce integers and so on that'll slip in here and there
param_pairs = list( query_dict.items() )
param_pairs.sort()
query_text = u'&'.join( ( unicode( key ) + u'=' + unicode( value ) for ( key, value ) in param_pairs ) )
return query_text
def ConvertQueryTextToDict( query_text ):
query_dict = dict( urlparse.parse_qsl( query_text ) )
return query_dict
def ConvertURLMatchesIntoAPIPairs( url_matches ):
pairs = []
@ -150,6 +159,12 @@ def ConvertURLIntoDomain( url ):
return domain
def ConvertURLIntoSecondLevelDomain( url ):
domain = ConvertURLIntoDomain( url )
return ConvertDomainIntoSecondLevelDomain( domain )
def DomainEqualsAnotherForgivingWWW( test_domain, wwwable_domain ):
# domain is either the same or starts with www. or www2. or something
@ -263,6 +278,7 @@ class NetworkDomainManager( HydrusSerialisable.SerialisableBase ):
self.engine = None
self._gugs = HydrusSerialisable.SerialisableList()
self._url_matches = HydrusSerialisable.SerialisableList()
self._parsers = HydrusSerialisable.SerialisableList()
self._network_contexts_to_custom_header_dicts = collections.defaultdict( dict )
@ -283,9 +299,6 @@ class NetworkDomainManager( HydrusSerialisable.SerialisableBase ):
self._parser_keys_to_parsers = {}
self._last_pages_gallery_query_timestamp = 0
self._last_subscriptions_gallery_query_timestamp = 0
self._dirty = False
self._lock = threading.Lock()
@ -748,6 +761,14 @@ class NetworkDomainManager( HydrusSerialisable.SerialisableBase ):
def GetGUGs( self ):
with self._lock:
return list( self._gugs )
def GetHeaders( self, network_contexts ):
with self._lock:
@ -919,10 +940,23 @@ class NetworkDomainManager( HydrusSerialisable.SerialisableBase ):
if url_match is None:
return url
p = urlparse.urlparse( url )
scheme = p.scheme
netloc = p.netloc
path = p.path
params = p.params
query = AlphabetiseQueryText( p.query )
fragment = p.fragment
r = urlparse.ParseResult( scheme, netloc, path, params, query, fragment )
normalised_url = r.geturl()
else:
normalised_url = url_match.Normalise( url )
normalised_url = url_match.Normalise( url )
return normalised_url
@ -936,6 +970,11 @@ class NetworkDomainManager( HydrusSerialisable.SerialisableBase ):
default_parsers = ClientDefaults.GetDefaultParsers()
for parser in default_parsers:
parser.RegenerateParserKey()
existing_parsers = list( self._parsers )
new_parsers = [ parser for parser in existing_parsers if parser.GetName() not in parser_names ]
@ -953,6 +992,11 @@ class NetworkDomainManager( HydrusSerialisable.SerialisableBase ):
default_url_matches = ClientDefaults.GetDefaultURLMatches()
for url_match in default_url_matches:
url_match.RegenMatchKey()
existing_url_matches = list( self._url_matches )
new_url_matches = [ url_match for url_match in existing_url_matches if url_match.GetName() not in url_match_names ]
@ -994,6 +1038,16 @@ class NetworkDomainManager( HydrusSerialisable.SerialisableBase ):
def SetGUGs( self, gugs ):
with self._lock:
#check ngugs maybe
self._gugs = gugs
def SetHeaderValidation( self, network_context, key, approved ):
with self._lock:
@ -1156,49 +1210,6 @@ class NetworkDomainManager( HydrusSerialisable.SerialisableBase ):
def TryToConsumeAGalleryQuery( self, query_type ):
with self._lock:
if query_type == 'pages':
delay = HG.client_controller.new_options.GetInteger( 'gallery_page_wait_period_pages' )
next_timestamp = self._last_pages_gallery_query_timestamp + delay
if HydrusData.TimeHasPassed( next_timestamp ):
self._last_pages_gallery_query_timestamp = HydrusData.GetNow()
return ( True, 0 )
else:
return ( False, next_timestamp )
elif query_type == 'subscriptions':
delay = HG.client_controller.new_options.GetInteger( 'gallery_page_wait_period_subscriptions' )
next_timestamp = self._last_subscriptions_gallery_query_timestamp + delay
if HydrusData.TimeHasPassed( next_timestamp ):
self._last_subscriptions_gallery_query_timestamp = HydrusData.GetNow()
return ( True, 0 )
else:
return ( False, next_timestamp )
raise NotImplementedError( 'Unknown query type' )
def URLCanReferToMultipleFiles( self, url ):
with self._lock:
@ -1436,6 +1447,183 @@ class DomainValidationPopupProcess( object ):
GALLERY_INDEX_TYPE_PATH_COMPONENT = 0
GALLERY_INDEX_TYPE_PARAMETER = 1
class GalleryURLGenerator( HydrusSerialisable.SerialisableBaseNamed ):
SERIALISABLE_TYPE = HydrusSerialisable.SERIALISABLE_TYPE_GALLERY_URL_GENERATOR
SERIALISABLE_NAME = 'Gallery URL Generator'
SERIALISABLE_VERSION = 1
def __init__( self, name, gug_key = None, url_template = None, replacement_phrase = None, search_terms_separator = None, initial_search_text = None, example_search_text = None ):
if gug_key is None:
gug_key = HydrusData.GenerateKey()
if url_template is None:
url_template = 'https://example.com/search?q=%tags%&index=0'
if replacement_phrase is None:
replacement_phrase = '%tags%'
if search_terms_separator is None:
search_terms_separator = '+'
if initial_search_text is None:
initial_search_text = 'search tags'
if example_search_text is None:
example_search_text = 'blue_eyes blonde_hair'
HydrusSerialisable.SerialisableBaseNamed.__init__( self, name )
self._gallery_url_generator_key = gug_key
self._url_template = url_template
self._replacement_phrase = replacement_phrase
self._search_terms_separator = search_terms_separator
self._initial_search_text = initial_search_text
self._example_search_text = example_search_text
def _GetSerialisableInfo( self ):
serialisable_gallery_url_generator_key = self._gallery_url_generator_key.encode( 'hex' )
return ( serialisable_gallery_url_generator_key, self._url_template, self._replacement_phrase, self._search_terms_separator, self._initial_search_text, self._example_search_text )
def _InitialiseFromSerialisableInfo( self, serialisable_info ):
( serialisable_gallery_url_generator_key, self._url_template, self._replacement_phrase, self._search_terms_separator, self._initial_search_text, self._example_search_text ) = serialisable_info
self._gallery_url_generator_key = serialisable_gallery_url_generator_key.decode( 'hex' )
def GenerateGalleryURL( self, search_terms ):
if self._replacement_phrase == '':
raise HydrusExceptions.GUGException( 'No replacement phrase!' )
if self._replacement_phrase not in self._url_template:
raise HydrusExceptions.GUGException( 'Replacement phrase not in URL template!' )
try:
search_phrase = self._search_terms_separator.join( search_terms )
gallery_url = self._url_template.replace( self._replacement_phrase, search_phrase )
except Exception as e:
raise HydrusExceptions.GUGException( unicode( e ) )
return gallery_url
def GetExampleURL( self ):
return self.GenerateGalleryURL( self._example_search_text.split( ' ' ) )
def GetGUGKey( self ):
return self._gallery_url_generator_key
def GetInitialSearchText( self ):
return self._initial_search_text
def GetURLTemplateVariables( self ):
return ( self._url_template, self._replacement_phrase, self._search_terms_separator, self._example_search_text )
def RegenerateGUGKey( self ):
self._gallery_url_generator_key = HydrusData.GenerateKey()
HydrusSerialisable.SERIALISABLE_TYPES_TO_OBJECT_TYPES[ HydrusSerialisable.SERIALISABLE_TYPE_GALLERY_URL_GENERATOR ] = GalleryURLGenerator
class NestedGalleryURLGenerator( HydrusSerialisable.SerialisableBaseNamed ):
SERIALISABLE_TYPE = HydrusSerialisable.SERIALISABLE_TYPE_NESTED_GALLERY_URL_GENERATOR
SERIALISABLE_NAME = 'Nested Gallery URL Generator'
SERIALISABLE_VERSION = 1
def __init__( self, name, initial_search_text = None, gug_keys = None ):
if initial_search_text is None:
initial_search_text = 'search tags'
if gug_keys is None:
gug_keys = []
HydrusSerialisable.SerialisableBaseNamed.__init__( self, name )
self._initial_search_text = initial_search_text
self._gug_keys = gug_keys
def _GetSerialisableInfo( self ):
serialisable_gug_keys = [ gug_key.encode( 'hex' ) for gug_key in self._gug_keys ]
return ( self._initial_search_text, serialisable_gug_keys )
def _InitialiseFromSerialisableInfo( self, serialisable_info ):
( self._initial_search_text, serialisable_gug_keys ) = serialisable_info
self._gug_keys = [ gug_key.decode( 'hex' ) for gug_key in serialisable_gug_keys ]
def GenerateGalleryURLs( self, search_terms ):
gallery_urls = []
for gug_key in self._gug_keys:
gug = HG.client_controller.network_engine.domain_manager.GetGUG( gug_key )
if gug is not None:
gallery_urls.append( gug.GenerateGalleryURL( search_terms ) )
return gallery_urls
def GetInitialSearchText( self ):
return self._initial_search_text
HydrusSerialisable.SERIALISABLE_TYPES_TO_OBJECT_TYPES[ HydrusSerialisable.SERIALISABLE_TYPE_NESTED_GALLERY_URL_GENERATOR ] = NestedGalleryURLGenerator
class URLMatch( HydrusSerialisable.SerialisableBaseNamed ):
SERIALISABLE_TYPE = HydrusSerialisable.SERIALISABLE_TYPE_URL_MATCH
@ -1553,19 +1741,11 @@ class URLMatch( HydrusSerialisable.SerialisableBaseNamed ):
def _ClipQuery( self, query ):
valid_parameters = []
query_dict = ConvertQueryTextToDict( query )
for ( key, value ) in urlparse.parse_qsl( query ):
if key in self._parameters:
valid_parameters.append( ( key, value ) )
valid_parameters = { key : value for ( key, value ) in query_dict.items() if key in self._parameters }
valid_parameters.sort()
query = '&'.join( ( key + '=' + value for ( key, value ) in valid_parameters ) )
query = ConvertQueryDictToText( valid_parameters )
return query
@ -1746,7 +1926,7 @@ class URLMatch( HydrusSerialisable.SerialisableBaseNamed ):
raise HydrusExceptions.URLMatchException( 'Could not generate next gallery page--index component was not an integer!' )
path_components[ page_index_path_component_index ] = page_index + self._gallery_index_delta
path_components[ page_index_path_component_index ] = str( page_index + self._gallery_index_delta )
path = '/' + '/'.join( path_components )
@ -1754,7 +1934,7 @@ class URLMatch( HydrusSerialisable.SerialisableBaseNamed ):
page_index_name = self._gallery_index_identifier
query_dict = { key : value_list[0] for ( key, value_list ) in urlparse.parse_qs( query ).items() }
query_dict = ConvertQueryTextToDict( query )
if page_index_name not in query_dict:
@ -1774,7 +1954,7 @@ class URLMatch( HydrusSerialisable.SerialisableBaseNamed ):
query_dict[ page_index_name ] = page_index + self._gallery_index_delta
query = urllib.urlencode( query_dict )
query = ConvertQueryDictToText( query_dict )
else:
@ -1850,7 +2030,7 @@ class URLMatch( HydrusSerialisable.SerialisableBaseNamed ):
netloc = p.netloc
path = p.path
query = p.query
query = AlphabetiseQueryText( p.query )
r = urlparse.ParseResult( scheme, netloc, path, params, query, fragment )
@ -1922,9 +2102,7 @@ class URLMatch( HydrusSerialisable.SerialisableBaseNamed ):
url_parameters_list = urlparse.parse_qsl( p.query )
url_parameters = dict( url_parameters_list )
url_parameters = ConvertQueryTextToDict( p.query )
if len( url_parameters ) < len( self._parameters ):
@ -1962,4 +2140,3 @@ class URLMatch( HydrusSerialisable.SerialisableBaseNamed ):
HydrusSerialisable.SERIALISABLE_TYPES_TO_OBJECT_TYPES[ HydrusSerialisable.SERIALISABLE_TYPE_URL_MATCH ] = URLMatch

View File

@ -90,6 +90,10 @@ class NetworkJob( object ):
self._method = method
self._url = url
self._domain = ClientNetworkingDomain.ConvertURLIntoDomain( self._url )
self._second_level_domain = ClientNetworkingDomain.ConvertURLIntoSecondLevelDomain( self._url )
self._body = body
self._referral_url = referral_url
self._temp_path = temp_path
@ -119,6 +123,9 @@ class NetworkJob( object ):
self._is_done = False
self._is_cancelled = False
self._gallery_token_name = None
self._gallery_token_consumed = False
self._bandwidth_manual_override = False
self._bandwidth_manual_override_delayed_timestamp = None
@ -162,8 +169,7 @@ class NetworkJob( object ):
network_contexts.append( ClientNetworkingContexts.GLOBAL_NETWORK_CONTEXT )
domain = ClientNetworkingDomain.ConvertURLIntoDomain( self._url )
domains = ClientNetworkingDomain.ConvertDomainIntoAllApplicableDomains( domain )
domains = ClientNetworkingDomain.ConvertDomainIntoAllApplicableDomains( self._domain )
network_contexts.extend( ( ClientNetworkingContexts.NetworkContext( CC.NETWORK_CONTEXT_DOMAIN, domain ) for domain in domains ) )
@ -175,11 +181,8 @@ class NetworkJob( object ):
# we always store cookies in the larger session (even if the cookie itself refers to a subdomain in the session object)
# but we can login to a specific subdomain
domain = ClientNetworkingDomain.ConvertURLIntoDomain( self._url )
domains = ClientNetworkingDomain.ConvertDomainIntoAllApplicableDomains( domain )
session_network_context = ClientNetworkingContexts.NetworkContext( CC.NETWORK_CONTEXT_DOMAIN, domains[-1] )
login_network_context = ClientNetworkingContexts.NetworkContext( CC.NETWORK_CONTEXT_DOMAIN, domain )
session_network_context = ClientNetworkingContexts.NetworkContext( CC.NETWORK_CONTEXT_DOMAIN, self._second_level_domain )
login_network_context = ClientNetworkingContexts.NetworkContext( CC.NETWORK_CONTEXT_DOMAIN, self._domain )
return ( session_network_context, login_network_context )
@ -588,6 +591,14 @@ class NetworkJob( object ):
def GetDomain( self ):
with self._lock:
return self._domain
def GetErrorException( self ):
with self._lock:
@ -612,6 +623,14 @@ class NetworkJob( object ):
def GetSecondLevelDomain( self ):
with self._lock:
return self._second_level_domain
def GetStatus( self ):
with self._lock:
@ -752,6 +771,14 @@ class NetworkJob( object ):
def SetGalleryToken( self, token_name ):
with self._lock:
self._gallery_token_name = token_name
def SetStatus( self, text ):
with self._lock:
@ -928,6 +955,32 @@ class NetworkJob( object ):
def TokensOK( self ):
with self._lock:
if self._gallery_token_name is not None and not self._gallery_token_consumed:
( consumed, next_timestamp ) = HG.client_controller.network_engine.bandwidth_manager.TryToConsumeAGalleryToken( self._second_level_domain, self._gallery_token_name )
if consumed:
self._gallery_token_consumed = True
else:
self._status_text = 'waiting for a ' + self._gallery_token_name + ' slot: next ' + HydrusData.TimestampToPrettyTimeDelta( next_timestamp, just_now_threshold = 1 )
self._Sleep( 1 )
return False
return True
def WaitUntilDone( self ):
while True:

View File

@ -187,6 +187,7 @@ class ClientOptions( HydrusSerialisable.SerialisableBase ):
self._dictionary[ 'integers' ][ 'gallery_page_wait_period_pages' ] = 15
self._dictionary[ 'integers' ][ 'gallery_page_wait_period_subscriptions' ] = 5
self._dictionary[ 'integers' ][ 'watcher_page_wait_period' ] = 5
self._dictionary[ 'integers' ][ 'popup_message_character_width' ] = 56

View File

@ -2382,7 +2382,7 @@ class ParseRootFileLookup( HydrusSerialisable.SerialisableBaseNamed ):
raise Exception( 'Cannot have a file as an argument on a GET query!' )
full_request_url = ClientNetworkingDomain.CombineGETURLWithParameters( self._url, request_args )
full_request_url = self._url + '?' + ClientNetworkingDomain.ConvertQueryDictToText( request_args )
job_key.SetVariable( 'script_status', 'fetching ' + full_request_url )

View File

@ -17,11 +17,12 @@ LZ4_OK = False
try:
import lz4
import lz4.block
LZ4_OK = True
except: # ImportError wasn't enough here as Linux went up the shoot with a __version__ doesn't exist bs
except Exception as e: # ImportError wasn't enough here as Linux went up the shoot with a __version__ doesn't exist bs
pass

View File

@ -49,7 +49,7 @@ options = {}
# Misc
NETWORK_VERSION = 18
SOFTWARE_VERSION = 318
SOFTWARE_VERSION = 319
UNSCALED_THUMBNAIL_DIMENSIONS = ( 200, 200 )

View File

@ -386,7 +386,7 @@ def TimestampToPrettyTimeDelta( timestamp, just_now_string = 'now', just_now_thr
time_delta = abs( timestamp - GetNow() )
if time_delta < just_now_threshold:
if time_delta <= just_now_threshold:
return just_now_string

View File

@ -27,6 +27,7 @@ class ParseException( HydrusException ): pass
class StringConvertException( ParseException ): pass
class StringMatchException( ParseException ): pass
class URLMatchException( ParseException ): pass
class GUGException( ParseException ): pass
class NetworkException( HydrusException ): pass

View File

@ -5,6 +5,7 @@ LZ4_OK = False
try:
import lz4
import lz4.block
LZ4_OK = True
@ -82,6 +83,8 @@ SERIALISABLE_TYPE_SERVICE_TAG_IMPORT_OPTIONS = 65
SERIALISABLE_TYPE_GALLERY_SEED = 66
SERIALISABLE_TYPE_GALLERY_SEED_LOG = 67
SERIALISABLE_TYPE_GALLERY_IMPORT = 68
SERIALISABLE_TYPE_GALLERY_URL_GENERATOR = 69
SERIALISABLE_TYPE_NESTED_GALLERY_URL_GENERATOR = 70
SERIALISABLE_TYPES_TO_OBJECT_TYPES = {}