Version 425

closes #780
This commit is contained in:
Hydrus Network Developer 2021-01-13 15:48:58 -06:00
parent 44e825e2a1
commit 069f77e194
18 changed files with 1618 additions and 483 deletions

View File

@ -8,6 +8,41 @@
<div class="content">
<h3>changelog</h3>
<ul>
<li><h3>version 425</h3></li>
<ul>
<li>optimisations:</li>
<li>I fixed the new tag cache's slow tag autocomplete when in 'all known files' domain (which is usually in the manage tags dialog). what was taking about 2.5 seconds in 424 should now take about 58ms!!! for technical details, I was foolishly performing the pre-search exact match lookup (where exactly what you type appears before the full results fetch) on the new quick-text search tables, but it turns out this is unoptimised and was wasting a ton of CPU once the table got big. sorry for the trouble here--this was driving me nuts IRL. I have now fleshed out my dev machine's test client with many more millions of tag mappings so I can test these scales better in future before they go live</li>
<li>internal autocomplete count fetches for single tags now have less overhead, which should add up for various rapid small checks across the program, mostly for tag processing, where the client frequently consults current counts on single tags for pre-processing analysis</li>
<li>autocomplete count fetch requests for zero tags (lol) are also dealt with more efficiently</li>
<li>thanks to the new tag definition cache, the 'num tags' service info cache is now updated and regenerated more efficiently. this speeds up all tag processing a couple percent</li>
<li>tag update now quickly filters out redundant data before the main processing job. it is now significantly faster to process tag mappings that already exist--e.g. when a downloaded file pends tags that already exist, or repo processing gives you tags you already have, or you are filling in content gaps in reprocessing</li>
<li>tag processing is now more efficient when checking against membership in the display cache, which greatly speeds up processing on services with many siblings and parents. thank you to the users who have contributed profiles and other feedback regarding slower processing speeds since the display cache was added</li>
<li>various tag filtering and display membership tests are now shunted to the top of the mappings update routine, reducing much other overhead, especially when the mappings being added are redundant</li>
<li>.</li>
<li>tag logic fixes:</li>
<li>I explored the 'ghost tag' issue, where sometimes committing a pending tag still leaves a pending record. this has been happening in the new display system when two pending tags that imply the same tag through siblings or parents are committed at the same time. I fixed a previous instance of this, but more remained. I replicated the problem through a unit test, rewrote several update loops to remain in sync when needed, and have fixed potential ghost tag instances in the specific and 'all known files' domains, for 'add', 'pend', 'delete', and 'rescind pend' actions</li>
<li>also tested and fixed are possible instances where both a tag and its implication tag are pend-committed at the same time, not just two that imply a shared other</li>
<li>furthermore, in a complex counting issue, storage autocomplete count updates are no longer deferred when updating mappings--they are 'interleaved' into mappings updates so counts are always synchronised to tables. this unfortunately adds some processing overhead back in, but as a number of newer cache calculations rely on autocomplete numbers, this change improves counting and pre-processing logic</li>
<li>fixed a 'commit pending to current' counting bug in the new autocomplete update routine for 'all known files' domain</li>
<li>while display tag logic is working increasingly ok and fast, most clients will have some miscounts and ghost tags here and there. I have yet to write efficient correction maintenance routines for particular files or tags, but this is planned and will come. at the moment, you just have the nuclear 'regen' maintenance calls, which are no good for little problems</li>
<li>.</li>
<li>network object breakup:</li>
<li>the network session and bandwidth managers, which store your cookies and bandwidth history for all the different network contexts, are no longer monolithic objects. on updates to individual network contexts (which happens all the time during network activity), only the particular updated session or bandwidth tracker now needs to be saved to the database. this reduces CPU and UI lag on heavy clients. basically the same thing as the subscriptions breakup last year, but all behind the scenes</li>
<li>your existing managers will be converted on update. all existing login and bandwidth log data should be preserved</li>
<li>sessions will now keep delayed cookie changes that occured in the final network request before client exit</li>
<li>we won't go too crazy yet, but session and bandwidth data is now synced to the database every 5 minutes, instead of 10, so if the client crashes, you only lose 5 mins of login/bandwidth data</li>
<li>some session clearing logic is improved</li>
<li>the bandwidth manager no longer considers future bandwidth in tests. if your computer clock goes haywire and your client records bandwidth in the future, it shouldn't bosh you _so much_ now</li>
<li>.</li>
<li>the rest:</li>
<li>the 'system:number of tags' query now has greatly improved cancelability, even on gigantic result domains</li>
<li>fixed a bad example in the client api help that mislabeled 'request_new_permissions' as 'request_access_permissions' (issue #780)</li>
<li>the 'check and repair db' boot routine now runs _after_ version checks, so if you accidentally install a version behind, you now get the 'weird version m8' warning before the db goes bananas about missing tables or similar</li>
<li>added some methods and optimised some access in Hydrus Tag Archives</li>
<li>if you delete all the rules from a default bandwidth ruleset, it no longer disappears momentarily in the edit UI</li>
<li>updated the python mpv bindings to 0.5.2 on windows, although the underlying dll is the same. this seems to fix at least one set of dll load problems. also updated is macOS, but not Linux (yet), because it broke there, hooray</li>
<li>updated cloudscraper to 1.2.52 for all platforms</li>
</ul>
<li><h3>version 424</h3></li>
<ul>
<li>new tag caches:</li>

View File

@ -137,7 +137,7 @@
<li>
<p>Example request:</p>
<ul>
<li><p>/request_access_permissions?name=my%20import%20script&basic_permissions=[0,1]</p></li>
<li><p>/request_new_permissions?name=my%20import%20script&basic_permissions=[0,1]</p></li>
</ul>
</li>
<li><p>Response description: Some JSON with your access key, which is 64 characters of hex. This will not be valid until the user approves the request in the client ui.</p></li>

View File

@ -871,11 +871,15 @@ class Controller( HydrusController.HydrusController ):
bandwidth_manager = ClientNetworkingBandwidth.NetworkBandwidthManager()
tracker_containers = self.Read( 'serialisable_named', HydrusSerialisable.SERIALISABLE_TYPE_NETWORK_BANDWIDTH_MANAGER_TRACKER_CONTAINER )
bandwidth_manager.SetTrackerContainers( tracker_containers )
ClientDefaults.SetDefaultBandwidthManagerRules( bandwidth_manager )
bandwidth_manager._dirty = True
bandwidth_manager.SetDirty()
self.SafeShowCriticalMessage( 'Problem loading object', 'Your bandwidth manager was missing on boot! I have recreated a new empty one with default rules. Please check that your hard drive and client are ok and let the hydrus dev know the details if there is a mystery.' )
self.SafeShowCriticalMessage( 'Problem loading object', 'Your bandwidth manager was missing on boot! I have recreated a new one. It may have your bandwidth record, but some/all may be missing. Your rules have been reset to default. Please check that your hard drive and client are ok and let the hydrus dev know the details if there is a mystery.' )
session_manager = self.Read( 'serialisable', HydrusSerialisable.SERIALISABLE_TYPE_NETWORK_SESSION_MANAGER )
@ -884,9 +888,13 @@ class Controller( HydrusController.HydrusController ):
session_manager = ClientNetworkingSessions.NetworkSessionManager()
session_manager._dirty = True
session_containers = self.Read( 'serialisable_named', HydrusSerialisable.SERIALISABLE_TYPE_NETWORK_SESSION_MANAGER_SESSION_CONTAINER )
self.SafeShowCriticalMessage( 'Problem loading object', 'Your session manager was missing on boot! I have recreated a new empty one. Please check that your hard drive and client are ok and let the hydrus dev know the details if there is a mystery.' )
session_manager.SetSessionContainers( session_containers )
session_manager.SetDirty()
self.SafeShowCriticalMessage( 'Problem loading object', 'Your session manager was missing on boot! I have recreated a new one. It may have your sessions, or some/all may be missing. Please check that your hard drive and client are ok and let the hydrus dev know the details if there is a mystery.' )
domain_manager = self.Read( 'serialisable', HydrusSerialisable.SERIALISABLE_TYPE_NETWORK_DOMAIN_MANAGER )
@ -1097,7 +1105,7 @@ class Controller( HydrusController.HydrusController ):
job.WakeOnPubSub( 'important_dirt_to_clean' )
self._daemon_jobs[ 'save_dirty_objects_important' ] = job
job = self.CallRepeating( 0.0, 600.0, self.SaveDirtyObjectsInfrequent )
job = self.CallRepeating( 0.0, 300.0, self.SaveDirtyObjectsInfrequent )
self._daemon_jobs[ 'save_dirty_objects_infrequent' ] = job
job = self.CallRepeating( 5.0, 3600.0, self.SynchroniseAccounts )
@ -1471,7 +1479,7 @@ class Controller( HydrusController.HydrusController ):
self.column_list_manager.SetClean()
if self.network_engine.bandwidth_manager.IsDirty():
if self.network_engine.bandwidth_manager.IsDirty() or self.network_engine.bandwidth_manager.HasDirtyTrackerContainers():
self.frame_splash_status.SetSubtext( 'bandwidth manager' )
@ -1480,7 +1488,7 @@ class Controller( HydrusController.HydrusController ):
self.network_engine.bandwidth_manager.SetClean()
if self.network_engine.session_manager.IsDirty():
if self.network_engine.session_manager.IsDirty() or self.network_engine.session_manager.HasDirtySessionContainers():
self.frame_splash_status.SetSubtext( 'session manager' )

File diff suppressed because it is too large Load Diff

View File

@ -1879,7 +1879,7 @@ class HydrusResourceClientAPIRestrictedManageCookiesSetCookies( HydrusResourceCl
HG.client_controller.pub( 'message', job_key )
HG.client_controller.network_engine.session_manager.SetDirty()
HG.client_controller.network_engine.session_manager.SetSessionDirty( network_context )
response_context = HydrusServerResources.ResponseContext( 200 )

View File

@ -1540,7 +1540,7 @@ class FrameGUI( ClientGUITopLevelWindows.MainFrameThatResizes ):
paths = []
frame = ClientGUITopLevelWindowsPanels.FrameThatTakesScrollablePanel( self, 'importing files' )
frame = ClientGUITopLevelWindowsPanels.FrameThatTakesScrollablePanel( self, 'review files to import' )
panel = ClientGUIScrolledPanelsReview.ReviewLocalFileImports( frame, paths )

View File

@ -1434,6 +1434,8 @@ class ReviewNetworkSessionPanel( ClientGUIScrolledPanels.ReviewPanel ):
self._SetCookie( name, value, domain, path, expires )
self._session_manager.SetSessionDirty( self._network_context )
self._Update()
@ -1486,6 +1488,8 @@ class ReviewNetworkSessionPanel( ClientGUIScrolledPanels.ReviewPanel ):
self._session.cookies.clear( domain, path, name )
self._session_manager.SetSessionDirty( self._network_context )
self._Update()
@ -1513,6 +1517,8 @@ class ReviewNetworkSessionPanel( ClientGUIScrolledPanels.ReviewPanel ):
self._SetCookie( name, value, domain, path, expires )
self._session_manager.SetSessionDirty( self._network_context )
else:
break

View File

@ -1,5 +1,6 @@
import collections
import threading
import typing
from hydrus.core import HydrusConstants as HC
from hydrus.core import HydrusData
@ -10,6 +11,48 @@ from hydrus.core import HydrusSerialisable
from hydrus.client import ClientConstants as CC
from hydrus.client.networking import ClientNetworkingContexts
class NetworkBandwidthManagerTrackerContainer( HydrusSerialisable.SerialisableBaseNamed ):
SERIALISABLE_TYPE = HydrusSerialisable.SERIALISABLE_TYPE_NETWORK_BANDWIDTH_MANAGER_TRACKER_CONTAINER
SERIALISABLE_NAME = 'Bandwidth Manager Tracker Container'
SERIALISABLE_VERSION = 1
def __init__( self, name, network_context = None, bandwidth_tracker = None ):
if network_context is None:
network_context = ClientNetworkingContexts.GLOBAL_NETWORK_CONTEXT
if bandwidth_tracker is None:
bandwidth_tracker = HydrusNetworking.BandwidthTracker()
HydrusSerialisable.SerialisableBaseNamed.__init__( self, name )
self.network_context = network_context
self.bandwidth_tracker = bandwidth_tracker
def _GetSerialisableInfo( self ):
serialisable_network_context = self.network_context.GetSerialisableTuple()
serialisable_bandwidth_tracker = self.bandwidth_tracker.GetSerialisableTuple()
return ( serialisable_network_context, serialisable_bandwidth_tracker )
def _InitialiseFromSerialisableInfo( self, serialisable_info ):
( serialisable_network_context, serialisable_bandwidth_tracker ) = serialisable_info
self.network_context = HydrusSerialisable.CreateFromSerialisableTuple( serialisable_network_context )
self.bandwidth_tracker = HydrusSerialisable.CreateFromSerialisableTuple( serialisable_bandwidth_tracker )
HydrusSerialisable.SERIALISABLE_TYPES_TO_OBJECT_TYPES[ HydrusSerialisable.SERIALISABLE_TYPE_NETWORK_BANDWIDTH_MANAGER_TRACKER_CONTAINER ] = NetworkBandwidthManagerTrackerContainer
class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
SERIALISABLE_TYPE = HydrusSerialisable.SERIALISABLE_TYPE_NETWORK_BANDWIDTH_MANAGER
@ -30,7 +73,13 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
self._last_subscriptions_gallery_query_timestamps = collections.defaultdict( lambda: 0 )
self._last_watchers_query_timestamps = collections.defaultdict( lambda: 0 )
self._network_contexts_to_bandwidth_trackers = collections.defaultdict( HydrusNetworking.BandwidthTracker )
self._tracker_container_names_to_tracker_containers = {}
self._network_contexts_to_tracker_containers = {}
self._tracker_container_names = set()
self._dirty_tracker_container_names = set()
self._deletee_tracker_container_names = set()
self._network_contexts_to_bandwidth_rules = collections.defaultdict( HydrusNetworking.BandwidthRules )
for context_type in [ CC.NETWORK_CONTEXT_GLOBAL, CC.NETWORK_CONTEXT_HYDRUS, CC.NETWORK_CONTEXT_DOMAIN, CC.NETWORK_CONTEXT_DOWNLOADER_PAGE, CC.NETWORK_CONTEXT_SUBSCRIPTION, CC.NETWORK_CONTEXT_WATCHER_PAGE ]:
@ -45,7 +94,7 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
bandwidth_rules = self._GetRules( network_context )
bandwidth_tracker = self._network_contexts_to_bandwidth_trackers[ network_context ]
bandwidth_tracker = self._GetTracker( network_context )
if not bandwidth_rules.CanStartRequest( bandwidth_tracker ):
@ -68,24 +117,51 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
def _GetSerialisableInfo( self ):
# note this discards ephemeral network contexts, which have temporary identifiers that are generally invisible to the user
all_serialisable_trackers = [ ( network_context.GetSerialisableTuple(), tracker.GetSerialisableTuple() ) for ( network_context, tracker ) in list(self._network_contexts_to_bandwidth_trackers.items()) if not network_context.IsEphemeral() ]
all_tracker_container_names = sorted( self._tracker_container_names )
all_serialisable_rules = [ ( network_context.GetSerialisableTuple(), rules.GetSerialisableTuple() ) for ( network_context, rules ) in list(self._network_contexts_to_bandwidth_rules.items()) ]
return ( all_serialisable_trackers, all_serialisable_rules )
return ( all_tracker_container_names, all_serialisable_rules )
def _GetTracker( self, network_context: ClientNetworkingContexts.NetworkContext, making_it_dirty = False ):
if network_context not in self._network_contexts_to_tracker_containers:
bandwidth_tracker = HydrusNetworking.BandwidthTracker()
tracker_container_name = HydrusData.GenerateKey().hex()
tracker_container = NetworkBandwidthManagerTrackerContainer( tracker_container_name, network_context = network_context, bandwidth_tracker = bandwidth_tracker )
self._tracker_container_names_to_tracker_containers[ tracker_container_name ] = tracker_container
self._network_contexts_to_tracker_containers[ network_context ] = tracker_container
# note this discards ephemeral network contexts, which have temporary identifiers that are generally invisible to the user
if not network_context.IsEphemeral():
self._tracker_container_names.add( tracker_container_name )
self._dirty_tracker_container_names.add( tracker_container_name )
self._SetDirty()
tracker_container = self._network_contexts_to_tracker_containers[ network_context ]
if making_it_dirty and not network_context.IsEphemeral():
self._dirty_tracker_container_names.add( tracker_container.GetName() )
return tracker_container.bandwidth_tracker
def _InitialiseFromSerialisableInfo( self, serialisable_info ):
( all_serialisable_trackers, all_serialisable_rules ) = serialisable_info
( all_tracker_container_names, all_serialisable_rules ) = serialisable_info
for ( serialisable_network_context, serialisable_tracker ) in all_serialisable_trackers:
network_context = HydrusSerialisable.CreateFromSerialisableTuple( serialisable_network_context )
tracker = HydrusSerialisable.CreateFromSerialisableTuple( serialisable_tracker )
self._network_contexts_to_bandwidth_trackers[ network_context ] = tracker
self._tracker_container_names = set( all_tracker_container_names )
for ( serialisable_network_context, serialisable_rules ) in all_serialisable_rules:
@ -105,10 +181,10 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
for network_context in network_contexts:
self._network_contexts_to_bandwidth_trackers[ network_context ].ReportRequestUsed()
bandwidth_tracker = self._GetTracker( network_context, making_it_dirty = True )
bandwidth_tracker.ReportRequestUsed()
self._SetDirty()
def _SetDirty( self ):
@ -162,7 +238,7 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
bandwidth_rules = self._GetRules( network_context )
bandwidth_tracker = self._network_contexts_to_bandwidth_trackers[ network_context ]
bandwidth_tracker = self._GetTracker( network_context )
if not bandwidth_rules.CanContinueDownload( bandwidth_tracker ):
@ -182,7 +258,7 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
bandwidth_rules = self._GetRules( network_context )
bandwidth_tracker = self._network_contexts_to_bandwidth_trackers[ network_context ]
bandwidth_tracker = self._GetTracker( network_context )
if not bandwidth_rules.CanDoWork( bandwidth_tracker, expected_requests = expected_requests, expected_bytes = expected_bytes, threshold = threshold ):
@ -228,22 +304,56 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
for network_context in network_contexts:
if network_context in self._network_contexts_to_bandwidth_trackers:
if network_context in self._network_contexts_to_tracker_containers:
del self._network_contexts_to_bandwidth_trackers[ network_context ]
tracker_container = self._network_contexts_to_tracker_containers[ network_context ]
if network_context == ClientNetworkingContexts.GLOBAL_NETWORK_CONTEXT:
del self._network_contexts_to_tracker_containers[ network_context ]
tracker_container_name = tracker_container.GetName()
if tracker_container_name in self._tracker_container_names_to_tracker_containers:
# just to reset it, so we have a 0 global context at all times
self._network_contexts_to_bandwidth_trackers[ ClientNetworkingContexts.GLOBAL_NETWORK_CONTEXT ] = HydrusNetworking.BandwidthTracker()
del self._tracker_container_names_to_tracker_containers[ tracker_container_name ]
self._tracker_container_names.discard( tracker_container_name )
self._deletee_tracker_container_names.add( tracker_container_name )
if network_context == ClientNetworkingContexts.GLOBAL_NETWORK_CONTEXT:
# just to reset it and have it in the system, so we have a 0 global context at all times
self._GetTracker( ClientNetworkingContexts.GLOBAL_NETWORK_CONTEXT )
self._SetDirty()
def GetBandwidthStringsAndGaugeTuples( self, network_context ):
with self._lock:
bandwidth_rules = self._GetRules( network_context )
bandwidth_tracker = self._GetTracker( network_context )
return bandwidth_rules.GetBandwidthStringsAndGaugeTuples( bandwidth_tracker )
def GetCurrentMonthSummary( self, network_context ):
with self._lock:
bandwidth_tracker = self._GetTracker( network_context )
return bandwidth_tracker.GetCurrentMonthSummary()
def GetDefaultRules( self ):
with self._lock:
@ -262,25 +372,19 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
def GetCurrentMonthSummary( self, network_context ):
def GetDeleteeTrackerNames( self ):
with self._lock:
bandwidth_tracker = self._network_contexts_to_bandwidth_trackers[ network_context ]
return bandwidth_tracker.GetCurrentMonthSummary()
return set( self._deletee_tracker_container_names )
def GetBandwidthStringsAndGaugeTuples( self, network_context ):
def GetDirtyTrackerContainers( self ):
with self._lock:
bandwidth_rules = self._GetRules( network_context )
bandwidth_tracker = self._network_contexts_to_bandwidth_trackers[ network_context ]
return bandwidth_rules.GetBandwidthStringsAndGaugeTuples( bandwidth_tracker )
return [ self._tracker_container_names_to_tracker_containers[ tracker_container_name ] for tracker_container_name in self._dirty_tracker_container_names ]
@ -290,13 +394,17 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
result = set()
for ( network_context, bandwidth_tracker ) in list(self._network_contexts_to_bandwidth_trackers.items()):
for tracker_container in self._network_contexts_to_tracker_containers.values():
network_context = tracker_container.network_context
if network_context.IsDefault() or network_context.IsEphemeral():
continue
bandwidth_tracker = tracker_container.bandwidth_tracker
if network_context != ClientNetworkingContexts.GLOBAL_NETWORK_CONTEXT and history_time_delta_threshold is not None:
if bandwidth_tracker.GetUsage( HC.BANDWIDTH_TYPE_REQUESTS, history_time_delta_threshold ) == 0:
@ -324,9 +432,9 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
with self._lock:
if network_context in self._network_contexts_to_bandwidth_trackers:
if network_context in self._network_contexts_to_tracker_containers:
return self._network_contexts_to_bandwidth_trackers[ network_context ]
return self._GetTracker( network_context )
else:
@ -345,7 +453,7 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
bandwidth_rules = self._GetRules( network_context )
bandwidth_tracker = self._network_contexts_to_bandwidth_trackers[ network_context ]
bandwidth_tracker = self._GetTracker( network_context )
estimates.append( ( bandwidth_rules.GetWaitingEstimate( bandwidth_tracker ), network_context ) )
@ -363,6 +471,14 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
def HasDirtyTrackerContainers( self ):
with self._lock:
return len( self._dirty_tracker_container_names ) > 0 or len( self._deletee_tracker_container_names ) > 0
def HasRules( self, network_context ):
with self._lock:
@ -385,10 +501,10 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
for network_context in network_contexts:
self._network_contexts_to_bandwidth_trackers[ network_context ].ReportDataUsed( num_bytes )
bandwidth_tracker = self._GetTracker( network_context, making_it_dirty = True )
bandwidth_tracker.ReportDataUsed( num_bytes )
self._SetDirty()
@ -405,6 +521,16 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
with self._lock:
self._dirty = False
self._dirty_tracker_container_names = set()
self._deletee_tracker_container_names = set()
def SetDirty( self ):
with self._lock:
self._SetDirty()
@ -412,7 +538,7 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
with self._lock:
if len( bandwidth_rules.GetRules() ) == 0:
if len( bandwidth_rules.GetRules() ) == 0 and not network_context.IsDefault():
if network_context in self._network_contexts_to_bandwidth_rules:
@ -428,6 +554,38 @@ class NetworkBandwidthManager( HydrusSerialisable.SerialisableBase ):
def SetTrackerContainers( self, tracker_containers: typing.Collection[ NetworkBandwidthManagerTrackerContainer ], set_all_trackers_dirty = False ):
with self._lock:
self._tracker_container_names_to_tracker_containers = {}
self._network_contexts_to_tracker_containers = {}
self._tracker_container_names = set()
self._dirty_tracker_container_names = set()
self._deletee_tracker_container_names = set()
for tracker_container in tracker_containers:
tracker_container_name = tracker_container.GetName()
network_context = tracker_container.network_context
self._tracker_container_names_to_tracker_containers[ tracker_container_name ] = tracker_container
self._network_contexts_to_tracker_containers[ network_context ] = tracker_container
if not network_context.IsEphemeral():
self._tracker_container_names.add( tracker_container_name )
if set_all_trackers_dirty:
self._dirty_tracker_container_names.add( tracker_container_name )
def TryToConsumeAGalleryToken( self, second_level_domain, query_type ):
with self._lock:

View File

@ -0,0 +1,96 @@
import collections
import threading
from hydrus.core import HydrusConstants as HC
from hydrus.core import HydrusData
from hydrus.core import HydrusGlobals as HG
from hydrus.core import HydrusNetworking
from hydrus.core import HydrusSerialisable
from hydrus.client import ClientConstants as CC
from hydrus.client.networking import ClientNetworkingBandwidth
class NetworkBandwidthManagerLegacy( HydrusSerialisable.SerialisableBase ):
SERIALISABLE_TYPE = HydrusSerialisable.SERIALISABLE_TYPE_NETWORK_BANDWIDTH_MANAGER_LEGACY
SERIALISABLE_NAME = 'Legacy Bandwidth Manager'
SERIALISABLE_VERSION = 1
def __init__( self ):
HydrusSerialisable.SerialisableBase.__init__( self )
self._network_contexts_to_bandwidth_trackers = collections.defaultdict( HydrusNetworking.BandwidthTracker )
self._network_contexts_to_bandwidth_rules = collections.defaultdict( HydrusNetworking.BandwidthRules )
def _GetSerialisableInfo( self ):
# note this discards ephemeral network contexts, which have temporary identifiers that are generally invisible to the user
all_serialisable_trackers = [ ( network_context.GetSerialisableTuple(), tracker.GetSerialisableTuple() ) for ( network_context, tracker ) in list(self._network_contexts_to_bandwidth_trackers.items()) if not network_context.IsEphemeral() ]
all_serialisable_rules = [ ( network_context.GetSerialisableTuple(), rules.GetSerialisableTuple() ) for ( network_context, rules ) in list(self._network_contexts_to_bandwidth_rules.items()) ]
return ( all_serialisable_trackers, all_serialisable_rules )
def _InitialiseFromSerialisableInfo( self, serialisable_info ):
( all_serialisable_trackers, all_serialisable_rules ) = serialisable_info
for ( serialisable_network_context, serialisable_tracker ) in all_serialisable_trackers:
network_context = HydrusSerialisable.CreateFromSerialisableTuple( serialisable_network_context )
tracker = HydrusSerialisable.CreateFromSerialisableTuple( serialisable_tracker )
self._network_contexts_to_bandwidth_trackers[ network_context ] = tracker
for ( serialisable_network_context, serialisable_rules ) in all_serialisable_rules:
network_context = HydrusSerialisable.CreateFromSerialisableTuple( serialisable_network_context )
rules = HydrusSerialisable.CreateFromSerialisableTuple( serialisable_rules )
if network_context.context_type == CC.NETWORK_CONTEXT_DOWNLOADER: # no longer use this
continue
self._network_contexts_to_bandwidth_rules[ network_context ] = rules
def GetData( self ):
return ( self._network_contexts_to_bandwidth_trackers, self._network_contexts_to_bandwidth_rules )
HydrusSerialisable.SERIALISABLE_TYPES_TO_OBJECT_TYPES[ HydrusSerialisable.SERIALISABLE_TYPE_NETWORK_BANDWIDTH_MANAGER_LEGACY ] = NetworkBandwidthManagerLegacy
def ConvertLegacyToNewBandwidth( legacy_bandwidth_manager: NetworkBandwidthManagerLegacy ):
tracker_containers = []
( network_contexts_to_bandwidth_trackers, network_contexts_to_bandwidth_rules ) = legacy_bandwidth_manager.GetData()
for ( network_context, bandwidth_tracker ) in network_contexts_to_bandwidth_trackers.items():
tracker_container_name = HydrusData.GenerateKey().hex()
tracker_container = ClientNetworkingBandwidth.NetworkBandwidthManagerTrackerContainer( tracker_container_name, network_context = network_context, bandwidth_tracker = bandwidth_tracker )
tracker_containers.append( tracker_container )
bandwidth_manager = ClientNetworkingBandwidth.NetworkBandwidthManager()
for ( network_context, bandwidth_rules ) in network_contexts_to_bandwidth_rules.items():
bandwidth_manager.SetRules( network_context, bandwidth_rules )
bandwidth_manager.SetTrackerContainers( tracker_containers, set_all_trackers_dirty = True )
bandwidth_manager.SetDirty()
return bandwidth_manager

View File

@ -635,7 +635,7 @@ class NetworkJob( object ):
ClientNetworkingDomain.AddCookieToSession( session, name, value, domain, path, expires, secure = secure, rest = rest )
self.engine.session_manager.SetDirty()
self.engine.session_manager.SetSessionDirty( snc )
except Exception as e:
@ -1326,6 +1326,13 @@ class NetworkJob( object ):
finally:
with self._lock:
snc = self._session_network_context
self.engine.session_manager.SetSessionDirty( snc )
if response is not None:
# if full data was not read, the response will hang around in connection pool longer than we want

View File

@ -1,6 +1,7 @@
import pickle
import requests
import threading
import typing
from hydrus.core import HydrusData
from hydrus.core import HydrusSerialisable
@ -21,6 +22,66 @@ except:
SOCKS_PROXY_OK = False
class NetworkSessionManagerSessionContainer( HydrusSerialisable.SerialisableBaseNamed ):
SERIALISABLE_TYPE = HydrusSerialisable.SERIALISABLE_TYPE_NETWORK_SESSION_MANAGER_SESSION_CONTAINER
SERIALISABLE_NAME = 'Session Manager Session Container'
SERIALISABLE_VERSION = 1
def __init__( self, name, network_context = None, session = None ):
if network_context is None:
network_context = ClientNetworkingContexts.GLOBAL_NETWORK_CONTEXT
HydrusSerialisable.SerialisableBaseNamed.__init__( self, name )
self.network_context = network_context
self.session = session
def _InitialiseEmptySession( self ):
self.session = requests.Session()
if self.network_context.context_type == CC.NETWORK_CONTEXT_HYDRUS:
self.session.verify = False
def _GetSerialisableInfo( self ):
serialisable_network_context = self.network_context.GetSerialisableTuple()
pickled_session_hex = pickle.dumps( self.session ).hex()
return ( serialisable_network_context, pickled_session_hex )
def _InitialiseFromSerialisableInfo( self, serialisable_info ):
( serialisable_network_context, pickled_session_hex ) = serialisable_info
self.network_context = HydrusSerialisable.CreateFromSerialisableTuple( serialisable_network_context )
try:
self.session = pickle.loads( bytes.fromhex( pickled_session_hex ) )
except:
# a new version of requests messed this up lad, so reset
self._InitialiseEmptySession()
self.session.cookies.clear_session_cookies()
HydrusSerialisable.SERIALISABLE_TYPES_TO_OBJECT_TYPES[ HydrusSerialisable.SERIALISABLE_TYPE_NETWORK_SESSION_MANAGER_SESSION_CONTAINER ] = NetworkSessionManagerSessionContainer
class NetworkSessionManager( HydrusSerialisable.SerialisableBase ):
SERIALISABLE_TYPE = HydrusSerialisable.SERIALISABLE_TYPE_NETWORK_SESSION_MANAGER
@ -36,18 +97,23 @@ class NetworkSessionManager( HydrusSerialisable.SerialisableBase ):
self.engine = None
self._dirty = False
self._dirty_session_container_names = set()
self._deletee_session_container_names = set()
self._lock = threading.Lock()
self._network_contexts_to_sessions = {}
self._session_container_names = set()
self._session_container_names_to_session_containers = {}
self._network_contexts_to_session_containers = {}
self._network_contexts_to_session_timeouts = {}
self._proxies_dict = {}
self._Reinitialise()
self._ReinitialiseProxies()
HG.client_controller.sub( self, 'Reinitialise', 'notify_new_options' )
HG.client_controller.sub( self, 'ReinitialiseProxies', 'notify_new_options' )
def _CleanSessionCookies( self, network_context, session ):
@ -67,23 +133,9 @@ class NetworkSessionManager( HydrusSerialisable.SerialisableBase ):
session.cookies.clear_expired_cookies()
def _GenerateSession( self, network_context ):
session = requests.Session()
if network_context.context_type == CC.NETWORK_CONTEXT_HYDRUS:
session.verify = False
return session
def _GetSerialisableInfo( self ):
serialisable_network_contexts_to_sessions = [ ( network_context.GetSerialisableTuple(), pickle.dumps( session ).hex() ) for ( network_context, session ) in list(self._network_contexts_to_sessions.items()) ]
return serialisable_network_contexts_to_sessions
return sorted( self._session_container_names )
def _GetSessionNetworkContext( self, network_context ):
@ -101,30 +153,32 @@ class NetworkSessionManager( HydrusSerialisable.SerialisableBase ):
def _InitialiseFromSerialisableInfo( self, serialisable_info ):
serialisable_network_contexts_to_sessions = serialisable_info
for ( serialisable_network_context, pickled_session_hex ) in serialisable_network_contexts_to_sessions:
network_context = HydrusSerialisable.CreateFromSerialisableTuple( serialisable_network_context )
try:
session = pickle.loads( bytes.fromhex( pickled_session_hex ) )
except:
# new version of requests uses a diff format, wew
continue
session.cookies.clear_session_cookies()
self._network_contexts_to_sessions[ network_context ] = session
self._session_container_names = set( serialisable_info )
def _Reinitialise( self ):
def _InitialiseSessionContainer( self, network_context ):
session = requests.Session()
if network_context.context_type == CC.NETWORK_CONTEXT_HYDRUS:
session.verify = False
session_container_name = HydrusData.GenerateKey().hex()
session_container = NetworkSessionManagerSessionContainer( session_container_name, network_context = network_context, session = session )
self._session_container_names_to_session_containers[ session_container_name ] = session_container
self._network_contexts_to_session_containers[ network_context ] = session_container
self._session_container_names.add( session_container_name )
self._dirty_session_container_names.add( session_container_name )
self._SetDirty()
def _ReinitialiseProxies( self ):
self._proxies_dict = {}
@ -159,20 +213,53 @@ class NetworkSessionManager( HydrusSerialisable.SerialisableBase ):
network_context = self._GetSessionNetworkContext( network_context )
if network_context in self._network_contexts_to_sessions:
if network_context in self._network_contexts_to_session_timeouts:
del self._network_contexts_to_sessions[ network_context ]
del self._network_contexts_to_session_timeouts[ network_context ]
if network_context in self._network_contexts_to_session_containers:
session_container = self._network_contexts_to_session_containers[ network_context ]
del self._network_contexts_to_session_containers[ network_context ]
session_container_name = session_container.GetName()
if session_container_name in self._session_container_names_to_session_containers:
del self._session_container_names_to_session_containers[ session_container_name ]
self._session_container_names.discard( session_container_name )
self._deletee_session_container_names.add( session_container_name )
self._SetDirty()
def GetDeleteeSessionNames( self ):
with self._lock:
return set( self._deletee_session_container_names )
def GetDirtySessionContainers( self ):
with self._lock:
return [ self._session_container_names_to_session_containers[ session_container_name ] for session_container_name in self._dirty_session_container_names ]
def GetNetworkContexts( self ):
with self._lock:
return list(self._network_contexts_to_sessions.keys())
return list( self._network_contexts_to_session_containers.keys() )
@ -182,12 +269,12 @@ class NetworkSessionManager( HydrusSerialisable.SerialisableBase ):
network_context = self._GetSessionNetworkContext( network_context )
if network_context not in self._network_contexts_to_sessions:
if network_context not in self._network_contexts_to_session_containers:
self._network_contexts_to_sessions[ network_context ] = self._GenerateSession( network_context )
self._InitialiseSessionContainer( network_context )
session = self._network_contexts_to_sessions[ network_context ]
session = self._network_contexts_to_session_containers[ network_context ].session
if session.proxies != self._proxies_dict:
@ -208,10 +295,6 @@ class NetworkSessionManager( HydrusSerialisable.SerialisableBase ):
session.verify = False
#
self._SetDirty()
return session
@ -223,6 +306,14 @@ class NetworkSessionManager( HydrusSerialisable.SerialisableBase ):
return self.GetSession( network_context )
def HasDirtySessionContainers( self ):
with self._lock:
return len( self._dirty_session_container_names ) > 0 or len( self._deletee_session_container_names ) > 0
def IsDirty( self ):
with self._lock:
@ -231,11 +322,11 @@ class NetworkSessionManager( HydrusSerialisable.SerialisableBase ):
def Reinitialise( self ):
def ReinitialiseProxies( self ):
with self._lock:
self._Reinitialise()
self._ReinitialiseProxies()
@ -244,6 +335,8 @@ class NetworkSessionManager( HydrusSerialisable.SerialisableBase ):
with self._lock:
self._dirty = False
self._dirty_session_container_names = set()
self._deletee_session_container_names = set()
@ -251,7 +344,48 @@ class NetworkSessionManager( HydrusSerialisable.SerialisableBase ):
with self._lock:
self._dirty = True
self._SetDirty()
def SetSessionContainers( self, session_containers: typing.Collection[ NetworkSessionManagerSessionContainer ], set_all_sessions_dirty = False ):
with self._lock:
self._session_container_names_to_session_containers = {}
self._network_contexts_to_session_containers = {}
self._session_container_names = set()
self._dirty_session_container_names = set()
self._deletee_session_container_names = set()
for session_container in session_containers:
session_container_name = session_container.GetName()
self._session_container_names_to_session_containers[ session_container_name ] = session_container
self._network_contexts_to_session_containers[ session_container.network_context ] = session_container
self._session_container_names.add( session_container_name )
if set_all_sessions_dirty:
self._dirty_session_container_names.add( session_container_name )
def SetSessionDirty( self, network_context: ClientNetworkingContexts.NetworkContext ):
with self._lock:
network_context = self._GetSessionNetworkContext( network_context )
if network_context in self._network_contexts_to_session_containers:
self._dirty_session_container_names.add( self._network_contexts_to_session_containers[ network_context ].GetName() )

View File

@ -0,0 +1,84 @@
import pickle
from hydrus.core import HydrusData
from hydrus.core import HydrusSerialisable
from hydrus.client.networking import ClientNetworkingSessions
class NetworkSessionManagerLegacy( HydrusSerialisable.SerialisableBase ):
SERIALISABLE_TYPE = HydrusSerialisable.SERIALISABLE_TYPE_NETWORK_SESSION_MANAGER_LEGACY
SERIALISABLE_NAME = 'Legacy Session Manager'
SERIALISABLE_VERSION = 1
SESSION_TIMEOUT = 60 * 60
def __init__( self ):
HydrusSerialisable.SerialisableBase.__init__( self )
self._network_contexts_to_sessions = {}
def _GetSerialisableInfo( self ):
serialisable_network_contexts_to_sessions = [ ( network_context.GetSerialisableTuple(), pickle.dumps( session ).hex() ) for ( network_context, session ) in list(self._network_contexts_to_sessions.items()) ]
return serialisable_network_contexts_to_sessions
def _InitialiseFromSerialisableInfo( self, serialisable_info ):
serialisable_network_contexts_to_sessions = serialisable_info
for ( serialisable_network_context, pickled_session_hex ) in serialisable_network_contexts_to_sessions:
network_context = HydrusSerialisable.CreateFromSerialisableTuple( serialisable_network_context )
try:
session = pickle.loads( bytes.fromhex( pickled_session_hex ) )
except:
# new version of requests uses a diff format, wew
continue
session.cookies.clear_session_cookies()
self._network_contexts_to_sessions[ network_context ] = session
def GetData( self ):
return self._network_contexts_to_sessions
HydrusSerialisable.SERIALISABLE_TYPES_TO_OBJECT_TYPES[ HydrusSerialisable.SERIALISABLE_TYPE_NETWORK_SESSION_MANAGER_LEGACY ] = NetworkSessionManagerLegacy
def ConvertLegacyToNewSessions( legacy_session_manager: NetworkSessionManagerLegacy ):
session_containers = []
network_contexts_to_sessions = legacy_session_manager.GetData()
for ( network_context, session ) in network_contexts_to_sessions.items():
session_container_name = HydrusData.GenerateKey().hex()
session_container = ClientNetworkingSessions.NetworkSessionManagerSessionContainer( session_container_name, network_context = network_context, session = session )
session_containers.append( session_container )
session_manager = ClientNetworkingSessions.NetworkSessionManager()
session_manager.SetSessionContainers( session_containers, set_all_sessions_dirty = True )
session_manager.SetDirty()
return session_manager

View File

@ -70,7 +70,7 @@ options = {}
# Misc
NETWORK_VERSION = 19
SOFTWARE_VERSION = 424
SOFTWARE_VERSION = 425
CLIENT_API_VERSION = 15
SERVER_THUMBNAIL_DIMENSIONS = ( 200, 200 )

View File

@ -221,8 +221,6 @@ class HydrusDB( object ):
self._InitDB()
self._RepairDB()
( version, ) = self._c.execute( 'SELECT version FROM version;' ).fetchone()
if version > HC.SOFTWARE_VERSION:
@ -240,6 +238,8 @@ class HydrusDB( object ):
raise Exception( 'Your current database version of hydrus ' + str( version ) + ' is too old for this software version ' + str( HC.SOFTWARE_VERSION ) + ' to update. Please try updating with version ' + str( version + 45 ) + ' or earlier first.' )
self._RepairDB()
while version < HC.SOFTWARE_VERSION:
time.sleep( self.UPDATE_WAIT )

View File

@ -624,9 +624,12 @@ class BandwidthTracker( HydrusSerialisable.SerialisableBase ):
search_time_delta = time_delta + window
since = HydrusData.GetNow() - search_time_delta
now = HydrusData.GetNow()
since = now - search_time_delta
return sum( ( value for ( timestamp, value ) in list(counter.items()) if timestamp >= since ) )
# we test 'now' as upper bound because a lad once had a motherboard reset and lost his clock time, ending up with a lump of data recorded several decades in the future
# I'm pretty sure this ended up in the seconds thing, so all his short-time tests were failing
return sum( ( value for ( timestamp, value ) in counter.items() if since <= timestamp <= now ) )

View File

@ -62,8 +62,8 @@ SERIALISABLE_TYPE_SHORTCUT = 41
SERIALISABLE_TYPE_APPLICATION_COMMAND = 42
SERIALISABLE_TYPE_DUPLICATE_ACTION_OPTIONS = 43
SERIALISABLE_TYPE_TAG_FILTER = 44
SERIALISABLE_TYPE_NETWORK_BANDWIDTH_MANAGER = 45
SERIALISABLE_TYPE_NETWORK_SESSION_MANAGER = 46
SERIALISABLE_TYPE_NETWORK_BANDWIDTH_MANAGER_LEGACY = 45
SERIALISABLE_TYPE_NETWORK_SESSION_MANAGER_LEGACY = 46
SERIALISABLE_TYPE_NETWORK_CONTEXT = 47
SERIALISABLE_TYPE_NETWORK_LOGIN_MANAGER = 48
SERIALISABLE_TYPE_MEDIA_SORT = 49
@ -111,6 +111,10 @@ SERIALISABLE_TYPE_SUBSCRIPTION_CONTAINER = 90
SERIALISABLE_TYPE_COLUMN_LIST_STATUS = 91
SERIALISABLE_TYPE_COLUMN_LIST_MANAGER = 92
SERIALISABLE_TYPE_NUMBER_TEST = 93
SERIALISABLE_TYPE_NETWORK_BANDWIDTH_MANAGER = 94
SERIALISABLE_TYPE_NETWORK_SESSION_MANAGER = 95
SERIALISABLE_TYPE_NETWORK_SESSION_MANAGER_SESSION_CONTAINER = 96
SERIALISABLE_TYPE_NETWORK_BANDWIDTH_MANAGER_TRACKER_CONTAINER = 97
SERIALISABLE_TYPES_TO_OBJECT_TYPES = {}

View File

@ -63,6 +63,34 @@ hash_str_to_type_lookup[ 'sha1' ] = HASH_TYPE_SHA1
hash_str_to_type_lookup[ 'sha256' ] = HASH_TYPE_SHA256
hash_str_to_type_lookup[ 'sha512' ] = HASH_TYPE_SHA512
def ReadLargeIdQueryInSeparateChunks( cursor, select_statement, chunk_size ):
table_name = 'tempbigread' + os.urandom( 32 ).hex()
cursor.execute( 'CREATE TEMPORARY TABLE ' + table_name + ' ( job_id INTEGER PRIMARY KEY AUTOINCREMENT, temp_id INTEGER );' )
cursor.execute( 'INSERT INTO ' + table_name + ' ( temp_id ) ' + select_statement ) # given statement should end in semicolon, so we are good
num_to_do = cursor.rowcount
if num_to_do is None or num_to_do == -1:
num_to_do = 0
i = 0
while i < num_to_do:
chunk = [ temp_id for ( temp_id, ) in cursor.execute( 'SELECT temp_id FROM ' + table_name + ' WHERE job_id BETWEEN ? AND ?;', ( i, i + chunk_size - 1 ) ) ]
yield chunk
i += chunk_size
cursor.execute( 'DROP TABLE ' + table_name + ';' )
class HydrusTagArchive( object ):
def __init__( self, path ):
@ -108,13 +136,23 @@ class HydrusTagArchive( object ):
self._c = self._db.cursor()
def _GetHashes( self, tag_id ):
result = { hash for ( hash, ) in self._c.execute( 'SELECT hash FROM mappings NATURAL JOIN hashes WHERE tag_id = ?;', ( tag_id, ) ) }
return result
def _GetHashId( self, hash, read_only = False ):
result = self._c.execute( 'SELECT hash_id FROM hashes WHERE hash = ?;', ( sqlite3.Binary( hash ), ) ).fetchone()
if result is None:
if read_only: raise Exception()
if read_only:
raise Exception()
self._c.execute( 'INSERT INTO hashes ( hash ) VALUES ( ? );', ( sqlite3.Binary( hash ), ) )
@ -128,7 +166,32 @@ class HydrusTagArchive( object ):
return hash_id
def _GetTagId( self, tag ):
def _GetTags( self, hash_id ):
result = { tag for ( tag, ) in self._c.execute( 'SELECT tag FROM mappings NATURAL JOIN tags WHERE hash_id = ?;', ( hash_id, ) ) }
return result
def _GetTagId( self, tag, read_only = False ):
result = self._c.execute( 'SELECT tag_id FROM tags WHERE tag = ?;', ( tag, ) ).fetchone()
if result is None:
if read_only:
raise Exception()
self._c.execute( 'INSERT INTO tags ( tag ) VALUES ( ? );', ( tag, ) )
tag_id = self._c.lastrowid
else:
( tag_id, ) = result
if ':' in tag:
@ -142,19 +205,6 @@ class HydrusTagArchive( object ):
result = self._c.execute( 'SELECT tag_id FROM tags WHERE tag = ?;', ( tag, ) ).fetchone()
if result is None:
self._c.execute( 'INSERT INTO tags ( tag ) VALUES ( ? );', ( tag, ) )
tag_id = self._c.lastrowid
else:
( tag_id, ) = result
return tag_id
@ -220,6 +270,20 @@ class HydrusTagArchive( object ):
self._c.execute( 'DELETE FROM namespaces;' )
def GetHashes( self, tag ):
try:
tag_id = self._GetTagId( tag, read_only = True )
except:
return set()
return self._GetHashes( tag_id )
def GetHashType( self ):
result = self._c.execute( 'SELECT hash_type FROM hash_type;' ).fetchone()
@ -290,9 +354,7 @@ class HydrusTagArchive( object ):
return set()
result = { tag for ( tag, ) in self._c.execute( 'SELECT DISTINCT tag FROM mappings NATURAL JOIN tags WHERE hash_id = ?;', ( hash_id, ) ) }
return result
return self._GetTags( hash_id )
def HasHash( self, hash ):
@ -318,22 +380,44 @@ class HydrusTagArchive( object ):
def IterateHashes( self ):
for ( hash, ) in self._c.execute( 'SELECT hash FROM hashes;' ): yield hash
for ( hash, ) in self._c.execute( 'SELECT hash FROM hashes;' ):
yield hash
def IterateMappings( self ):
hash_ids = [ hash_id for ( hash_id, ) in self._c.execute( 'SELECT hash_id FROM hashes;' ) ]
for hash_id in hash_ids:
for group_of_hash_ids in ReadLargeIdQueryInSeparateChunks( self._c, 'SELECT hash_id FROM hashes;', 256 ):
( hash, ) = self._c.execute( 'SELECT hash FROM hashes WHERE hash_id = ?;', ( hash_id, ) ).fetchone()
tags = self.GetTags( hash )
if len( tags ) > 0:
for hash_id in group_of_hash_ids:
yield ( hash, tags )
tags = self._GetTags( hash_id )
if len( tags ) > 0:
( hash, ) = self._c.execute( 'SELECT hash FROM hashes WHERE hash_id = ?;', ( hash_id, ) ).fetchone()
yield ( hash, tags )
def IterateMappingsTagFirst( self ):
for group_of_tag_ids in ReadLargeIdQueryInSeparateChunks( self._c, 'SELECT tag_id FROM tags;', 256 ):
for tag_id in group_of_tag_ids:
hashes = self._GetHashes( tag_id )
if len( hashes ) > 0:
( tag, ) = self._c.execute( 'SELECT tag FROM tags WHERE tag_id = ?;', ( tag_id, ) ).fetchone()
yield ( tag, hashes )

View File

@ -634,6 +634,23 @@ class TestClientDBTags( unittest.TestCase ):
def _test_ac( self, search_text, tag_service_key, file_service_key, expected_storage_tags_to_counts, expected_display_tags_to_counts ):
tag_search_context = ClientSearch.TagSearchContext( tag_service_key )
preds = self._read( 'autocomplete_predicates', ClientTags.TAG_DISPLAY_STORAGE, tag_search_context, file_service_key, search_text = search_text )
tags_to_counts = { pred.GetValue() : pred.GetAllCounts() for pred in preds }
self.assertEqual( expected_storage_tags_to_counts, tags_to_counts )
preds = self._read( 'autocomplete_predicates', ClientTags.TAG_DISPLAY_ACTUAL, tag_search_context, file_service_key, search_text = search_text )
tags_to_counts = { pred.GetValue() : pred.GetAllCounts() for pred in preds }
self.assertEqual( expected_display_tags_to_counts, tags_to_counts )
def test_display_pairs_lookup_web_parents( self ):
self._clear_db()
@ -807,6 +824,199 @@ class TestClientDBTags( unittest.TestCase ):
} ) )
def test_display_pending_to_current_bug_both_bad( self ):
# rescinding pending (when you set current on pending) two tags that imply the same thing at once can lead to ghost pending when you don't interleave processing
# so lets test that, both for combined and specific domains!
self._clear_db()
# add samus
bad_samus_tag_1 = 'samus_aran_(character)'
bad_samus_tag_2 = 'samus aran'
good_samus_tag = 'character:samus aran'
service_keys_to_content_updates = {}
content_updates = []
content_updates.append( HydrusData.ContentUpdate( HC.CONTENT_TYPE_TAG_SIBLINGS, HC.CONTENT_UPDATE_ADD, ( bad_samus_tag_1, good_samus_tag ) ) )
content_updates.append( HydrusData.ContentUpdate( HC.CONTENT_TYPE_TAG_SIBLINGS, HC.CONTENT_UPDATE_ADD, ( bad_samus_tag_2, good_samus_tag ) ) )
service_keys_to_content_updates[ self._public_service_key ] = content_updates
self._write( 'content_updates', service_keys_to_content_updates )
self._sync_display()
# import a file
path = os.path.join( HC.STATIC_DIR, 'testing', 'muh_jpg.jpg' )
file_import_job = ClientImportFileSeeds.FileImportJob( path )
file_import_job.GenerateHashAndStatus()
file_import_job.GenerateInfo()
self._write( 'import_file', file_import_job )
muh_jpg_hash = file_import_job.GetHash()
# pend samus to it in one action
service_keys_to_content_updates = {}
content_updates = []
content_updates.extend( ( HydrusData.ContentUpdate( HC.CONTENT_TYPE_MAPPINGS, HC.CONTENT_UPDATE_PEND, ( tag, ( muh_jpg_hash, ) ) ) for tag in ( bad_samus_tag_1, bad_samus_tag_2 ) ) )
service_keys_to_content_updates[ self._public_service_key ] = content_updates
self._write( 'content_updates', service_keys_to_content_updates )
# let's check those tags on the file's media result, which uses specific domain to populate tag data
( media_result, ) = self._read( 'media_results', ( muh_jpg_hash, ) )
tags_manager = media_result.GetTagsManager()
self.assertEqual( tags_manager.GetPending( self._public_service_key, ClientTags.TAG_DISPLAY_STORAGE ), { bad_samus_tag_1, bad_samus_tag_2 } )
self.assertEqual( tags_manager.GetPending( self._public_service_key, ClientTags.TAG_DISPLAY_ACTUAL ), { good_samus_tag } )
# and a/c results, both specific and combined
self._test_ac( 'samu*', self._public_service_key, CC.LOCAL_FILE_SERVICE_KEY, { bad_samus_tag_1 : ( 0, None, 1, None ), bad_samus_tag_2 : ( 0, None, 1, None ) }, { good_samus_tag : ( 0, None, 1, None ) } )
self._test_ac( 'samu*', self._public_service_key, CC.COMBINED_FILE_SERVICE_KEY, { bad_samus_tag_1 : ( 0, None, 1, None ), bad_samus_tag_2 : ( 0, None, 1, None ) }, { good_samus_tag : ( 0, None, 1, None ) } )
# now we'll currentify the tags in one action
service_keys_to_content_updates = {}
content_updates = []
content_updates.extend( ( HydrusData.ContentUpdate( HC.CONTENT_TYPE_MAPPINGS, HC.CONTENT_UPDATE_ADD, ( tag, ( muh_jpg_hash, ) ) ) for tag in ( bad_samus_tag_1, bad_samus_tag_2 ) ) )
service_keys_to_content_updates[ self._public_service_key ] = content_updates
self._write( 'content_updates', service_keys_to_content_updates )
# and magically our tags should now be both current, no ghost pending
( media_result, ) = self._read( 'media_results', ( muh_jpg_hash, ) )
tags_manager = media_result.GetTagsManager()
self.assertEqual( tags_manager.GetPending( self._public_service_key, ClientTags.TAG_DISPLAY_STORAGE ), set() )
self.assertEqual( tags_manager.GetCurrent( self._public_service_key, ClientTags.TAG_DISPLAY_STORAGE ), { bad_samus_tag_1, bad_samus_tag_2 } )
self.assertEqual( tags_manager.GetPending( self._public_service_key, ClientTags.TAG_DISPLAY_ACTUAL ), set() )
self.assertEqual( tags_manager.GetCurrent( self._public_service_key, ClientTags.TAG_DISPLAY_ACTUAL ), { good_samus_tag } )
# and a/c results, both specific and combined
self._test_ac( 'samu*', self._public_service_key, CC.LOCAL_FILE_SERVICE_KEY, { bad_samus_tag_1 : ( 1, None, 0, None ), bad_samus_tag_2 : ( 1, None, 0, None ) }, { good_samus_tag : ( 1, None, 0, None ) } )
self._test_ac( 'samu*', self._public_service_key, CC.COMBINED_FILE_SERVICE_KEY, { bad_samus_tag_1 : ( 1, None, 0, None ), bad_samus_tag_2 : ( 1, None, 0, None ) }, { good_samus_tag : ( 1, None, 0, None ) } )
def test_display_pending_to_current_bug_bad_and_ideal( self ):
# like the test above, this will test 'a' and 'b' being commited at the same time, but when 'a->b', rather than 'a->c' and 'b->c'
self._clear_db()
# add samus
bad_samus_tag_1 = 'samus_aran_(character)'
bad_samus_tag_2 = 'samus aran'
good_samus_tag = 'character:samus aran'
service_keys_to_content_updates = {}
content_updates = []
content_updates.append( HydrusData.ContentUpdate( HC.CONTENT_TYPE_TAG_SIBLINGS, HC.CONTENT_UPDATE_ADD, ( bad_samus_tag_1, good_samus_tag ) ) )
content_updates.append( HydrusData.ContentUpdate( HC.CONTENT_TYPE_TAG_SIBLINGS, HC.CONTENT_UPDATE_ADD, ( bad_samus_tag_2, good_samus_tag ) ) )
service_keys_to_content_updates[ self._public_service_key ] = content_updates
self._write( 'content_updates', service_keys_to_content_updates )
self._sync_display()
# import a file
path = os.path.join( HC.STATIC_DIR, 'testing', 'muh_jpg.jpg' )
file_import_job = ClientImportFileSeeds.FileImportJob( path )
file_import_job.GenerateHashAndStatus()
file_import_job.GenerateInfo()
self._write( 'import_file', file_import_job )
muh_jpg_hash = file_import_job.GetHash()
# pend samus to it in one action
service_keys_to_content_updates = {}
content_updates = []
content_updates.extend( ( HydrusData.ContentUpdate( HC.CONTENT_TYPE_MAPPINGS, HC.CONTENT_UPDATE_PEND, ( tag, ( muh_jpg_hash, ) ) ) for tag in ( bad_samus_tag_1, good_samus_tag ) ) )
service_keys_to_content_updates[ self._public_service_key ] = content_updates
self._write( 'content_updates', service_keys_to_content_updates )
# let's check those tags on the file's media result, which uses specific domain to populate tag data
( media_result, ) = self._read( 'media_results', ( muh_jpg_hash, ) )
tags_manager = media_result.GetTagsManager()
self.assertEqual( tags_manager.GetPending( self._public_service_key, ClientTags.TAG_DISPLAY_STORAGE ), { bad_samus_tag_1, good_samus_tag } )
self.assertEqual( tags_manager.GetPending( self._public_service_key, ClientTags.TAG_DISPLAY_ACTUAL ), { good_samus_tag } )
# and a/c results, both specific and combined
self._test_ac( 'samu*', self._public_service_key, CC.LOCAL_FILE_SERVICE_KEY, { bad_samus_tag_1 : ( 0, None, 1, None ), good_samus_tag : ( 0, None, 1, None ) }, { good_samus_tag : ( 0, None, 1, None ) } )
self._test_ac( 'samu*', self._public_service_key, CC.COMBINED_FILE_SERVICE_KEY, { bad_samus_tag_1 : ( 0, None, 1, None ), good_samus_tag : ( 0, None, 1, None ) }, { good_samus_tag : ( 0, None, 1, None ) } )
# now we'll currentify the tags in one action
service_keys_to_content_updates = {}
content_updates = []
content_updates.extend( ( HydrusData.ContentUpdate( HC.CONTENT_TYPE_MAPPINGS, HC.CONTENT_UPDATE_ADD, ( tag, ( muh_jpg_hash, ) ) ) for tag in ( bad_samus_tag_1, good_samus_tag ) ) )
service_keys_to_content_updates[ self._public_service_key ] = content_updates
self._write( 'content_updates', service_keys_to_content_updates )
# and magically our tags should now be both current, no ghost pending
( media_result, ) = self._read( 'media_results', ( muh_jpg_hash, ) )
tags_manager = media_result.GetTagsManager()
self.assertEqual( tags_manager.GetPending( self._public_service_key, ClientTags.TAG_DISPLAY_STORAGE ), set() )
self.assertEqual( tags_manager.GetCurrent( self._public_service_key, ClientTags.TAG_DISPLAY_STORAGE ), { bad_samus_tag_1, good_samus_tag } )
self.assertEqual( tags_manager.GetPending( self._public_service_key, ClientTags.TAG_DISPLAY_ACTUAL ), set() )
self.assertEqual( tags_manager.GetCurrent( self._public_service_key, ClientTags.TAG_DISPLAY_ACTUAL ), { good_samus_tag } )
# and a/c results, both specific and combined
self._test_ac( 'samu*', self._public_service_key, CC.LOCAL_FILE_SERVICE_KEY, { bad_samus_tag_1 : ( 1, None, 0, None ), good_samus_tag : ( 1, None, 0, None ) }, { good_samus_tag : ( 1, None, 0, None ) } )
self._test_ac( 'samu*', self._public_service_key, CC.COMBINED_FILE_SERVICE_KEY, { bad_samus_tag_1 : ( 1, None, 0, None ), good_samus_tag : ( 1, None, 0, None ) }, { good_samus_tag : ( 1, None, 0, None ) } )
def test_parents_pairs_lookup( self ):
self._clear_db()