Since d41f0a54b0, enabling a stream will
no longer perform a refresh seek, to fix issues with switching video
tracks. Additionally, 62e9a0c5f6 also
prevents refresh seek on track switching when it happens right after a
seek.
Unfortunately, when external audio files are loaded, preventing refresh
seeks can cause A-V sync issues. Since external tracks have separate demuxer
instances from the internal tracks, the demuxer instances from both
internal and external tracks are explicitly sought to the same pts when
seeking is performed. When enabling an external audio track for the first
time, the existing logic prevents refresh seek because no packets have ever
been read, so the track ends up not being sought. This means that switching
the track in the middle of playback results in a huge A-V desync.
To make it worse, unlike one demuxer instance with multiple tracks,
tracks from multiple demuxer instances are not synced in any meaningful
way beyond the separate explicit seeking. The only thing which keeps these
tracks synced is the generic A-V sync logic, which is unfit for such large
desync above. This means the audio is at the start position after the track
is switched, behind the video, so the audio is not considered ready, and
audio is continuously filtered sequentially until it is synced to the video.
While this is happening, the audio filtering exhausts all CPU resources.
There is also a case with cache enabled: if a seek causes some new data to
be cached, the demuxer is sought to the end of joined cache range.
If track is switched after this, the existing logic prevents a refresh seek,
so the demuxer is at a wrong position after the track is switched.
Fix this by allowing refresh seek for non-video streams, even when no packets
have been read yet or immediately after a seek.