Commit Graph

4 Commits

Author SHA1 Message Date
rcombs aa0a9ce2ec sd_ass: allow get_text to return more than 500 bytes 2024-04-27 01:19:56 +02:00
Avi Halachmi (:avih) 41650203c3 sub: sub-filter-regex and jsre: support ass-to-plaintext
Using --sub-filter-regex-plain (default:no)

The ass-to-plaintext functionality already existed at sd_ass.c, but
it's internal and uses a private buffer type, so a trivial utility
wrapper was added with standard char*/bstr interface.

The plaintext can be multi-line, and the multi-line regexp flag is now
always set, but only affects plaintext (the ASS source is one line).
2021-08-05 21:32:22 +03:00
Avi Halachmi (:avih) ab689a33a8 sub: add filter text utils, use from filter-regex (no-op)
Add two stand-alone function to help with the text-extraction task
which ass filters need. Makes it easier to add new filters without
cargo-culting this functionality.

Currently, on malformed event (which shouldn't happen), a warning is
printed when a filter tries to extract the text, so if few filters
are enabled, we'll get multiple warnings (like before) - not critical.

The regex filter now uses these utils, the SDH filter not yet.
2021-08-05 21:32:22 +03:00
wm4 a4eb8f75c0 sub: add an option to filter subtitles by regex
Works as ad-filter. I had some more plans, for example replacing
matching text with different text, but for now it's dropping matches
only. There's a big warning in the manpage that I might change
semantics. For example, I might turn it into a primitive sed.

In a sane world, you'd probably write a simple script that processes
downloaded subtitles before giving them to mpv, and avoid all this
complexity. But we don't live in a sane world, and the sooner you learn
this, the happier you will be. (But I also want to run this on muxed
subtitles.)

This is pretty straightforward. We use POSIX regexes, which are readily
available without additional pain or dependencies. This also means it's
(apparently) not available on win32 (MinGW). The regex list is because I
hate big monolithic regexes, and this makes it slightly better.

Very superficially tested.
2020-02-16 02:07:24 +01:00