player: add track-list/N/image sub-property

This exposes whether a video track is detected as an image. This is
useful for profile conditions, property expansion and lavfi-complex, and
is more accurate than any detection even Lua scripts can perform, since
they can't differentiate between images and videos without container-fps
and audio and with duration 1 (which is the duration set by the mf
demuxer with the default --mf-fps=1).

The lavf demuxer image check is moved to where the number of frames is
available for comparison, and is modified to check the number of frames
and duration instead of the video codec. This doesn't misdetect videos
in a codec commonly used for images (e.g. mjpeg) as images, and can
detect images in a codec commonly used for videos (e.g. 1-frame gifs).

pix files are also now detected as images, while before they weren't
since the condition was checking if the AVInputFormat name ends with
_pipe, and alias_pix doesn't.

Both nb_frames and codec_info_nb_frames are checked because nb_frames is
0 for some video codecs (hevc, av1, vc1, mpeg1video, vp9 if forcing
--demuxer=lavf), and codec_info_nb_frames is 1 for others (mpeg, mpeg4,
wmv3).

The duration is checked as well because for some uncommon codecs and
containers found in FFMpeg's FATE suite, libavformat returns nb_frames =
0 and codec_info_nb_frames = 1. For some of them it even returns
duration = 0, so they are blacklisted in order to never be considered
images.

The extra codecs that would have to be blacklisted without checking the
duration are AV_CODEC_ID_4XM, AV_CODEC_ID_BINKVIDEO,
AV_CODEC_ID_DSICINVIDEO, AV_CODEC_ID_ESCAPE130, AV_CODEC_ID_MMVIDEO,
AV_CODEC_ID_NUV, AV_CODEC_ID_RL2, AV_CODEC_ID_SMACKVIDEO and
AV_CODEC_ID_XAN_WC3, while the containers are film-cpk, ivf and ogg.

The lower limit for duration is 10 because that's the duration of
1-frame gifs.

Streams with codec_info_nb_frames 0 are not considered images because
vp9 and av1 have nb_frames = 0 and codec_info_nb_frames = 0, and we
can't rely on just the duration to detect them because they could be
livestreams without an initial duration, and actually even if we could
for these codecs libavformat returns huge negative durations like
-9223372036854775808.

Some more images in the FATE suite that are really frames cut from a
video in an uncommon codec and container, like cine/bayer_gbrg8.cine,
could be detected by allowing codec_info_nb_frames = 0, but then any
present and future video codec with nb_frames = 0 and
codec_info_nb_frames = 0 would need to be added to the blacklist. Some
even have duration > 10, so to detect these images the duration check
would have to be removed, and all the previously mentioned extra codecs
and containers would have to be added added to the blacklists, which
means that images that use them (if they exist anywhere) will never be
detected. These FATE images aren't detected as such by mediainfo either
anyway, nor can a Lua script reliably detect them as images since they
have container-fps and duration > 0 and != 1, and you probably will
never see files like them anywhere else.

For attached pictures the lavf demuxer always set image to true, which
is necessary because they have duration > 10. There is a minor change in
behavior for which audio with attached pictures now has mf-fps as
container-fps instead of unavailable, but this makes it consistent with
external cover art, which was already being assigned mf-fps.

When the lavf demuxer fails, the mf one guesses if the file is an image
by its extension, so sh->image is set to true when the mf demuxer
succeds and there's only one file.

Even if you add a video's file type to --mf-type and open it with the mf
protocol, only the first frame is used, so setting image to true is
still accurate.

When converting an image to the extensions listed in demux/demux_mf.c,
tga and pam files are currently the only ones detected by the mf demuxer
rather than lavf. Actually they are detected with the image2 format, but
it is blacklisted; see d0fee0ac33.

The mkv demuxer just sets image to true for any attached picture.

The timeline demuxer just copies the value of image from source to
destination. This sets image to true for attached pictures, standalone
images and images added with !new_stream in EDL playlists, but it is
imperfect since you could concatenate multiple images in an EDL playlist
(which should be done with the mf demuxer anyway). This is good enough
anyway since the comment of the modified function already says it is
"Imperfect and arbitrary".
This commit is contained in:
Guido Cella 2021-10-02 16:31:24 +02:00 committed by Dudemanguy
parent 38b55c862f
commit 0862664ac9
10 changed files with 53 additions and 14 deletions

View File

@ -43,6 +43,7 @@ Interface changes
- add a `--watch-later-options` option to allow configuring which
options quit-watch-later saves
- make `current-window-scale` writeable and use it in the default input.conf
- add ``track-list/N/image`` sub-property
--- mpv 0.33.0 ---
- add `--d3d11-exclusive-fs` flag to enable D3D11 exclusive fullscreen mode
when the player enters fullscreen.

View File

@ -2838,11 +2838,13 @@ Property list
``track-list/N/lang``
Track language as identified by the file. Not always available.
``track-list/N/albumart``
``track-list/N/image``
``yes``/true if this is a video track that consists of a single
picture, ``no``/false or unavailable otherwise. This is used for video
tracks that are really images embedded in audio files and for external
cover art.
picture, ``no``/false or unavailable otherwise.
``track-list/N/albumart``
``yes``/true if this is an image embedded in an audio file or external
cover art, ``no``/false or unavailable otherwise.
``track-list/N/default``
``yes``/true if the track has the default flag set in the file,
@ -2936,6 +2938,7 @@ Property list
"src-id" MPV_FORMAT_INT64
"title" MPV_FORMAT_STRING
"lang" MPV_FORMAT_STRING
"image" MPV_FORMAT_FLAG
"albumart" MPV_FORMAT_FLAG
"default" MPV_FORMAT_FLAG
"forced" MPV_FORMAT_FLAG

View File

@ -138,7 +138,6 @@ struct format_hack {
bool use_stream_ids : 1; // has a meaningful native stream IDs (export it)
bool fully_read : 1; // set demuxer.fully_read flag
bool detect_charset : 1; // format is a small text file, possibly not UTF8
bool image_format : 1; // expected to contain exactly 1 frame
// Do not confuse player's position estimation (position is into external
// segment, with e.g. HLS, player knows about the playlist main file only).
bool clear_filepos : 1;
@ -205,8 +204,6 @@ static const struct format_hack format_hacks[] = {
BLACKLIST("bin"),
// Useless, does not work with custom streams.
BLACKLIST("image2"),
// Image demuxers ("<name>_pipe" is detected explicitly)
{"image2pipe", .image_format = true},
{0}
};
@ -528,11 +525,6 @@ static int lavf_check_file(demuxer_t *demuxer, enum demux_check check)
return -1;
}
if (bstr_endswith0(bstr0(priv->avif->name), "_pipe")) {
MP_VERBOSE(demuxer, "Assuming this is an image format.\n");
priv->format_hack.image_format = true;
}
if (lavfdopts->hacks)
priv->avif_flags = priv->avif->flags | priv->format_hack.if_flags;
@ -655,6 +647,35 @@ static int dict_get_decimal(AVDictionary *dict, const char *entry, int def)
return def;
}
// Detect if a stream is an image from the number of frames and duration.
// Unlike checking only the codec, this doesn't detect videos with codecs
// commonly used for images (e.g. mjpeg) as images, and can detect images in
// codecs commonly used for videos. But for some video codecs and containers,
// libavformat always returns 0/1 numbers of frames and 0 duration, so they
// have to be hardcoded to never be considered images.
static const int blacklisted_video_codecs[] =
{AV_CODEC_ID_AMV, AV_CODEC_ID_FLIC, AV_CODEC_ID_VMDVIDEO, 0};
static const char *const blacklisted_formats[] = {"lavfi", "m4v", "vc1", NULL};
static bool is_image(AVStream *st, int codec_id, const char *avifname)
{
if (st->nb_frames > 1 || st->codec_info_nb_frames != 1 || st->duration > 10)
return false;
for (int i = 0; blacklisted_video_codecs[i]; i++) {
if (codec_id == blacklisted_video_codecs[i])
return false;
}
for (int i = 0; blacklisted_formats[i]; i++) {
if (strcmp(avifname, blacklisted_formats[i]) == 0)
return false;
}
return true;
}
static void handle_new_stream(demuxer_t *demuxer, int i)
{
lavf_priv_t *priv = demuxer->priv;
@ -714,8 +735,12 @@ static void handle_new_stream(demuxer_t *demuxer, int i)
sh->codec->disp_h = codec->height;
if (st->avg_frame_rate.num)
sh->codec->fps = av_q2d(st->avg_frame_rate);
if (priv->format_hack.image_format)
if (sh->attached_picture ||
is_image(st, codec->codec_id, priv->avif->name)) {
MP_VERBOSE(demuxer, "Assuming this is an image format.\n");
sh->image = true;
sh->codec->fps = priv->mf_fps;
}
sh->codec->par_w = st->sample_aspect_ratio.num;
sh->codec->par_h = st->sample_aspect_ratio.den;

View File

@ -381,8 +381,12 @@ static int demux_open_mf(demuxer_t *demuxer, enum demux_check check)
// create a new video stream header
struct sh_stream *sh = demux_alloc_sh_stream(STREAM_VIDEO);
struct mp_codec_params *c = sh->codec;
if (mf->nr_of_files == 1) {
MP_VERBOSE(demuxer, "Assuming this is an image format.\n");
sh->image = true;
}
struct mp_codec_params *c = sh->codec;
c->codec = codec;
c->disp_w = 0;
c->disp_h = 0;

View File

@ -1300,6 +1300,7 @@ static void add_coverart(struct demuxer *demuxer)
sh->attached_picture->pts = 0;
talloc_steal(sh, sh->attached_picture);
sh->attached_picture->keyframe = true;
sh->image = true;
}
sh->title = att->name;
demux_add_sh_stream(demuxer, sh);

View File

@ -525,6 +525,7 @@ static void apply_meta(struct sh_stream *dst, struct sh_stream *src)
dst->missing_timestamps = src->missing_timestamps;
if (src->attached_picture)
dst->attached_picture = src->attached_picture;
dst->image = src->image;
}
// This is mostly for EDL user-defined metadata.

View File

@ -48,6 +48,7 @@ struct sh_stream {
bool dependent_track; // container dependent track flag
bool visual_impaired_track; // container flag
bool hearing_impaired_track;// container flag
bool image; // video stream is an image
bool still_image; // video stream contains still images
int hls_bitrate;

View File

@ -1949,6 +1949,7 @@ static int get_track_entry(int item, int action, void *arg, void *ctx)
.unavailable = !track->lang},
{"audio-channels", SUB_PROP_INT(track_channels(track)),
.unavailable = track_channels(track) <= 0},
{"image", SUB_PROP_FLAG(track->image)},
{"albumart", SUB_PROP_FLAG(track->attached_picture)},
{"default", SUB_PROP_FLAG(track->default_track)},
{"forced", SUB_PROP_FLAG(track->forced_track)},

View File

@ -136,6 +136,7 @@ struct track {
char *title;
bool default_track, forced_track, dependent_track;
bool visual_impaired_track, hearing_impaired_track;
bool image;
bool attached_picture;
char *lang;

View File

@ -421,6 +421,7 @@ static struct track *add_stream_track(struct MPContext *mpctx,
.dependent_track = stream->dependent_track,
.visual_impaired_track = stream->visual_impaired_track,
.hearing_impaired_track = stream->hearing_impaired_track,
.image = stream->image,
.attached_picture = stream->attached_picture != NULL,
.lang = stream->lang,
.demuxer = demuxer,