Just avoid some code duplication. Also, gl_video_set_options() having a
queue size output parameter is weird at best. While I don't appreciate
that this commit suddenly requires gl_video.c to deal with vo.c directly
in a special case, it's simply the best place to put this function.
See manpage additions. This is mainly useful for vo_opengl_cb, but can
also be applied to vo_opengl.
On a side note, gl_hwdec_load_api() should stop using a name string, and
instead always use the IDs. This should be cleaned up another time.
draw_image_timed is renamed to draw_frame. struct frame_timing is
renamed to vo_frame. flip_page_timed is merged into draw_frame (the
additional parameters are part of struct vo_frame). draw_frame also
deprecates VOCTRL_REDRAW_FRAME, and replaces it with a method that
works for both VOs which can cache the current frame, and VOs which
need to redraw it anyway.
This is preparation to making the interpolation and (work in progress)
display sync code saner.
Lots of other refactoring, and also some simplifications.
This should make interpolation work much better in general, although
there still might be some side effects for unusual framerates (eg. 35 Hz
or 48 Hz). Most of the common framerates are tested and working fine.
(24 Hz, 30 Hz, 60 Hz)
The new code doesn't have support for oversample yet, so it's been
removed (and will most likely be reimplemented in a cleaner way if
there's enough demand). I would recommend using something like robidoux
or mitchell instead of oversample, though - they're much
smoother for the common cases.
Now the VO can request a number of future frames with the last parameter
of vo_set_queue_params(). This will be helpful to fix the interpolation
code.
Note that the first frame (after playback start or seeking) will usually
not have any future frames (to make seeking fast). Near the end of the
file, the number of future frames will become lower as well.
An attempt to get rid of the weird mix of callbacks that take either
struct vo or MPGLCopntext as parameter. This is not perfect, and the
API will probably change a bit until all other code is ported to it.
the main question is how to separate struct vo completely from the
windowing code, which actually needs vo for very little.
In the end, the legacy callbacks will be dropped.
Instead of having separate backends, make use of GLES a flag. This
reduces the number of backends and the resulting annoyances.
Also, nobody cares about using GLES, so there's no backward
compatibility either.
This will essentially make screenshot-tag-colorspace also affect the
"screenshot window" command, where possible.
Unfortunately, it's completely incompatible with icc-profile, due to API
limitations of ffmpeg (we can only give it an enum of well-known
primaries, rather than an actual ICC profile or primaries).
I think this used to be quite important, because the ancient VfW support
in MPlayer used to output flipped frames. This code has been dead in mpv
for quite some time (because VfW decoders were removed, and the --flip
option was dropped too), so get rid of it.
Currently, the wayland backend needs extra work to avoid drawing more
often than the wayland frame callback allows. (This is not ideal, but
will be fixed at a later time.)
Unify this with the start_frame callback added for cocoa. Some details
change for the better. For example, if a frame is dropped, and a redraw
is done afterwards, the actually correct frame is redrawn, instead
whatever was in the textures from before the dropped frame.
With --idle --force-window, or when started from the bundle, the cocoa
code dropped the first frame. This resulted in a black frame on start
sometimes.
The reason was that the live resizing/redrawing code was invoked, which
simply set skip_swap_buffer to false, blocking redrawing whatever was
going to be rendered next. Normally this is done so that the following
works:
1. vo_opengl draw a frame, releases GL lock
2. live resizing kicks in, redraw the frame
3. vo_opengl wants to call SwapBuffers, drawing a stale buffer
overwritten by the live resizing code
This is solved by setting skip_swap_buffer in 2., and querying it in 3.
Fix this by resetting the skip_swap_buffer at a known good point: when
vo_opengl starts drawing a new frame.
The start_frame function returns bool, so that it can be merged with
is_active in a following commit.
This could help in cases where the DWM (Windows desktop compositor) adds another
layer of bufferring and therefore the SwapBuffers timing could get messed up.
Signed-off-by: wm4 <wm4@nowhere>
Rarely used and essentially useless. The only VO for which this was
implemented correctly and for which this did anything was vo_xv, but you
shouldn't use vo_xv anyway (plus it support BT.601 only, plus a vendor
specific extension for BT.709, whose presence this function essentially
reported - use xvinfo instead).
Instead of somehow looking for the substring "_swap_control" and trying
to several arbitrary function names, do it cleanly. The old approach has
the problem that it's not very exact, and may even load a pointer to a
function which doesn't exist. (Some GL implementations like Mesa return
function pointers even the functions which don't exist, and calling them
crashes.)
I couldn't find any evidence that glXSwapInterval, wglSwapIntervalSGI,
or wglSwapInterval actually exist, so don't include them. They were
carried over from MPlayer times.
To make diagnostics easier, print a warning in verbose mode if the
function could not be loaded.
When not receiving frame callbacks, we should not draw anything to avoid
blocking the OpenGL renderer. We do this by extending gl context api, by
introducing new optional function 'is_active', that indicates whether
OpenGL renderers should draw or not.
This fixes issue #249.
Unlike other VOs, this rendered OSD even while no VO was created
(because the renderer lives as long as the API user wants). Change this,
and refactor the code so that the OSD object is accessible only while
the VO is created.
(There is a short time where the OSD can still be accessed even after VO
destruction - this is not a race condition, though it's inelegant and
unfortunately unavoidable.)
There's literally no reason why these functions have to be inline (they
might be performance critical, but then the function call overhead isn't
going to matter at all).
Uninline them and move them to mp_image.c. Drop the header file and fix
all uses of it.
This replaces the old smoothmotion code by a more flexible tscale
option, which essentially allows any scaler to be used for interpolating
frames. (The actual "smoothmotion" scaler which behaves identical to the
old code does not currently exist, but it will be re-added in a later commit)
The only odd thing is that larger filters require a larger queue size
offset, which is currently set dynamically as it introduces some issues
when pausing or framestepping. Filters with a lower radius are not
affected as much, so this is identical to the old smoothmotion if the
smoothmotion interpolator is used.
The basic idea is to use dynamically generated shaders instead of a
single monolithic file + a ton of ifdefs. Instead of having to setup
every aspect of it separately (like compiling shaders, setting uniforms,
perfoming the actual rendering steps, the GLSL parts), we generate the
GLSL on the fly, and perform the rendering at the same time. The GLSL
is regenerated every frame, but the actual compiled OpenGL-level shaders
are cached, which makes it fast again. Almost all logic can be in a
single place.
The new code is significantly more flexible, which allows us to improve
the code clarity, performance and add more features easily.
This commit is incomplete. It drops almost all previous code, and
readds only the most important things (some of them actually buggy).
The next commit will complete it - it's separate to preserve authorship
information.
mpv would attempt to load ICC profiles several times during VO init
even if no window is displayed. This potentially causes it to load
a profile for a different screen than it is going to be displayed
on, thereby invalidating the profile cache and rebuilding the LUT
every single time.
It would not unload a previously loaded profile when the video
window is moved to a display without an installed profile.
Fix these issues and tweak the log messages a little.
This automatically sets the gamma option depending on lighting conditions
measured from the computer's ambient light sensor.
sRGB – arguably the “sibling” to BT.709 for still images – has a reference
viewing environment defined in its specification (IEC 61966-2-1:1999, see
http://www.color.org/chardata/rgb/srgb.xalter). According to this data, the
assumed ambient illuminance is 64 lux. This is the illuminance where the gamma
that results from ICC color management is correct.
On the other hand, BT.1886 formalizes that the gamma level for dim environments
to be 2.40, and Apple resources (WWDC12: 2012 Session 523: Best practices for
color management) define the BT.1886 dim at 16 lux.
So the logic we apply is:
* >= 64lux -> 1.961 gamma
* =< 16lux -> 2.400 gamma
* 16lux < x < 64lux -> logaritmic rescale of lux to gamma. The human
perception of illuminance roughly follows a logaritmic scale of lux [1].
[1]: https://msdn.microsoft.com/en-us/library/windows/desktop/dd319008%28v=vs.85%29.aspx
gl_common.c contained the function loader (which is big) and additional
utility functions (not so big, but will grow when moving more out of
gl_video.c). Just split them. There are no changes other than some
modifications to comments.
Merge update_icc_profile() into get_and_update_icc_profile() - there's
no reason anymore to keep them separate. The former is only called by
the latter, and the separation of responsibilities between them is
blurry a best.
Query the ICC profile only if the corresponding feature is actually
enabled. Additionally, change the error behavior of this code. Make
loading failure non-fatal, and distinguish between runtime error and
unimplemented functionality.
Fix a memory leak in gl_lcms.c (although the changes in vo_opengl.c
already take care of this, it's just logical and cleaner).
At the time screenshot support was added, images weren't refcounted yet,
so screenshots required specialized implementations in the VOs. But now
we can handle these things much simpler. Also see commit 5bb24980.
If there are VOs in the future which can't do this (e.g. they need to
write to the image passed to vo_driver->draw_image), this still could be
disabled on a per-VO basis etc., so we lose no potential performance
advantages.
Use different VOCTRLs for "window" and normal screenshot modes. The
normal one will probably be removed, and replaced by generic code in
vo.c, and this commit is preparation for this. (Doing it the other way
around would be slightly simpler, but I haven't decided yet about the
second one, and touching every VO is needed anyway in order to remove
the unneeded crap. E.g. has_osd has been unused for a long time.)
vo.c queried the VO at initialization whether it wants to be updated on
every display frame, or every video frame. If the smoothmotion option
was changed at runtime, the rendering mode in vo.c wasn't updated.
Just let vo_opengl set the mode directly. Abuse the existing
vo_set_flip_queue_offset() function for this.
Also add a comment suggesting the use of --display-fps to the manpage,
which doesn't have anything to do with the rest of this commit, but is
important to make smoothmotion run well.
SmoothMotion is a way to time and blend frames made popular by MadVR. It's
intended behaviour is to remove stuttering caused by mismatches between the
display refresh rate and the video fps, while preserving the video's original
artistic qualities (no soap opera effect). It's supposed to make 24fps video
playback on 60hz monitors as close as possible to a 24hz monitor.
Instead of drawing a frame once once it's pts has passed the vsync time, we
redraw at the display refresh rate, and if we detect the vsync is between two
frames we interpolated them (depending on their position relative to the vsync).
We actually interpolate as few frames as possible to avoid a blur effect as
much as possible. For example, if we were to play back a 1fps video on a 60hz
monitor, we would blend at most on 1 vsync for each frame (while the other 59
vsyncs would be rendered as is).
Frame interpolation is always done before scaling and in linear light when
possible (an ICC profile is used, or :srgb is used).
Before this commit, each hw backend had their own specific struct types
for context, and some, like VDA, had none at all. Add a context struct
(mp_hwdec_ctx) that provides a somewhat generic way to pass the hwdec
context around. Some things get slightly better, some slightly more
verbose.
mp_hwdec_info is still around; it's still needed, but is reduced to its
role of handling delayed loading of the hwdec backend.
And remove all uses of the VFCAP_CSP_SUPPORTED* constants. This is
supposed to reduce conversions if many filters are used (with many
incompatible pixel formats), and also for preferring the VO's natively
supported pixel formats (as opposed to conversion).
This is worthless by now. Not only do the main VOs not use software
conversion, but also the way vf_lavfi and libavfilter work mostly break
the way the old MPlayer mechanism worked. Other important filters like
vf_vapoursynth do not support "proper" format negotation either.
Part of this was already removed with the vf_scale cleanup from today.
While I'm touching every single VO, also fix the query_format argument
(it's not a FourCC anymore).
Don't load all the legacy functions (including ancient extensions).
Slightly simplify function loader and context creation, now that legacy
GL doesn't need to be handled. Remove the code for drawing OSD in legacy
mode.
Remove all the header hacks, which were meant for ancient OpenGL headers
which didn't even support things like OpenGL 1.3. Instead, adjust the
GLX check to make sure we get both OpenGL 3x and 2.1 symbols. For win32
and OSX, we assume that the user has the latest headers anyway. For
wayland, we hope that things somehow go right.
vo_opengl was crashing since f811348d because it passed NULL for the
events parameter to vo_control. Normally the parameter should not be
NULL, so add a hack to account for this. In particular, we should
handle the events that are returned. For the call in preinit() we
skip this, but it most likely has no meaning anyway, because in this
stage no window is visible yet.
This affects OSX, where memory profiles are updated e.g. on fullscreen
switches. The profile most likely doesn't change, but the LUT will
be generated and reloaded anyway.
Somewhat of a regression from commit f811348.
Fixes#1439.
Previously we just forced loading a profile from file, but that has poor
integration for querying the OS / display server for an ICC profile, and
generating profiles on the fly (which we might use in the future for creating
preset 3dluts).
Also changed the previous icc-profile-auto code to use this mechanism, and
moved gl_lcms to be an opaque type with state instead of just providing pure
functions.
This makes vo_opengl_cb respond to controls like "gamma" and
"brightness". The commit includes an awkward refactor for vo_opengl to
make it easier for vo_opengl_cb.
One problem is a logical race condition. The set of supported controls
depends on the pixelformat, which in turn is set by reconfig(). But the
actual reconfig() call (on the renderer) happens asynchronously on the
renderer thread. At the time it happens, the player most likely already
tried to set some controls for command line options (see init_vo() in
video.c). So setting this command line options will fail most of the
time, though it could randomly succeed. This can't be fixed directly,
because the player can't wait on the renderer thread, because the
renderer thread might already wait on the player.
GL_ARB_debug_output provides a logging callback, which can be used to
diagnose problems etc. in case the driver supports it. It's enabled only
if the vo_opengl "debug" suboption is set.
Involve detection of software renderers in the probing properly. Other
VOs could handle probing also more gracefully, and e.g. produce less
noise if an API is unavailable. (Although other than the OpenGL VOs,
only vo_wayland will.)
Now the "sw" suboption for vo_opengl[_old] is strictly speaking not
needed anymore. Doing "--vo=opengl" disables the probing logic, and will
always force it, if possible.
Includes some simplifications as well.
Because of the icc-profile-auto option (which was added at a later
point), supporting this would probably be slightly messy: the ICC
profile can spontaneously update, and then it would overwrite the
previously set options.
Don't make icc-profile-auto fatal if unsupported. The "auto" indicates
that it will use whatever it finds, even if it's nothing.
Also add a warning; before this commit, it just refused to initialize
without explanation.
As a mostly unrelated cosmetic change, remove redundant parameters which
had no point anymore.
Probably fixes#1359 (or rather works it around by disallowing it).
Obscure feature, and I've never heard of anyone using it.
The anaglyph effects can be reproduced with vf_stereo3d. The only thing
that can't be reproduced with it is "quadbuffer", which requires special
and expensive hardware.
This adds API to libmpv that lets host applications use the mpv opengl
renderer. This is a more flexible (and possibly more portable) option to
foreign window embedding (via --wid).
This assumes that methods like context sharing and multithreaded OpenGL
rendering are infeasible, and that a way is needed to integrate it with
an application that uses a single thread to render everything.
Add an example that does this with QtQuick/qml. The example is
relatively lazy, but still shows how relatively simple the integration
is. The FBO indirection could probably be avoided, but would require
more work (and would probably lead to worse QtQuick integration, because
it would have to ignore transformations like rotation).
Because this makes mpv directly use the host application's OpenGL
context, there is no platform specific code involved in mpv, except
for hw decoding interop.
main.qml is derived from some Qt example.
The following things are still missing:
- a way to do better video timing
- expose GL renderer options, allow changing them at runtime
- support for color equalizer controls
- support for screenshots
This wasn't done before because there was no advantage in "abstracting"
it. This changed, and putting this into its own files is better than
messing it into gl_common.c/h.
Same as with the previous commits.
In theory, vdpau/x11 GL interop doesn't assume GLX. It could use EGL as
well. But since it's always GLX in practice, so we're fine with this.
Remove the gl_hwdec.mpgl field - it's unused now.
Basically, don't access the vo field.
There's also no reason anymore to access MPGLContext. We still need to
access loaded GL functions though, so add a field for that to gl_hwdec.
Untested.
Always set the viewport on entry. The way the viewport is tracked is a
bit complicated in my opinion, and in fact it doesn't even reduce the
number of GL calls. Setting it on entry is actually redundant if video
covers the screen fully, because the handle_pass() unconditionally sets
it anyway, but avoiding it would complicate the cases gl->Clear() is
actually needed.
Add a fbo argument to gl_video_render_frame(). This allows you to render
into a FBO rather than the default framebuffer. It will be useful for
providing an API to render on an external GL context. (If that will
actually be added.)
Always create the context in mpgl_init(), instead of doing it when
mpgl_config_window() is called the first time. This is a small step
towards cleaning up the GL backend interface, and adding other things
like perhaps GLES support, or a callback-driven backend for libmpv.
This silences the warning:
video/out/gl_video.c:1091:51: runtime error: division by zero
when running with clang -fsanitize=undefined. Division by zero is legal
according to IEEE, but I guess clang doesn't care about standard. While
triggering this warning isn't actually avoided in all cases, it's
avoided in the common case and also makes people shut up about it.
Add a generic mechanism to the VO to relay "extra" events from VO to
player. Use it to notify the core of window resizes, which in turn will
be used to mark all affected properties ("window-scale" in this case) as
changed.
(I refrained from hacking this as internal command into input_ctx, or to
poll the state change, etc. - but in the end, maybe it would be best to
actually pass the client API context directly to the places where events
can happen.)
After removing synchronous libdispatch calls, this looks like it doesn't
deadlock anymore. I also experimented with pthread_mutex_trylock liek wm4
suggested, but it leads to some annoying black flickering. I will fallback to
that only if some new deadlocks are discovered.
Unfortunately using dispatch_sync for synchronization turned out to be really
bad for us. It caused a wide array of race conditions, deadlocks, etc.
Moving to a very simple mutex. It's not clear to me how to do liveresizing
with this, for now it just flickers with is unacceptable (maybe I'll draw
black instead).
This should fix all the threading cocoa bugs. Reopen if it's not the case!
Fixes#751Fixes#1129
bstr.c doesn't really deserve its own directory, and compat had just
a few files, most of which may as well be in osdep. There isn't really
any justification for these extra directories, so get rid of them.
The compat/libav.h was empty - just delete it. We changed our approach
to API compatibility, and will likely not need it anymore.
This uses glXGetVideoSyncSGI() to check how many vsyncs happened since
the last flip_page() call. It allows checking a pattern of vsync
increments of at most 2 elements. For example, to check ~24 fps playback
on a ~60 Hz monitor, this can be used:
--vo=opengl:check-pattern=[3-2]:waitvsync
Whether the reported results are accurate or just plain wrong may depend
on the driver and if the waitvsync sub-option is used. There are no
guarantees.
This option is undocumented, and may be removed again in the near or
distant future.
I'm not sure about the merit, though it does print nice numbers if debug
output is enabled.
Basically, this tries to achieve similar results as the glFinish()
business, but again it entirely depends on the drivers whether this
does anything meaningful, or whether it's actively harmful.
It seems that at least on nvidia systems with composting disabled, we
can get it to block deterministically on the actual vsync event, which
should improve framedropping.
The VO is run inside its own thread. It also does most of video timing.
The playloop hands the image data and a realtime timestamp to the VO,
and the VO does the rest.
In particular, this allows the playloop to do other things, instead of
blocking for video redraw. But if anything accesses the VO during video
timing, it will block.
This also fixes vo_sdl.c event handling; but that is only a side-effect,
since reimplementing the broken way would require more effort.
Also drop --softsleep. In theory, this option helps if the kernel's
sleeping mechanism is too inaccurate for video timing. In practice, I
haven't ever encountered a situation where it helps, and it just burns
CPU cycles. On the other hand it's probably actively harmful, because
it prevents the libavcodec decoder threads from doing real work.
Side note:
Originally, I intended that multiple frames can be queued to the VO. But
this is not done, due to problems with OSD and other certain features.
OSD in particular is simply designed in a way that it can be neither
timed nor copied, so you do have to render it into the video frame
before you can draw the next frame. (Subtitles have no such restriction.
sd_lavc was even updated to fix this.) It seems the right solution to
queuing multiple VO frames is rendering on VO-backed framebuffers, like
vo_vdpau.c does. This requires VO driver support, and is out of scope
of this commit.
As consequence, the VO has a queue size of 1. The existing video queue
is just needed to compute frame duration, and will be moved out in the
next commit.
OSD used to be not thread-safe at all, so a track was used to get it
redrawn. This mostly reverts commit 6a2a8880, because OSD not being
thread-safe was the non-trivial part of it.
Mostly untested, because this code path is used on OSX only, and I don't
have OSX.
Let the VOs draw the OSD on their own, instead of making OSD drawing a
separate VO driver call. Further, let it be the VOs responsibility to
request subtitles with the correct PTS. We also basically allow the VO
to request OSD/subtitles at any time.
OSX changes untested.
This turned out much more complicated than I thought. It's not just a
matter of adjusting the texture coordinates, but you also have to
consider separated scaling and panscan clipping, which make everything
complicated.
This actually still doesn't clip 100% correctly, but the bug is only
visible when rotating (or flipping with --vf=flip), and using something
like --video-pan-x/y at the same time.
This commit adds support for automatic selection of color profiles based on
the display where mpv is initialized, and automatically changes the color
profile when display is changed or the profile itself is changed from
System Preferences.
@UliZappe was responsible with the testing and implementation of a lot of this
commit, including the original implementation of `cocoa_get_icc_profile_path`
(See #594).
Fixes#594
Reduce most dependencies on struct mp_csp_details, which was a bad first
attempt at dealing with colorspace stuff. Instead, consistently use
mp_image_params.
Code which retrieves colorspace matrices from csputils.c still uses this
type, though.
Always pass around mp_log contexts in the option parser code. This of
course affects all users of this API as well.
In stream.c, pass a mp_null_log, because we can't do it properly yet.
This will be fixed later.
Since m_option.h and options.h are extremely often included, a lot of
files have to be changed.
Moving path.c/h to options/ is a bit questionable, but since this is
mainly about access to config files (which are also handled in
options/), it's probably ok.
The --flip option flipped the image upside-down, by trying to use VO
support, or if not available, by inserting a video filter. I'm not sure
why it existed. Maybe it was important in ancient times when VfW based
decoders output an image this way (but even then, flipping an image is a
free operation by negating the stride).
One nice thing about this is that it provided a possible path for
implementing video orientation, which is a feature we should probably
support eventually. The important part is that it would be for free for
VOs that support it, and would work even with hardware decoding.
But for now get rid of it. It's useless, trivial, stands in the way, and
supporting video orientation would require solving other problems first.
This allows vo_opengl to use GL_TEXTURE_RECTANGLE textures, either by
enabling it with the 'rectangle-textures' sub-option, or by having a
hwdec backend force it. By default it's off.
The _only_ reason we're adding this is because VDA can export rectangle
textures only.
This means most code accessing this struct must now include hwdec.h
instead of dec_video.h. I just put it into dec_video.h at first because
I thought a separate file would be a waste, but it's more proper to do
it this way, as there are too many files which include dec_video.h only
to get the mp_hwdec_info definition.
This initialized only the load_api and load_api_ctx fields, and left the
other fields as they were. This failed with vf_vavpp, which assumed all
fields are initialized.
Most hardware decoding APIs provide some OpenGL interop. This allows
using vo_opengl, without having to read the video data back from GPU.
This requires adding a backend for each hardware decoding API. (Each
backend is an entry in gl_hwdec_vaglx[].) The backends expose video data
as a set of OpenGL textures.
Add infrastructure to support this. The next commit will add support for
VA-API.
Keep track of the default values directly, instead of creating a new
instance of the option struct just to get the defaults.
Also get rid of the special handling of m_obj_desc.init_options.
Instead, handle it purely by the option parser. Originally, I wanted to
handle --vo=opengl-hq and --vo=direct3d_shaders with this (by making
them aliases to the real VOs with a different preset), but since --vo
=opengl-hq=help prints the wrong values (as consequence of the
simplification), I'm not doing that, and instead use something
different.
Improves display of images and video with alpha channel, especially if
the transparent regions contain (supposed to be invisible) garbage
color values.
Until now, video output levels (obscure feature, like using TV screens
that require RGB output in limited range, similar to YUY) still required
handling of VOCTRL_SET_YUV_COLORSPACE. Simplify this, and use the new
mp_image_params code. This gets rid of some code. VOCTRL_SET_YUV_COLORSPACE
is not needed at all anymore in VOs that use the reconfig callback. The
result of VOCTRL_GET_YUV_COLORSPACE is now used only used for the
colormatrix related properties (basically, for display on OSD). For
other VOs, VOCTRL_SET_YUV_COLORSPACE will be sent only once after config
instead of twice.
Change how m_config is initialized. Make it more uniform; now all
m_config structs are intialized in exactly the same way. Make sure
there's only a single m_option[] array defining the options, and keep
around the pointer to the optstruct default value, and the optstruct
size as well. This will allow reconstructing the option default values
in the following commit.
In particular, stop pretending that the handling of some special options
(like --profile, --v, and some others) is in any way elegant, and make
them explicit hacks. This is really more readable and easier to
understand than what was before, and simplifies the code.