failure to do so was causing crashes on x86_64 when ctors used SSE,
which was first observed when ctors called variadic functions due to
the SSE prologue code inserted into every variadic function.
looks like nik copied these "extra arguments" from the i386 code.
they're not actually arguments there, just 1-byte instructions to
make sure the stack is aligned to 16 bytes after all the other
arguments are pushed. since each push is 8 bytes on x86_64, they
happened to have no effect here, but their presence is confusing and a
minor waste of space.
this is mainly in hopes of supporting c++ (not yet possible for other
reasons) but will also help applications/libraries which use (and more
often, abuse) the gcc __attribute__((__constructor__)) feature in "C"
code.
x86_64 and arm versions of the new startup asm are untested and may
have minor problems.