This replaces mutation of underlying bytes in the iterated slice with a shift counter, which is used when reading the head byte. This is avoids having to copy the entire slice for every new iterator.