sbase

Commit Graph

Author	SHA1	Message	Date
Michael Forney	e5284b1537	sort: Don't do fallback top-level sort in check mode The fallback useful to provide a consistent order of tied lines, but in check mode, we don't want it to report disorder for equal lines (according to the passed flags). Thanks to Richard Ipsum for the bug report and proposed patch.	2020-01-03 15:42:33 -08:00
Michael Forney	e9bfb97808	sort: Consider end field in keydef when additional fields are present Currently, if the delimiter is found after the last field of a keydef, only up to the beginning of the field is considered. This breaks `sort -k N,N`, as well as whenever the sorted order comes down to that last field. Thanks to Richard Ipsum for the bug report and proposed patch.	2020-01-01 12:23:48 -08:00
Richard Ipsum	57c9cab849	sort: Fix string length update math	2019-12-31 13:32:25 -08:00
Michael Forney	6b950e436b	sort: Use regular `double` for -n `long double` may require software emulation and the (possible) extra precision is unnecessary here.	2019-03-13 11:59:33 -07:00
Michael Forney	a944b682a6	sort: Fix line comparison when col buffer contains data from longer line I'm not sure if there are other implications of this or not, but the issue is that columns() uses len to store the allocated buffer size, but linecmp() compares up to len bytes. If those trailing bytes do not match, the line is considered not matching, even though the relevant parts of the buffer do match. To resolve this, also keep track of column capacity. Additionally, since there is no reason to keep the existing data when resizing, just use free and emalloc rather than erealloc. The simplest case I could reduce it to is this: if [ "$(printf '%s\n' a a xxb xxc \| ./sort -u)" = "$(printf '%s\n' a xxb xxc)" ] ; then echo pass else echo fail fi	2016-07-09 10:09:50 +01:00
Michael Forney	75611997f9	sort: Fix -c option In `eb9bda8787`, a bug was introduced in the handling of -1 return values from getline. Since the type of the len field in struct line is unsigned, the break condition was never true. This caused sort -c to never succeed.	2016-03-13 11:08:36 +00:00
FRIGN	5ad71a466b	Error out when giving an empty delimiter to sort(1)	2016-03-10 08:48:09 +00:00
FRIGN	0fa5a3e5bb	Rename struct linebufline to struct line and add linecmp() This simplifies the handling in sort(1) and comm(1) quite a bit.	2016-03-10 08:48:09 +00:00
FRIGN	54d3f3b3a5	Rename linecmp and line-structs in join(1) and sort(1) We will steal the names for the global functions.	2016-03-10 08:48:09 +00:00
FRIGN	d585d4b028	No need for += when res is 0 anyway	2016-03-10 08:48:09 +00:00
FRIGN	9d120b7b32	Actually move past the field separator Previously, sort(1) failed on key-based sorting and was caught in an infinite loop with the c-flag.	2016-03-10 08:48:09 +00:00
FRIGN	0e25f09b56	Remove debug info	2016-03-10 08:48:09 +00:00
FRIGN	eb9bda8787	Support NUL-containing lines in sort(1) For sort(1) we need memmem(), which I imported from OpenBSD. Inside sort(1), the changes involved working with the explicit lengths given by getlines() earlier and rewriting some of the functions. Now we can handle NUL-characters in the input just fine.	2016-03-10 08:48:09 +00:00
pekka.jylha.ollila@gmail.com	fad1d35357	Add -d, -f and -i flags to sort(1) Here's the patch with updated manpage and usage().	2016-02-16 09:56:48 +00:00
sin	2366164de7	No need for semicolon after ARGEND This is also the style used in Plan 9.	2015-11-01 10:18:55 +00:00
FRIGN	51390a3c51	Audit sort(1) and mark it as finished 1) Remove the function prototypes. No need for them, as the functions are ordered. 2) Add fieldseplen, so the length of the field-separator is not calculated nearly each time skipcolumn() is called. 3) rename next_col to skip_to_next_col so the purpose is clear, also reorder the conditional accordingly. 4) Put parentheses around certain ternary expressions. 5) BUGFIX: Don't just exit() in check(), but make it return something, so we can cleanly fshut() everything. 6) OFF-POSIX: Posix for no apparent reason does not allow more than one file when the -c or -C flags are given. This can be problematic when you want to check multiple files. With the change 5), rewriting check() to return a value, I went off-posix after discussing this with Dimitris to just allow arbitrary numbers of files. Obviously, this does not break scripts and is convenient for everybody who wants to quickly check a big amount of files. As soon as 1 file is "unsorted", the return value is 1, as expected. For convenience reasons, check()'s warning now includes the filename. 7) BUGFIX: Set ret to 2 instead of 1 when the fshut(fp, *argv) fails. 8) BUGFIX: Don't forget to fshut stderr at the end. This would improperly return 1 in the following case: $ sort -c unsorted_file 2> /dev/full 9) Other style changes, line length, empty line before return.	2015-08-04 12:08:13 +01:00
FRIGN	e153447657	Make sort(1) utf-compliant and update README Make it clear that <blank> characters just are spaces or tabs and not a special group which needs special treatment for wide characters. Also, and that was the only problem here, correctly calculate the offset given by the key definitions for the start- and end-characters using libutf-utility-functions. Mark the progress in the README and put parentheses around the missing flags which are insane to implement for no real gain.	2015-08-03 19:14:52 +01:00
FRIGN	1622089a21	Reorder functions in sort(1) I kind of missed that the sorting was still not properly done. parse_flags() and addkeydef() are independent of everything else, so they can be put at the bottom. Sorting the other functions reveals the true hierarchy much better.	2015-08-03 10:00:00 +01:00
FRIGN	61ee561728	Factor out parse_keydef() into addkeydef() and reorder functions Add a small comment explaining the data-structure and sort the functions according to usage, not alphabetically.	2015-08-03 10:00:00 +01:00
FRIGN	e00cdf226a	Use queue.h-macros in sort(1) This is much easier to read than having yet another handrolled list implementation. Tested and more or less clearly equivalent. Now that I have uni-vac, I'll have enough time to refactor more.	2015-08-02 23:32:17 +01:00
FRIGN	d23cc72490	Simplify return & fshut() logic Get rid of the !!()-constructs and use ret where available (or introduce it). In some cases, there would be an "abort" on the first fshut-error, but we want to close all files and report all warnings and then quit, not just the warning for the first file.	2015-05-26 16:41:43 +01:00
FRIGN	9a074144c9	Remove handrolled strcmp()'s Favor readability over bare-metal.	2015-05-21 15:43:38 +01:00
FRIGN	0545d32ce9	Handle '-' consistently In general, POSIX does not define /dev/std{in, out, err} because it does not want to depend on the dev-filesystem. For utilities, it thus introduced the '-'-keyword to denote standard input (and output in some cases) and the programs have to deal with it accordingly. Sadly, the design of many tools doesn't allow strict shell-redirections and many scripts don't even use this feature when possible. Thus, we made the decision to implement it consistently across all tools where it makes sense (namely those which read files). Along the way, I spotted some behavioural bugs in libutil/crypt.c and others where it was forgotten to fshut the files after use.	2015-05-16 13:34:00 +01:00
Hiltjo Posthuma	72250324b1	sort: reuse buffer in columns() speeds up sorting for huge input aswell.	2015-05-07 18:18:35 +01:00
Jakob Kramer	403b047a30	sort: allow keys where start_col > end_col Useful in (rare) cases like: $ printf 'aaaa c\nx a\n0 b\n' \| sort -k 2,1.3 And this is how POSIX wants it.	2015-04-06 17:15:54 +01:00
Jakob Kramer	061932a31b	sort: allow 0 as key's end_char	2015-04-06 17:15:54 +01:00
Jakob Kramer	bddb7200b8	sort: apply -b only to "custom" keys	2015-04-06 17:15:54 +01:00
Jakob Kramer	2d9d224a1b	sort: add support for delimiter strings Instead of just single characters. This also fixes some bugs in columns(). Example bug: $ printf "a b\nc b x\n" \| sort -k 2,2 -k 1,1	2015-04-06 17:15:54 +01:00
FRIGN	11e2d472bf	Add *fshut() functions to properly flush file streams This has been a known issue for a long time. Example: printf "word" > /dev/full wouldn't report there's not enough space on the device. This is due to the fact that every libc has internal buffers for stdout which store fragments of written data until they reach a certain size or on some callback to flush them all at once to the kernel. You can force the libc to flush them with fflush(). In case flushing fails, you can check the return value of fflush() and report an error. However, previously, sbase didn't have such checks and without fflush(), the libc silently flushes the buffers on exit without checking the errors. No offense, but there's no way for the libc to report errors in the exit- condition. GNU coreutils solve this by having onexit-callbacks to handle the flushing and report issues, but they have obvious deficiencies. After long discussions on IRC, we came to the conclusion that checking the return value of every io-function would be a bit too much, and having a general-purpose fclose-wrapper would be the best way to go. It turned out that fclose() alone is not enough to detect errors. The right way to do it is to fflush() + check ferror on the fp and then to a fclose(). This is what fshut does and that's how it's done before each return. The return value is obviously affected, reporting an error in case a flush or close failed, but also when reading failed for some reason, the error- state is caught. the !!( ... + ...) construction is used to call all functions inside the brackets and not "terminating" on the first. We want errors to be reported, but there's no reason to stop flushing buffers when one other file buffer has issues. Obviously, functionales come before the flush and ret-logic comes after to prevent early exits as well without reporting warnings if there are any. One more advantage of fshut() is that it is even able to report errors on obscure NFS-setups which the other coreutils are unable to detect, because they only check the return-value of fflush() and fclose(), not ferror() as well.	2015-04-05 09:13:56 +01:00
FRIGN	9144d51594	Check getline()-return-values properly It's not useful when 0 is returned anyway, so be sure that we have a string with length > 0, this also solves some indexing-gotchas like "len - 1" and so on. Also, add checked getline()'s whenever it has been forgotten and clean up the error-messages.	2015-03-27 14:49:48 +01:00
FRIGN	df8529f0a1	Fix syntax error in sort(1) Somehow went unnoticed...	2015-03-23 20:30:07 +01:00
FRIGN	49e27c1b0c	Add -m and -o flags to sort(1) Sort comes pretty much automatically, as no script relies on the undefined behaviour of the input _not_ being sorted, we might as well sort the sorted input already. The only downside is memory usage, which can be an issue for large files. The o-flag was trivial to implement.	2015-03-22 23:39:48 +01:00
Hiltjo Posthuma	ad6776e9a1	grep, kill, renice, sort: style: put main at bottom	2015-03-08 12:51:33 +01:00
Hiltjo Posthuma	31f0624f3d	code-style: minor cleanup and nitpicking	2015-02-20 13:29:38 +01:00
FRIGN	31572c8b0e	Clean up #includes	2015-02-14 21:12:23 +01:00
Jakob Kramer	0fcad66c75	make use of en*alloc functions	2015-02-11 01:17:21 +00:00
Jakob Kramer	4769b47dd7	Use size_t for number of lines in linebuf .nlines and .capacity are used as array indices and should therefore be of type size_t.	2015-01-31 22:49:43 +00:00
Jakob Kramer	572ad27110	sort: support sorting decimal numbers correctly sorry not to have used strtold from the beginning	2015-01-31 19:19:55 +00:00
sin	153b8428b1	Nuke another freelist()	2014-12-16 21:02:03 +00:00
Michael Forney	cb427d553a	sort: Implement -c and -C flags	2014-11-23 19:42:14 +00:00
FRIGN	1436518f9d	Use < 0 instead of == -1	2014-11-19 20:09:29 +00:00
FRIGN	7fc5856e64	Tweak NULL-pointer checks Use !p and p when comparing pointers as opposed to explicit checks against NULL. This is generally easier to read.	2014-11-14 10:54:30 +00:00
FRIGN	ec8246bbc6	Un-boolify sbase It actually makes the binaries smaller, the code easier to read (gems like "val == true", "val == false" are gone) and actually predictable in the sense of that we actually know what we're working with (one bitwise operator was quite adventurous and should now be fixed). This is also more consistent with the other suckless projects around which don't use boolean types.	2014-11-14 10:54:20 +00:00
FRIGN	eee98ed3a4	Fix coding style It was about damn time. Consistency is very important in such a big codebase.	2014-11-13 18:08:43 +00:00
sin	0c5b7b9155	Stop using EXIT_{SUCCESS,FAILURE}	2014-10-02 23:46:59 +01:00
sin	b712ef44ad	Fix warning 'array subscript of type char'	2014-09-02 13:32:32 +01:00
Jakob Kramer	7d1fd2621e	add -t flag to sort	2014-06-02 13:35:59 +01:00
Jakob Kramer	9366f48b1f	sort: simplify linecmp, rename curr => tail	2014-05-06 18:01:44 +01:00
Jakob Kramer	6f7e9a5078	sort: add support for "per-keydef" flags	2014-05-06 16:21:50 +01:00
Jakob Kramer	109e8963f5	sort: ignore trailing newline while sorting	2014-05-06 16:21:45 +01:00

1 2

77 Commits