Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nah, the codebase just hadn't been touched by the FSF yet.

See also, "UNIX Style, or cat -v Considered Harmful" (http://harmful.cat-v.org/cat-v/).

It seems telling that the GNU echo's source is "derived from code echo.c in Bash."



i don't get while people are being snide about code that does more and so has more lines. a small amount of code is very nice and elegant, but if it doesn't do what people need then it's pointless.


I think their point is that you should separate it in different utilities/binaries which would be very simple and have less bugs, and let users combine them as they wish.

For example, instead of cat -v, you'd have a second utility called, 'nonprint' which would just translate non-printing chars, and you'd call it using

     cat file1 file2 | nonprint


"nonprint" existed as "vis" on some systems AFAIR.


The Unix Programming Environment has a discussion of vis on page 172. It notes that you can use 'sed -n l' to do the same thing.


yes


You are making the system more complicated for everyone because of features that only a few users know about. This is how code bloat starts its life cycle.

If you need more features from a basic utility like echo or cat you should create your own version, maybe with a slightly different name, and leave the original as it is.


That's exactly what happened.

Those crazy kids at Berkeley cooked up BSD, which was written to meet their needs and subsequently forked into a few variants. The GNU people made a GNU collection of core utilities that met their particular needs and desires.

The Unix nerds at my University felt as you did, and ran a Unix System 5 variant into the late 90's.


Anyone can find those features in the man page. I guarantee that the number of people who use those features is much larger than the number of people who read the source before today.

The UNIX style promoted in the "cat -v Considered Harmful" paper may have made sense at one time, but it doesn't make sense anymore. For example:

It seems that UNIX has become the victim of cancerous growth at the hands of organizations such as UCB. 4.2BSD is an order of magnitude larger than Version 5, but, Pike claims, not ten times better.

This logic gives the same consideration to people who are digging in the source code for these utilities as to people who actually use them. When you consider the relative numbers, that's a very elitist attitude (for some value of "elite").

Also consider the explanation given in another comment for why "cat -n" is unnecessary:

If you want to number a file's lines, 'echo ,n | ed file | sed 1d' or 'awk ''{ print NR " " $0 }''' will do just fine.

Munging text like that is a pretty common skill for Unix users, but by no means universal. If the man page for cat is pretty simple and readable, and the feature doesn't bloat the code to the point of causing maintenance problems, and there's somebody who's willing to write the code, then enabling "cat -n" a win for users.


Not a very good strategy to avoid bloat... Not even mentioning you will end up with 100 times the number of tools you have now. I would not call such a system more simple, indeed it would be inferior on all possible points.

In a real system, even basic utilities rarely looks like a CS101 homework result. This is perfectly fine, especially in this case the amount of feature in gnu echo is perfectly reasonable, and the size of the executable will likely depends on various header than code size when the code is so small anyway.


You are making the system more complicated for everyone because of a program that only a few users know about. That is how code bloat starts its life cycle.

If you need a basic utility like echo or cat, you should create your own version and don't bother others with it.


/bin/echo on debian sid (x86):

13 .text 000028dc 08048b90 08048b90 00000b90 24 CONTENTS, ALLOC, LOAD, READONLY, CODE

/bin/cat on debian sid (x86):

13 .text 0000775c 080491b0 080491b0 000011b0 24 CONTENTS, ALLOC, LOAD, READONLY, CODE

That's 12kb and 32kb, respectively. It may have been a lot back in the day, but it's plenty small enough on today's systems.


It's just good practice, in general, to keep individual programs simple. If you have a look at DMR's description of why the pipe was invented, it suddenly clicks.

All of these tools are intended to be composable, analogous to functions. 'cat -v' is like a function with too many arguments, one that does too much. If you need, for example, to allocate a block of zero'd memory, you don't add new flags to malloc(); you use memset() after allocation, you write a for loop, or you use calloc().

Likewise, the basic tools available on Unix can and should be thought of as functions, which take some number of arguments, and the implicit argument of an input channel. They produce as their output an integer as a return value and two output channels, stdout and stderr. Making a function that does too much (and this is as subjective for functions as it is for command-line tools) is known to be bad practice, but for the shell, it is often misunderstood. To misunderstand this is to misunderstand the core principals of the Unix environment.

It has nothing to do with the typical non-programmer user. On, say, Linux or OSX, the user doesn't write functions or talk to the shell very often. They click buttons in a GUI that doesn't in a meaningful sense offer composable programs, and it's an inefficient but simple way to interact with the machine, a way that matches their habits and understanding. cat, echo, sed, and awk aren't for these users; they're for programmers, and the typical user does not know or care whether cat can show non-printing characters, but as a programmer, I certainly care about a clean design for my environment.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: