NEVER, EVER, NOT IN A MILLION YEARS use a signed int/char etc, unless you are 200% certain you're doing the right thing (that is, it's for something specific you need it)
You WILL have problems, period.
"Oh it's just a matter of knowing the C spec" then please go ahead as I grab the popcorn.
"For something specific you need it" meaning ... a negative number, like an array or memory address offset? I mean, sure, I agree that you should be doing anything sensitive to 2's complement behavior on unsigned quantities. And if you know the values are positive-definite, unsigned has much cleaner behavior. And I'd even entertain the argument that unsigned should have been the default in the language spec and that signed quantities are the ones that should have specially declared types.
But... your advice as written is just insane. They are real, and required routinely. You can't just apply this as a "for dummies" rule without essentially ruling out half of the code they'll need to write.
> For all other cases you would be using floating point or decimal numbers
If I'm trying to avoid mathematical anomalies, floating point is not what I would run to... "Equal" is a matter of degree, you have to be careful with anything near zero and you can't carelessly mix together numbers that are a few orders of magnitude different than each other.
signed usually promotes to unsigned in nice ways, such that if you really want to store -1 in your unsigned, it will just work. I've found using all unsigned types, with the occasional signed value jammed into them, is less error prone than mixing and matching types throughout the program. ymmv, and of course, here be the odd dragon or two.
My goodness, no, this is terrible advice. Never do this. Go fix all the code you wrote immediately. It is full of security vulnerabilities. I'm not kidding, this is so bad.
I'm pretty confused. I didn't know who Ted was, but a quick google search shows he is an OpenBSD dev and worked for Coverity. Coverity itself will flag this error. Now you are backing up that position. Historically, this exact thing has been the cause of many security vulnerabilities. It's especially precarious with the POSIX API due to many functions returning negative for error. I recall OpenSSH making sweeping changes to rid the code of signed ints being used for error, for this reason.
Can you explain why you would advocate this? Am I misunderstanding you, or missing something?
I replied to the other comment in this thread with an openbsd vulnerability caused by doing what is being advocated (I did choose openbsd to be funny).
Looks to me like the vulnerability you linked to demonstrates the exact opposite of what you think it demonstrates: The problem is that an unsigned value (a length) was being stored in a signed integer type, allowing it to pass a validation check by masquerading as a negative value.
Well, no, select() takes a signed value for the length (it is actually not a length, but the number of descriptors, later used to derive a length), and there is no changing that interface obviously. This is the source of the "negative value" in this example. The problem arises because internally, openbsd "jammed a negative value into an unsigned int", as Ted put it, and made it a very large positive value, leading to an overflow.
If the bounds check was performed after casting to unsigned, there would have been no problem. The vulnerability occurred because a bounds check was incorrectly performed on a signed value.
Ah, thanks. From reading the summary, it seems that case would have been prevented by using signed integers throughout, but would also have been prevented by using unsigned integers throughout?
I've seen more problems from unsigned ints than signed ints (in particular, people doing things like comparing with n-1 in loop boundary conditions). There's a reason Java, C# etc. default to signed integers. Unsigned chars, I have no quibble (and Java, C#, use an unsigned byte here).
Unsigned integer overflow has defined behaviour in C, while signed overflow doesn't. Is it really better to protect people from a simple logical error by exposing them to possible undefined behaviour?
With signed integers, you'll run into the same problem with comparing to n+1 at INT_MAX or n-1 at INT_MIN.
Signed arithmetic is generally only problematic when you hit the upper or lower bound. The right answer is almost never to use unsigned; instead, it's to use a wider signed type.
It's far too easy to get things wrong when you add unsigned integers into the mix; ever compare a size_t with a ptrdiff_t? Comes up all the time when you're working with resizable buffers, arrays, etc.
"Gosling: For me as a language designer, which I don't really count myself as these days, what "simple" really ended up meaning was could I expect J. Random Developer to hold the spec in his head. That definition says that, for instance, Java isn't -- and in fact a lot of these languages end up with a lot of corner cases, things that nobody really understands. Quiz any C developer about unsigned, and pretty soon you discover that almost no C developers actually understand what goes on with unsigned, what unsigned arithmetic is. Things like that made C complex. The language part of Java is, I think, pretty simple. The libraries you have to look up."
Unsigned is useful in a handful of situations: when on a 32-bit machine dealing with >2GB of address space; bit twiddling where you don't want any sign extension interference; and hardware / network / file protocols and formats where things are defined as unsigned quantities. But most of the time, it's more trouble than it's worth.
"The right answer is almost never to use unsigned; instead, it's to use a wider signed type.""
Depends on the case of course, but yes, you'll rarely hit the limits in an int (in a char, all the time)
"It's far too easy to get things wrong when you add unsigned integers"
I disagree, it's very easy to get things wrong when dealing with signed
Why? For example, (x+1) < x is never true on unsigned ints. Now, think x may be a user provided value. See where this is going? Integer overflow exploit
Edit: stupid me, of course x+1 < x can be true on unsigned. But unsigned makes it easier (because you don't need to test for x < 0)
"what unsigned arithmetic is"
This is computing 101, really (well, signed arithmetic as well). Then you have people who don't know what signed or unsigned is developing code. Sure, signed is more natural, but the limits are there, and then you end up with people who don't get why the sum of two positive numbers is a negative one.
As I said, if I'm not convincing please go ahead and used signed numbers as I get the popcorn ;)
Here's something you can try: resize a picture (a pure bitmap), with antialiasing, in a very slow machine (think 300MHz VIA x86). Difficulty: without libraries.
Java actually uses a signed byte type, which was I guess to keep a common theme along with all the other types, but in practice leads to a lot of b & 0xFF, upcasting to int etc when dealing with bytes coming from streams.
There's a certain aspect of "functions have domains, not just ranges" at work here as well -- e.g. - restricting the (math) tan() function to the domain of -90 to 90 degrees (exclusive), unless you really get off on watching it cycle over madness. If you are going to be playing around the edges of something, it behooves you to put some kind of pre-condition in with an assert of similar mechanism.
In fairness, I guess a function like this is a good example of why you should put in preconditions, as well as a good demonstration that "not all the world is a VAX" (nor MS C 7, nor GCC version N) :-)
Actually, a colleague recently convinced me to start using signed ints for, e.g., for loops instead of unsigned ints. His reasoning was that if you're overflowing with signed integers, you'll probably overflow with unsigned integers too (we work with very large numbers), but it's easier to notice if you have a negative number rather than silently wrapping around to a still-valid value.
99% of the time, you should not be using any sort of int for a loop variable. Loop variables are almost always array indexes, and array indexes should be size_t.
It's not just the C spec you've got to watch. I saw a wonderful bug last week where the author hadn't spotted that write(2) returned a ssize_t, not a size_t (or didn't appreciate the difference), so was gleefully checking an unsigned variable for a -1 error result.
How did the bug manifest itself? You can store 0xffffffff(ffffffff) in a 32(64)-bit unsigned int, or a 32(64)-bit signed int. In the one case you'll get UINT_MAX, -1 in the other, but they should compare as equal. If you have -Wextra turned on, gcc will give a warning, though.
Here's some sample C code tested on a 32-bit Linux system:
C is my absolute favorite language, and as such, I learned a long time ago to pay very close attention to compiler warnings and Valgrind memory errors.
No, you get room for one more negative int. x = x < 0 ? -x : x; is a common buggy abs(), e.g. when printing an integer. One should test if it's negative, perhaps print the '-', and then ensure it's negative for the rest of the per-digit code.
NEVER, EVER, NOT IN A MILLION YEARS use a signed int/char etc, unless you are 200% certain you're doing the right thing (that is, it's for something specific you need it)
You WILL have problems, period.
"Oh it's just a matter of knowing the C spec" then please go ahead as I grab the popcorn.