OMG Ponies (Aka Humanity: Epic Fail)

arohner · on Nov 3, 2009

With every passing year, I'm more and more convinced a language approaching Haskell's pedantic-ness is A Good Thing. A language where Joda is the most convenient option, where all numerical quantities have units, and you have to make an explicit option to do "unsafe" things. Naturally, the goal of the language is to do as few unsafe operations as possible.

I think lisp is a good example: Most lisps have a full numeric tower that automatically converts between ints and bignums and ratios as appropriate. You don't have to think about the floating-point representation of 0.3 unless you wrote (double 0.3) in your code somewhere. Correctness should be the default, because it's much easier to trade some correctness for performance. Going the other way is much harder, and most people don't need performance.

This goes a long way, but now I notice the biggest obstacle is "legacy systems". The only time I get in trouble with the Clojure numeric tower is when I want to store a Ratio in the DB.

RyanMcGreal · on Nov 3, 2009

>a language approaching Haskell's pedantic-ness

I believe the word you're looking for is "pedantry".

/pedant

LogicHoleFlaw · on Nov 3, 2009

Can we get significant digit support too? That would make my life a happier place.

edit: the joke about the horse mispelling "main()" by force of habit made me chuckle.

_csoo · on Nov 6, 2009

What you're saying is that people will finally understand that thinking about a problem and preventing bugs before you begin is better than trying to fight fires afterwards.

eru · on Nov 3, 2009

Yes, Haskell is not pedantic enough!

seldo · on Nov 3, 2009

"due to rainfall thousands of miles away, my unit test had moved Greenland into Argentina. Fail."

Argentina's 11-day notice for changing their DST rules wreaked all sorts of unexpected havoc. It's sort of awesome (small worlds are more fun) and also totally exasperating.

kylec · on Nov 3, 2009

The video, for those that missed the link in the second paragraph: http://vimeo.com/7403673

KevBurnsJr · on Nov 3, 2009

WAAAAY more entertaining w/ actual sock puppets.

blasdel · on Nov 3, 2009

Unicode has its own special line terminator character as well, just for kicks

<clippy>It looks like you're trying to refer to NEL, which is in Unicode because it was distinct in EBCDIC, and who doesn't want lossless round-trips? </clippy>

die_sekte · on Nov 3, 2009

Unicode also has separate Line Breaks and Paragraph Breaks somewhere around U+2000.

mumrah · on Nov 3, 2009

It's important to note that choice of language affects a lot of these issues. For example, the unicode issue in Python:

  >>> s = unicode('Les Misérables','utf-8')
  >>> print s[::-1]
  selbarésiM seL

blasdel · on Nov 3, 2009

Nope, Python fucks this up just the same, even in Python 3:

  >>> print u'Les Mise\u0301rables'[::-1] #2
  >>> print('Les Mise\u0301rables'[::-1]) #3
  selbaŕesiM seL

Almost no implementation will fuck up LATIN SMALL LETTER E WITH ACUTE U+00E9, but nearly all programming languages will royally fuck up a COMBINING ACUTE ACCENT U+0301 even in much easier cases like string length. Almost all implementations that claim to be UTF-16 are actually UCS-2, and can't handle surrogates in the slightest.

simonw · on Nov 3, 2009

You can get the correct result by normalizing the string first:

    >>> import unicodedata
    >>> print unicodedata.normalize('NFC', u'Les Mise\u0301rables')[::-1]
    selbarésiM seL

fh · on Nov 3, 2009

That's in some sense even more broken: Not all combined characters have a normalized form, so the result is even less predictable. (I don't have an example ready.)

stilist · on Nov 3, 2009

Here’s a good series of articles on how it works in Ruby: http://blog.grayproductions.net/categories/character_encodin...

Apparently Ruby 1.9 handles it well.

(I wouldn’t be surprised if this had previously been linked on HN.)

gord · on Nov 3, 2009

I think the s is round the wrong way.

gord · on Nov 3, 2009

and the grave should be an acute.

gord · on Nov 3, 2009

and the r.

for the love of Mike, down-modder, look at the text in a mirror.

nomoresecrets · on Nov 3, 2009

I saw this presentation at the London DevDays where Jon gave it.

I was quite far back, and when he walked on stage with the sock puppet on, I assumed it was one of those wrist support gloves that you use when you get RSI because you won't just stop typing all day long, even though you're harming yourself.

After all, I thought, it's Jon Skeet - he never stops typing, surely? :-)

gchpaco · on Nov 3, 2009

I'm embarrassed. I admit it, I forgot Turkish upcases 'i' to İ. But the core teaching, which is "if it involves internationalization or talking to humans find a library written by a legitimate expert and use it instead" is true gold.

iron_ball · on Nov 3, 2009

Yeah, but who expects to need a special library for uppercasing some seemingly Latin text?

jmatt · on Nov 3, 2009

The article really was entertaining. These are all common problems for new C# programmers.

Doubles in .net are IEEE 64-bit (8-byte) double-precision floating-point numbers . I've written a few different comparers to deal with problems like the one he demoed. He should have typed it to be a decimal if it was currency or he wanted it to have an exact value instead of be a double-precision floating-point number.

As for the reverse on the string. He reversed the characters, which are defined in .net as char / byte. Not unicode. If you want to play nice with unicode in .net you need to parse it as unicode. A hugely common mistake for all of us who have done some internationalization in .net.

I remember or sorts of trouble with java.util.GregorianCalendar back in the day. Ugh.

I'm not defending .net or C#. I'm just not surprised he ran into these newbie problems. You guys should see the evilness that is Access, the inconsistencies there make this look like nothing. And it's a black art with almost no documentation anywhere.

barrkel · on Nov 3, 2009

Jon didn't run into most of these problems so much as answer others who did run into them on stackoverflow.com or the MS newsgroups.

WalkingDead · on Nov 4, 2009

> You guys should see the evilness that is Access, the inconsistencies there make this look like nothing. And it's a black art with almost no documentation anywhere.

Are you talking about MS Access? It's about 15 years old. There should be a lot of documentation. You need to look for old books form the last decade though.

nazgulnarsil · on Nov 3, 2009

he sums it up quite well at the end.

if you write a lot of code under a set of assumptions which then changes, you're in trouble.

I think the key question is: do my assumptions differ from my customers assumptions, if so where?

kuda · on Nov 3, 2009

I've gotten to the point where I automatically dislike anything containing the phrase "Epic Fail."

brown9-2 · on Nov 3, 2009

Same here, but you should really look past that in this case.