Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This looks interesting. But I was a bit disappointed about the amount of work needed to divide the job into subtasks (finding the characters, cropping them, etcetera). I was somehow hoping the ANN would take care of that as well.


Having a single ANN do everything is the best way to end up with a system that works wonders 95% of the time, but then you give it a simple 1+1 question and it answers 42.

It would also require way more neurons, and a lot of processing power. Consider the curse of dimensionality: he's working with a 5000ish dimensions vector, you have to make it simpler on the machine at some point!


But I can also recognize digits by correlating them against a database of millions of test-characters. Recognizing perfectly cropped characters is not a hard problem. The only benefit of an ANN is the compactness of the representation. So, unfortunately, as a non-expert, I would say this is only a minor step in the direction of true artificial intelligence. The application looks quite fun and interesting though.


That's because you're confusing ANN with General/Hard AI. It's not. ANN's are a tool which excells in clasically difficult tasks (pattern matching, for example). But since the AI winters we've learend that, as good as they're for pattern matching, we can't rely on them to do all the work.

And I would like to disagree with you. A system, with many small subsystems dedicated to specific tasks, is not only simpler to develop, but also better from an engineering point of view.

To put in a practical example: do you use the same "parts" of the brain to read a poem and to interpret a mathematical formula? If you would, you'd be quite bad at both things. your brain has specialized "parts" (not necesarily physical parts) to interpret correctly different things. Why should we not do the same with our AI systems?


The author probably could have used an ANN to segment the images as well before feeding it into another NN to classify the images. This has the benefit of re-using the character recognition network, which would better simulate what humans do, since we read new equations character by character into memory (unless we've seen so many that we begin chunking them like speed readers do with word groups)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: