Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> But the author just took pictures of food & expected a realistic response?

There are very popular apps on the App Store right now that are going viral among non-techie people that do exactly this, and they have no concept of how AI works. My wife was talking about one and I had to give her a reality check that the AI had no idea what ingredients were used to make the food. And she's a licensed nutritionalist.

Studies like this create something to point at for people who are confused and serve as a springboard for a conversation in the media.

 help



That's true - I suppose i'm just dissapointed that this study hasn't seemed to include those within any analysis. Being able to point out that the top 100 calorie counting apps on the app store return similiar results to simple frontier models would be of interest.

I think i'm just dissapointed that this study doesn't go deep enough, and stays at a surface level statistical analysis of frontier models.


I think it’s a very useful study specifically to debunk the apps that support this flow.

None of those apps have magic. They cannot do better than the frontier models.


To be fair these kinds of apps also existed before LLMs. They just used OpenCV or similar instead of the LLM APIs.

To be fair my expectations is that those apps have done the prompt engineering, and schema, and tools (to query nutrition database), etc... and although they're not 100% consistent, the margin of errors should be narrow to the point that barely matter, and they should do a bit better than a random ChatGPT chat session.

the problem isn't one that can be solved with prompts. If I gave a panel of food and nutrition experts (human or machine) a bunch of pictures of food, they still wouldn't be able to tell if, e.g. a slice of cake was made with whole milk or skim.

The "pic of packaged food --> LLM --> nutrition DB call" pipeline is workable, but many users of these apps are using them for fresh prepared foods, which is just an unworkable problem without either an understanding of the preparation process or a bomb calorimeter.


The real benchmark should be comparing the amounts with a human guess. And aa far as I know with diabetes if you are within 30% of guessing carbs then you should be fine.

Even simpler examples make the limitations obvious. Images can't distinguish Diet Coke from Coke.

diet coke in a glass, or in a can?

cuz it says in big bold letters DIET COKE on the side of one of those.

OCR modules in AI is a thing, so I'd hope it could...


licensed nutritionalist

Nutritionist?


Haha oops. English is hard...



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: