Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You wouldn't download the lot, methinks. Not unless you have a big ole cluster to handle it.

The index is only 12GB and contains enough metadata that you can whittle it down to a subset, pull the comments and filter based on those, and ultimately produce a list of photo IDs to grab from the collection. That's a couple of day's work for a grad student, it's not even Big Data.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: