Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Cheap writes, in an OLTP sense?

Sure, but HIBP is most certainly not an OLTP workload. It is updated infrequently as a batch process. The official mirror was last modified in July 2019!

Whenever a "new set" of millions of passwords are leaked, the HIBP maintainer(s) merge it with their existing data set of millions of passwords.

The "update" process is to download the new data set... and that's it. It's already sorted.

The only step I'm suggesting is to simply convert the pre-sorted HIBP SHA1 text file to a flat binary file. This takes a constant time and requires only a tiny buffer in memory.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: