Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So, how to do offset w/o using OFFSET?


Include the last id (which should be indexed) of the previous page in the next where filter.

https://use-the-index-luke.com/sql/partial-results/fetch-nex...


That's pagination, not indexed page scanning. Both have their place but they're not the same. Pagination is way better to handle updates between page loads and generally more complicated to implement. As you're now doing head tail index cursor tracking. Flat boring offset/limit is amazingly simple for the happy lazy path which is probably fine for most apps.


Make a query whose parameters exclude the previous page results altogether. I learned about this from here: https://www.citusdata.com/blog/2016/03/30/five-ways-to-pagin...


If you need to iterate over all records, just do it? Why do you need offset.

Otherwise using offset usually is OK idea. Because users very rarely will inspect page #2153. They're interested with page 1, sometimes page 2. limit/offset works fine for those cases and it'll work for page 2153 for those who visit it once in a decade. Using ids makes logic to trac prev/next/page number incredibly complex and generally you don't need it.


> If you need to iterate over all records, just do it?

Who is "you" here?

Usually what happens is that party A builds a REST API (or other connectionless protocol) for fetching a list of some entity-type; and they limit the number of items that can be fetched in one request (because they have a billion such entities, and they don't want to even try to imagine what kind of system would be necessary to generate and stream back a 5TB JSON response); which implies pagination, to get at "the rest of" the entities.

Then party B, a user of this API, decides that they want to suck party A's whole billion-entity database out through the straw of that REST API, by scraping their way through each page.

> it'll work for page 2153 for those who visit it once in a decade

To be looking at page 2153, the user probably first visited every page before 2153. Which means they didn't do one O(N) query (which would by itself be fine); but rather, they made O(N) requests that each did an O(N) query.


I regularly change 3 to 2 in /new/345689 links when bored with today’s content.

Using ids makes logic to trac prev/next/page number incredibly complex and generally you don't need it.

When it’s a public site, users may post so fast that this “next” can show some previous page. Paging via ids is a must there.


> “next” can show some previous page

That is usually a non-issue. The cost in DB operations is usually much more relevant than it.

When people do actually care about fully enumeration and unicity of the items they they are querying, "pagination" itself tends to be a too messy concept.


The cost in DB operations is usually much more relevant than it.

As a result, a user (all of them) hits “next” again until a page looks like containing new posts. It’s multiple requests wasted.

Anyway, what exactly becomes messy?


What a "page" means when enough things appear between them that they push stuff a page away? Are those things reordered? Do people expect to see the new things before they get to the old?

A "page" is a very ill-defined thing that can only exist if your stuff is mostly static. Queries are very different on dynamic content.


You’re overthinking (overquestioning) it. When a user hits “next”, they want to see next N posts from where they at, in order they chosen before, that’s it.

Since there’s no evidence of a mess still, I believe you’re projecting it from an overthought side that isn’t real.


Use a sorted index (e.g. Btree), and use those values to quickly find the start of the next page of results.

For good performance this also requires that your search criteria (if any) be compatible with this index.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: