> I worry that if the Postgres people do make that change, they'll find themselves hearing from a different set of kernel developers that they should have known direct IO doesn't work properly and they should be using buffered IO instead.
That definitely will happen. But the fact remains that at the moment you'll get considerably higher performance when expertly using O_DIRECT, and there's nothing on the horizon to change that.
> In particular, I'd taken this bit as a suggestion that if people found problems with buffered IO then the right thing to do is to ask the kernel side to improve things, rather than switch:
I think partially that's just been overtaken by reality. A database is guaranteed to need its own buffer pool and you're a) going to have more information about recency in there b) the OS caching adds a good chunk of additional overhead. With buffered IO we (PostgreSQL) already had to add code to manage e.g. the amount of dirty data caching the OS does. The only reason DIO isn't always going to be beneficial after doing the necessary architectural improvements, is that the OS buffer pool is more adaptive in mixed use / not as well tuned databases.
That definitely will happen. But the fact remains that at the moment you'll get considerably higher performance when expertly using O_DIRECT, and there's nothing on the horizon to change that.
> For example this message from ten years ago, and other strongly-worded views in that thread: https://lkml.org/lkml/2007/1/10/235
> In particular, I'd taken this bit as a suggestion that if people found problems with buffered IO then the right thing to do is to ask the kernel side to improve things, rather than switch:
I think partially that's just been overtaken by reality. A database is guaranteed to need its own buffer pool and you're a) going to have more information about recency in there b) the OS caching adds a good chunk of additional overhead. With buffered IO we (PostgreSQL) already had to add code to manage e.g. the amount of dirty data caching the OS does. The only reason DIO isn't always going to be beneficial after doing the necessary architectural improvements, is that the OS buffer pool is more adaptive in mixed use / not as well tuned databases.