It looks like pgBackRest will likely continue, multiple companies are stepping up with sponsorships. Mentioning this just in case anyone is making plans to move away, it's probably worth waiting a bit for things to settle.
I do think that as service providers we now have a new "attack vector" to be worried about. Up to now, having an API that deletes the whole volume, including backups, might have been acceptable, because generally users won't do such a destructive action via the API or if they do, they likely understand the consequences. Or at the very least don't complain if they do it without reading the docs carefully enough.
But now agents are overly eager to solve the problem and can be quite resourceful in finding an API to "start from clean-slate" to fix it.
> Up to now, having an API that deletes the whole volume, including backups, might have been acceptable
It was never acceptable, major service providers figured this out long time ago and added all sorts of guardrails long before LLMs. Other providers will learn from their own mistakes, or not.
I agree and hope this is the case for anything serious enough. I also don't see this changing any time soon.
There are ways to give safe access to the data, at least read-only, that don't involve production risk and don't sacrifice privacy. For example, database branches with anonymization. Instead of accessing the prod/staging db, the agent creates a branch and has read/write access to that.
(disclaimer: I work at Xata, where we offer copy-on-write branches for Postgres, and the agent use-cases are the most popular right now)
Looks interesting! Do you have ClickBench results or similar?
> Everything in core, no extensions. HTTP(S), S3 (anonymous public reads), Avro, Excel, Arrow, and SQLite read through the same core binary - no separate install/load step.
That is not so good for an embedded database, though, opens security concerns.
I will follow this one for sure. There are a few more companies with the extremely ambitious goal of "a better AWS", and I am interested in the various strategies they take to approach that goal incrementally.
A service offering VMs for $20 is a long way from AWS, but I see how it makes sense as a first step. AWS also started with EC2, but in a completely different environment with no competition.
> You don’t need anything but vanilla pg and a supported file system to do it anymore; just clone the database using a template and a newish version of Postgres.
What I'm saying there is that if you do Postgres with on top of a local ZFS volume, the child branches Postgres instances need to be on the same server. So you are limited in how many branches you can do. One or two are fine, but if you want to do a branch per PR, that will likely not work.
If you separate the compute from storage via the network, this problem goes away.
ZFS snapshots can be transmitted over the network, with some diff-only and deduplication gains if the remote destination has an older instance of the same ZFS filesystem. It’s not perfect, and the worst case is still a full copy, but the tooling and efficiency wins for the ordinary case are battle-tested and capable.
Yes, for sure, and stuff like this is really useful when rebalancing storage nodes, for example.
My point is that for the use case of offering a Postgres service with CoW branching as a key feature, you can't really escape some form of separation of storage and compute.
Btw, don't really want to talk too much about it yet, but our proprietary storage engine (Xatastor) is basically ZFS exposed over NVMe-OF. We'll announce it in a couple of weeks, and we'll have a detailed technical blog post then on pros/cons.
You're still making the assumption in this comment: why does my 2nd (cloned) database need a separate postgres instance? One postgres server can host multiple databases.
Got it, yes, I've seen in the other comment that you're referring to the new Postgres 18 feature. If that works for you in local dev, so much the better :)