Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We migrated some services to AKS because the upper management thought it was a good deal to get so many credits, and now pods are randomly crashing and database nodes have random spikes in disk latency. What ran reliably on GCP became quite unpredictable.


Exact same story at my place. Upper management decided it's a good idea to build on Azure because Microsoft promised some benefits. Things that ran reliable on GCP now need active firefighting on Azure


Interesting! We're using AKS with huge success so far, but lately our Pods are unresponsive and we get 503 Gateway Timeouts that we really can't trace down. And don't get me started on Azure Blob Tables...


In our case this was only a month ago, and now we're stuck because management thought it was a good idea to sign a hefty spend commitment.


In our case, we spent to much time of engineer time just to put up with Azure but there’s no good ROI. It took sometime for the upper management to realize Azure is shit and cut the cost


Don't they have an SLA? You can break that open if they don't perform.


To what end? I've never seen an SLA which is clear cut enough to be worth pursuing if you want more than a free t-shirt.


> I've never seen an SLA which is clear cut enough to be worth pursuing if you want more than a free t-shirt.

I have, regularly. I am not sure what kind of business you are running but parties that rely on service providers for critical (primary business process driving) components routinely agree to SLAs with large penalties and the ability to open up an existing contract in case of non-performance. Obviously you would have to be willing to pay for such a service in the first place otherwise there is no point in setting up an SLA, this won't be cheap. But we're definitely not talking about 'free t-shirts' here, more about direct liability, per hour penalties and so on.


I'm thinking ISPs, colo, cloud.

By the time SLA thresholds are being breached you've been through months (or years) of pain. They're not strong enough or specific enough to save you from a bad provider. ymmv


Colo and cloud providers that provide real SLAs exist. But they're pricey because they tend to insure against breach of that that SLA and they pass on the cost of that insurance. If you're a run-of-the-mill e-commerce company then it probably doesn't make much sense. But if you yourself are providing critical services to others and they have you by the short hairs in case you don't perform you better make sure that you're not going to end up holding the bag.

One simple example: energy market services, 15 minute ahead and day ahead markets require participants to have the ability to perform or they will be penalized severely, to the point where they can lose that access, the damage of which could easily be in the 10's of millions to 100's of millions depending on their size. Asset owners and utilities both would be able to hit them hard if they do not perform, the asset owners for lost income and the utilities for both government penalties and possibly for outages and all associated costs. These are not the kind of contracts you enter into lightly.


Exactly what I was thinking. But then again, from what I've seen, the persons responsible for monitoring uptimes are often much further removed from the C suite in these "committed-spend" companies.


Gcp is hard to beat on k8s stuff. Performance and stability is crazy good.

But it's not aws are famous and costs money. Hence moving away seems like a good idea :)




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: