Oh sure, we had many outages. More outages on the one service where we tried using loadbalancers because the loadbalancers would take a one hour break every 30 days (which is pretty shitty, but that was the load balancer available, unless we wanted to run a software load balancer, which didn't make any sense).
We didn't have many outages due to DNS, because we had fallback ips to contact chat in our clients. Usage was down in the 24 hours after our domain was briefly hijacked (thanks Network Solutions), and I think we lost some usage when our DNS provider was DDoSed by 'angry gamers'. But when FB broke most of their load balancers, that was a much bigger outage. BGP based outages broke everything, DNS and load balancers, so no wins there.
> We didn't have many outages due to DNS, because we had fallback ips to contact chat in our clients.
Exactly! When you control the client, you don't even need DNS. Things are actually even more secure when you don't use it, nothing to DDoS or hijack. When FB broke one set of LB's, the clients should have just routed to another set of LB's, by IP.
FB likes to break everything all at once anyway... And healtchecking the load balancers wasn't working either. So DNS to regional balancers was sending people to the wrong place, and the anycast ips might have worked if you were lucky, but you might have gotten a PoP that was broken.
The servers behind it were fine, if you could get to one. You could push broken DNS responses, I suppose, but it's harder than breaking a load balancer.
We didn't have many outages due to DNS, because we had fallback ips to contact chat in our clients. Usage was down in the 24 hours after our domain was briefly hijacked (thanks Network Solutions), and I think we lost some usage when our DNS provider was DDoSed by 'angry gamers'. But when FB broke most of their load balancers, that was a much bigger outage. BGP based outages broke everything, DNS and load balancers, so no wins there.