If anybody's curious about this "public beta hosting - HN bear hug" performance load test:
I just (sequentially) restarted all 4 servers behind my 2 load balancers - seems that by reaching 100 containers in each geography the provisioning threads may have died after many (unsuccessful) retries, so need to restart the thread pools.
That is one more error I need to handle (or increasing my limit in AWS).
Heya - just a note that I build these kinds of PaaS platforms for a living - if you need a hand building a super-massive cluster for lots of users, let me know!