Honest questions: What kind of benchmark do others uses for a 'reasonable' response time? Of course it fully depends on the use-case (rendering a video can be hard in 500ms), but for user facing stuff?
In my previous startup we tried to stay within 500ms.
Not saying this isn't a great improvement, but to me 3s still sounds quite long? (not saying it's easy to do quicker!)
There's a difference between requests that the user "expects" to take a long time, and those that can never be fast enough. For example, POSTs, credit card transactions, and things like Wikipedia edits generally have lengthy forms prior to the request, and the user can tolerate a correspondingly-lengthy response time. I prefer 2s as a target for anything like that, and rely on a queue for asynchronously processing anything that takes longer.
For GET requests, particularly those reached by clicking a link from elsewhere on the site, faster is better... Luckily, many of these types of requests can leverage a cache.
It depends completely on the application. It's also usually best to focus on the tail end -- I've always found alerts on 95th or 99th percentile latency useful and easy to decide on thresholds for (ask yourself -- how slow can I tolerate this being for 1% or 5% of users?)