Hey HN,
I’m conducting an experiment for certain models that I’m using often.
Here are the findings:
• Google Gemini 2.5 Flash: 595 ms avg
• Anthropic Claude Opus 4.6: 1887ms avg
• OpenAI GPT-4.1 Mini: 976 ms avg.
Planning quarterly deep-dive reports with P99 latency, TTFT, regional performance, and cost analysis for potential weekly/monthly insight reports. Let me know what else can be improved. Thanks