Show HN: Independent monitoring of AI API reliability

Hey HN, I’m conducting an experiment for certain models that I’m using often.

Here are the findings: • Google Gemini 2.5 Flash: 595 ms avg • Anthropic Claude Opus 4.6: 1887ms avg • OpenAI GPT-4.1 Mini: 976 ms avg.

Planning quarterly deep-dive reports with P99 latency, TTFT, regional performance, and cost analysis for potential weekly/monthly insight reports. Let me know what else can be improved. Thanks