Scaling PostgreSQL to power 800 million ChatGPT users

Original URL: https://openai.com/index/scaling-postgresql/

Article Written: January 22, 2026

Added: March 19, 2026

Type: project

Summary

The article discusses how OpenAI has successfully scaled PostgreSQL to handle the demands of 800 million ChatGPT users, achieving millions of queries per second. It outlines the challenges faced during high write traffic, the optimizations implemented, and the architectural decisions made to maintain performance and reliability. Key strategies include offloading read traffic, optimizing queries, and managing workloads to prevent service degradation. The article also highlights the importance of connection pooling and caching to enhance database efficiency.

💭 Your Thoughts

Some learning notes: - PostgreSQL scales well for our read-heavy workloads, we still encounter challenges during periods of high write traffic. - write-heavy workloads to sharded systems such as Azure Cosmos DB - split requests into low-priority and high-priority tiers and route them to separate instances. - PgBouncer as a proxy layer to efficiently managing the connections - When multiple requests miss on the same cache key, only one request acquires the lock and proceeds to retrieve the data and repopulate the cache. - the primary must stream WAL to every replica. intermediate replicas relay WAL to downstream replicas. - a small schema change, such as altering a column type, can trigger a full table rewrite Truly a lot of engineering optimisation works has been down, I still feel Postgres is not suitable for this kind of workflow ... but i guess the COST is relative low with this solution.

Data Problems Addressed

Mitigating Single Point of Failure in Database Architectures Optimizing Read-Heavy Workloads in PostgreSQL

Technologies Referenced

Apache Pinot Postgres DB