As a user of Heroku Postgres, I am getting out of memory errors, but metrics are still showing memory in cached and free. I'd like to understand why that is.
Postgres out-of-memory (OOM) errors can be surprising to users, especially when there is still memory available in the
memory-cached. The reason this happens is due to a combination of VM settings and Postgres preference for contiguous memory allocation. The effect of this is that smaller Postgres plans may run out of memory with half of their memory still in
memory-cached. This does not mean that Postgres is not using this memory, as Postgres is able to leverage the OS cache (a majority of
memory-cached) to improve query performance. It does mean that Postgres is prevented from using this memory directly for things like creating new connections or complex querying (which may use
Understanding Postgres Memory Pressure and Out of Memory Errors
In order to monitor memory pressure on your server, you will want to track the amount of memory available to Postgres and other server processes. The two most important metrics for doing so,
memory-cached, come from the Postgres Metrics Logs.
Server memory is a valuable resource and so in order not to waste it an operating system prefers to use all of the memory available to it. It will use this memory either by granting it to a process that request it, or allocating it to
memory-cached in order to improve overall performance. Because of this, it is not uncommon to see low numbers for the
memory-free metric. These low numbers don't necessarily indicate that your server is low on memory, so in order to understand if there is memory contention we must look at another metric,
memory-cached represents the amount of memory held in the operating system cache. It is common to see large portions of your
total-memory being held by
memory-cached. This is a feature, not a bug. It means that Postgres PIDs do not currently need additional memory and the operating system is using it to optimize read and write performance.
In the event that Postgres needs additional memory, the operating system will begin to free cached memory. When this happens you should see an upward spike in the
memory-free metric as well as a similar downward spike in utilization for the
The operating system is not able to free up memory indefinitely though. After certain limits are reached the OS will no longer be able to move memory from the
memory-cached bucket into the
memory-free bucket. This why you may see some of the smaller Postgres instances run out of memory with close to half of their memory in the
When this happens your Postgres instance will begin reporting Out of Memory (OOM) errors. These errors, as you would expect, indicate that the Postgres processes are no longer able to obtain the memory from
memory-free that they are requesting.
If this happens frequently - you will want to consider upgrading your Postgres plan in order to take advantage of larger amounts of memory.
What is the best way to monitor memory pressure on an instance?
The best way to monitor memory pressure on a Postgres instance is to monitor the
memory-cached metrics. As discussed above, when a Postgres process requests more memory, the operating system will release cached memory to free memory where it can then be consumed by the process requesting it. When the operating system is no longer able to release cached memory and there is very small amounts of memory available in free memory, Postgres will throw "out of memory" errors. This can begin to happen when
memory-free drops below ~50% of the databases
memory-total. On smaller plans this may be skewed closer to 60% and on larger plans closer to 40%.
Why is my Postgres server out of memory with memory still available in
Due to configuration (
overcommit_ratio) on the server meant to protect Postgres from being killed unexpectedly, not all of the memory in the
memory-cached is available to be used by Postgres processes directly.
This memory still benefits Postgres processes due to way that Postgres relies on the operating system's page cache when its own shared buffers cache is full.