Welcome back to Serverhacks—a collection of tips, tricks, and troubleshooting guides for servers, networking, and system administration. I’m Corels from Emmanuel Corels Creatives, and today we’re tackling one of the most critical challenges for system administrators: Linux Out-Of-Memory (OOM) issues. When a server runs out of memory, processes can be killed abruptly, services may become unresponsive, and overall system stability can be compromised. In this guide, we’ll walk through a systematic approach to diagnose OOM issues, identify their causes, and implement solutions to stabilize your system.
Understanding OOM Issues
When your Linux server exhausts its available memory, the kernel’s OOM killer is triggered. This mechanism terminates processes to free up memory, often targeting processes consuming high resources. Common causes include:
- Memory leaks in applications
- Misconfigured services consuming excessive cache
- Insufficient swap space
- Unexpected load spikes
A methodical approach helps you pinpoint the root cause and address the underlying problem before it leads to system instability.
Step 1: Identify OOM Events in Logs
Start by confirming that your system has encountered OOM issues.
-
Check Kernel Logs:
sudo dmesg | grep -i -E 'oom|killed process'
This command searches for OOM-related messages in the kernel ring buffer. Look for lines indicating that a process was killed due to memory exhaustion.
-
Review System Logs:
sudo journalctl -k | grep -i oom
This provides detailed kernel log entries related to OOM events. Identifying the exact time and affected processes can help narrow down the culprit.
Step 2: Monitor Memory Usage
Understanding your system’s memory consumption is crucial.
-
Real-Time Monitoring with top or htop:
top -o %MEM
or install and run:
htop
These tools show you which processes are consuming the most memory. Look for any runaway processes or unexpected spikes.
-
Check Free Memory and Swap:
free -m
This command displays memory and swap usage in megabytes. Note the values for total, used, free, and available memory. Low free memory and high swap usage are red flags.
Step 3: Analyze Process Memory Consumption
Drill down into specific processes to see if they are leaking memory or consuming more than expected.
-
Examine Detailed Process Information:
ps aux --sort=-%mem | head -n 10
This command lists the top 10 memory-consuming processes. Investigate any processes that seem abnormally high.
-
Use pmap for Process Memory Maps: For a specific process ID (PID), run:
sudo pmap -x <PID> | tail -n 1
This shows the total memory used by the process. Consistent growth over time can indicate a memory leak.
Step 4: Check and Configure Swap Space
Swap space acts as an overflow for RAM. Insufficient swap can lead to OOM conditions.
-
View Swap Usage:
free -m
Check if swap is being used excessively or if it’s nearly full.
-
Add or Increase Swap: If needed, create a swap file:
sudo fallocate -l 2G /swapfile sudo chmod 600 /swapfile sudo mkswap /swapfile sudo swapon /swapfile
Then add it to
/etc/fstab
to make it persistent:/swapfile none swap sw 0 0
Adjust the swap size based on your server’s needs.
Step 5: Optimize Application and Service Configurations
Sometimes, tuning application settings can prevent OOM conditions.
-
Configure PHP-FPM Memory Limits (if applicable): In your
php.ini
, set an appropriate memory limit:memory_limit = 256M
Restart PHP-FPM:
sudo systemctl restart php7.4-fpm
-
Optimize Java Applications: For Java-based applications, adjust the JVM heap size using
-Xms
and-Xmx
flags. -
Tune Caching Services: If you use caching mechanisms like Redis or memcached, ensure their memory limits are configured appropriately to avoid consuming all available memory.
Step 6: Implement Resource Limits and Monitoring
To prevent runaway processes, you can set resource limits and employ proactive monitoring.
-
Set ulimit for Processes: Edit
/etc/security/limits.conf
to limit the maximum memory or number of processes for users:* soft rss 1048576 * hard rss 2097152
This limits the resident set size (in KB) for processes.
-
Automate Monitoring with Scripts: Create a script that logs memory usage and alerts you when thresholds are exceeded. For example:
#!/bin/bash MEM_USAGE=$(free -m | awk '/^Mem:/{print $3}') THRESHOLD=800 if [ "$MEM_USAGE" -gt "$THRESHOLD" ]; then echo "Warning: High memory usage detected: ${MEM_USAGE}MB" | mail -s "Memory Alert" admin@yourdomain.com fi
Schedule this script with cron:
crontab -e
Add:
*/5 * * * * /path/to/memory_check.sh
Final Thoughts
Diagnosing and resolving Linux OOM issues involves a comprehensive approach—reviewing logs, monitoring memory usage, analyzing process behavior, checking swap configuration, and optimizing application settings. By following these steps, you can identify the root causes of memory exhaustion and implement effective measures to keep your server stable.
Take your time to test each diagnostic step and adjust configurations based on your server’s specific workload. If you have any questions or need further assistance, feel free to reach out. Happy troubleshooting, and here’s to a smoothly running server environment!
Explained with clarity by
Corels – Admin, Emmanuel Corels Creatives