Memory Monitor
This guide explains how to use the class Async::Container::Supervisor::MemoryMonitor to detect and restart workers that exceed memory limits or develop memory leaks.
Overview
Long-running worker processes often accumulate memory over time, either through legitimate growth or memory leaks. Without intervention, workers can consume all available system memory, causing performance degradation or system crashes. The MemoryMonitor solves this by automatically detecting and restarting problematic workers before they impact system stability.
Use the MemoryMonitor when you need:
- Memory leak protection: Automatically restart workers that continuously accumulate memory.
- Resource limits: Enforce maximum memory usage per worker.
- System stability: Prevent runaway processes from exhausting system memory.
- Leak diagnosis: Capture memory samples when leaks are detected for debugging.
The monitor uses the memory-leak gem to track process memory usage over time, detecting abnormal growth patterns that indicate leaks.
Usage
Add a memory monitor to your supervisor service to automatically restart workers that exceed 500MB:
service "supervisor" do
include Async::Container::Supervisor::Environment
monitors do
[
Async::Container::Supervisor::MemoryMonitor.new(
# Check worker memory every 10 seconds:
interval: 10,
# Restart workers exceeding 500MB:
maximum_size_limit: 1024 * 1024 * 500
)
]
end
end
When a worker exceeds the limit:
- The monitor logs the leak detection.
- Optionally captures a memory sample for debugging.
- Sends
SIGINTto gracefully shut down the worker. - The container automatically spawns a replacement worker.
Configuration Options
The MemoryMonitor accepts the following options:
interval
The interval (in seconds) at which to check for memory leaks. Default: 10 seconds.
Async::Container::Supervisor::MemoryMonitor.new(interval: 30)
maximum_size_limit
The maximum memory size (in bytes) per process. When a process exceeds this limit, it will be restarted.
# 500MB limit
Async::Container::Supervisor::MemoryMonitor.new(maximum_size_limit: 1024 * 1024 * 500)
# 1GB limit
Async::Container::Supervisor::MemoryMonitor.new(maximum_size_limit: 1024 * 1024 * 1024)
total_size_limit
The total size limit (in bytes) for all monitored processes combined. If not specified, only per-process limits are enforced.
# Total limit of 2GB across all workers
Async::Container::Supervisor::MemoryMonitor.new(
maximum_size_limit: 1024 * 1024 * 500, # 500MB per process
total_size_limit: 1024 * 1024 * 1024 * 2 # 2GB total
)
memory_sample
Options for capturing memory samples when a leak is detected. If nil, memory sampling is disabled.
Default: {duration: 30, timeout: 120}
# Customize memory sampling:
Async::Container::Supervisor::MemoryMonitor.new(
memory_sample: {
duration: 60, # Sample for 60 seconds
timeout: 180 # Timeout after 180 seconds
}
)
# Disable memory sampling:
Async::Container::Supervisor::MemoryMonitor.new(
memory_sample: nil
)
Memory Leak Detection
When a memory leak is detected, the monitor will:
- Log the leak detection with process details.
- If
memory_sampleis configured, capture a memory sample from the worker. - Send a
SIGINTsignal to gracefully restart the worker. - The container will automatically restart the worker process.
Memory Sampling
When a memory leak is detected and memory_sample is configured, the monitor requests a lightweight memory sample from the worker. This sample:
- Tracks allocations during the sampling period.
- Forces a garbage collection.
- Returns a JSON report showing retained objects.
The report includes:
total_allocated: Total allocated memory and object count.total_retained: Total retained memory and count after GC.by_gem: Breakdown by gem/library.by_file: Breakdown by source file.by_location: Breakdown by specific file:line locations.by_class: Breakdown by object class.strings: String allocation analysis.
This is much more efficient than a full heap dump using ObjectSpace.dump_all.