Munin I/O latencies

Dealing with large amounts of hosts and probes to graph, or sometimes just disks with poor performances, you may see munin loading and your system struggling with I/O waits.

It’s not the first time I have to deal with such problems. I’ve used tmpfs and hourly crontabs saving rrd data, SSD, RAID 0+1, … There’s lots of ways, to mitigate with such troubles.
Though today, I decided to have an other look on google, and found out about rrdcached.

Switching to rrdcached on an existing munin setup is pretty straight forward: install the package, feed the service the proper options to use with munin. The you’ll need to update your munin.conf, setting rrdcached_socket to the socket created by rrdcached. That’s it.
From there, you could consider updating munin jobs, so that munin-html and munin-graph are not run every 5 minutes, which would drastically lower your I/O. Alternatively, you may mount your munin www directory using tmpfs: your rrd data remain either cached or written on disk, thus on the next munin-html/munin-graph job, your munin DocumentRoot is completely rewritten anyway.

Munin RRDCache

Munin RRDCache – vmstat

Munin RRDCache

Munin RRDCache – CPU usage

Munin RRDCache

Munin RRDCache – load average

Munin RRDCache

Munin RRDCache – processes priority

Munin RRDCache

Munin RRDCache – diskstats

Munin RRDCache

Munin RRDCache – disk latency