Munin I/O latencies
Dealing with large amounts of hosts and probes to graph, or sometimes just disks with poor performances, you may see munin loading and your system struggling with I/O waits.
It’s not the first time I have to deal with such problems. I’ve used tmpfs and hourly crontabs saving rrd data, SSD, RAID 0+1, … There’s lots of ways, to mitigate with such troubles.
Though today, I decided to have an other look on google, and found out about rrdcached.
Switching to rrdcached on an existing munin setup is pretty straight forward: install the package, feed the service the proper options to use with munin. The you’ll need to update your munin.conf, setting rrdcached_socket to the socket created by rrdcached. That’s it.
From there, you could consider updating munin jobs, so that munin-html and munin-graph are not run every 5 minutes, which would drastically lower your I/O. Alternatively, you may mount your munin www directory using tmpfs: your rrd data remain either cached or written on disk, thus on the next munin-html/munin-graph job, your munin DocumentRoot is completely rewritten anyway.