Simple Fusion IO Monitor
July 10, 2013 - 12:00 pm
I had a powerful need to monitor a single Fusion IO card. Simple check of media status, capacity reserves, blocks good and pages good.
Adjust $confEmailTo, $confEmailFrom and potentially confCapReserveAlarm, confBlocksGoodAlarm, confPagesGoodAlarm.
Run on cron, run manually, run by intern, it's all the same to me.
#!/usr/bin/ruby
###############################################################################
# Copyright (c) 2013, Workhabit, Inc.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# * Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
#
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# * Neither the name of Workhabit, Inc., nor the names of its contributors
# may be used to endorse or promote products derived from this software
# without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
###############################################################################
# Authored by Gary Gogick (gary@workhabit.com)
#
# This is a simple script to monitor certain fio-status parameters and alert
# via e-mail if set thresholds are breached. Specifically, it checks the
# media status, capacity reserves, blocks good and pages good parameters
# of all ioDimm* sections of fio-status -fk.
#
# Configuration parameters are fairly straightforward; the conf*Alarm
# variables should be in the form of a float (eg, 25.0 for 25%)
#
# The following gems are required:
# inifile
# mail
require "rubygems"
require "inifile"
require "mail"
### Configuration
$confEmailTo = "from@example.org"
$confEmailFrom = "root@server.example.org"
confCapReserveAlarm = 75.0
confBlocksGoodAlarm = 75.0
confPagesGoodAlarm = 75.0
### Helper functions
def sendmail(subj, msg)
Mail.deliver do
from $confEmailFrom
to $confEmailTo
subject subj
body msg
end
end
### Initial setup and fio probe
# Set message variables
host = `hostname`.strip
message = ""
# Generate fio status report
system 'fio-status -fk > /tmp/fio-status'
# Load report into IniFile
status = IniFile.load("/tmp/fio-status")
### Alert handling
# Loop through ioDimm sections
status.each_section do |i|
# This is the [ioDimm *] section; additional sections or checks within this section should be easy to add.
if i =~ /ioDimm */
section = status[i]
if section['media_status'] != 'Healthy'
message = message + "Device #{i}: media_status has failed: #{section['media_status']} (vs Healthy)\n\r"
end
if section['capacity_reserves_percent'].to_f < confCapReserveAlarm
message = message + "Device #{i}: capacity_reserves_percent is #{section['capacity_reserves_percent']} (vs #{confCapReserveAlarm})\n\r"
end
if section['blocks_good_percent'].to_f < confBlocksGoodAlarm
message = message + "Device #{i}: blocks_good_percent is #{section['blocks_good_percent']} (vs #{confBlocksGoodAlarm})\n\r"
end
if section['pages_good_percent'].to_f < confPagesGoodAlarm
message = message + "Device #{i}: pages_good_percent is #{section['pages_good_percent']} (vs #{confPagesGoodAlarm})\n\r"
end
end
end
# Send e-mail alert
if message != ""
sendmail("FusionIO alert on #{host}", message)
end