CPU Temperature Alert
  CPU Temperature Alert - Home  

  Part of Home site:
  http://akakul.co.uk/

  Scripts Home

Summarizer Summarizer
RAQ RAQ File Convertor
FAQ
Useful Links
db.cache (DNS)
CPU Temp Alert
Resolving NS Check
OverQuota Check


 

Welcome to the home of Kuls 'CPU Temperature Alert' script

This page deals with CPU Temperature Alert on Cobalt/Sun RaQ3's & RaQ4's [RaQ550?]specifically, though it is EASILY customisable to any Linux machines that has 'hardware' monitoring of the CPU Temperature and the Kernel reports the temperature in a file / pseudo file (i.e. /proc/cpuinfo).
Tested/Used on RaQ3 & RaQ4's.

I wrote this script to alert me of excessive Temperature rises above a 'normalised' temperature. This can (and did) indicate that the CPU fan had failed on my Sun/Cobalt RaQ3, and subsequently that the PSU fan also later failed - Jinxed RaQ3 :). Though detecting that the PSU fan had died also relied on the fact that one day the temperature rose another 5 degree's C and remained at least 5 degree's C higher than the previous weeks temperatures / based on loading at the time. (No temperature sensor is installed in the RaQ3's Chassis or PSU - so is mostly guess work, but with a History of previous Temperature this is not hard to work out, by just spotting abnormal trends.) I later used this script to monitor when my RaQ3's was around 30 seconds away from crashing, by alerting me when the temperature had reached an unbearable 80 degree's C. Toasty! (80 degree's C is NOT a good Temperature to run your CPU at I'm told :-P)
My RaQ3's fans have now been replaced and a healthy 33 degree's C is its daily minimum with 45 being about as high as it gets most days, though a good set of MD5sum's and a TAR'ing of a few files can make it rise a little above this (if i try hard). Previously with two dead fans, 65 degree's C was its idle temperature, with 80-85 (sustained) & 90 (peak) being the 'Crash Zone' temperature range.


This small script needs to be installed via Shell (Telnet/SSH) and set to run from 'crontab' (or similar) frequently. One of the IMPORTANT features of this script is that it will not constantly keep sending emails/SMS to you every time the CPU is over a predetermined temperature like most other similar types of scripts. This Script will *ONLY* email/SMS you once in any set period (you define this period) so you wont be blasted with hundreds of SMS Messages or E-mails should the Temperature rise too high on your RaQ!

Source Formats Available: Plain Text (UNIX) | GZ (GZiped Compressed) | ZIP (Windows Format Compressed), Download your chosen Format now

Installing

You can either download your chosen format to the server directly (recommend GZ file), or to your PC and FTP (this is a UNIX file and should be uploaded as BINARY - NOT ASCII) up to a user account, then move the script to somewhere that only 'root' can see / alter it.
I have a directory called /opt/bin that only 'root' can read, this is where in this example the script is assumed to be, you should change as appropriate.
  or follow these instructions:

su -

{ENTER YOUR ROOT PASSWORD WHEN PROMPTED}

cd /opt/bin
wget http://scripts.akakul.co.uk/code/cpu_temp.gz
gunzip cpu_temp.gz
chown root.root cpu_temp
chmod 700 cpu_temp

Now you can either run it manually without any parameters, and it will display its usage (help texts), or continue on to setting 'crontab' to run it.
Edit 'root's crontab, and insert this (or similar): [HowTo Use Crontab]

*/12 * * * * /opt/bin/cpu_temp -w 45.5 -p 360 -e your@email.address -l /home/cpu_info_log

Which will tell crontab to run the script every 5 minutes

  • -w 45.5 [REQUIRED] (45.5 is the temperature to trigger an alarm at, you can set this to any 're al' (floating point) or integer positive number)
  • -p 360 [REQUIRED] (360 is the number of minutes before re-alerting (6 hours))
  • -e your@email.address [OPTIONAL] (your@email.address the e-mail address to send the e-mails too in the event of an alert)
  • -l /home/cpu_info_log [OPTIONAL] (/home/cpu_info_log is the place to store all the results (history) - leaving this out will stop any logging)
  • -s [OPTIONAL] (used to show the parameters, the CPU Temperature processing is NOT run with this option. It is a Validation option only.)

You can run this script manually if you wish as many times as you like, with the only side-effect being that the LOG (if any) will have extra entries in.
You can experiment with the script to test the temperature trigger is working by setting it to a really low temperature (around the 30 mark)
A small file is touched (created / modified) each time the temperature Trigger happens, this is stored in '/tmp/' and is called 'cpu_temp.panic', this file is safe to delete at ANY time if you wish, and all that will happen is the next Trigger event will re-create it! It contains nothing at all, its JUST the modified date that is used to determine when a new e-mail should or should not be sent (the -p [trigger period] option) be sent.

NOTE: The LOG file (if enabled) will be rotated during every 'execution' if it is required, ONLY ONE months data will be kept in the log file. :-)

References:
  • None

 © akaKul.co.uk