Daemon - Version 6.0
Back to PSC's home page
Back to Watchdog
This page provides just a simple link to download the source code and
differential patches (about 0.5MB tar/gzip file) that have
been used to create an "experimental version 6.0" of the Linux
Watchdog for use in Dundee with the following
updates/changes/improvements (depending on your point of view)
starting from the original 5.13 version:
- Works with the lm-sensors style of temperature monitoring
(NOTE: this changed 13-May-2013 to use 'temperature-sensor' in
place of 'temperature-device' in the config file).
- More robust parsing of the configuration file.
- Tests and cleans up its own PID file to prevent multiple
copies running and to properly cooperate with the "service
watchdog start|stop" operation.
- Ping code now reports ping-times when verbose mode is
- Running of multiple test scripts "in parallel" works correctly
(error codes always attributed to the script that caused them).
- Test scripts that return 255 (-1) (and 254 = (-2), etc) are
now properly interpreted as unconditional reboot (or hard reset,
- Fixed polling interval so now matches the "interval =" setting
in the configuration file.
- Has a retry timer so momentary faults are handled gracefully
(e.g. files go missing briefly during log-rotation), but
persistent faults actioned.
- The "softboot" option is now depreciated, it acts now only to
disable the retry timer.
- [above now in GIT for 5.14+]
You can read the changelog here.
- Uses the system 'umount' program (and if necessary 'swapoff'
program) to attempt to un-mount the file system in an orderly
manner (but with simple fall-back code if not possible).
Previously it used a lot of "borrowed code" that dated back
around 10 years and had not been updated with ext4-specific
- Now syncs the CMOS real-time clock to system time on reboot
using 'hwclock' program.
- File testing performed with process fork. This tests that
forking is OK, and also properly deals with access hanging on
- Can kill a process tree, not just the test/repair script
started by the watchdog.
- [above are still "work in progress" for integration as time
NOTE: This is not the official version, that is maintained by
Michael Meskes "meskes at debian dot org" and the GIT site is
A number of the the above changes made in Dundee have been
incorporated in to the official version, but some have not been done
NOTE: Check back periodically to get the archive file, as the "V6.0"
number mentioned here will not be updated until it is considered
'stable'. You did read the bit about this being experimental, didn't
This page is intended for folk who want to try this out, but before
doing so you should read the patch-log.txt file and take a look at
the source code to see it is doing what you want. The 'xref'
directory has two sub-directories that tell you which patches affect
which files, and which files have been changed by a list of patches.
WARNING: The information on this page, and in the linked source
code, is provided with absolutely no warranty at all so do not use
it for production systems, etc, without thorough testing. We accept
no liability for any errors or omissions!
[top of page]
To build the test version, grab the archive linked to from this
page, uncompress it, and then uncompress the watchdog-6.0.tar.gz
file from inside that. Change in to that directory (watchdog-6.0)
and then run the commands:
You need to have the appropriate tools installed, I think if you use
"apt-get build-dep watchdog" on a Debian/Ubuntu machine (as root or
using sudo as appropriate) it installs what you need. However, it
seems you may need some extra tools installed:
apt-get install build-essential automake libtool
In addition you might want to install GIT and learn how to use it to
make a clone of the official version later, so that you can try code
To run that version stop any current official package with "pkill
watchdog" (as root or with sudo, but do not kill with -9 or you will
reboot by surprise if hardware timer is in use). Don't use the usual
"service watchdog stop" as that starts the wd_keepalive daemon in
Then run the compiled version with "./watchdog -c test.conf -v"
where test.conf is a local copy of your normal file edited to add
any special options (e.g. the temperature devices). The -v option
will result in lots of syslog messages about the temperature(s) and
any other checks, etc, but will help you verify it looks OK.
See also the configuration
page and the command-line
options on this site.
Makefile Build Options
The Makefile created by the 5.13-era of git clone has the compile
CFLAGS = -g -O2
Which enabled debug symbols (-g) and normal levels of release code
optimisation (-O2). This results in a larger binary than the “as
shipped” version, but in the event of needing to use gdb for
debugging, there is enough information to be usable. However, this
is not optimal for release or debugging/development, so you might
want to manually edit the Makefile and change the above line to one
of the following choices. For release only:
CFLAGS = -O2 -Wall
If you keep the '-g' option for debug symbols you can use the
'strip' command to remove them later to make the binary smaller.
CFLAGS = -g -O0 -DDEBUG -Wall
In this case the “-DDEBUG” option results in the fatal_error() call
in logmessage.c using an asert() call to cause a core-dump if such
an error message is output. This
is not recommended for production use or live testing! The
use of "-Wall" for picking up coding errors is also strongly
recommended, even if some of the warnings are pedantic, it is worth
taking a careful look at any problems it finds.
Generally to enable the core dump for analysis by gdb you need to
run the following bash command before you run the daemon:
ulimit -c unlimited
Memory Checking - Electric Fence
In addition, you might be interested in checking for memory errors
to prevent leaks slowly using up system memory. One useful tool in
this respect the Electric Fence library which replaces the normal
glibc memory allocation and freeing routines by ones that use the
hardware virtual memory manager to enforce checking. Make sure you
install this with the command:
apt-get install electric-fence
Then edit the Makefile to change the "LIBS" line to add "-lefence":
LIBS = -lrt -lefence
If the program makes a mistake with access to dynamically allocated
memory then it will simply core-dump. So if testing with this option
you should disable the use of the hardware watchdog (by editing the
config file) or use the --no-action command line to stop it from
opening /dev/watchdog even when configured to do so.
Again, this is not for production
Memory Checking - Valgrind
Another tool for solving memory access problems is 'valgrind' which
runs the program under test in what is essentially a virtual
machine. An example of using valgrind is given here: http://www.cprogramming.com/debugging/valgrind.html
When I ran it I used this command:
valgrind --tool=memcheck --leak-check=full --show-reachable=yes
--config-file ../watchdog.conf --verbose --foreground \
--force --loop-exit 5
One problem with this is you can't then easily send a signal to the
watchdog as it is being tested to stop it normally. To deal with
this type of test the '--loop-exit' command line option was
added, with the above example running for 5 intervals.
Last Updated on 28-Mar-2016 by
Copyright (c) 2014-16 by Paul S. Crawford. All rights reserved.
Absolutely no warranty, use this information at your own risk.