Welcome to my home page which is under construction. The areas of interest on this page are:

Recent and Current Projects

These projects are the ones people seem to have the most interest in or are being actively used.

Laying waste to waste

This is a talk I gave to Back Bay LISA about process improvement and reducing waste in IT.

Waste is part of every process. As with entropy, waste increases unless steps are taken to reduce it.

This covers the basic ideas and some background behind Lean and 6 Sigma (LSS), discuss their origins and then walk through some examples from IT while introducing the tools and techniques.

For further details on the talk, see Back Bay LISA website for the announcement for Jun 11, 2014.

Also available are the slide deck (in odp (Libre/Open Office Impress) format) (it is also available in pdf format). A recording of the presentation can be seen on youtube.

A System Administration Template for the Roundup Issue Tracker

This is a tracker for the the Roundup Issue Tracker. This tracker requires version 1.6.0 or newer of Roundup. You can: I originally wrote this in my spare time intending on putting it in place for the company I was contracting at. Sadly, the contract ended before I could place it into production. It is released under the GPL V2 license and is free to use.

If you are not familiar with Roundup, it is a simple-to-use and install issue-tracking system with command-line, web, REST, XMLRPC, and e-mail interfaces. It is based on the winning design from Ka-Ping Yee in the Software Carpentry "Track" design competition.

Roundup's homepage is at https://roundup-tracker.org/. Roundup has been deployed for:

Sysadmin tracker intro

The sysadmin tracker for Roundup is designed to handle issues generated in a system administration or help desk environment.

It has more features than the classic tracker. Some of the functions were inspired by equivalent functionality in Request Tracker (RT) while some other things I have wanted in various trackers (req, reqng, queuemh, RT1, Clearquest, Remedy, Jira, Cherwell) through the years.

Features

Compared to the classic tracker, this tracker has:

History of the tracker

As mentioned above this was originally developed many years ago but never deployed. The original full documentation (in html) is located here .

The tarball of the original code developed for version 0.7.0 of roundup is located here. Note that this is probably only of historic interest as it will not work with a current version of roundup. You would be better off downloading the current copy from the fossil repository.

Configuration management with Config/DACS

From 2005-2008 "Config" was extensively modified and updated. It is now named DACS - Distribution and Configuration System. In DACS subversion replaces CVS as the source code control system. Plus it has greatly enhanced features compared to the original Config. You should visit its home page at http://www.cs.umb.edu/~rouilj/DACS.

The Config/DACS system started from a paper I wrote for LISA in 1994 with the assistance of Rick Martin. It was titled

Config: A Mechanism for Installing and Tracking System Configurations
It integrated: which I still claim are the 4 essential components of any configuration management system.

While the original software to implement config can be downloaded from here: http://www.cs.umb.edu/~rouilj/config/config_tools-1.0.tgz. you should be using the newer DACS system.

Real time log analysis using SEC - the simple event correlator

In November 2004 I presented a paper at the USENIX LISA conference titled:
Real-time log file analysis using the Simple Event Correlator (SEC)

The main page with links to the paper as published and the original longer author's cut along with example rulesets and slides are located here.

I also created a coursebook for a class on SEC that I taught for the LISA 2009 conference. The coursebook is based on tiddlywiki and provides:

in a single HTML file with search capability.

You can get a stripped down version that I used for a BBLISA presentation from http://www.cs.umb.edu/~rouilj/classes/SEC_bblisa1/SEC_tw.html.

Nagios system monitoring software.

I have two published changes for the Nagios network/application monitoring system: I have some unpublished work done for making tkwatcher output nagios passive service external commands so that tkwatcher becomes a framework for building plugins for nagios.

Combining SEC and Nagios.

I wrote a patch for nagios to allow it to use SEC - the simple event correlator for finer gained correlation. Nagios has dependencies, however they are limited to using the exit codes for decisions and don't make decisions on the type of error generated. Also nagios flapping service detection never really quite worked for me. I gave a work in progress presentation at LISA 2006. A PDF of the slides and notes is available .

Some use cases of this integration are:

The patches have been released. It is still reported as beta because the plugin is missing some functionality I want to add before a final release. The beta testers didn't turn up any major performance or other disasters in the code, so I am releasing it to a wider audience. I have been running it for 6+ months without any issues with nagios 2.5 and 2.9.

In addition I did a presentation for Back Bay LISA on January 10, 2006. This covered Nagios/Sec and their integration. The PDF of my slides and notes is available for reading. The other documentation and the unpacked distribution for browsing, including the manual/release notes is also available.

Nagios patch for webinject.

I made two sets of patches to webinject (http://www.webinject.org/) to better support nagios. My patches and descriptive text are located here.

TkWatcher

If you are looking for TkWatcher, you have come to the right place. Just click here for its homepage. Tkwatcher is a tcl program that allows monitoring and analysis of program output. It can use any tcl based shell including tclsh, expect, wish, or scotty. It runs with tcl-7.3 and tcl-8.0 and should work with future versions.

By default it reports problems by email, but it can also send problem reports to a file or to standard output. The reports can be human or program readable. In addition to these reporting modes, it can log error using external programs to permit syslogging, paging or other real-time notification of errors.

Benefits over Watcher

It was inspired by the program watcher by Kenneth Ingham (not Inghman as mistyped in the docs these many years. Hope to fix this typo in version 1.5, sorry Mr. Ingham), but adds features that I found lacking in the original watcher. Among those features are the ability to:

It can Monitor

This tool has been used to:

It permits you to parse any command output line extracting values from the line, and performing tests on these values.

Prior Projects

These are projects that are complete and available for people to use.

Personal LOgging Device modifications.

This tool was written by Hal Pomeranz and presented at LISA 93. The abstract reads:
PLOD (the Personal LOgging Device) is a simple text interface which allows System Administrators (and others) to keep a record of the work they from day to day. The program was developed in Perl with device independence, flexibility, extensibility, and ease of use in mind. The user-interface is reminiscent of Berkeley mail, complete with many pre-defined tilde-escapes which perform various useful functions. Users may easily extend the program by defining their own personal escape sequences.
Plod is a tool for logging your daily tasks. I have modified the version available from Hal's web site. My modified version allows time tracking and assigning (time) and plod notes to particular tasks. I use it for recording the amount of time I spend on reading email, working on a particular trouble ticket, responding to critical incidents etc. The files are: Some useful command lines or functions are:
  plod -T -d `date +%m/%d/%Y`
to display the number of minutes spent in each category for today.
function timecard () {
   plod -T -d $1 -D `date +%m/%d/%Y` $2;
   plod -d $1 -D `date +%m/%d/%Y` -g : $2 ;
}
is used to display the timecard (i.e number of minutes per category) and all the log entries between the start date and the current day. Useful for filling out timecards. If the start date and current date are in different months, then this will have to be run twice. E.G. December 1, 2005 falls on a Thursday. So I would run:
   timecard 11/30/2005 200511
   timecard 12/1/2005
on December 2nd to get the time for the week. Sorry about that, I didn't change the code to allow this to work in a single command.

Majordomo

I was a significant contributer to the majordomo mailing list manager in the early 1990's. I was responsible for the current email based configuration as well as the 1.90 through 1.93 (or was it 1.94) releases.

Software management with Depot Lite

At the USENIX LISA conference in 1994 I fortunate enough to have a second paper accepted. It was titled:
Depot-Lite: A Mechanism for Managing Software
Depot-Lite is a software management, packaging and deployment method. It extends the Depot concept of software management with some additions that are useful in an academic environment where students are allowed to install and publish software for others to use.

Cygwin Involvement

I have been using cygwin for many years. It is the one thing that makes windows bearable. I am hosting the latest copy I have of Michael A Chase's clean_setup.zip version 1.0700 from July 02, 2003 because it is a useful tool and there doesn't seem to be another downloadable version on the internet. It used to be at:
http://home.ix.netcom.com/~mchase/zip/clean_setup.zip
Thanks to Angelo Graziosi for providing me with the copy. If you download it from here, please consider posting it on your own web page so that this tool will not once again face extinction. Thanks.

Using ssh and screen together

I use ssh with screen all the time because I work remotely and have my ssh session drop on a regular basis. I also use ssh-agent on my laptop and forward the agent via my ssh session. Now within my screen session, I often ssh to other hosts where I want my ssh-agent to be accessible.

When I initially log in it works fine. The SSH_AUTH_SOCK variable is in screen's environment and is inherited by the sessions under screen. When I ssh to other systems, again the SSH_AUTH_SOCK is forwarded along. However after the ssh disconnects and is reconnected (using autossh or manually), the SSH_AUTH_SOCK variable in my screen sessions is pointing to a dead socket. Fixing this for screen sessions on the ssh target host is easy, save the SSH_AUTH_SOCK variable before invoking screen -Dr, and source the new SSH_AUTH_SOCK into the shell running under screen.

However for remote ssh sessions, it is more difficult as we have to redirect their remote SSH_AUTH_SOCK to the newly created socket. To do this, I created ssh_auth_shuffle which is a bash script that combs the environments of open ssh sessions and symbolically links their SSH_AUTH_SOCK's to the newly created socket allowing access to the ssh-agent.

So to recap, the sequence:

can be established without having to type a password. However this depends on an unbroken sequence of connections from host2 all the way back to the desktop.

This works fine if you don't use something like screen(1) on access_host. If the network link between your laptop/desktop and access_host breaks, it also tears down all of the other ssh processes and they have to be re-established. When you reestablish the links, the ssh-auth tunnel is automatically connected through all the segments and things work fine.

However if your access_host session runs screen(1) and your desktop disconnects, the rest of the ssh connections survive, but the access_host end of the access_host->host1 ssh link doesn't have a link to the desktop anymore. When you ssh back in, ssh established a new endpoint for the ssh-agent tunnel.

Indeed the shells that run within screen still have the old endpoint that they propagate to new ssh connections. So we need to propagate the new ssh-agent endpoint to the local shell. http://www.deadman.org/sshscreen.html addresses this issue but is not able to fix the loss of agent access for established ssh sessions to other hosts.

A solution is to ssh from your desktop to access_host, use:

  ssh access_host 'ssh_auth_shuffle && screen -d -r'
The ssh_auth_shuffle script locates all your established ssh commands and figures out where their end of the ssh-auth tunnel is located. It then links that end to the new ssh-auth tunnel created by the ssh from your desktop to access_host. You can install ssh_auth_shuffle anywhere in your path.

Some useful aliases are:

The file ~/.auth_ssh is a link to the file that stores the current ssh connection parameters. A sample file is:

export SSH_AUTH_SOCK=/tmp/ssh-xNQZo23620/agent.23620
export SSH_CLIENT='::ffff:65.33.255.162 1100 22'
export SSH_CONNECTION='::ffff:63.33.222.162 1100 ::ffff:192.168.7.14 22'
export SSH_TTY=/dev/pts/7
export DISPLAY=localhost:11.0
The values of these environment variables (except SSH_CLIENT) are described in the ssh man page.

In addition the current DISPLAY that is in use by ssh is exported as well so you can use X11 forwarding after reconnecting to access_host although this does not work for host1 or host2.

Note that this script is very Bourne shell centric since the ~/.auth_ssh file uses "export var=value" syntax to set variables. However modifying it to support csh style shells should be pretty straight forward.

Korn shell semaphore implementation

This was originally from:
Implementing Semaphores in the Shell
10/6/2004 By Ed Schaefer and John Spurgeon for UNIX Review

Summary: The authors present a Korn shell implementation of a counting semaphore.
and implements a fair queuing semaphore implementation in ksh.

I had a couple of problems with it when I used it for controlling resource usage by BackupPC, so I patched it and the original source plus the patch are available here. It is released under GPL V2 as are my patches to it.

I am putting the shell semaphore code here because the original link to the article (and in theory the source code) at: http://unix.ittoolbox.com/documents/implementing-semaphores-in-the-shell-15726 is dead.

A little about me

I have posted my resume (in adobe acrobat (.pdf) format) and will post it in other formats along with my picture here when I get a chance. If I am not working on an ambulance (I've been an emergency medical technician since the mid 80's) I can be found playing Ultimate Frisbee, or giving somebody a massage.

I was formerly employed as a system administrator with MathWorks.

Also you may be interested in my profile on LinkedIn.

Quick Menu