Did a 16-bit counter overflow shut down Comair?
Hi,
On Christmas Day last Saturday, Comair Airlines had to completely stop
flying
all of its planes due to computer problems. Comair blamed the computer
problems on their pilot scheduling software being overloaded after bad
weather earlier in the week forced many flights to be rescheduled. Comair
now hopes to have all of its 1,100 daily flights restored by tomorrow.
An article which was published today at the Cincinnati Post Web site
provides some interesting details of a software failure in Comair's pilot
scheduling software:
How it happened
http://www.cincypost.com/2004/12/28/comp12-28-2004.html
According to the article, Comair is running a 15-year old scheduling
software package from SBS International (www.sbsint.com). The software has
a hard limit of 32,000 schedule changes per month. With all of the bad
weather last week, Comair apparently hit this limit and then was unable to
assign pilots to planes.
It sounds like 16-bit integers are being used in the SBS International
scheduling software to identify transactions. Given that the software is 15
years old, this design decision perhaps was made to save on memory usage.
In retrospect, 16-bit integers were probably not a good choice.
An anonymous message posted to Slashdot the day after Christmas first
described the software failure at Comair:
http://slashdot.org/comments.pl?sid=134005&cid=11185556
Earlier this year, an overflow of a 32-bit counter in Windows shut down air
traffic control over southern California for 3 hours:
Microsoft server crash nearly causes 800-plane pile-up
http://www.techworld.com/opsys/news/index.cfm?NewsID=2275
This problem occurred because of a known design flaw in older versions of
Windows:
http://tinyurl.com/5n9gc
Richard M. Smith
http://www.ComputerBytesMan.com