Advanced pattern recognition for detection of complex software aging phenomena in online transaction processing servers

Software aging phenomena have been recently studied; one particularly complex type is shared memory pool latch contention in large OLTP servers. Latch contention onset leads to severe performance degradation until a manual rejuvenation of the DBMS shared memory pool is triggered. Conventional approaches to automated rejuvenation have failed for latch contention because no single resource metric has been identified that can be monitored to alert the onset of this complex mechanism. The current investigation explores the feasibility of applying an advanced pattern recognition method that is embodied in a commercially available equipment condition monitoring system (SmartSignal eCM/spl trade/) for proactive annunciation of software-aging faults. One hundred data signals are monitored from a large OLTP server, collected at 20-60 sec. intervals over a 5-month period. Results show 13 variables consistently deviate from normal operation prior to a latch event, providing up to 2 hours early warning.