A general purpose data analysis monitoring system with case studies from the National Fusion Grid and the DIII–D MDSplus between pulse analysis system

Abstract As computing infrastructures become more complex, it is important to centralize information in order to efficiently monitor and maintain computing processes. A general monitoring system has been developed at the DIII–D National Fusion Facility, that combines code run status, data analysis tracking, logfile access, complex error detection, and expert system capabilities. The monitoring system’s flexibility and ease of deployment have enabled it to be successfully applied to two significantly different computing environments. At DIII–D, the system is being used as the data analysis monitor ( http://nssrv1.gat.com:8000/dam ) to allow both application scientists and computer scientists to monitor the status of between pulse MDSplus dispatched data analysis codes. The monitoring system is also being used by the National Fusion Collaboratory Project, as the Fusion Grid monitor ( http://nssrv1.gat.com:8000/fgm ), to track multiple asynchronous complex code runs on the Fusion Grid.