Poster: Autonomic Modeling of Data-Driven Application Behavior

Computational behavior of large-scale data-driven applications is a complex function of their input, various configuration settings, and underlying system architecture. The resulting difficulty in predicting this behavior complicates optimizing applications' performance and scheduling them onto compute resources. Manually diagnosing performance problems and reconfiguring resource settings to improve performance is cumbersome and inefficient. We thus need autonomic optimization techniques that observe the application, learn from the observations, and subsequently successfully predict application behavior across different systems and load scenarios. This work presents a modular modeling approach for complex data-driven applications that uses statistical techniques to capture pertinent characteristics of input data, dynamic application behaviors, and system properties to predict application behavior with minimum human intervention. The work demonstrates how to adaptively structure and configure the model based on the observed complexity of application behavior in different input and execution contexts.