Identification and approximations for systems with multi-stage workflows

Distributed systems with multi-stage workflows are characterized by multiple logical stages which can either execute sequentially or concurrently and a single stage can be executed on one or more physical nodes. Knowing the mapping of logical stages to physical nodes is important to characterize performance and study resource bottlenecks. Often due to the physical magnitude of such systems and complexity of the software, it is difficult to get detailed information about all the system parameters. We show that under light load conditions, the system can be well approximated using first order models and the hence simplifying the system identification problem. For general load, we develop a parameter estimation technique using maximum likelihood and propose a heuristic to solve it efficiently.