Effective performance measurement and analysis of multithreaded applications

Understanding why the performance of a multithreaded program does not improve linearly with the number of cores in a shared-memory node populated with one or more multicore processors is a problem ...