Improving the Scalability of an Operational Scientific Application in a Large Multi-core Cluster

Currently, High-Performance Computers use nodes with a tendency of an increasing number of cores per chip. In this scenario, enhancing scalability of an existing application requires a comprehensive approach, since system parameters such as memory per core and I/O speeds increase slower with time than cores per chip. This work describes the enhancements incorporated in BRAMS - a regional weather forecasting model - to reach a target execution time using 9,600 cores. We show that some common coding techniques may prevent scalability and that I/O and memory are constraints as core counts increase.