Design, Implementation, and Usage of LibNBC

We describe the design and implementation of LibNBC a library that implements nonblocking collective operations. Its main goals are high portability and high performance. The library is written in ANSI C on top of MPI-1. This document describes the internal design, implementation and various internal and external programming interfaces of LibNBC.