Page table management in local/remote architectures

We conjecture that a paged memory with page migration by the operating system may be an effective system environment for a local/remote shared memory architecture executing a single parallel computation. Implementing a paged memory in such an architecture raises several issues with respect to page table management. These issues include page table placement, page table replication level, and page table storage overhead. We discuss these issues, propose alternative solutions, and present an experimental evaluation of the solutions. The experiments were conducted using software implemented page tables on a 32-node BBN Butterfly. The experiments have investigated the case of a single shared-memory parallel computation with one user process per processor. The implementation captures the costs of page table entry locking and reference information updating. Each user process has a copy of the computation's code and non-shared variables in local memory. Only shared data references use the page tables. A separate processor has a migration daemon that periodically unblocks itself and examines the page tables to make policy decisions concerning page migration. The conclusions drawn include that: 1) a fully replicated page-indexed page table significantly reduces network, memory, and lock contention in comparison to a single copy, 2) a fully replicated page-indexed page table faces a severe memory utilization problem in large-scale architectures, 3) a proposed approach based on inverted page tables appears to be a promising alternative to a fully replicated page-indexed page table.