论文信息 - Virtualized I/O

Virtualized I/O

[1] Seth Copen Goldstein,et al. Active messages: a mechanism for integrating communication and computation , 1998, ISCA '98.

[2] Robert W. Numrich,et al. Co-array Fortran for parallel programming , 1998, FORF.

[3] Karl S. Hemmert,et al. High message rate, NIC-based atomics: Design and performance considerations , 2008, 2008 IEEE International Conference on Cluster Computing.

[4] Thorsten von Eicken,et al. U-Net: a user-level network interface for parallel and distributed computing , 1995, SOSP.

[5] Fabrizio Petrini,et al. Transparent system-level migration of PGAS applications using Xen on InfiniBand , 2007, 2007 IEEE International Conference on Cluster Computing.

[6] Karsten Schwan,et al. Resource-Aware Distributed Stream Management Using Dynamic Overlays , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[7] Chris Smith,et al. An Open Grid Services Architecture Primer , 2009, Computer.

[8] Wolf-Dietrich Weber,et al. Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[9] Charles E. Leiserson,et al. Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.

[10] Dilma Da Silva,et al. Libra: a library operating system for a jvm in a virtualized execution environment , 2007, VEE '07.

[11] Patrick Th. Eugster,et al. Type-based publish/subscribe: Concepts and experiences , 2007, TOPL.

[12] Rupak Biswas,et al. Impact of the Columbia Supercomputer on NASA Science and Engineering Applications , 2005, IWDC.

[13] Alan L. Cox,et al. Achieving 10 Gb/s using safe and transparent network interface virtualization , 2009, VEE '09.

[14] Qian Zhang,et al. A Compound TCP Approach for High-Speed and Long Distance Networks , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[15] Rolf Riesen,et al. SUNMOS for the Intel Paragon - a brief user`s guide , 1994 .

[16] Karsten Schwan,et al. Lightweight Morphing Support for Evolving Middleware Data Exchanges in Distributed Applications , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[17] Fabrizio Petrini,et al. Challenges in Mapping Graph Exploration Algorithms on Advanced Multi-core Processors , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[18] Scott Rixner,et al. An efficient programmable 10 gigabit Ethernet network interface card , 2005, 11th International Symposium on High-Performance Computer Architecture.

[19] Adit Ranadive,et al. Performance implications of virtualizing multicore cluster machines , 2008, HPCVirt '08.

[20] David F. Heidel,et al. An Overview of the BlueGene/L Supercomputer , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[21] Sally Floyd,et al. Simulation-based comparisons of Tahoe, Reno and SACK TCP , 1996, CCRV.

[22] Amith R. Mamidala,et al. Hot-Spot Avoidance With Multi-Pathing Over InfiniBand: An MPI Perspective , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[23] D.E. Culler,et al. Effects Of Communication Latency, Overhead, And Bandwidth In A Cluster Architecture , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[24] L. W. Tucker,et al. Architecture and applications of the Connection Machine , 1988, Computer.

[25] William Gropp,et al. Design and implementation of message-passing services for the Blue Gene/L supercomputer , 2005, IBM J. Res. Dev..

[26] QinWei,et al. A formal concurrency model based architecture description language for synthesis of software development tools , 2004 .

[27] Wu-chun Feng,et al. Asymmetric interactions in symmetric multi-core systems: analysis, enhancements and evaluation , 2008, HiPC 2008.

[28] Marvin H. Solomon,et al. Dense Trivalent Graphs for Processor Interconnection , 1982, IEEE Transactions on Computers.

[29] Norman P. Jouppi,et al. High-performance ethernet-based communications for future multi-core processors , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[30] Charles Clos,et al. A study of non-blocking switching networks , 1953 .

[31] Duncan H. Lawrie,et al. Access and Alignment of Data in an Array Processor , 1975, IEEE Transactions on Computers.

[32] Keith D. Underwood,et al. A preliminary analysis of the MPI queue characterisitics of several applications , 2005, 2005 International Conference on Parallel Processing (ICPP'05).

[33] Joel H. Saltz,et al. The virtual microscope. , 2003, IEEE transactions on information technology in biomedicine : a publication of the IEEE Engineering in Medicine and Biology Society.

[34] Philip Heidelberger,et al. The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer , 2008, ICS '08.

[35] Ivan Stojmenovic,et al. Honeycomb Networks: Topological Properties and Communication Algorithms , 1997, IEEE Trans. Parallel Distributed Syst..

[36] Courtenay T. Vaughan,et al. A Simple Synchronous Distributed-Memory Algorithm for the HPCC RandomAccess Benchmark , 2006, 2006 IEEE International Conference on Cluster Computing.

[37] David B. Loveman. High performance Fortran , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.

[38] Harold S. Stone,et al. Parallel Processing with the Perfect Shuffle , 1971, IEEE Transactions on Computers.

[39] Rolf Riesen,et al. Instruction-level simulation of a cluster at scale , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[40] Brent Callaghan,et al. NFS over RDMA , 2003, NICELI '03.

[41] Taisuke Boku,et al. The architecture of massively parallel processor CP-PACS , 1997, Proceedings of IEEE International Symposium on Parallel Algorithms Architecture Synthesis.

[42] V. Glushkov. THE ABSTRACT THEORY OF AUTOMATA , 1961 .

[43] Harold S. Stone,et al. Dynamic Memories with Enhanced Data Access , 1972, IEEE Transactions on Computers.

[44] Russ Miller,et al. Data Movement Techniques for the Pyramid Computer , 1987, SIAM J. Comput..

[45] Dharma P. Agrawal,et al. Generalized Hypercube and Hyperbus Structures for a Computer Network , 1984, IEEE Transactions on Computers.

[46] Fred Kuhns,et al. A remotely accessible network processor-based router for network experimentation , 2008, ANCS '08.

[47] Karl S. Hemmert,et al. An architecture to perform NIC based MPI matching , 2007, 2007 IEEE International Conference on Cluster Computing.

[48] Dhabaleswar K. Panda,et al. Nomad: migrating OS-bypass networks in virtual machines , 2007, VEE '07.

[49] Robert E. Kahn,et al. A Protocol for Packet Network Intercommunication , 1974 .

[50] Sheldon B. Akers,et al. A Group-Theoretic Model for Symmetric Interconnection Networks , 1989, IEEE Trans. Computers.

[51] Xiaola Lin,et al. Recursive Cube of Rings: A New Topology for Interconnection Networks , 2000, IEEE Trans. Parallel Distributed Syst..

[52] Karsten Schwan,et al. Service Augmentation for High End Interactive Data Services , 2005, 2005 IEEE International Conference on Cluster Computing.

[53] Thorsten von Eicken,et al. Evolution of the Virtual Interface Architecture , 1998, Computer.

[54] Andrew A. Chien,et al. Software overhead in messaging layers: where does the time go? , 1994, ASPLOS VI.

[55] Mohan Kumar,et al. Extended Hypercube: A Hierarchical Interconnection Network of Hypercubes , 1992, IEEE Trans. Parallel Distributed Syst..

[56] Ron Brightwell,et al. Architectural specification for massively parallel computers: an experience and measurement‐based approach , 2003, Concurr. Pract. Exp..

[57] D. Tolmie,et al. HIPPI: simplicity yields success , 1993, IEEE Network.

[58] Janak H. Patel,et al. Processor-memory interconnections for multiprocessors , 1979, ISCA '79.

[59] Samuel Thibault,et al. Improving performance by embedding HPC applications in lightweight Xen domains , 2008, HPCVirt '08.

[60] Jian Liu,et al. Optical MEMS devices for telecom systems , 2003, SPIE Microtechnologies.

[61] Wu-chun Feng,et al. A comparison of TCP automatic tuning techniques for distributed computing , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[62] Ramesh Subramonian,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[63] José Duato,et al. High-radix crossbar switches enabled by proximity communication , 2008, HiPC 2008.

[64] Dhabaleswar K. Panda,et al. High Performance Remote Memory Access Communication: The Armci Approach , 2006, Int. J. High Perform. Comput. Appl..

[65] Dave Olson,et al. Pathscale InfiniPath: a first look , 2005, 13th Symposium on High Performance Interconnects (HOTI'05).

[66] Johannes Gehrke,et al. Cayuga: a high-performance event processing engine , 2007, SIGMOD '07.

[67] Hyun-Wook Jin,et al. Designing next generation data-centers with advanced communication protocols and systems services , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[68] Wei Huang,et al. High performance virtual machine migration with RDMA over modern interconnects , 2007, 2007 IEEE International Conference on Cluster Computing.

[69] Ulrich Brüning,et al. An open-source HyperTransport core , 2008, TRETS.

[70] R. Brightwell,et al. Design and implementation of MPI on Puma portals , 1996, Proceedings. Second MPI Developer's Conference.

[71] Scott Pakin,et al. The Impact of Message-buffer Alignment on Communication Performance , 2005, Parallel Process. Lett..

[72] Franco P. Preparata,et al. The cube-connected-cycles: A versatile network for parallel computation , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[73] Muli Ben-Yehuda,et al. Loosely Coupled TCP Acceleration Architecture , 2006, 14th IEEE Symposium on High-Performance Interconnects (HOTI'06).

[74] Nicholas Pippenger,et al. On Crossbar Switching Networks , 1975, IEEE Trans. Commun..

[75] Viktor K. Prasanna,et al. A Memory-Balanced Linear Pipeline Architecture for Trie-based IP Lookup , 2007 .

[76] John L. Henning. SPEC CPU2006 benchmark descriptions , 2006, CARN.

[77] Johannes Gehrke,et al. Towards Expressive Publish/Subscribe Systems , 2006, EDBT.

[78] Rami G. Melhem,et al. On the Feasibility of Optical Circuit Switching for High Performance Computing Systems , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[79] Karl S. Hemmert,et al. A hardware acceleration unit for MPI queue processing , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[80] Pascal Caron,et al. Characterization of Glushkov automata , 2000, Theor. Comput. Sci..

[81] Huai-An Lin,et al. Estimation of the optimal performance of ASN.1/BER transfer syntax , 1993, CCRV.

[82] Charles L. Seitz,et al. Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[83] Vivek Sarkar,et al. X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.

[84] Fabrizio Petrini,et al. Hardware- and software-based collective communication on the Quadrics network , 2001, Proceedings IEEE International Symposium on Network Computing and Applications. NCA 2001.

[85] Darren J. Kerbyson. A look at application performance sensitivity to the bandwidth and latency of InfiniBand networks , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[86] Dhiraj K. Pradhan,et al. The De Bruijn Multiprocessor Network: A Versatile Parallel Processing and Sorting Network for VLSI , 1989, IEEE Trans. Computers.

[87] D. Frank Hsu,et al. Distributed Loop Computer Networks: A Survey , 1995, J. Parallel Distributed Comput..

[88] Injong Rhee,et al. CUBIC: a new TCP-friendly high-speed TCP variant , 2008, OPSR.

[89] Ian T. Foster,et al. The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[90] Rajgopal Kannan. The KR-Benes Network: A Control-Optimal Rearrangeable Permutation Network , 2005, IEEE Trans. Computers.

[91] William J. Dally,et al. Flattened butterfly: a cost-efficient topology for high-radix networks , 2007, ISCA '07.

[92] Patrick Crowley,et al. Application development on hybrid systems , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[93] Dhabaleswar K. Panda,et al. Can user-level protocols take advantage of multi-CPU NICs? , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[94] Keith D. Underwood,et al. A preliminary analysis of the InfiniPath and XD1 network interfaces , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[95] William J. Dally,et al. Performance Analysis of k-Ary n-Cube Interconnection Networks , 1987, IEEE Trans. Computers.

[96] Fabrizio Petrini,et al. Accelerating Real-Time String Searching with Multicore Processors , 2008, Computer.

[97] G. Lafontant,et al. Packaging the Cell Broadband Engine microprocessor for supercomputer applications , 2008, 2008 58th Electronic Components and Technology Conference.

[98] Fabrizio Petrini,et al. Peak-Performance DFA-based String Matching on the Cell Processor , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[99] David A. Patterson,et al. X-Tree: A tree structured multi-processor computer architecture , 1978, ISCA '78.

[100] Tilman Wolf,et al. Massively Parallel Anomaly Detection in Online Network Measurement , 2008, 2008 Proceedings of 17th International Conference on Computer Communications and Networks.

[101] Larry L. Peterson,et al. binpac: a yacc for writing application protocol parsers , 2006, IMC '06.

[102] B W Arden,et al. Analysis of Chordal Ring Network , 1981, IEEE Transactions on Computers.

[103] P. Wyckoff,et al. EMP: Zero-Copy OS-Bypass NIC-Driven Gigabit Ethernet Message Passing , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[104] William Gropp,et al. MPI-2: Extending the Message-Passing Interface , 1996, Euro-Par, Vol. I.

[105] Jason Leigh,et al. Reliable Blast UDP : predictable high performance bulk data transfer , 2002, Proceedings. IEEE International Conference on Cluster Computing.

[106] Michael J. Flynn,et al. Very high-speed computing systems , 1966 .

[107] Cheng Jin,et al. FAST TCP: Motivation, Architecture, Algorithms, Performance , 2006, IEEE/ACM Transactions on Networking.

[108] Yi Huang,et al. WS-Messenger: a Web services-based messaging system for service-oriented grid computing , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[109] Willy Zwaenepoel,et al. Diagnosing performance overheads in the xen virtual machine environment , 2005, VEE '05.

[110] Keith D. Underwood,et al. Accelerating List Management for MPI , 2005, 2005 IEEE International Conference on Cluster Computing.

[111] Rajkumar Buyya,et al. High Performance Mass Storage and Parallel I/O: Technologies and Applications , 2001 .

[112] Yoshiko Yasuda,et al. Architecture and performance of the Hitachi SR2201 massively parallel processor system , 1997, Proceedings 11th International Parallel Processing Symposium.

[113] Karthick Rajamani,et al. Energy Management for Commercial Servers , 2003, Computer.

[114] Jack J. Dongarra,et al. The LINPACK Benchmark: past, present and future , 2003, Concurr. Comput. Pract. Exp..

[115] Yin-Ling Liong,et al. The Scheduled Transfer (ST) Protocol , 1999, CANPC.

[116] Robert W. Horst,et al. ServerNet deadlock avoidance and fractahedral topologies , 1996, Proceedings of International Conference on Parallel Processing.

[117] Wenke Lee,et al. Secure and Flexible Monitoring of Virtual Machines , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[118] Adrian Schüpbach,et al. The multikernel: a new OS architecture for scalable multicore systems , 2009, SOSP '09.