The statement of work and the report of the PoT Project can be downloaded from this page or wieved online in html form.

Statement of work

The goal of this project is to evaluate the possibility of developing a high-performance implementation of the Portals 3.0 API on TNet.

Background

The Portals 3.0 API was developed as a joint project between Sandia National Laboratories and the Scalable Systems Lab at the University of New Mexico. Like many other high-performance message passing APIs (e.g. Scheduled Transfer and Virtual Interface Architecture), the Portals API supports OS-Bypass. OS-Bypass is motivated by the high cost, in terms of time, associated with servicing interrupts during high speed communication. In OS-bypass, the relevant policies of the OS are implemented in a control program which is run on the Network Interface Card (NIC), thus eliminating the need to generate many of the interrupts associated with high speed communication. In addition to OS-Bypass, the Portals API also supports "application-bypass." Application-bypass is motivated by the need to minimize memory copies during communication. In application-bypass, the policies of the application regarding message placement are implemented on the NIC. Because the NIC is able to deliver messages to the correct location based on the contents of the message, the application is able to avoid a costly memory copy operation.

The company Supercomputing Systems in Zurich designed a custom network called TNet for the parallel computing project "Swiss-Tx" at the Swiss Federal Institute of Technology in Lausanne (EPFL). The message passing library MPI is installed and executed through the hardware interpreted Fast Communication Interface (FCI) that enables a direct store from one processor into the memory of another processor. Because the network interface card carries a large FPGA and 16 (or more) MB of memory, a flexible and fast implementation of any communication protocol can be done. By putting time-critical parts of the protocol into the hardware it is possible to optimize latency and throughput of high-performance networks.

Project Scope

The goal of this project is to design and develop an initial implementation of the Portals 3.0 API for TNet. We will start from the reference implementation of the Portals 3.0 API. The Portals 3.0 reference implementation uses a Network Abstraction Layer (NAL) to achieve independence of protection domains. That is, all of the calls to functions in the NAL are implemented as call-backs which may or may not cross protection domain boundaries. The three protection domains of interest are the application, the OS (kernel), and the domain defined by the control program on the NIC.

The primary goal of this project is to design an implementation of the Portals API that places as much of the functionality on the NIC as is feasible. This design would define the goal of a full implementation. A secondary goal is to develop a preliminary implementation of this design. In the preliminary implementation, much of the Portals functionality will remain in the application and OS domains and the NIC will have minimal functionality.

Report Portals over TNet

High Performance Computing using Portals over TNet - final report in pdf (320kB)
Report Portals on Intercept

High Performance Computing using Portals on Intercept - by Gerard Basler (2MB)
This is another work on Portals using the Intercept Network Technology.
My colleague Gerard Basler made his Diploma Thesis in Winter 2002/2003 at the UNM.
© 2000 | Adrian Riedo | University of New Mexico ..