About Portals

Portals is a low-level Network Protocol developed by Sandia National Labs for their commodity-based, large-scale computing cluster named CPlant.

Background

Portals 3 is the data movement layer actually used by the CPlant cluster at Sandia National Labs. Its roots go back to the SUNMOS and TeraFLOPS projects and is designed to support commodity based clusters up to the order of ten thousand nodes.

 

Addressing

A Portal represents an opening in the address space of a process. A Portal get operation reads data from another process while a put performs a write operation.

The Portals addressing scheme for incoming data is an intricate hierarchy. The first identifier is a match list, a list of match entries each of which contains a set of match bits. These bits describe a specific pattern that the incoming data must match to use that match entry. Within each match entry is a list of memory descriptors that define a region in memory as well as the behavior associated with that region, like how many and what kind of operations can be done using it. Although the match entry contains a list of memory descriptors, only the first one is considered when matching incoming data. Each memory descriptor can contain an event queue that is updated when an operation is performed on the region to let the application know what has happened. The final piece of addressing information is an offset within the memory descriptor. Therefore, a remote memory address can be accessed through a match entry, a memory descriptor, and an offest in the region defined by the memory descriptor.

NAL

The implementation strategy of Portals 3 is to provide a highly platform and network independent API for message passing applications. The concept of a network abstraction layer (NAL) is used to make migration from one network environment to another easy. Portals 3 is divided into two parts: an application programming interface (API) and a library (LIB).

The current CPlant network is Myrinet and therefore the network abstraction layer is called MyrNAL. On the API-side the NAL is defined by myrnal.c and controls the communication out of the application. The main LIB-side NAL sourcefile is called lib_myrnal.c and accesses the library which is actually located in the kernel as a driver. These files separate hardware dependent calls from logical routines.

The function forward on the API-side is used to communicate to the library. For a library in form of a kernel module, ioctl, a widely used Unix programming function, is used to perform this operation. For the NAL on the library side, more functions are required as most of the Portals work is done here. Its main functionalities include open, dispatch and close of the library and communication to the next layer (send, receive).

The network abstraction layer makes migration to another network environment much easier as the programmer does not have to deal with the Portals internals in depth. This is especially true when the library still remains in kernel space in form of a driver.

© 2000 | Adrian Riedo | University of New Mexico ..