MBTA: Quick Overview
Andre DeHon
Original Issue: January 1991
Last Updated: Mon Nov 8 14:57:22 EST 1993
The modular bootstrapping transit architecture (MBTA) aims to satisfy two major goals:
The METRO routing component is the basic VLSI building block for the Transit
network. METRO is a custom CMOS routing component (the successor to
RN1) currently under construction. It provides simple, high-speed
switching for fault-tolerant networks. The target router for use in MBTA
will have eight ten-bit wide input channels and eight ten-bit
wide output channels. These channels provide byte-wide data transfer with
the ninth bit serving as a signal for the beginning and end of
transmissions and the tenth bit serving for reverse flow-control.
The primary METRO configuration is a crossbar
router with a dilation of two. In this configuration, all eight input
channels are logically equivalent. Alternately, the component can be
configured as an
dilation one crossbars.
METRO is packaged in a 372 pin dual sided pad grid array package (DSPGA372) ( or, perhaps, its successor) (tn33). DSPGA372 is a custom package designed by Fred Drenckhahn and Tom Knight for our high performance, high density, three-dimensional packaging strategy. The package is designed to accommodate high-speed controlled impedance signalling for large pin count VLSI components. DSPGA372 integrates features for robust alignment, cooling, and effective use of printed circuit board space.
DSPGA372 mates with a button board connector (BB372) to provide very low-profile solderless interconnect. BB372 is a custom button board connector designed to facilitate our dense three-dimensional packaging strategies. BB372 provides low-resistance, solderless vertical connection between pad grid arrays and printed circuit boards (tn33) [Kni89].
MBTA is based around an indirect, multistage network built from METRO routing components and organized in a bidelta configuration. Routing components are connected as described in [DKM90] to achieve fault-tolerance.
The Transit network is packaged in a three-dimensional stack structure
(Figure ). The network stack is composed of alternating
layers, or planes, of components and printed circuit boards. Components
and printed circuit boards are interconnected using the BB372 button board
connectors. The component layers perform switching while the printed
circuit boards interconnect routing components to effect the network
organization just described. Figure
shows a cross
section of stack demonstrating how the button board carriers, routing
components, and printed circuit boards mate to form a stack.
MBTA will offer a complete field test of this network structure and packaging scheme. The nodes will be able to keep the network fully utilized at its projected full-speed operation of 100 megabytes/second/port. MBTA will exercise the METRO routing component fully. The performance of METRO's routing and fault-tolerance protocols can be easily evaluated in this representative setting. This field test will allow us to evaluate robustness of the packaging components and stack structure. From MBTA we should gain experience with critical issues such as clocking, powering, and cooling this dense three-dimensional packaging structure.
Building a large scale parallel computer is a huge task. There is a considerable amount which we do not understand about how to build and efficiently program parallel computers. To successfully construct and evaluate a parallel computer numerous system components must be developed and integrated together. Such components typically include a network, memory, network and cache interfaces, processors, compilers, operating systems, programming paradigms, and application programs. The volume and breadth of components necessary to construct a parallel computer make the task of bootstrapping and evaluating a particular parallel architecture, or even a single component of a parallel architecture, very difficult. Once a system is built, many of the components are set in stone, and it is not possible to easily evaluate changes to single components of the system. As a result, all of the effort expended to realize a particular parallel computer usually result in only a few datapoints in the multidimensional space of potential parallel computer architectures.
By providing parallel emulation of the hardware components in a parallel computer, MBTA attempts to provide a modular framework in which parallel architectures can be evaluated. Architecture and protocol ideas can be soft coded onto the machine. MBTA can then be used to emulate the system under study. Software can be targeted at the emulated system. Efforts at all levels of software, architecture, protocols, and paradigms can feed back on each other to identify the best ensemble of components necessary for efficient parallel architectures.
The modularity provided allows experimenters to develop or experiment with a component of the system independent from others. The MBTA scaffolding allows the system component under study to be examined in the context of an operational machine. Variants of system components can be mixed and matched to study their interaction. As the pieces of the system are better understood, designs can be spawned off which replace the generic MBTA modules with hardwired components. The modular architecture should allow the rapid incorporation of such developments into complete parallel computer systems.
In order to take full advantage of MBTA's modular hardware architecture, we
aim to utilize a software model which is modular at the compiler and
programming level, as well. Figure shows the software
modularity we aim to achieve. That is, at a software level, MBTA attempts
to exemplify an approach to parallel computing which fully decouples the
hardware architecture from the programming paradigm
(tn34).
This
scheme makes the hardware fully modular with respect to the programming
model. It allows any architecture to leverage evaluation code from many
programming models. The processes of defining the parallel register
transfer language allows us to focus on the computational mechanisms
necessary to efficiently support the wide range of evolving parallel
programming paradigms. With this software model and the ability to emulate
a wide range of parallel computer hardware configurations, MBTA will be a
powerful tool for evaluating parallel computer architectures.
Each MBTA node will be composed of a processor, memory, bus
logic, and network interfaces. Figure shows the high level
composition and organization of an MBTA node. This node architecture is
intended to allow both fast network testing and fair emulation (tn25).
Nodes are interconnected using the Transit network described in
Section
.
During each emulation cycle, the processor on each node emulates the function of the node. Thus, within the emulation cycle, the processor will execute an appropriate time-slice for each piece of hardware which is being modeled in the node. Fair emulation is guaranteed by keeping the relative progression of each node synchronized. When the emulation utilizes the underlying Transit network directly, dummy network cycles are introduced between the transmission of real data in order to match the bandwidth of the network to the bandwidth of the emulating nodes.
Intel's 80960CF [MB88] [Int89c] serves as the node processor. Initially, the node memory will be flat, fast-static RAM for simplicity and maximum emulation flexibility. The basic design will accommodate additional DRAM memory and co-processors if these seem appropriate during the design and use of the experimental prototypes. A custom network interface component of our design connects the node to the METRO based network (tn75). Additionally, a custom node bus controller is responsible for coordinating the interactions of the processor, memory, and network interfaces (tn30).