Transit Note #103
Computational Quasistatics

Andre DeHon

Ian Eslick

Original Issue: February, 1994

Last Updated: Wed Oct 25 01:23:17 EDT 1995

Abstract:

We present a quasistatic computational model which provides the performance advantages of static computation while achieving the expressiveness and flexibility inherent in dynamic computing models. Feedback information from run time provides the compiler with details of program dynamics. This additional information enables the compiler to evolve efficient and specialized static execution sequences for commonly occurring dynamic computations. By incorporating the compiler into the lifetime of each program, quasistatic computation allows the compiler to automatically adapt programs to actual usage. This paper introduces a conceptual framework intended to facilitate system-level thought about the role of feedback and program adaption in the design and optimization of computer systems. We conclude the paper by examining the implications of this computational paradigm to future computer systems, programming languages, and architectures.

Introduction

In present computing systems, we can distinctly characterize many elements of our computations as static or dynamic. Informally, static computations are characterized by predictable control flow, timing, and data access patterns. Static elements of computation can be determined and scheduled at compile time. In contrast, dynamic elements of computation are characterized by frequent, often unpredictable, changes in control flow, variable timing, and random data access patterns. Over the past decade, we have seen continuous advances in computational performance. Notably, however, the rate of performance increase for static computations has been much larger than the rate for dynamic computation. There are two key reasons for the growing performance gap:

Compiler Technology -- Advances in compiler technology have allowed us to view and reorganize code heavily around information known at compile time in order to reduce execution latency. Since static computations exhibit more known and predictable information at compile time, the compiler can apply higher quality and more aggressive optimization to static computations. Many of these optimizations exploit properties and values known at compile time to effectively move computation from run time to compile time.
Architecture Trends -- Our ability to deliver high throughput at reasonable costs has increased more rapidly than our ability to deliver comparable reductions in latency. For instance, we can easily employ parallelism at various levels ( e.g. ILP, pipelining, multiprocessing) to increase raw throughput. With predictable control flow and data access, we can better exploit throughput to reduce task execution latency. When dynamic dependencies must be resolved to determine the execution path or data access, resolution latency is the key factor limiting the rate of task execution rather than available throughput.

Nonetheless, dynamic computations are essential to our computing systems:

Natural Unpredictability -- Many events and objects with which our computations must contend are naturally unpredictable.
Expressiveness -- Many computations are more easily expressed in a dynamic form.
Flexibility -- The late binding inherent in dynamic computations often defers commitments offering considerably more flexibility than purely static computations.

The goal of quasistatic computation is to narrow the gap between static and dynamic computation by capturing the best elements of both. Quasistatics introduces a feedback path between run-time execution and compilation to provide the compiler with information about run-time dynamics. Consequently, the compiler obtains information about program behavior that is not traditionally available during the compilation of dynamic code. This feedback allows the compiler to better predict control flow, timing, and data access patterns.

In micro forms, we have already seen quasistatic approaches used to reduce execution latency ( e.g. Trace Scheduling [Fis81], Profile-driven Compilation [Wal91] [Sam91], Synthesis [MP89]). Nonetheless, the techniques to date have been relatively independent and ad hoc. In this paper, we provide a paradigmatic framework for integrating feedback and recompilation into the basic computational model. This framework is intended to facilitate system-level thought about the role of feedback and program adaptation in the design and optimization of computer systems.

Since ``static'' and ``dynamic'' are actually relative terms, the following section examines static and dynamic computation in more depth to better characterize their computational properties and implications. Section formulates the quasistatic computational paradigm and provides examples of quasistatic utility and employment. Mature research in several areas provides many of the basic building blocks which make quasistatic techniques ripe for exploitation. We highlight some of these enabling technologies and recognize the intellectual heritage of quasistatic computation in Section . Before concluding, we take a brief look at the implications which the adoption of quasistatic computing styles will have on future computing systems (Section ).

Static and Dynamic Computation

Static: showing little change; characterized by a lack of movement, animation or progression
Dynamic: marked by a continuous usually productive activity or change

When describing computer systems and artifacts, we generally apply the words ``static'' and ``dynamic'' in concrete terms to distinguish characteristics. The terms are applied to indicate when changes may occur, when decisions are made, or when binding occurs. Usually these terms are used to refer to particular, often orthogonal, aspects of a computer system such as typing, scheduling, or memory management. In this section we highlight common static and dynamic aspects of computer systems and their implications.

Static Computation

Loosely, computer scientists define aspects of a computation which can be resolved and fixed at compile time as static. Hence, we have:

Static typing ( e.g. FORTRAN)
Static scheduling ( e.g. traditional basic-block scheduling; traditional, single-threaded control flow)
Statically allocated memory ( e.g. top-level globals in most languages)

Beyond these examples, where our nomenclature clearly identifies them as static, many other aspects of our programming languages can exhibit static properties:

Operators -- In languages like C, Pascal, and FORTRAN the basic operators have fixed meaning which can be unambiguously determined at compile time.
Functions, Subroutines -- Some languages and systems require that all functions be defined and resolvable at compile or link time.
Data Structures -- Arrays in most languages and fixed record structures in languages like COBOL, Pascal, or C, can exhibit regular structure which makes data access predictable and allow compile-time computation of many addresses and offsets.

Modern compiler technologies allow us to exploit all of these static structures to achieve high performance. With full knowledge of operators and subroutines, a compiler can aggressively reorder code for lower latency execution, sometimes breaking down artificial abstraction barriers like procedure and function invocations. Much of the work associated with evaluating a computation can be performed at compile time. The executable produced simply evaluates the data- and control-flow dependent residue of the computation described by the original program.

Almost no practical programming language can be termed purely-static, but languages do vary in their static expressiveness. FORTRAN serves as a strong example of a mostly static language. FORTRAN does have dynamic aspects in flow control, but the language tends to encourage structures with large amounts of predictability. Numerical computations in FORTRAN constitute the most canonical example of ``static'' computations. The literature contains considerable evidence that the static aspects of this class of computations can be exploited to achieve very high computational performance.

Dynamic Computation

Similarly, computer scientists define aspects of a computation which are not resolved until employed at run time as dynamic. We have:

Dynamic typing ( e.g. run-time, polymorphic typing in LISP and many object-oriented programming languages)
Dynamic scheduling ( e.g. dynamic dataflow, traditional operating-system scheduling and multithreading)
Dynamically allocated memory ( e.g. heap allocated data such as cons in LISP, or malloc in C)
Dynamic exception handling and error detection ( e.g. page faults)
Dynamic address translation ( e.g. virtual memory systems)
Dynamic linking ( e.g. Multics, SunOS)

Again, beyond our nomenclature, many aspects of our programming languages can exhibit dynamic properties:

Operators -- In languages like most dialects of LISP the basic operators can be redefined as part of a computation and hence must be resolved each time they are used at run time.
Functions, Subroutines -- Languages like LISP provide expressive freedom by allowing operator creation and redefinition during computation. This expressive power is accompanied by the requirement for run-time resolution. Similarly, many object-oriented languages have convenient operator inheritance relationships which can only be resolved at run time.
Pointers -- Languages like C and LISP allow flexible and spatially efficient data structures to be created from pointers. Data access requires run-time traversal of these data structures.
Aliasing -- All languages suffer to some extent from the dynamic aspects of aliasing. Two different ``names'' can refer to the same ultimate object. This property forces compilers to forgo many potential optimizations to preserve the language semantics.
Unpredictable Latency -- Almost all non-dedicated computer systems exhibit unpredictable latencies in memory and network access times. Time-shared systems further complicate the predictability of task completion rates. This lack of predictability makes static scheduling inefficient, virtually necessitating some level of dynamic scheduling to efficiently tolerate variable latency operations.

In many ways, dynamic support allows programs to be more efficient, expressive, and flexible than they would be otherwise. Nonetheless, bindings and decisions which must be resolved at run time are costly for several reasons:

Interpretive overhead -- Decisions cost execution time to resolve when postponed to run time.
Indirection overhead -- Run-time resolution often requires translation or lookup. This adds overhead time, often requires additional memory bandwidth for resolution, and makes memory access less predictable.
Reduced knowledge and power -- Decisions made at run-time must be inexpensive time-wise and use a minimum amount of information. Consequently, these decisions are forced to be simple and local. In contrast, off-line decisions made by the compiler can afford to take a more global perspective and perform more detailed analysis before committing a decision.

LISP and C both exhibit a wealth of dynamic properties. Clever programmers are aware of the dynamic and static aspects of their respective programming languages and the system implementations they employ. Consequently, they will strategically strength reduce the facilities they employ as dictated by the requirements of their application. Unfortunately, the distinction between static and dynamic computational aspects is often subtle. The programmer must have a sophisticated understanding of operator semantics to properly perform this kind of optimization. Further, if a value is going to change at all after compilation, the programmer is forced to adopt a dynamic solution regardless of the actual frequency of the change.

Symbolic processing and system-level management are canonical examples of ``dynamic'' computations. These are characterized by heavy pointer manipulation, frequent and statically unpredictable jumps in control flow, statically unpredictable operation latency, and regular rebinding of data and operators. The literature presents evidence that this class of programs is much harder to optimize than FORTRAN numerical code.

Quasistatic Computation

Outside of computer science the terms ``static'' and ``dynamic'' serve as relative quantifiers rather than absolutes. ``Static'' does not mean fixed and unchanging. Rather it connotes infrequent change. Similarly, we can adopt a compilation and computation model which does not make such a rigid and absolute resolution-time distinction. We use the term ``quasistatic'' to describe this computational model which allows dynamic changes to occur but also judiciously employs recompilation to update the executable instruction stream to reflect information acquired from run time. Here, the executable may remain unchanged for relatively long periods of time, for example a program run, but is allowed to change to adapt to common usage conditions.

An important observation is that entities whose values and properties are allowed to change under dynamic computational models do not, necessarily, change as frequently as they are used. For example, an operator binding does not, normally, change with every use of the operator. Further, the sequence of operator bindings employed by a program may remain mostly constant from run to run. Similarly, a variable's value may remain mostly constant over a program run and may even be constant or highly predictable over a sequence of program runs. If we can discover these quasistatic values, bindings, or other characteristics of a program, we can feed them back to the compiler. The compiler can use its knowledge of the quasistatic values to produce code optimized for the stable and predictable epochs of these values. While optimizing around fed back quasistatic data, the compiler may perform non-local code reorganization and take advantage of other techniques normally restricted to the compilation of mostly static code. Of course, the compiler must also assure that code so compiled can properly handle changes that invariably occur and signal a change of quasistatic epochs.

We can view this invariant factorization as a complex form of code motion or hoisting. By empirically recognizing epochs and properties which are invariant within each epoch, we can factor computations involving the invariant out of the epoch, leaving only the residue computation which results from the invariant evaluation. Factoring and residue generation are exactly the function performed by partial evaluation [JGS93].

With quasistatics, we use run-time feedback to identify invariants which empirically exist over portions of code for relatively long periods of execution time even when there is no static evidence or guarantee that such invariants exist. There are two key reasons why such invariants may arise:

A value or property may have the potential to be dynamic, but that option is not being exercised or the option is being exercised only infrequently. For example, a variable in a program may be taken from a configuration file read when the program is invoked. The user may have the option to change the configuration, but may change it very infrequently. By discovering that the value is constant across many computations, the compiler can generate an executable for the user where this value is treated as a constant. The constant can be propagated using classical propagation techniques to generate a more efficient, specialized executable for the user. If the user does change the variable, the compiled code recognizes that its invariant has changed and invokes the more general execution sequence instead of the specialized one. If this new value remains invariant, the compiler can then generate a version specialized around the new quasistatic value.
A value or property may, in fact, be invariant. Discovery of the invariant may be beyond the state of the art in static analysis or may require exponential computation. Without resorting to overly expensive static analysis, empirical, run-time feedback can be used to identify promising invariants.

Dynamic, Run-time Feedback

Dynamic, run-time feedback is the key to quasistatic computation. By monitoring run-time behavior we can acquire information about quantities unknown at compile time. For these unknown quantities, we can use feedback to obtain:

Values -- the typical value or range of values exhibited by a variable or property ( e.g. , is an integer, executing procedure requires 1000 cycles)
Stable Epochs -- ranges over which a variable or property is usually constant ( e.g. is not changed during this procedure)
Variance -- how tightly bound is a value, or how well can we predict a value ( e.g. 98% of the time procedure executes in 100010 cycles).
Correlations -- predictability of values based on other values ( e.g. when , 99.9% of the time executes in 100+ cycles)

It is important to remember that information obtained empirically from run-time is only predictive. We can never use this information to conclude that we will only see certain values. Nonetheless, we can determine the normal cases and specialize the executable code accordingly.

When we take advantage of run-time feedback, we make the explicit assumption that the future resembles the past. This is often a good assumption to make, and the variance over time provides a good indication of where it is most fruitful to apply this assumption. To see why the assumption is useful, we can review some of the components which contribute to each execution signature captured by run-time feedback:

Some of the characteristics we are capturing are simply representative of the program being run. ( e.g. function usage, certain access patterns)
Some characteristics are representative of the program running on a particular computer. ( e.g. access-time for references, miss-rate)
Some characteristics will represent the effects of external resource usage and sharing. ( e.g. network loading while the program was run)
Some characteristics are indicative of the gross data set. ( e.g. the characteristics may be a signature of running spice on a 100-node CMOS circuit)
Some characteristics will be representative of the particular data sets employed.

We can optimize around characteristics generated by (1) and (2) and reasonably expect that the future will resemble the past. By determining variance and correlation, we can determine the predictable effects of (3) and optimize around them. We may also be able to determine strategic correlations ( e.g. at 1 AM, we can expect to see higher throughput and lower-latency out of our LAN than at 1 PM). To the extent that a user runs similar datasets, we can expect the future to display similar characteristics due to (4) and (5). Again, variance over time helps the compiler to separate gross characteristics which recur from characteristics which are unique to a few datasets.

A Simple Example

Consider the following code snippet from a hypothetical placement program written in C:


 manhattan_cost(x1,y1,x2,y2,placed)
   int x1,y1,x1,y2;
   OBJECT_ARRAY placed;
  {return(abs(x1-x2)+abs(y1-y2));
   }
 density_cost(x1,y1,x2,y2,placed)
   int x1,y1,x1,y2;
   OBJECT_ARRAY placed;
  {return(horizontal_channel_density_approximation(x1,x2,placed)+
             vertical_channel_density_approximation(y1,y2,placed));
   }
              . . .
 place_objects()
  {    
              . . .
   for (obj=all_objects;obj!=NULL;obj=obj->next)
     {
              . . .
      pick_new_location(&xnew,&ynew,obj,placed);
      new_cost = 0;
      for (links=obj->connections;links!=NULL;link=links->next)
          new_cost += (*cost_function)(xnew,ynew,links->x,links->y,placed);
              . . .
      }
              . . .
  }
              . . .

place_objects is reused during different phases of placement and optimization to place the objects and recompute the value of each placement. During early phases of optimization, the cheap cost function ( manhattan_cost) is used. In later phases, a more expensive and accurate cost function ( density_cost) is used instead. At a high level, the optimization manager decides when to change the cost metric used for each phase of optimization.

Unfortunately, since the cost function is accessed via indirection through a pointer whose value is not constant, the cost function computation cannot be inlined and optimized at compile time. With run-time feedback, the compiler can discover the range of values taken on by cost_function. During a recompilation pass, the compiler can then inline the computation. For example, after determining that 80% of the time cost_function is resolved it actually refers to manhattan_cost, the compiler might reorganize the code as follows:


 place_objects()
  {    
           . . .
  if (cost_function == &manhattan_cost)
     {for (obj=all_objects;obj!=NULL;obj=obj->next)
       {
           . . .
          pick_new_location(&xnew,&ynew,obj,placed);
          new_cost = 0;
          for (links=obj->connections;links!=NULL;link=links->next)
              new_cost +=(abs(xnew-links->x)+abs(ynew-links->y));
           . . .
       }
     }
   else
     { [original code with indirection]
     }
           . . .
  }
           . . .

Here, we avoid a pointer indirection and anonymous procedure call overhead in a frequently executed portion of code.

If the compiler cannot guarantee that the value of the cost_function pointer is not modified inside the if block, it may mark the page containing cost_function as read-only. In this manner, the code can identify any unanticipated change in the value. When an unexpected change does occur, patch up code can be executed to assure proper program semantics.

Range of Application

Feedback can be beneficial for almost any quantity which is unknown at initial compile time. For example:

Aliasing -- Feedback can indicate typical address values and hence aliasing. The compiler can compile code specialized to the aliasing which actually occurs. To assure correctness, the code can insert dynamic checks to guarantee that specialized code is only used in place of general code when appropriate. Of course, this kind of transformation is only beneficial when the aggregate savings due to specialization justify the added checks. As we saw in Section , these assumption checks can often be hoisted out of frequently executed code segments.
Timing -- Often unpredictable timing necessitates dynamic scheduling. With feedback on operation timings, including data access latencies and interprocessor communication delays, the compiler can more reasonably schedule many computations statically. ( e.g. With accurate access latency estimates, the compiler can schedule prefetch to minimize stalls in computations.)
Procedure Bindings -- As demonstrated in the previous section, anonymous function bindings can be resolved via feedback. When bindings are highly predictable, they can be inlined and optimized just as efficiently as function calls which are statically apparent.
Values -- Common values can be identified via feedback. Using partial-evaluation techniques, we can generate code segments specialized around the common values. When encountered, the specialized code can be used in place of the general code.
Ranges -- Knowing the typical range of a data value often enables optimizations which are not possible when the value is completely unknown. Here again, we can specialize code to run most efficiently when the data value falls into the range most commonly seen at run time.
Control Flow -- Information on past control flow increases the predictability of program control flow. Increased predictability allows the compiler to exploit additional throughput to reduce task execution latency.
Access/Usage Patterns -- Good a priori estimates of resource and memory usage allow the compiler to tailor better strategies for resource management. ( e.g. If we can predict the typical amount of dynamic memory and chunk-size used by a program, we can do a better job of allocating the memory to reduce management overhead and avoid memory fragmentation than traditional, knowledgeless algorithms [BZ93] [GZ92]. Knowledge of access and communication patterns allows the compiler to redistribute data and code to improve locality and minimize cache conflicts.)

Advantages

Quasistatic computation is aimed at reducing run-time execution latency while allowing dynamic expressiveness and flexibility. In this section we examine the general advantages of quasistatic computing from several perspectives.

Eliminate Unused Functionality

One key optimization is aggressive operator and data structure strength reduction. When values and functions are statically known, the compiler can often determine the exact functionality required by an operator or data structure and strength reduce it to cheaper operations which have the same semantic effect. Using feedback data in a quasistatic computing model, we gain the same advantages by noting the features actually employed. As always, we need to provide an escape mechanism which allows the computation to be patched up if the additional functionality is, unexpectedly, required. In many ways, this strength reduction can take the burden of optimization and function selection off the programmer since unused functionality contained in the operators will be discovered and specialized away. For instance, a LISP list which is created with uniform data elements and is not arbitrarily ``grown'' can be replaced with an array. In many cases, this will make traversal and access much cheaper since the array requires less indirection to access any given element. This class of transforms can achieve a significant reduction in the run-time overhead associated with more powerful data structures by moving the overhead associated with data-structure traversal from run time to compile time.

Common-Case Optimization

Automated, common-case optimization will be another key benefit of quasistatic compilation. Feedback from run-time tells the compiler what paths are most commonly executed and what data values occur most commonly. The compiler can reorganize and restructure executable code to reduce the execution latency in the most common cases.

Execution Latency Compression

We can also view quasistatic computation from an information theoretic perspective. If a quantity is known statically, or is highly predictable, the run-time value of the quantity conveys little or no information. The known value can be folded into computation at little run-time cost on average. Values whose probability distributions are highly non-uniform admit to compression using non-uniform coding, or in this case execution, sequences. Only quantities which are completely unpredictable cannot be compressed by recoding and hence must be fully evaluated each time they are encountered at run time. Quasistatic computation including adaptive recompilation, then, is directly analogous to adaptive data compression.

Embodiments

Quasistatic computation is only a net benefit to the extent that it reduces execution latency. Certainly feedback analysis and recompilation require additional computation. Fortunately, here we can readily exploit throughput to perform the additional computation.

At a coarse granularity, we can exploit otherwise ``idle'' machine cycles between programming runs to perform analysis and recompilation. When the user is away pondering the results of a previous run, or at home for the night, his workstation can analyze feedback and regenerate executables. In the most basic incarnation, the compiler can tune execution and replace whole executable binaries between program invocations. This approach is probably the easiest to introduce into existing computer systems. However, limited collection bandwidth and storage space may necessitate very frugal collection of feedback data. Additionally, this scheme does not allow programs to adapt to usage during a single execution.

On parallel or multithreaded systems, we might implement one or more compilation servers. When application tasks do not consume all the resources on the machine, compiler threads can analyze feedback data and generate new execution sequences. Compilation threads will be coarse-grained so can easily and efficiently exploit the throughput available on parallel computers. ``Idle'' workstations on a LAN can also serve efficiently in this capacity. Again, care must be taken to limit the quantity of feedback data. Dynamic adjustment can be used to match the data to the available collection bandwidth and analysis cycles.

Both of the previous suggestions borrow only spare throughput to perform recompilation. However, strategic analysis and recompilation interspersed with a program can also result in net reductions in application latency. Interspersed analysis requires much more attention to analysis and recompilation efficiency. This approach allows executable adaptation during long program runs.

In addition to these cases where recompilation is initiated by an external agent in response to feedback, the executable can explicitly request compilation. In cases where earlier feedback suggests that a value is constant past a point in the program, but the value of the constant is unpredictable, it may be beneficial for the program to explicitly request recompilation at that point during program execution. Several researchers have demonstrated the advantages of this kind of run-time code synthesis in various scenarios. Massalin and Pu employ run-time compilation in the Synthesis kernel [MP89] to generate specialized kernel operations. Keppel, Eggers, and Henry demonstrate that the explicit use of run-time code generation can often produce better executable code than is possible with purely static compilation [KEH93].

Foundational and Enabling Work

Quasistatics continues the intellectual legacy of common-case optimization, trace-scheduling, and profile-driven compilation. Manual optimization for the normal case is a well known system technique [Lam84] for achieving good average performance. As a system design technique, common-case optimization relies on the designer to collect data, analyze the feedback, and optimize accordingly. Trace Scheduling [Fis81] [FERN94] and profile-driven ( e.g. [Wal91] [Sam91]) compilation take this idea one step further and automate code reorganization based on profile traces from sample program runs. Quasistatics takes the next step beyond these techniques by integrating optimization based on observed common cases into the lifetime of the program.

Run-time code generation research also plays an inspirational and foundational role in quasistatic computation. The Synthesis kernel effectively demonstrates the power of invariant factorization to produce highly efficient, specialized functions at run time [MP89]. Value-specific optimization demonstrates more general and portable application of run-time code generation [KEH93]. Deferred compilation employs static, staging analysis on functional programming languages to automatically determine opportunities for invariant factorization [LL93]. Quasistatic computation completes the picture by allowing the compiler to automatically recognize opportunities for run-time code generation which are not easily identified by static analysis.

Maturing research in partial evaluation and specialization ( e.g. [Ber90] [JGS93]) provides us with the tools to take advantage of known data values to produce streamlined executables.

Quasistatic optimization would be of limited utility without efficient techniques for lightweight profiling and tracing of execution flow and data values. Many recent developments in this area play a key role enabling quasistatic computation. Appel introduces simple, compiler-inserted profiling annotations which support lightweight profile collection [ADM88]. Larus demonstrates the efficiency of modern techniques for compiler-based program tracing [Lar93]. Wahbe presents efficient schemes for data value monitoring during program execution [WLG93]. His data tracing techniques allow programs to identify when values change and can be much more lightweight than trap-based techniques such as the read-only, virtual-memory page scheme suggested in Section .

In quasistatic computation, the compiler may be called upon continuously to incrementally recompile portions of the program. Several compilers for object-oriented languages, including the SELF compiler [CUL91], employ dynamic compilation to generate executable code from a simple intermediate form during execution. Notably, since the SELF compiler is employed dynamically, it can often exploit knowledge of data values which are not known until run time. Researchers at Sun Microsystems have developed techniques for modifying more conventional compilers for efficient incremental recompilation [FST91].

Implications

As a discipline, we recognize the importance of analyzing dynamic behavior and optimizing our systems accordingly. While feedback is used in spot applications, we have not, traditionally, included it as a fundamental part of our computational model. As noted herein, there are many good reasons to integrate automatic feedback generation, consumption, and feedback-oriented adaptation into our base-line computing model.

Once we accept this new model, many optimizations which the clever programmer could make can now be automated and performed by a clever compiler. This has the obvious benefit of taking some of the burden for extracting performance off the application and system programmers. More fundamentally, though, the model protects our programs against inevitable changes in technology, changes in external systems, and shifts in usage patterns.

One key shift which quasistatic computation encourages is a general move to more explicit management of resources ( e.g. caches, virtual memory paging, memory management schemes). Rather than relying on oblivious mechanisms which simply react to events at run time, we can predict likely events and usage patterns. We can often avoid the overhead associated with belated discovery of an exceptional case ( e.g. data item not in the cache, TLB miss) by explicitly predicting that it will occur and providing a hint in the instruction stream to explicitly handle or prepare for the event.

Future Computer Systems

Our systems should be designed to:

Collect statistical and usage information -- Collection should be cheap and flexible. At times the compiler will require more detailed information. Ideally, it should be possible to get more detailed feedback data on some aspect or portion of a system while collecting less information on the rest of the system.
Communicate statistical and usage information to the compiler -- Many of our contemporary systems can collect this kind of information, but the consumers have traditionally been people. Future systems will need to seamlessly provide this information to the compiler as needed.
Accept hints issued explicitly by the instruction stream -- When a priori knowledge of future events can make processing more efficient, the system should be designed to accept and utilize hints provided by the executing program. Ideally, there will be correlation between the kinds of statistics and usage information a system collects and the kinds of hints it can exploit.
Support speculative operations whenever possible -- Much of the advantage afforded by quasistatic computation results from aggressive optimization for identified, common-case execution. Speculative support allows programs to execute down paths which may turn out to be incorrect without causing unrepairable damage. The more freedom the system provides for such speculation, the more performance quasistatic computation can extract.

Future Architectures

To some extent, quasistatic computation allows us to better exploit the characteristics we already know how to architect well. In particular, quasistatic computation sets out to increase predictability allowing us to better exploit throughput to increase performance. This shows up partially as larger statically scheduled code blocks. It also shows up in the accurate estimates of timing and data requirements, allowing data access latency to be more completely overlapped with useful computation.

New architectures should explicitly consider the system support suggested in the previous section:

Feedback -- Computational devices should be designed to allow lightweight extraction of relevant statistical information.
Hints -- Almost any heuristically managed resource ( e.g. caches, TLBs) should provide support for explicit management from the instruction stream. Support for software prefetching [CKP91] is a good example of such a hint. Non-binding prefetch [RL92] interacts better with speculatively executed prefetch operations.
Speculation -- In traditional architectures, a crude form of common-case speculation and backout is provided via faults and traps. Unfortunately, the overhead associated with these traps on modern processors is extremely high, making such optimization fruitful only for highly predictable events. Support for common-case speculation with lightweight backout would increase the range of worthwhile application. Torch, for example, introduces support for instruction boosting across basic-block boundaries to efficiently exploit speculation on superscalar architectures [SLH90].

With the advent of reconfigurable logic, specialization is not limited to the executable instruction stream. Architecturally, we can now design general-purpose computing systems which can be configured, on the fly, to provide application-specific acceleration. While the field of custom computing using reconfigurable logic is relatively young, researchers using modest amounts of reconfigurable logic have already demonstrated orders of magnitude higher performance than general-purpose microprocessors or workstations on numerous applications [BP93]. One of the key advantages of reconfigurable logic is that the architecture can be specialized to match the computational requirements of the application. Quasistatic computation can employ feedback to extract the critical computational requirements for an application. Further, the quasistatic compiler gathers information on the properties of the data normally seen during the computation. Combining this information, the compiler can synthesize minimum cost computing structures tailored to common-case execution requirements.

Future Programming Languages

One key benefit of quasistatic computation is that we do not have to forgo dynamic expressibility in order to achieve high performance. We have seen that dynamic features do not inherently limit performance. Rather, the unpredictability of dynamic quantities ultimately limits performance. By adopting a quasistatic computational model, we recognize that all dynamics are not equal, and we only pay a run-time cost for dynamic features when they are used in unpredictable ways.

Languages and programming environments should be designed with the feedback model in mind. Libraries should be able to provide multiple implementation options for routines or data structures. Run-time feedback will then allow the compiler to select the appropriate version for each application ( e.g. [Low78] [Sam91]). As noted in Section some versions of a routine or data structure may be applicable only in a limited context; that is a version may not provide all the functionality or be valid over the whole range of values supported by other versions. As long as the programmer communicates the range of applicability of each version to the compiler, the compiler can select appropriate versions based on actual requirements seen during execution.

Conclusion

We have articulated a paradigm for computing which differs substantially from traditional static and dynamic computational models. Quasistatic computation allows us to have dynamic expressibility and flexibility while obtaining performance comparable to purely static code. Feedback and adaptive recompilation throughout program lifetime provide the key elements which make this kind of optimization possible. In effect, quasistatic computation embodies classical common-case optimizations in an automated manner. Quasistatic computation can also be viewed as a technique which exploits computational regularities to adaptively compress program run time. Through continuous adaptation and recompilation, quasistatic computation allows programs to deliver high performance even across changes in external assumptions, technology, and usage patterns. Many technologies which enable quasistatic computation are beginning to mature, making this style of computing ripe for exploitation. Future computing systems should support feedback, hints, and speculation to take full advantage of the opportunities afforded by quasistatic computing.

Transit Note #103
Computational Quasistatics

Abstract:

Introduction

Static and Dynamic Computation

Static Computation

Dynamic Computation

Quasistatic Computation

Dynamic, Run-time Feedback

A Simple Example

Range of Application

Advantages

Eliminate Unused Functionality

Common-Case Optimization

Execution Latency Compression

Embodiments

Foundational and Enabling Work

Implications

Future Computer Systems

Future Architectures

Future Programming Languages

Conclusion

See Also...

References

Transit Note #103 Computational Quasistatics

Abstract:

Introduction

Static and Dynamic Computation

Static Computation

Dynamic Computation

Quasistatic Computation

Dynamic, Run-time Feedback

A Simple Example

Range of Application

Advantages

Eliminate Unused Functionality

Common-Case Optimization

Execution Latency Compression

Embodiments

Foundational and Enabling Work

Implications

Future Computer Systems

Future Architectures

Future Programming Languages

Conclusion

See Also...

References

Transit Note #103
Computational Quasistatics