Transit Note #103

Computational Quasistatics

Andre DeHon

Ian Eslick

Original Issue: February, 1994

Last Updated: Wed Oct 25 01:23:17 EDT 1995

Abstract:

We present a quasistatic computational model which provides the performance advantages of static computation while achieving the expressiveness and flexibility inherent in dynamic computing models. Feedback information from run time provides the compiler with details of program dynamics. This additional information enables the compiler to evolve efficient and specialized static execution sequences for commonly occurring dynamic computations. By incorporating the compiler into the lifetime of each program, quasistatic computation allows the compiler to automatically adapt programs to actual usage. This paper introduces a conceptual framework intended to facilitate system-level thought about the role of feedback and program adaption in the design and optimization of computer systems. We conclude the paper by examining the implications of this computational paradigm to future computer systems, programming languages, and architectures.

Introduction

In present computing systems, we can distinctly characterize many elements of our computations as static or dynamic. Informally, static computations are characterized by predictable control flow, timing, and data access patterns. Static elements of computation can be determined and scheduled at compile time. In contrast, dynamic elements of computation are characterized by frequent, often unpredictable, changes in control flow, variable timing, and random data access patterns. Over the past decade, we have seen continuous advances in computational performance. Notably, however, the rate of performance increase for static computations has been much larger than the rate for dynamic computation. There are two key reasons for the growing performance gap:

  1. Compiler Technology -- Advances in compiler technology have allowed us to view and reorganize code heavily around information known at compile time in order to reduce execution latency. Since static computations exhibit more known and predictable information at compile time, the compiler can apply higher quality and more aggressive optimization to static computations. Many of these optimizations exploit properties and values known at compile time to effectively move computation from run time to compile time.
  2. Architecture Trends -- Our ability to deliver high throughput at reasonable costs has increased more rapidly than our ability to deliver comparable reductions in latency. For instance, we can easily employ parallelism at various levels ( e.g. ILP, pipelining, multiprocessing) to increase raw throughput. With predictable control flow and data access, we can better exploit throughput to reduce task execution latency. When dynamic dependencies must be resolved to determine the execution path or data access, resolution latency is the key factor limiting the rate of task execution rather than available throughput.
Nonetheless, dynamic computations are essential to our computing systems:

The goal of quasistatic computation is to narrow the gap between static and dynamic computation by capturing the best elements of both. Quasistatics introduces a feedback path between run-time execution and compilation to provide the compiler with information about run-time dynamics. Consequently, the compiler obtains information about program behavior that is not traditionally available during the compilation of dynamic code. This feedback allows the compiler to better predict control flow, timing, and data access patterns.

In micro forms, we have already seen quasistatic approaches used to reduce execution latency ( e.g. Trace Scheduling [Fis81], Profile-driven Compilation [Wal91] [Sam91], Synthesis [MP89]). Nonetheless, the techniques to date have been relatively independent and ad hoc. In this paper, we provide a paradigmatic framework for integrating feedback and recompilation into the basic computational model. This framework is intended to facilitate system-level thought about the role of feedback and program adaptation in the design and optimization of computer systems.

Since ``static'' and ``dynamic'' are actually relative terms, the following section examines static and dynamic computation in more depth to better characterize their computational properties and implications. Section formulates the quasistatic computational paradigm and provides examples of quasistatic utility and employment. Mature research in several areas provides many of the basic building blocks which make quasistatic techniques ripe for exploitation. We highlight some of these enabling technologies and recognize the intellectual heritage of quasistatic computation in Section . Before concluding, we take a brief look at the implications which the adoption of quasistatic computing styles will have on future computing systems (Section ).

Static and Dynamic Computation

Static: showing little change; characterized by a lack of movement, animation or progression

Dynamic: marked by a continuous usually productive activity or change

When describing computer systems and artifacts, we generally apply the words ``static'' and ``dynamic'' in concrete terms to distinguish characteristics. The terms are applied to indicate when changes may occur, when decisions are made, or when binding occurs. Usually these terms are used to refer to particular, often orthogonal, aspects of a computer system such as typing, scheduling, or memory management. In this section we highlight common static and dynamic aspects of computer systems and their implications.

Static Computation

Loosely, computer scientists define aspects of a computation which can be resolved and fixed at compile time as static. Hence, we have:

Beyond these examples, where our nomenclature clearly identifies them as static, many other aspects of our programming languages can exhibit static properties:

Modern compiler technologies allow us to exploit all of these static structures to achieve high performance. With full knowledge of operators and subroutines, a compiler can aggressively reorder code for lower latency execution, sometimes breaking down artificial abstraction barriers like procedure and function invocations. Much of the work associated with evaluating a computation can be performed at compile time. The executable produced simply evaluates the data- and control-flow dependent residue of the computation described by the original program.

Almost no practical programming language can be termed purely-static, but languages do vary in their static expressiveness. FORTRAN serves as a strong example of a mostly static language. FORTRAN does have dynamic aspects in flow control, but the language tends to encourage structures with large amounts of predictability. Numerical computations in FORTRAN constitute the most canonical example of ``static'' computations. The literature contains considerable evidence that the static aspects of this class of computations can be exploited to achieve very high computational performance.

Dynamic Computation

Similarly, computer scientists define aspects of a computation which are not resolved until employed at run time as dynamic. We have:

Again, beyond our nomenclature, many aspects of our programming languages can exhibit dynamic properties:

In many ways, dynamic support allows programs to be more efficient, expressive, and flexible than they would be otherwise. Nonetheless, bindings and decisions which must be resolved at run time are costly for several reasons:

  1. Interpretive overhead -- Decisions cost execution time to resolve when postponed to run time.
  2. Indirection overhead -- Run-time resolution often requires translation or lookup. This adds overhead time, often requires additional memory bandwidth for resolution, and makes memory access less predictable.
  3. Reduced knowledge and power -- Decisions made at run-time must be inexpensive time-wise and use a minimum amount of information. Consequently, these decisions are forced to be simple and local. In contrast, off-line decisions made by the compiler can afford to take a more global perspective and perform more detailed analysis before committing a decision.

LISP and C both exhibit a wealth of dynamic properties. Clever programmers are aware of the dynamic and static aspects of their respective programming languages and the system implementations they employ. Consequently, they will strategically strength reduce the facilities they employ as dictated by the requirements of their application. Unfortunately, the distinction between static and dynamic computational aspects is often subtle. The programmer must have a sophisticated understanding of operator semantics to properly perform this kind of optimization. Further, if a value is going to change at all after compilation, the programmer is forced to adopt a dynamic solution regardless of the actual frequency of the change.

Symbolic processing and system-level management are canonical examples of ``dynamic'' computations. These are characterized by heavy pointer manipulation, frequent and statically unpredictable jumps in control flow, statically unpredictable operation latency, and regular rebinding of data and operators. The literature presents evidence that this class of programs is much harder to optimize than FORTRAN numerical code.

Quasistatic Computation

Outside of computer science the terms ``static'' and ``dynamic'' serve as relative quantifiers rather than absolutes. ``Static'' does not mean fixed and unchanging. Rather it connotes infrequent change. Similarly, we can adopt a compilation and computation model which does not make such a rigid and absolute resolution-time distinction. We use the term ``quasistatic'' to describe this computational model which allows dynamic changes to occur but also judiciously employs recompilation to update the executable instruction stream to reflect information acquired from run time. Here, the executable may remain unchanged for relatively long periods of time, for example a program run, but is allowed to change to adapt to common usage conditions.

An important observation is that entities whose values and properties are allowed to change under dynamic computational models do not, necessarily, change as frequently as they are used. For example, an operator binding does not, normally, change with every use of the operator. Further, the sequence of operator bindings employed by a program may remain mostly constant from run to run. Similarly, a variable's value may remain mostly constant over a program run and may even be constant or highly predictable over a sequence of program runs. If we can discover these quasistatic values, bindings, or other characteristics of a program, we can feed them back to the compiler. The compiler can use its knowledge of the quasistatic values to produce code optimized for the stable and predictable epochs of these values. While optimizing around fed back quasistatic data, the compiler may perform non-local code reorganization and take advantage of other techniques normally restricted to the compilation of mostly static code. Of course, the compiler must also assure that code so compiled can properly handle changes that invariably occur and signal a change of quasistatic epochs.

We can view this invariant factorization as a complex form of code motion or hoisting. By empirically recognizing epochs and properties which are invariant within each epoch, we can factor computations involving the invariant out of the epoch, leaving only the residue computation which results from the invariant evaluation. Factoring and residue generation are exactly the function performed by partial evaluation [JGS93].

With quasistatics, we use run-time feedback to identify invariants which empirically exist over portions of code for relatively long periods of execution time even when there is no static evidence or guarantee that such invariants exist. There are two key reasons why such invariants may arise:

  1. A value or property may have the potential to be dynamic, but that option is not being exercised or the option is being exercised only infrequently. For example, a variable in a program may be taken from a configuration file read when the program is invoked. The user may have the option to change the configuration, but may change it very infrequently. By discovering that the value is constant across many computations, the compiler can generate an executable for the user where this value is treated as a constant. The constant can be propagated using classical propagation techniques to generate a more efficient, specialized executable for the user. If the user does change the variable, the compiled code recognizes that its invariant has changed and invokes the more general execution sequence instead of the specialized one. If this new value remains invariant, the compiler can then generate a version specialized around the new quasistatic value.
  2. A value or property may, in fact, be invariant. Discovery of the invariant may be beyond the state of the art in static analysis or may require exponential computation. Without resorting to overly expensive static analysis, empirical, run-time feedback can be used to identify promising invariants.

Dynamic, Run-time Feedback

Dynamic, run-time feedback is the key to quasistatic computation. By monitoring run-time behavior we can acquire information about quantities unknown at compile time. For these unknown quantities, we can use feedback to obtain:

It is important to remember that information obtained empirically from run-time is only predictive. We can never use this information to conclude that we will only see certain values. Nonetheless, we can determine the normal cases and specialize the executable code accordingly.

When we take advantage of run-time feedback, we make the explicit assumption that the future resembles the past. This is often a good assumption to make, and the variance over time provides a good indication of where it is most fruitful to apply this assumption. To see why the assumption is useful, we can review some of the components which contribute to each execution signature captured by run-time feedback:

We can optimize around characteristics generated by (1) and (2) and reasonably expect that the future will resemble the past. By determining variance and correlation, we can determine the predictable effects of (3) and optimize around them. We may also be able to determine strategic correlations ( e.g. at 1 AM, we can expect to see higher throughput and lower-latency out of our LAN than at 1 PM). To the extent that a user runs similar datasets, we can expect the future to display similar characteristics due to (4) and (5). Again, variance over time helps the compiler to separate gross characteristics which recur from characteristics which are unique to a few datasets.

A Simple Example

Consider the following code snippet from a hypothetical placement program written in C:


 manhattan_cost(x1,y1,x2,y2,placed)
   int x1,y1,x1,y2;
   OBJECT_ARRAY placed;
  {return(abs(x1-x2)+abs(y1-y2));
   }
 density_cost(x1,y1,x2,y2,placed)
   int x1,y1,x1,y2;
   OBJECT_ARRAY placed;
  {return(horizontal_channel_density_approximation(x1,x2,placed)+
             vertical_channel_density_approximation(y1,y2,placed));
   }
              . . .
 place_objects()
  {    
              . . .
   for (obj=all_objects;obj!=NULL;obj=obj->next)
     {
              . . .
      pick_new_location(&xnew,&ynew,obj,placed);
      new_cost = 0;
      for (links=obj->connections;links!=NULL;link=links->next)
          new_cost += (*cost_function)(xnew,ynew,links->x,links->y,placed);
              . . .
      }
              . . .
  }
              . . .
  

place_objects is reused during different phases of placement and optimization to place the objects and recompute the value of each placement. During early phases of optimization, the cheap cost function ( manhattan_cost) is used. In later phases, a more expensive and accurate cost function ( density_cost) is used instead. At a high level, the optimization manager decides when to change the cost metric used for each phase of optimization.

Unfortunately, since the cost function is accessed via indirection through a pointer whose value is not constant, the cost function computation cannot be inlined and optimized at compile time. With run-time feedback, the compiler can discover the range of values taken on by cost_function. During a recompilation pass, the compiler can then inline the computation. For example, after determining that 80% of the time cost_function is resolved it actually refers to manhattan_cost, the compiler might reorganize the code as follows:


 place_objects()
  {    
           . . .
  if (cost_function == &manhattan_cost)
     {for (obj=all_objects;obj!=NULL;obj=obj->next)
       {
           . . .
          pick_new_location(&xnew,&ynew,obj,placed);
          new_cost = 0;
          for (links=obj->connections;links!=NULL;link=links->next)
              new_cost +=(abs(xnew-links->x)+abs(ynew-links->y));
           . . .
       }
     }
   else
     { [original code with indirection]
     }
           . . .
  }
           . . .
  

Here, we avoid a pointer indirection and anonymous procedure call overhead in a frequently executed portion of code.

If the compiler cannot guarantee that the value of the cost_function pointer is not modified inside the if block, it may mark the page containing cost_function as read-only. In this manner, the code can identify any unanticipated change in the value. When an unexpected change does occur, patch up code can be executed to assure proper program semantics.

Range of Application

Feedback can be beneficial for almost any quantity which is unknown at initial compile time. For example:

Advantages

Quasistatic computation is aimed at reducing run-time execution latency while allowing dynamic expressiveness and flexibility. In this section we examine the general advantages of quasistatic computing from several perspectives.

Eliminate Unused Functionality

One key optimization is aggressive operator and data structure strength reduction. When values and functions are statically known, the compiler can often determine the exact functionality required by an operator or data structure and strength reduce it to cheaper operations which have the same semantic effect. Using feedback data in a quasistatic computing model, we gain the same advantages by noting the features actually employed. As always, we need to provide an escape mechanism which allows the computation to be patched up if the additional functionality is, unexpectedly, required. In many ways, this strength reduction can take the burden of optimization and function selection off the programmer since unused functionality contained in the operators will be discovered and specialized away. For instance, a LISP list which is created with uniform data elements and is not arbitrarily ``grown'' can be replaced with an array. In many cases, this will make traversal and access much cheaper since the array requires less indirection to access any given element. This class of transforms can achieve a significant reduction in the run-time overhead associated with more powerful data structures by moving the overhead associated with data-structure traversal from run time to compile time.

Common-Case Optimization

Automated, common-case optimization will be another key benefit of quasistatic compilation. Feedback from run-time tells the compiler what paths are most commonly executed and what data values occur most commonly. The compiler can reorganize and restructure executable code to reduce the execution latency in the most common cases.

Execution Latency Compression

We can also view quasistatic computation from an information theoretic perspective. If a quantity is known statically, or is highly predictable, the run-time value of the quantity conveys little or no information. The known value can be folded into computation at little run-time cost on average. Values whose probability distributions are highly non-uniform admit to compression using non-uniform coding, or in this case execution, sequences. Only quantities which are completely unpredictable cannot be compressed by recoding and hence must be fully evaluated each time they are encountered at run time. Quasistatic computation including adaptive recompilation, then, is directly analogous to adaptive data compression.

Embodiments

Quasistatic computation is only a net benefit to the extent that it reduces execution latency. Certainly feedback analysis and recompilation require additional computation. Fortunately, here we can readily exploit throughput to perform the additional computation.

At a coarse granularity, we can exploit otherwise ``idle'' machine cycles between programming runs to perform analysis and recompilation. When the user is away pondering the results of a previous run, or at home for the night, his workstation can analyze feedback and regenerate executables. In the most basic incarnation, the compiler can tune execution and replace whole executable binaries between program invocations. This approach is probably the easiest to introduce into existing computer systems. However, limited collection bandwidth and storage space may necessitate very frugal collection of feedback data. Additionally, this scheme does not allow programs to adapt to usage during a single execution.

On parallel or multithreaded systems, we might implement one or more compilation servers. When application tasks do not consume all the resources on the machine, compiler threads can analyze feedback data and generate new execution sequences. Compilation threads will be coarse-grained so can easily and efficiently exploit the throughput available on parallel computers. ``Idle'' workstations on a LAN can also serve efficiently in this capacity. Again, care must be taken to limit the quantity of feedback data. Dynamic adjustment can be used to match the data to the available collection bandwidth and analysis cycles.

Both of the previous suggestions borrow only spare throughput to perform recompilation. However, strategic analysis and recompilation interspersed with a program can also result in net reductions in application latency. Interspersed analysis requires much more attention to analysis and recompilation efficiency. This approach allows executable adaptation during long program runs.

In addition to these cases where recompilation is initiated by an external agent in response to feedback, the executable can explicitly request compilation. In cases where earlier feedback suggests that a value is constant past a point in the program, but the value of the constant is unpredictable, it may be beneficial for the program to explicitly request recompilation at that point during program execution. Several researchers have demonstrated the advantages of this kind of run-time code synthesis in various scenarios. Massalin and Pu employ run-time compilation in the Synthesis kernel [MP89] to generate specialized kernel operations. Keppel, Eggers, and Henry demonstrate that the explicit use of run-time code generation can often produce better executable code than is possible with purely static compilation [KEH93].

Foundational and Enabling Work

Quasistatics continues the intellectual legacy of common-case optimization, trace-scheduling, and profile-driven compilation. Manual optimization for the normal case is a well known system technique [Lam84] for achieving good average performance. As a system design technique, common-case optimization relies on the designer to collect data, analyze the feedback, and optimize accordingly. Trace Scheduling [Fis81] [FERN94] and profile-driven ( e.g. [Wal91] [Sam91]) compilation take this idea one step further and automate code reorganization based on profile traces from sample program runs. Quasistatics takes the next step beyond these techniques by integrating optimization based on observed common cases into the lifetime of the program.

Run-time code generation research also plays an inspirational and foundational role in quasistatic computation. The Synthesis kernel effectively demonstrates the power of invariant factorization to produce highly efficient, specialized functions at run time [MP89]. Value-specific optimization demonstrates more general and portable application of run-time code generation [KEH93]. Deferred compilation employs static, staging analysis on functional programming languages to automatically determine opportunities for invariant factorization [LL93]. Quasistatic computation completes the picture by allowing the compiler to automatically recognize opportunities for run-time code generation which are not easily identified by static analysis.

Maturing research in partial evaluation and specialization ( e.g. [Ber90] [JGS93]) provides us with the tools to take advantage of known data values to produce streamlined executables.

Quasistatic optimization would be of limited utility without efficient techniques for lightweight profiling and tracing of execution flow and data values. Many recent developments in this area play a key role enabling quasistatic computation. Appel introduces simple, compiler-inserted profiling annotations which support lightweight profile collection [ADM88]. Larus demonstrates the efficiency of modern techniques for compiler-based program tracing [Lar93]. Wahbe presents efficient schemes for data value monitoring during program execution [WLG93]. His data tracing techniques allow programs to identify when values change and can be much more lightweight than trap-based techniques such as the read-only, virtual-memory page scheme suggested in Section .

In quasistatic computation, the compiler may be called upon continuously to incrementally recompile portions of the program. Several compilers for object-oriented languages, including the SELF compiler [CUL91], employ dynamic compilation to generate executable code from a simple intermediate form during execution. Notably, since the SELF compiler is employed dynamically, it can often exploit knowledge of data values which are not known until run time. Researchers at Sun Microsystems have developed techniques for modifying more conventional compilers for efficient incremental recompilation [FST91].

Implications

As a discipline, we recognize the importance of analyzing dynamic behavior and optimizing our systems accordingly. While feedback is used in spot applications, we have not, traditionally, included it as a fundamental part of our computational model. As noted herein, there are many good reasons to integrate automatic feedback generation, consumption, and feedback-oriented adaptation into our base-line computing model.

Once we accept this new model, many optimizations which the clever programmer could make can now be automated and performed by a clever compiler. This has the obvious benefit of taking some of the burden for extracting performance off the application and system programmers. More fundamentally, though, the model protects our programs against inevitable changes in technology, changes in external systems, and shifts in usage patterns.

One key shift which quasistatic computation encourages is a general move to more explicit management of resources ( e.g. caches, virtual memory paging, memory management schemes). Rather than relying on oblivious mechanisms which simply react to events at run time, we can predict likely events and usage patterns. We can often avoid the overhead associated with belated discovery of an exceptional case ( e.g. data item not in the cache, TLB miss) by explicitly predicting that it will occur and providing a hint in the instruction stream to explicitly handle or prepare for the event.

Future Computer Systems

Our systems should be designed to:

Future Architectures

To some extent, quasistatic computation allows us to better exploit the characteristics we already know how to architect well. In particular, quasistatic computation sets out to increase predictability allowing us to better exploit throughput to increase performance. This shows up partially as larger statically scheduled code blocks. It also shows up in the accurate estimates of timing and data requirements, allowing data access latency to be more completely overlapped with useful computation.

New architectures should explicitly consider the system support suggested in the previous section:

With the advent of reconfigurable logic, specialization is not limited to the executable instruction stream. Architecturally, we can now design general-purpose computing systems which can be configured, on the fly, to provide application-specific acceleration. While the field of custom computing using reconfigurable logic is relatively young, researchers using modest amounts of reconfigurable logic have already demonstrated orders of magnitude higher performance than general-purpose microprocessors or workstations on numerous applications [BP93]. One of the key advantages of reconfigurable logic is that the architecture can be specialized to match the computational requirements of the application. Quasistatic computation can employ feedback to extract the critical computational requirements for an application. Further, the quasistatic compiler gathers information on the properties of the data normally seen during the computation. Combining this information, the compiler can synthesize minimum cost computing structures tailored to common-case execution requirements.

Future Programming Languages

One key benefit of quasistatic computation is that we do not have to forgo dynamic expressibility in order to achieve high performance. We have seen that dynamic features do not inherently limit performance. Rather, the unpredictability of dynamic quantities ultimately limits performance. By adopting a quasistatic computational model, we recognize that all dynamics are not equal, and we only pay a run-time cost for dynamic features when they are used in unpredictable ways.

Languages and programming environments should be designed with the feedback model in mind. Libraries should be able to provide multiple implementation options for routines or data structures. Run-time feedback will then allow the compiler to select the appropriate version for each application ( e.g. [Low78] [Sam91]). As noted in Section some versions of a routine or data structure may be applicable only in a limited context; that is a version may not provide all the functionality or be valid over the whole range of values supported by other versions. As long as the programmer communicates the range of applicability of each version to the compiler, the compiler can select appropriate versions based on actual requirements seen during execution.

Conclusion

We have articulated a paradigm for computing which differs substantially from traditional static and dynamic computational models. Quasistatic computation allows us to have dynamic expressibility and flexibility while obtaining performance comparable to purely static code. Feedback and adaptive recompilation throughout program lifetime provide the key elements which make this kind of optimization possible. In effect, quasistatic computation embodies classical common-case optimizations in an automated manner. Quasistatic computation can also be viewed as a technique which exploits computational regularities to adaptively compress program run time. Through continuous adaptation and recompilation, quasistatic computation allows programs to deliver high performance even across changes in external assumptions, technology, and usage patterns. Many technologies which enable quasistatic computation are beginning to mature, making this style of computing ripe for exploitation. Future computing systems should support feedback, hints, and speculation to take full advantage of the opportunities afforded by quasistatic computing.

See Also...

References

ADM88
Andrew Appel, Bruce Duba, and David MacQueen. Profiling in the Presence of Optimization and Garbage Collection. CS-TR 197-88, Princeton University, November 1988.

Ber90
Andrew A. Berlin. Partial Evaluation Applied to Numerical Computation. In ACM Conference on LISP and Functional Programming. ACM, ACM, 1990.

BP93
Duncan A. Buell and Kenneth L. Pocek, editors. Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines, Los Alamitos, California, April 1993. IEEE Computer Society, IEEE Computer Society Press.

BZ93
David Barret and Benjamin Zorn. Using Lifetime Predictors to Improve Memory Allocation Performance. In Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation, pages 187-196. SIGPLAN, ACM, June 1993.

[FTP link].

CKP91
David Callahan, Ken Kennedy, and Allan Porterfield. Software Prefetching. In Proceedings of the Fourth International Conference on the Architectural Support for Programming Languages and Operating Systems, pages 40-52. ACM, April 1991.

CUL91
Craig Chambers, David Ungar, and Elgin Lee. An Efficient Implentation of SELF, a Dynamically-Typed Ojbect-Oriented Language Based on Prototypes. LISP and Symbolic Computation, 4(3), 1991.

DeH93
Andre DeHon. Smart Compiler Advantages. Transit Note 99, MIT Artificial Intelligence Laboratory, December 1993. [tn99 HTML link] [tn99 PS link].

DEK93
Andre DeHon, Ian Eslick, and Thomas F. Knight Jr. Toward 21st Century Computing: Transit Perspective. Transit Note 93, MIT Artificial Intelligence Laboratory, September 1993. [tn93 HTML link] [tn93 PS link].

DEMK93
Andre DeHon, Ian Eslick, John Mallery, and Thomas F. Knight Jr. Prospects for a Smart Compiler. Transit Note 87, MIT Artificial Intelligence Laboratory, June 1993. [tn87 HTML link] [tn87 PS link].

FERN94
Joseph A. Fisher, John R. Ellis, John C. Ruttenberg, and Alexandru Nicolau. Parallel Processing: A Smart Compiler and a Dumb Machine. ACM SIGPLAN Notices, 19(6):37-47, June 1994.

Fis81
Joseph Fisher. Trace Scheduling: A Technique for Global Microcode Compaction. IEEE Transactions on Computers, 30(7):478-490, July 1981.

FST91
Alastair Fyfe, Ivan Soleimanipour, and Vijay Tatkar. Compiling from Saved State: Fast Incremental Compilatoin with Traditional UNIX Compilers. In Proceedings of the Winter 1991 USENIX Conference, pages 161-170. USENIX Assoiciation, 1991.

GZ92
Dick Grunwald and Benjamin Zorn. CustoMalloc: Efficient Synthesized Memory Allocators. CU-CS 602-92, University of Colorado, Department of Computer Science, University of Colorado, Boulder, Colorado, July 1992.

[FTP link].

JGS93
Neil D. Jones, Carsten K. Gomard, and Peter Sestoft. Partial Evaluation and Automatic Program Generation. Prentice Hall International Series in Computer Science. Prentice Hall, 1993.

KEH93
David Keppel, Susan Eggers, and Robert Henry. Evalu RunCompiled Value-Specific Optimizations. Technical Report 93-11-02, Department of Computer Science and Engineering,University of Washington, Seattle, WA 98195, November 1993. [FTP link].

Lam84
Butler W. Lampson. Hints for Computer System Design. IEEE Software, 1(1):11-28, January 1984.

Lar93
James R. Larus. Efficient Program Tracing. Computer, 26(5):52-60, May 1993.

LL93
Mark Leone and Peter Lee. Deferred Compilation: The Automation of Run-Time Code Generation. CMU-CS 93-225, Carnegie-Mellon, December 1993. [FTP link].

Low78
James Low. Automatic Data Structure Selection: An Example and Overview. Communications of the ACM, 21(5):376-385, May 1978.

MP89
Henry Massalin and Calton Pu. Threads and Input/Output in the Syn Kernel. In Pro of the 12th ACM Symposium on Operating Systems Principles, pages 191-201. ACM, December 1989.

RL92
Anne Rogers and Kai Li. Software Support for Speculative Loads. In Proceedings of the Fifth International Conference on the Architectural Support for Programming Languages and Operating Systems, pages 38-50. ACM, October 1992.

Sam91
Alan Dain Samples. Profile-driven Compilation. PhD thesis, U.C. Berkeley, April 1991. U.C. Berkeley CSD-91-627. [FTP link].

SLH90
Michael Smith, Monica Lam, and Mark Horowitz. Boosting Beyond Static Scheduling in a Superscalar Processor. In Proceedings of the 17th International Symposium on Computer Architecture, pages 344-354. ACM, May 1990.

Wal91
David Wall. Predicting Program Behavior Using Real or Estimated Profiles. In Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation, pages 59-70. SIGPLAN, ACM, 1991.

WLG93
Robert Wahbe, Steven Lucco, and Susan Graham. Practical Data Breakpoints: Design and Implementation. In Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation. SIGPLAN, ACM, 1993. Sigplan Notices, Volume 28, Number 6.

[FTP link].

MIT Transit Project