Speeding up mobile code execution on
resource-constrained embedded processors.

Embedded platforms are increasingly connected to the Web and are executing mobile code.
These platforms are a resource-constrained environment in which interpreted execution
of mobile codes is the norm and highly-optimized or dynamic compilation systems are not
a suitable choice, primarily due to their high memory requirements.
At the same time, the performance of the executed code is of critical importance and
is often the limiting factor in both the capabilities of the system and user perception.

The goals of this research project were to significantly improve interpreter performance
for mobile code on embedded platforms without increasing its resource requirements and to
design a resource constrained basic block dynamic compilation system to be used with an
interpreter for adaptive optimization at "low cost".

The framework proposed to achieve for both of these goals is based on "superoperators" and
code "annotations". The former are groups of instructions that can be executed as a unit and
optimized together. The latter is a mechanism for passing information from a compiler producing
mobile code to the interpreter running on a client system. The proposed approach shifts
as much of the work of identifying, compiling, and optimizing superoperators as possible to
the compiler and thus both simplifies and speeds up interpreted execution. This is possible
because the annotations can carry the additional information (otherwise unavailable in the
mobile code) between the compiler and the interpreter. Annotations can also reduce delays
and allow small applets to be optimized with little or no overhead when used with adaptive
dynamic optimization. Currently such optimization requires dynamic profiling and incurs
the associated overhead.

Superoperators provide two main advantages for optimizing interpreter performance. They reduce
the dispatch overhead of individual bytecodes comprising the superoperator and they allow
stack-based communication to be converted to a more efficient form using registers. The
registers can be utilized without dynamic compilation if the superoperators are created statically.
Alternatively, they can be introduced by a simplified dynamic compilation module which is invoked
when specified by an annotation.

The resulting combination of some of the main benefits of JIT compilation, superoperators (SO's)
and profile-guided optimization delivers a lightweight Java bytecode compilation system for
resource-constrained environments that achieves runtime performance similar to that of
state-of-the-art JIT/Adaptive Optimization systems, while having a minimal impact on
runtime memory consumption.

For experimental evaluation, we developed three Virtual Machines (VMs). One deploys
our proposed techniques. This VM is first compared (w.r.t. runtime performance)
to a baseline interpreted VM and then to a VM that employs state-of-the-art JIT/AO.
Our system attains speedups ranging from a factor of 1.52 to a factor of 3.07, w.r.t. to the
baseline VM. When compared to a state-of-the-art JIT/AO VM,
our system achieves performance which is within a factor of two w.r.t JIT/AO.
It is actually better for three of the benchmarks and worse by less than a factor
of 2 for three others. But our SO-extended VM outperforms the JIT/AO system by a factor of
16, on average, w.r.t. runtime memory consumption.