Stream-Based Trace Compression
Assistant Professor of Electrical and Computer Engineering
The University of Alabama in Huntsville
301 Sparkman Drive, Huntsville, AL 35899
Trace-driven simulations have been widely used in quantitative evaluations
of new techniques in computer architecture. To offer a faithful representation
of a specific workload, traces are very large, encompassing billions of
memory references and/or instructions. To efficiently store and use even
a small collection of traces, its size must be reduced beyond traditional
Stream-based compression (SBC) is a new method for single-pass compression of address traces and various extended trace formats. The SBC algorithm relies on extracting instruction streams. An instruction stream is defined as a sequential run of instructions, from the target of a taken branch to the first taken branch in sequence. A stream table is created during compression, encompassing all relevant information about streams, such as starting address, stream length, instruction words and their types. All instructions from a stream are replaced by its index in the stream table, creating a trace of instruction streams.
Unlike instruction addresses, data address for a memory referencing instruction stays rarely constant during program execution, but it a can have a regular stride. The SBC trace compression features a very simple and yet efficient on-line algorithm for compression of data address references. The compressed data address trace encompasses the data address stride and the number or repetitions for each memory-referencing instruction in a stream. A change in a data address stride results in another compressed record. Compressed records are ordered by the corresponding stream appearances in the trace.
The SBC algorithm exploits several inherent characteristics of program
execution traces. Instruction traces consist of a fairly limited number
of different instruction streams, and the most of memory references exhibit
strong spatial and/or temporal locality, for example, a load having a constant
address stride across loop iterations.
Figure 1 illustrates the compression flow assuming Dinero+ input. A Dinero+ trace record has fixed length fields: the header field (0 – data read, 1 – data write, and 2 – instruction read), the address field, and the instruction word field for the instruction read type. The stream-based compression results in three files: Stream Table File (STF), Stream-Based Instruction Trace (SBIT), and Stream-Based Data Trace (SBDT). Figure 2 demonstrates the compression process using a short excerpt of a trace. Figure 3 shows the SBC format for data address references.
Header file: mytrace.h
Dinero+ to SBC conversion: din_plus2sbc.c
SBC to Dinero+ conversion: sbc2din_plus.c
SBC to Dinero conversion: sbc2din.c
A. Milenkovic, M. Milenkovic, "Exploiting Streams in Instruction and
Data Address Trace Compression,"
Proceedings of the IEEE 6th Annual Workshop on Workload Characterization, Austin, TX, USA, October 27, 2003, pp. 99-107.
[PDF] [ps.gz] [PPT slides]
free hit counter
Last update: March 4, 2004