Improving Speculative Thread-Level Parallelism Through Module Run-Length Prediction
Authors: Fredrik Warg and Per Stenström
Abstract:

Exploiting speculative thread-level parallelism across modules, e.g., methods, procedures, or functions, have shown promise. However, misspeculations and task creation overhead are known to adversely impact the speedup if too many small modules are executed speculatively. Our approach to reduce the impact of these overheads is to disable speculation on modules with a run-length below a certain threshold.

We first confirm that if speculation is disabled on modules with an execution time -- or run-length -- shorter than a threshold comparable to the overheads, we obtain nearly as high speedup as if the overhead was zero. For example, if the overhead is 200 cycles and the run-length threshold is 500 cycles, six out of the nine applications we ran enjoyed nearly as high speedup as were the overhead zero. We then propose a mechanism by which the run-length can be predicted at run-time based on previous invocations of the module. This simple predictor achieves an accuracy between 83% and 99%. Finally, our run-length predictor is shown to improve the efficiency of module-level speculation by preventing modules with short run-lengths from occupying precious processing resources.

Keywords: Multiprocessors, thread-level speculation, module-level parallelism, module run-length prediction, performance evaluation.
Fulltext: pdf
Published: Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS '2003
DOI: 10.1109/IPDPS.2003.1213089
Presentation: pdf

Note: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Last modified: