## **Proceedings** ## 20th International Symposium on Computer Architecture and High Performance Computing ### **Table of Contents** # 20th International Symposium on Computer Architecture and High Performance Computing | | SBAC-PAD | |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| | Message from the General Chair | viii | | Message from the Program Committee Chairs | | | Conference Organizers | x | | Program Committee | xi | | Reviewers | | | Brazilian Computer Society | xiv | | Session 1: Architecture I | | | Accurate and Low-Overhead Dynamic Detection and Prediction of Program Phases Using Branch Signatures Balaji Vijayn and Dmitry V. Ponomarev | 3 | | Aggressive Scheduling and Speculation in Multithreaded Architectures: Is it Worth its Salt? | 11 | | An Optimization Mechanism Intended for Two-Level Cache Hierarchy to Improve Energy and Performance Using the NSGAII Algorithm Abel G. Silva-Filho, Carmelo J.A. Bastos-Filho, Davi M.A. Falcão, Filipe R. Cordeiro, and Rodrigo M.C.S. Castro | 19 | | Session 2: Applications I | | | On Simulated Annealing for the Scheduling of Parallel Applications *Rodrigo Fernandes de Mello and Luciano José Senger* | 29 | | Controlling Processes Reassignment in BSP Applications | 37 | | A High Performance Massively Parallel Approach for Real Time | | | Deformable Body Physics Simulation Thiago S.M.C de Farias, Mozart W.S. Almeida, João Marcelo X.N. Teixeira, Veronica Teic and Judith Kelner | hrieb, | #### **Session 3: Multicore** | A Methodology for Developing High Fidelity Communication Models for Large-Scale Applications Targeted on Multicore Systems Charles W. Lively, Valerie E. Taylor, Sadaf R. Alam, and Jeffrey S. Vetter | 55 | |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----| | Selection of the Register File Size and the Resource Allocation Policy on SMT Processors Jesús Alastruey, Teresa Monreal, Francisco Cazorla, Víctor Viñals, and Mateo Valero | 63 | | ORBIT: Effective Issue Queue Soft-Error Vulnerability Mitigation on Simultaneous Multithreaded Architectures Using Operand Readiness-Based Instruction Dispatch Xin Fu, Tao Li and José Fortes | 71 | | Session 4: Applications II | | | Processing Neocognitron of Face Recognition on High Performance Environment Based on GPU with CUDA Architecture Gustavo Poli, José Hiroki Saito, João F. Mari, and Marcelo R. Zorzan | 81 | | Parallel Verified Linear System Solver for Uncertain Input Data Mariana Kolberg, Márcio Dorn, Luiz Gustavo Fernandes, and Gerd Bohlender | 89 | | Applying Virtualization and System Management in a Cluster to Implement an Automated Emulation Testbed for Grid Applications | 97 | | Session 5: Architecture II | | | Hiding Communication Delays in Clustered Microarchitectures | 107 | | Software Synthesis for Hard Real-Time Embedded Systems with Energy Constraints<br><i>Eduardo Tavares, Bruno Silva, Paulo Maciel, and Pedro Dallegrave</i> | 115 | | A Segmented Bloom Filter Algorithm for Efficient Predictors M. Breternitz, Gabriel H. Loh, Bryan Black, Jeffrey Rupley, Peter G. Sassone, Wesley Attrot, and Youfeng Wu | 123 | | Session 6: Grid, Cluster, and Operating Systems | | | Measuring Operating System Overhead on CMT Processors | 133 | | Aspect-Based Patterns for Grid Programming Luis Daniel Benavides Navarro, Rémi Douence, Fabien Hermenier, Jean-Marc Menaud, and Mario Südholt | 141 | | A Reconfigurable Run-Time System for Filter-Stream Applications Daniel Fireman, George Teodoro, André Cardoso, and Renato Ferreira | 149 | ### **Session 7: Memory Systems** | Author Index | 191 | |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----| | Transactional WaveCache: Towards Speculative and Out-of-Order DataFlow Execution of Memory Operations Leandro A.J. Marzulo, Felipe M.G. Franca, and Vítor Santos Costa | 183 | | A Software Transactional Memory System for an Asymmetric Processor Architecture Felipe Goldstein, Alexandro Baldassin, Paulo Centoducatte, Rodolfo Azevedo, and Leonardo A.G. Garcia | 175 | | Performance Sensitivity of NUCA Caches to On-Chip Network Parameters Alessandro Bardine, Manuel Comparetti, Pierfrancesco Foglia, Giacomo Gabrielli, and Cosimo A. Prete | 167 | | James Poe, Chang-Burm Cho, and Tao Li | | | and Multi-Core Co-Design | 159 | | Using Analytical Models to Efficiently Explore Hardware Transactional Memory | |