# The 39<sup>th</sup> Annual IEEE/ACM International Symposium on Microarchitecture December 9-13, 2006, Orlando, Florida, USA **IEEE** Los Alamitos, California Washington Tokyo #### All rights reserved. Copyright and Reprint Permissions: Abstracting is permitted with credit to the source. Libraries may photocopy beyond the limits of US copyright law, for private use of patrons, those articles in this volume that carry a code at the bottom of the first page, provided that the per-copy fee indicated in the code is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. Other copying, reprint, or republication requests should be addressed to: IEEE Copyrights Manager, IEEE Service Center, 445 Hoes Lane, P.O. Box 133, Piscataway, NJ 08855-1331. The papers in this book comprise the proceedings of the meeting mentioned on the cover and title page. They reflect the authors' opinions and, in the interests of timely dissemination, are published as presented and without change. Their inclusion in this publication does not necessarily constitute endorsement by the editors, the IEEE Computer Society, or the Institute of Electrical and Electronics Engineers, Inc. IEEE Computer Society Order Number P2732 ISBN 0-7695-2732-9 ISBN 978-0-7695-2732-1 ISSN Number 1072-4451 Additional copies may be ordered from: IEEE Computer Society Customer Service Center 10662 Los Vaqueros Circle P.O. Box 3014 Los Alamitos, CA 90720-1314 Tel: + 1 800 272 6657 Fax: + 1 714 821 4641 http://computer.org/cspress csbooks@computer.org IEEE Service Center 445 Hoes Lane P.O. Box 1331 Piscataway, NJ 08855-1331 Tel: + 1 732 981 0060 Fax: + 1 732 981 9667 http://shop.ieee.org/store/ customer-service@ieee.org IEEE Computer Society Asia/Pacific Office Watanabe Bldg., 1-4-2 Minami-Aoyama Minato-ku, Tokyo 107-0062 JAPAN Tel: +81 3 3408 3118 Fax: +81 3 3408 3553 tokyo.ofc@computer.org Individual paper REPRINTS may be ordered at: <reprints@computer.org> Editorial production by Bob Werner Cover art production by Joe Daigle/Studio Productions Printed in the United States of America by The Printing House **IEEE Computer Society** Conference Publishing Services http://www.computer.org/proceedings/ # Table of Contents: MICRO-39 2006 ## 39<sup>th</sup> Annual International Symposium on Microarchitecture | Message from the General Co-Chairs | ix | |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----| | Message from the Program Co-Chairs | x | | Conference Committees | xi | | Reviewers | | | Sponsors | | | Session 1: Reliability and Bug Detection | | | A Floorplan-Aware Dynamic Inductive Noise Controller for Reliable Processor Design<br>Fayez Mohamood, Michael Healy, Sung Kyu Lim, and Hsien-Hsin S. Lee | 3 | | Yield-Aware Cache Architectures | 15 | | Serkan Ozdemir, Debjit Sinha, Gokhan Memik, Jonathan Adams, and Hai Zhou | | | Phoenix: Detecting and Recovering from Permanent Processor Design Bugs with Programmable Hardware | 26 | | PathExpander: Architectural Support for Increasing the Path Coverage of Dynamic Bug Detection | 38 | | Session 2A: Compiler and Branch Handling | | | Diverge-Merge Processor (DMP): Dynamic Predicated Execution of Complex Control-Flow Graphs Based on Frequently Executed Paths Hyesoon Kim, José A. Joao, Onur Mutlu, and Yale N. Patt | 53 | | Merging Head and Tail Duplication for Convergent Hyperblock Formation | 65 | | Data-Dependency Graph Transformations for Superblock Scheduling | 77 | | Dataflow Predication | 89 | | Session 2B: Security | | | Authentication Control Point and its Implications for Secure Processor Design | 103 | | Using Branch Correlation to Identify Infeasible Paths for Anomaly Detection | 113 | | Memory Protection through Dynamic Access Control | 123 | | LIFT: A Low-Overhead Practical Information Flow Tracking System for | 4.25 | |--------------------------------------------------------------------------------------|------| | Detecting Security Attacks | 135 | | Feng Qin, Zhenmin Li, Yuanyuan Zhou, Cheng Wang, | | | Ho-seop Kim, and Youfeng Wu | | | Session 3A: Superscalar Processors | | | Fairness and Throughput in Switch on Event Multithreading | 149 | | Ron Gabor, Shlomo Weiss, and Avi Mendelson | | | A Predictive Performance Model for Superscalar Processors | 161 | | P.J. Joseph, Kapil V aswani, and Matthew J. Thazhuthaveetil | | | Serialization-Aware Mini-Graphs: Performance with Fewer Resources | 171 | | Anne Bracy and Amir Roth | | | Session 3B: Memory Systems | | | Architectural Support for Software Transactional Memory | 185 | | Bratin Saha, Ali-Reza Adl-Tahatahai, and Quinn Jacohson | | | Virtually Pipelined Network Memory | 197 | | Banit Agrawal and Timothy Sherwood | | | Fair Queuing Memory Systems | 208 | | Kyle J. Neshit, Nidhi Aggarwal, James Laudon, and James E. Smith | | | Session 4: CMP Execution | | | Reunion: Complexity-Effective Multicore Redundancy | 223 | | Jared C. Smolens, Brian T. Gold, Babak Falsafi, and James C. Hoe | | | Exploiting Fine-Grained Data Parallelism with Chip Multiprocessors and Fast Barriers | 235 | | Jack Sampson, Rubén González, Jean-Francois Collard, | | | Norman P. Jouppi, Mike Schlansker, and Brad Calder | | | CAPSULE: Hardware-Assisted Parallel Execution of Component-Based Programs | 247 | | Pierre Palatin, Yves Lhuillier, and Olivier Temam | | | Support for High-Frequency Streaming in CMPs | 259 | | Ram Rangan, Neil V achharajani, Adam Stoler, | | | Guilherme Ottoni, David August, and George Cai | | | Session 5A: Memory Dependences | | | Fire-and-Forget: Load/Store Scheduling with No Store Queue at All | 273 | | Samantika Subramaniam and Gabriel H. Loh | | | NoSQ: Store-Load Communication without a Store Queue | 285 | | Tingting Sha, Milo M.K. Martin, and Amir Roth | | | DMDC: Delayed Memory Dependence Checking through Age-Based Filtering | 297 | | Fernando Castro, Daniel Chaver, Luis Pinuel, | | | Manuel Prieto, Michael C. Huang, and Francisco Tirado | | #### Session 5B: Networks and Coherence | Coherence Ordering for Ring-Based Chip Multiprocessors | 309 | |------------------------------------------------------------------------------------|-----| | · | 224 | | In-Network Cache Coherence | 321 | | ViChaR: A Dynamic Virtual Channel Regulator for Network-on-Chip Routers | 333 | | Narayanan Vijaykrishnan, Mazin S. Yousif, and Chita R. Das | | | Session 6A: Power | | | An Analysis of Efficient Multi-Core Global Power Management Policies: | | | Maximizing Performance for a Given Power Budget | 347 | | Canturk Isci, Alper Buyuktosunoglu, Pradip Bose, | | | Margaret Martonosi, and Chen-Yong Cher | | | Live, Runtime Phase Monitoring and Prediction on Real Systems with | | | Application to Dynamic Power Management | 359 | | Canturk Isci, Gilberto Contreras, and Margaret Martonosi | | | Dynamic Standby Prediction for Leakage Tolerant Microprocessor Functional Units | 371 | | Ahmed Youssef, Mohab Anis, and Mohamed Elmasry | | | Session 6B: Caches and Prefetching | | | Adaptive Caches: Effective Shaping of Cache Behavior to Workloads | 385 | | Ranjith Subramanian, Yannis Smaragdakis, and Gabriel H. Loh | | | Memory Prefetching Using Adaptive Stream Detection | 397 | | Ibrahim Hur and Calvin Lin | | | Scalable Cache Miss Handling for High Memory-Level Parallelism | 409 | | James Tuck, Luis Ceze, and Josep Torrellas | | | Session 7: Managing CMP Caches | | | Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, | | | Runtime Mechanism to Partition Shared Caches | 423 | | Moinuddin K. Qureshi and Yale N. Patt | | | Molecular Caches: A Caching Structure for Dynamic Creation of Application-Specific | | | Heterogeneous Cache Regions | 433 | | Keshavan Varadarajan, S.K. Nandy, Vishal Sharda, Amrutur Bharadwaj, | | | Ravi Iyer, Srihari Makineni, and Donald Newell | | | ASR: Adaptive Selective Replication for CMP Caches | 443 | | Bradford M. Beckmann, Michael R. Marty, and David A. Wood | | | Managing Distributed, Shared L2 Caches through OS-Level Page Allocation | 455 | | Sangyeun Cho and Lei Jin | 133 | ### Session 8: Technology-Driven Architecture | Die Stacking (3D) Microarchitecture | 469 | |---------------------------------------------------------------------------------------------|-----| | Bryan Black, Murali Annavaram, Ned Brekelbaum, John DeVale, | | | Lei Jiang, Gabriel H. Loh, Don McCauley, Pat Morrow, | | | Donald W. Nelson, Daniel Pantuso, Paul Reed, Jeff Rupley, | | | Sadasivan Shankar, John Shen, and Clair Webb | | | Distributed Microarchitectural Protocols in the TRIPS Prototype Processor | 480 | | Karthikeyan Sankaralingam, Ramadass Nagarajan, Robert McDonald, | | | Rajagopalan Desikan, Saurabh Drolia, Madhu Saravana Sibi Govindan, | | | Paul Gratz, Divya Gulati, Heather Hanson, Changkyu Kim, | | | Haiming Liu, Nitya Ranganathan, Simha Sethmadhavan, | | | Sadia Sharif, Premkishore Shivakumar, Stephen W. Keckler, and Doug Burger | | | Leveraging Optical Technology in Future Bus-Based Chip Multiprocessors | 492 | | Nevin Kırman, Meyrem Kırman, Rajeev K. Dokania, | | | José F. Martínez, Alyssa B. Apsel, Matthew A. Watkins, and David H. Albonesi | | | Mitigating the Impact of Process Variations on Processor Register Files and Execution Units | 504 | | Xiaoyao Liang and David Brooks | | | Author Index | 515 |