## **2009 IEEE International Conference on Computer Design** (ICCD 2009) Lake Tahoe, California, USA 4 – 7 October 2009 **IEEE Catalog Number: CFP09ICD-PRT** **ISBN**: 978-1-4244-5029-9 ## Table of Contents ## Session I. | Session 1.1. Disruptive Computing Technology<br>Session Chair: Davia J Lu, Intel Corporation | | |--------------------------------------------------------------------------------------------------------------------------|----| | Imperfection-Immune Carbon Nanotube Digital VLSI (invited talk) | 1 | | Computer-Aided Design for Microfluidic Chips Based on Multilayer Soft Lithography (invited talk) | 2 | | Reincarnate Historic Systems On FPGA with Novel Design Methodology (invited talk) | 10 | | Session 1.2. Advances in Timing Analysis and Optimization<br>Session Chair: Stephan Wong, TU Delft | | | Automatic Synthesis of Computation Interference Constraints for Relative Timing Verification | 16 | | Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations | 23 | | Statistical Timing Analysis based on simulation of Lithographic process.<br>$Aswin\ Sreedhar,\ Sandip\ Kundu$ | 29 | | Session 1.3. System Power and Thermal Issues<br>Session chair: Xiaoyao Liang, NVIDIA | | | Compiler-Directed Leakage Reduction in Embedded Microprocessors Soumyaroop Roy, Nagarajan Ranganathan, Srinivas Katkoori | 35 | | Efficient Calibration of Thermal Models based on Application Behavior Youngwoo Ahn Inchoon Yeo Riccardo Bettati | 41 | | Using Checksum to Reduce Power Consumption of the Display System for Low-Motion Content | 47 | |----------------------------------------------------------------------------------------------------------------------------------------------------|----| | Kyungtae Han, Zhen Fang, Richard Forand, Paul Diefenbaugh, Ravi<br>Iyer, Donald Newell | 41 | | Session II. | | | Session 2.1. Keynote Series: Disruptive Computing Design | | | A Disruptive Computer Design Idea: Architectures with Repeatable Timing (invited talk) | 54 | | Stephen Edwards, Sungjun Kim, Edward Lee, Isaac Liu, Hiren Patel,<br>Martin Schoeberl | | | Algorithmic Approach to Designing an Easy-To-Program System: Can It Lead to a HW-Enhanced Programmerâs Workflow Add-On? (invited | | | talk) | 60 | | Session 2.2. Hierarchical Testing and Design for Test<br>Session chair: Prab Varma, Blue Pearl Software | | | Quality Improvement and Cost Reduction Using Statistical Outlier Methods | 64 | | Amit Nahar, Kenneth Butler, John Carulli, Charles Weinberger | | | Test-Wrapper Optimization for Embedded Cores in TSV-Based Three-Dimensional SOCs Brandon Noia, Krishnendu Chakrabarty, Yuan Xie | 70 | | Hierarchical Parametric Test Metrics Estimation: A Sigma-Delta Converter BIST Case-Study | 78 | | Design and Test Strategies for Microarchitectural Post-Fabrication Tuning Xiaoyao Liang, Benjamin Lee, Gu-Yeon Wei, David Brooks | 84 | | Impact Analysis of Performance Faults in Modern Microprocessors Naghmeh Karimi, Michail Maniatakos, Chandra Tirumurti, Abhijit Jas, Yiorgos Makris | 91 | | A Robust Pulse-triggered Flip-Flop based Enhanced Scan Cell Design Rajesh Kumar, Kalyana Bollapalli, Rajesh Garg, Tarun Soni, Sunil Khatri | 97 | | Session 2.3. Clocking, Synchronization and Interconnect<br>Session chair: Nestoras Tzartzanis, Apple | | |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----| | Enabling Resonant Clock Distribution with Scaled On-Chip Magnetic Inductors | 103 | | Saurabh Sinha, Wei Xu, Jyothi Velamala, Tawab Dastagir, Bertan<br>Bakkaloglu, Hongbin Yu, Yu Cao | | | A Flexible Communication Scheme for Rationally-Related Clock<br>Frequencies | 109 | | Jean-Michel Chabloz, Ahmed Hemani | 100 | | VariPipe: Low-overhead Variable-clock Synchronous Pipelines | 117 | | N-way Ring and Square Arbiters | 125 | | On-chip Bidirectional Wiring for Heavily Pipelined Systems using | 101 | | Network Coding | 131 | | Session III. | | | Session 3.1. Energy Efficient Architectures Session chair: Allen Cheng, University of Pittsburgh | | | WHOLE: A Low Energy I-Cache with Separate Way History Zichao Xie, Dong Tong, Xu Cheng | 137 | | Reducing Dynamic Power Dissipation in Pipelined Forwarding Engines Weirong Jiang, Viktor Prasanna | 144 | | A Power-aware Hybrid RAM-CAM Renaming Mechanism for Fast | 470 | | Recovery | 150 | | Resource Sharing of Pipelined Custom Hardware Extension for Energy-efficient Application-specific Instruction Set Processor Design | 158 | | Deterministic Clock Gating to Eliminate Wasteful Activity in Out-of-Order Superscalar Processors due to Wrong Path Instructions Nasir Mohyuddin, Kimish Patel, Massoud Pedram | 166 | | Session 3.2. System Level Test and Verification | | | Real-time, Unobtrusive, and Efficient Program Execution Tracing with Stream Caches and Last Stream Predictors | 173 | |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----| | A Distributed Concurrent On-Line Test Scheduling Protocol for<br>Many-Core NoC-Based Systems | 179 | | Transaction-Based Debugging of System-on-Chips with Patterns | 186 | | A New Verification Method for Embedded Systems | 193 | | A Hierarchical Approach Towards System Level Static Timing Verification of SoCs | 201 | | Session 3.3. Synthesis and Optimization under Reliability<br>Constraints<br>Session chair: Mehdi Tahoori, Northeastern University | | | Timing Variation-Aware High-Level Synthesis Considering Accurate Yield Computation | 207 | | Fault-Tolerant Synthesis using Non-Uniform Redundancy | 213 | | Low-Overhead Error Detection for Networks-on-Chip | 219 | | Reliable 3D Stacked Power Distribution Considering Substrate Coupling . Amirali Shayan Arani, Xiang Hu, A. Ege Engin, Xiaoming Chen, Mikhail Popovich, Wanping Zhang, Chung-Kuan Cheng | 225 | | Interconnect Performance Corners considering Crosstalk Noise | 231 | | Session IV. | | | Session 4.1. High Performance Architecture and Advanced Memory Session chair: Sung Woo Chung, Korea University, Korea | | | Reducing Register File Size through Instruction Pre-Execution Enhanced by Value Prediction | 238 | |--------------------------------------------------------------------------------------------------------------------------|-----| | Reusing cached schedules in an out-of-order processor with in-order issue logic | 246 | | 3D GPU Architecture using Cache Stacking: Performance, Cost, Power and Thermal analysis | 254 | | Extending Data Prefetching to Cope with Context Switch Misses Suleyman Sair, Hanyu Cui | 260 | | The Salvage Cache: A fault-tolerant cache architecture for next-generation memory technologies | 268 | | Session 4.2. Memory and Processors Session chair: John Kim, Cray/Northwestern University | | | LRU-PEA: A Smart Replacement Policy for Non-Uniform Cache Architectures on Chip Multiprocessors | 275 | | Avoiding Cache Thrashing due to Private Data Placement in Last-level Cache For Manycore Scaling | 282 | | SHIELDSTRAP: Making Secure Processors Truly Secure | 289 | | Rapid Early-Stage Microarchitecture Design Using Predictive Models Christophe Dubach, Timothy Jones, Michael O'Boyle | 297 | | Efficient Binary Translation System with Low Hardware Cost | 305 | | Session 4.3. Disruptive Trends in Test and Verification (Invited) Session chair: Rathish Jayabharathi, Intel Corporation | | | Defect-Based Test Optimization for Analog/RF Circuits for Near-Zero DPPM Applications (invited talk) | 313 | | | | | (invited talk) | 319 | |----------------------------------------------------------------------------------------------------------------------|-----| | Testing Bio-chips (invited talk) | 327 | | Framework for Massively Parallel Testing at Wafer and Package Test (invited talk) | 328 | | Online Multiple Error Detection in Crossbar Nano-architectures (invited talk) | 335 | | Best Paper Session. | | | Session chairs: Sofiene Tahar (Concordia University), Georgi Gaydadjiev (Delft University of Technology) | | | Adaptive Online Testing for Efficient Hard Fault Detection | 343 | | FinFET-based Dynamic Power Management of On-chip Interconnection<br>Networks through Adaptive Back-gate Biasing | 350 | | Analysis and Optimization of Pausible Clocking based GALS Design $\dots$ $Xin\ Fan,\ Milos\ Krstic,\ Eckhard\ Grass$ | 358 | | Reliable Cache Design with Detection of Gate Oxide Breakdown Using BIST | 366 | | Session V. | | | Session 5.1. Logic and Memory Design<br>Session chair: Amy Novak, AMD | | | Efficient Architectures for Elliptic Curve Cryptographic Processors for RFID | 372 | | Multiplier-less and Table-less Linear Approximation for Square and Square-root | 378 | | 384 | |-----| | 390 | | 398 | | | | 404 | | 412 | | 419 | | | | | | 427 | | 433 | | 439 | | 445 | | | | | | Session 6.2. System Level Influence on Architecture<br>Session chair: Hai Li, Polytechnic Institute of NYU | | |--------------------------------------------------------------------------------------------------------------------------------------------|-----| | Code Density Concerns for New Architectures | 459 | | Performance Analysis of Decimal Floating-Point Libraries and Its Impact on Decimal Hardware and Software Solutions | 465 | | The Impact of Liquid Cooling on 3D Multi-Core Processors | 472 | | Intra-Vector SIMD Instructions for Core Specialization | 479 | | A High Throughput FFT Processor with no Multipliers | 485 | | Session 6.3. Low Voltage and Low Power<br>Session chair: Lars Svensson, Chalmers | | | Panoptic DVS: A Fine-Grained Dynamic Voltage Scaling Framework for Energy Scalable CMOS Design | 491 | | 3D Simulation and Analysis of the Radiation Tolerance of Voltage Scaled Digital Circuits | 498 | | A Radiation Tolerant Phase Locked Loop Design for Digital Electronics Rajesh Kumar, Vinay Karkala, Rajesh Garg, Tanuj Jindal, Sunil Khatri | 505 | | A PLL Design based on a Standing Wave Resonant Oscillator | 511 | | Mid-range Wireless Energy Transfer Using Inductive Resonance for Wireless Sensors | 517 | | A Technology-Agnostic Simulation Environment (TASE) for Iterative Custom IC Design Across Processes | 523 | | Author index | |