# **2014 19th Asia and South Pacific Design Automation Conference**

## (ASP-DAC 2014)

Singapore 20-23 January 2014



IEEE Catalog Number: ISBN:

CFP14ASP-POD 978-1-4799-2817-0

#### 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC) Date: Jan 20-23, 2014 Place: Suntec, Singapore

### **Table of Contents**

#### **Technical Papers**

| No | Title                                                          | Pci g'P q |
|----|----------------------------------------------------------------|-----------|
| 1  | Normally-Off Computing Project : Challenges and Opportunities  | 1         |
| 2  | Novel Nonvolatile Memory Hierarchies to Realize "Normally-Off  | 6         |
|    | Mobile Processors"                                             |           |
| 3  | Normally-Off MCU Architecture for Low-Power Sensor Node        | 12        |
| 4  | Normally-Off Technologies for Healthcare Appliance             | 17        |
| 5  | A Dual-Loop Injection-Locked PLL with All-Digital Background   | 21        |
|    | Calibration System for On-Chip Clock Generation                |           |
| 6  | A 950µW 5.5-GHz Low Voltage PLL with Digitally-Calibrated      | 23        |
|    | ILFD and Linearized Varactor                                   |           |
| 7  | A Swing-Enhanced Current-Reuse Class-C VCO with Dynamic        | 25        |
|    | Bias Control Circuits                                          |           |
| 8  | Design of A High-Performance Millimeter-Wave Amplifier         | 27        |
|    | Using Specific Modeling                                        |           |
| 9  | A Multi-Mode Reconfigurable Analog Baseband with I/Q           | 29        |
|    | Calibration for GNSS Receivers                                 |           |
| 10 | An 8b Extremely Area Efficient Threshold Configuring SAR       | 31        |
|    | ADC with Source Voltage Shifting Technique                     |           |
| 11 | A Single-Inductor 8-Channel Output DC-DC Boost Converter       | 33        |
|    | with Time-Limited Power Distribution Control and Single Shared |           |
|    | Hysteresis Comparator                                          |           |
|    |                                                                |           |

| 12 | A DC-DC Boost Converter with Variation Tolerant MPPT             | 35  |
|----|------------------------------------------------------------------|-----|
|    | Technique and Efficient ZCS Circuit for Thermoelectric Energy    |     |
|    | Harvesting Applications                                          |     |
| 13 | 7.3 Gb/s Universal BCH Encoder and Decoder for SSD               | 37  |
| _  | Controllers                                                      |     |
| 14 | A High-Speed and Low-Complexity Lens Distortion Correction       | 39  |
|    | Processor for Wide-Angle Cameras                                 |     |
| 15 | Analytical Placement of Mixed-Size Circuits for Better Detailed- | 41  |
|    | Routability                                                      |     |
| 16 | Lithographic Defect Aware Placement Using Compact Standard       | 47  |
| -  | Cells Without Inter-Cell Margin                                  |     |
| 17 | Structural Planning of 3D-IC Interconnects by Block Alignment    | 53  |
| 18 | Comprehensive Die-Level Assessment of Design Rules and           | 61  |
| 10 | Lavouts                                                          |     |
| 19 | Prefetching Techniques for STT-RAM Based Last-Level Cache        | 67  |
| 17 | in CMP Systems                                                   |     |
| 20 | CNPLIF: A Carbon Nanotube-based Physically Unclonable            | 73  |
| 20 | Function for Secure Low-Energy Hardware Design                   | 15  |
| 21 | 3DCoB: A New Design Approach for Monolithic 3D Integrated        | 79  |
| 21 | Circuits                                                         | 17  |
| 22 | Emulator-Oriented Tiny Processors for Unreliable Post-Silicon    | 85  |
| 22 | Devices: A Case Study                                            | 05  |
| 23 | Applying VI SI EDA to Energy Distribution System Design          | 91  |
| 23 | A Model-Based Design of Cyber-Physical Energy Systems            | 97  |
| 24 | The Date Center as a Grid Load Stabilizer                        | 105 |
| 25 | Rounding Ruffer Space Requirements for Real Time Priority        | 112 |
| 20 | Aware Networks                                                   | 115 |
| 27 | Task and Network Lavel Schedule Co. Synthesis of Ethernet        | 110 |
| 27 | Resed Time Triggered Systems                                     | 119 |
| 28 | Sarvice Adaptions for Mixed Criticality Systems                  | 125 |
| 20 | Efficient Ecosibility Analysis of DAC Scheduling with Bool       | 123 |
| 29 | Time Constraints in the Presence of Faults                       | 151 |
| 20 | Flavible Decked Stongil Design with Multiple Shaping Apertures   | 127 |
| 50 | for E Beam Lithography                                           | 157 |
| 31 | Solf Aligned Double Patterning Layout Decomposition with         | 142 |
| 51 | Complementary E Peem Lithegraphy                                 | 145 |
| 22 | Complementary E-Deam Liniography                                 | 140 |
| 32 | Fixing Double Pattern Shift Among Critical Danaity Analysis for  | 149 |
| 33 | EUV-CDA: Pattern Shift Aware Chucai Density Analysis for         | 155 |
| 24 | EUV Mask Layous                                                  | 162 |
| 54 | Statistical Analysis of Kandom Telegraph Noise in Digital        | 105 |
| 25 | Circuits                                                         | 167 |
| 35 | Semi-Analytical Current Source Modeling of FinFE1 Devices        | 167 |
|    | Control and Considering Decases Variation                        |     |
| 26 | Control and Considering Process Variation                        | 172 |
| 30 | 2-5A1 Based Linear Time Optimum Two-Domain Clock Skew            | 1/5 |
| 27 | Down Minimization of Dinaling Architecture through 1 Co. 1       | 170 |
| 51 | From Correction and Voltage Scaling                              | 1/7 |
| 20 | A Silicon Nonodiale Amory Structure Dealining Computin D         | 195 |
| 38 | A Shicon Nanoulsk Array Structure Kealizing Synaptic Response    | 103 |
| 20 | of Spiking Neuron Models with Noise                              | 101 |
| 39 | Energy Efficient In-Memory Machine Learning for Data             | 191 |
|    | Intensive Image-Processing by Non-Volatile Domain-Wall           |     |
| 40 |                                                                  | 107 |
| 40 | Lessons from the neurons 1 nemserves                             | 197 |
| 1  |                                                                  |     |

| 41         | Leveraging the Error Resilience of Machine-Learning              | 201 |
|------------|------------------------------------------------------------------|-----|
|            | Applications for Designing Highly Energy Efficient Accelerators  |     |
| 42         | ArISE: Aging-Aware Instruction Set Encoding for Lifetime         | 207 |
| 72         | Improvement                                                      | 207 |
| 13         | DPuiD: Designing Paconfigurable Architectures with Desigion      | 212 |
| 43         | Making Support                                                   | 213 |
| 4.4        | Rdit Distance Deced Instruction Marging Technique to Improve     | 210 |
| 44         | East Distance Based instruction Merging Technique to Improve     | 219 |
|            | Design                                                           |     |
| 45         | A Net with Floren 1 Octional Consult Descention Allocation       | 225 |
| 43         | A Network-Flow-Dased Optimal Sample Preparation Algorithm        | 223 |
| 10         | Final sing Speed and Engrand Tradeoffs in Dranlet Transport for  | 221 |
| 40         | Exploring Speed and Energy Tradeons in Droplet Transport for     | 251 |
| 47         | Digital Microllulaic Biochips                                    | 229 |
| 4/         | General Purpose Cross-Referencing Microfluidic Biochip with      | 238 |
| 40         | Reduced Pin-Count                                                | 244 |
| 48         | Wash Optimization for Cross-Contamination Removal in Flow-       | 244 |
| 40         | Based Microfiluidic Biochips                                     | 250 |
| 49         | ABCD-NL: Approximating Continuous Non-Linear Dynamical           | 250 |
|            | Systems Using Purely Boolean Models for Analog/Mixed-Signal      |     |
| 50         |                                                                  | 257 |
| 50         | I oward Efficient Programming of Reconfigurable Radio            | 256 |
| <i>E</i> 1 | Frequency (KF) Receivers                                         | 262 |
| 51         | Efficient Matrix Exponential Method Based on Extended Krylov     | 202 |
| 50         | Subspace for Transient Simulation of Large-Scale Linear Circuits | 267 |
| 52         | SDG2KPN: System Dependency Graph to Function-Level KPN           | 267 |
| 52         | Generation of Legacy Code for MPSoLs                             | 274 |
| 55         | Low Power Design of the Next-Generation High Efficiency          | 274 |
| 51         | Video Coding                                                     | 282 |
| 54         | Sunthesis                                                        | 202 |
| 55         | Lawaraging Darallalism in the Presence of Control Flow on        | 285 |
| 55         | CCP As                                                           | 283 |
| 56         | Dhusical Awara Task Migration Algorithm for Dynamic Thormal      | 202 |
| 50         | Management of SMT Multi Core Processors                          | 272 |
| 57         | A gile Frequency Scaling for Adaptive Power Allocation in        | 208 |
| 57         | Many Core Systems Powered by Renewable Energy Sources            | 298 |
| 58         | Variation Awara Voltage Island Formation for Power Efficient     | 304 |
| 58         | Variation Aware Voltage Island Pornation for Power Efficient     | 504 |
| 50         | An Evaluation of an Energy Efficient Many Core SoC with          | 211 |
| 39         | Parallelized Face Detection                                      | 511 |
| 60         | Energy Aware Real Time Scheduling Policy with Guaranteed         | 317 |
| 00         | Security Protection                                              | 517 |
| 61         | A Comprehensive and Accurate Latency Model for Network on        | 202 |
| 01         | Chin Performance Analysis                                        | 525 |
| 62         | A Low Latency Asynchronous Interconnection Network with          | 320 |
| 02         | Farly Arbitration Resolution                                     | 527 |
| 63         | A Vertically Integrated and Interoperable Multi-Vendor Synthesis | 337 |
| 03         | Flow for Predictable NoC Design in Nanoscale Technologies        | 557 |
| 64         | Fuzzy Flow Regulation for Network-on-Chin Based Chin             | 343 |
|            | Multiprocessors Systems                                          |     |
| 65         | Adjustable Contiguity of Run-Time Task Allocation in             | 349 |
| 05         | Networked Many-Core Systems                                      |     |
| 66         | STD-TLB: A STT-RAM-Based Dynamically-Configurable                | 355 |
| 00         | Translation Lookaside Buffer for GPU Architectures               |     |
|            |                                                                  |     |

| 67       | Training Itself: Mixed-Signal Training Acceleration for                          | 361 |
|----------|----------------------------------------------------------------------------------|-----|
|          | Memristor-Based Neural Network                                                   |     |
| 68       | HDTV1080p HEVC Intra Encoder with Source Texture Based                           | 367 |
|          | CU/PU Mode Pre-decision                                                          |     |
| 69       | Fast Large-Scale Optimal Power Flow Analysis for Smart Grid                      | 373 |
| -        | through Network Reduction                                                        | 270 |
| 70       | Storage-Less and Converter-Less Maximum Power Point                              | 379 |
| 71       | Tracking of Photovoltaic Cells for a Nonvolatile Microprocessor                  | 207 |
| 71       | Soft Error Resiliency Characterization on IBM BlueGene/Q                         | 385 |
| 70       | Processor                                                                        | 200 |
| 72       | Residency for Many-Core System on a Chip                                         | 388 |
| 75       | Amphishaanay Modeling Two Orthogonal Ways to Unit on                             | 204 |
| /4       | Amphisbaena: Modering Two Orthogonal ways to Hunt on<br>Heterogeneous Many Cores | 394 |
| 75       | Co Simulation Framework for Streamlining Microprocessor                          | 400 |
| 15       | Development on Standard ASIC Design Flow                                         | 400 |
| 76       | Appotition and Applysis Combined Cache Modeling for Native                       | 406 |
| 70       | Simulation                                                                       | 400 |
| 77       | A Scorchingly Fast FPGA-Based Precise L1 LRU Cache                               | 412 |
| ,,       | Simulator                                                                        | 712 |
| 78       | Redundant-Via-Aware ECO Routing                                                  | 418 |
| 79       | A Fast and Provably Bounded Failure Analysis of Memory                           | 424 |
|          | Circuits in High Dimensions                                                      |     |
| 80       | Predicting Circuit Aging Using Ring Oscillators                                  | 430 |
| 81       | Statistical Analysis of Process Variation Based on Indirect                      | 436 |
|          | Measurements for Electronic System Design                                        |     |
| 82       | Symbolic Computation of SNR for Variational Analysis of                          | 443 |
|          | Sigma-Delta Modulator                                                            |     |
| 83       | Sparse Statistical Model Inference for Analog Circuits under                     | 449 |
|          | Process Variations                                                               |     |
| 84       | Time-Domain Performance Bound Analysis for Analog and                            | 455 |
|          | Interconnect Circuits Considering Process Variations                             |     |
| 85       | A Robustness Optimization of SRAM Dynamic Stability by                           | 461 |
|          | Sensitivity-Based Reachability Analysis                                          |     |
| 86       | Accurate and Inexpensive Performance Monitoring for                              | 467 |
| 07       | Variability-Aware Systems                                                        | 474 |
| 87       | Quantifying Workload Dependent Reliability in Embedded                           | 4/4 |
| 00       | Processors                                                                       | 470 |
| 88       | QED Post-Sincon Validation and Debug: Frequently Asked                           | 4/8 |
| 80       | Efficient Synthesis of Quantum Circuits Implementing Clifford                    | 483 |
| 89       | Group Operations                                                                 | 485 |
| 90       | Optimal SWAP Gate Insertion for Nearest Neighbor Quantum                         | 489 |
| 70       | Circuits                                                                         |     |
| 91       | Oubit Placement to Minimize Communication Overhead in 2D                         | 495 |
| <i>,</i> | Ouantum Architectures                                                            |     |
| 92       | A Novel Wirelength-Driven Packing Algorithm for FPGAs with                       | 501 |
|          | Adaptive Logic Modules                                                           |     |
| 93       | A Topology-Based ECO Routing Methodology for Mask Cost                           | 507 |
|          | Minimization                                                                     |     |
| 94       | BOB-Router: A New Buffering-Aware Global Router with Over-                       | 513 |
|          | the-Block Routing Resources Optimization                                         |     |
| 95       | Routability-Driven Bump Assignment for Chip-Package Co-                          | 519 |
|          | Design                                                                           |     |
|          |                                                                                  |     |

| 96  | VFGR: A Very Fast Parallel Global Router with Accurate         | 525          |
|-----|----------------------------------------------------------------|--------------|
|     | Congestion Modeling                                            |              |
| 97  | Efficient Simulation-Based Optimization of Power Grid with On- | 531          |
|     | Chip Voltage Regulator                                         |              |
| 98  | Walking Pads: Fast Power-Supply Pad-Placement Optimization     | 537          |
| 99  | Power Supply Noise-Aware Workload Assignments for              | 544          |
|     | Homogenous 3D MPSoCs with Thermal Consideration                |              |
| 100 | SwimmingLane: A Composite Approach to Mitigate Voltage         | 550          |
|     | Droop Effects in 3D Power Delivery Network                     |              |
| 101 | Spiking Brain Models: Computation, Memory and                  | 556          |
|     | Communication Constraints for Custom Hardware                  |              |
| 102 | Implementation                                                 | 5(2)         |
| 102 | Advanced Technologies for Brain-Inspired Computing             | 503          |
| 103 | GPGPU Accelerated Simulation and Parameter Tuning for          | 570          |
| 104 | Neuromorphic Applications                                      | 570          |
| 104 | A Scalable Custom Simulation Machine for the Bayesian          | 5/8          |
| 105 | No A L everaging Delta Compression for End to End Memory       | 596          |
| 105 | Access in NoC Based Multicores                                 | 380          |
| 106 | DPA: A Data Pattern Awara Error Prevention Technique for       | 502          |
| 100 | NAND Flash Lifetime Extension                                  | 592          |
| 107 | Scattered Refresh: An Alternative Refresh Mechanism to Reduce  | 598          |
| 107 | Refresh Cycle Time                                             | 570          |
| 108 | A Read-Write Aware DRAM Scheduling for Power Reduction in      | 604          |
| 100 | Multi-Core Systems                                             |              |
| 109 | A Coherent Hybrid SRAM and STT-RAM L1 Cache                    | 610          |
|     | Architecture for Shared Memory Multicores                      |              |
| 110 | Allocation of FPGA DSP-Macros in Multi-Process High-Level      | 616          |
|     | Synthesis Systems                                              |              |
| 111 | Array Scalarization in High Level Synthesis                    | 622          |
| 112 | Data Compression via Logic Synthesis                           | 628          |
| 113 | Synthesis of Power- and Area-Efficient Binary Machines for     | 634          |
|     | Incompletely Specified Sequences                               |              |
| 114 | Multi-Mode Trace Signal Selection for Post-Silicon Debug       | 640          |
| 115 | Implicit Intermittent Fault Detection in Distributed Systems   | 646          |
| 116 | A Segmentation-Based BISR Scheme                               | 652          |
| 117 | Fault-Tolerant TSV by Using Scan-Chain Test TSV                | 658          |
| 118 | Suppressing Test Inflation in Shared-Memory Parallel Automatic | 664          |
| 110 | Test Pattern Generation                                        | ( <b>7</b> 0 |
| 119 | A Volume Diagnosis Method for Identifying Systematic Faults in | 670          |
| 120 | Lower-Yield Wafer Occurring during Mass Production             |              |
| 120 | An Overview of Spin-Based Integrated Circuits                  | 6/6          |
| 121 | Advances in Spintronics Devices for Microelectronics - from    | 684          |
| 100 | Spin-Transfer Torque to Spin-Orbit Torque                      | (02          |
| 122 | nyonu CiviOS/Wagnetic Process Design Kit and SOI-Based         | 092          |
| 122 | Architectural Aspects in Design and Analysis of SOT Pasad      | 700          |
| 123 | Memories                                                       | /00          |
| 124 | Timing Anomalies in Multi-Core Architectures due to the        | 708          |
| 124 | Interference on the Shared Resources                           | /00          |
| 125 | A Unified Online Directed Acyclic Graph Flow Manager for       | 714          |
| 120 | Multicore Schedulers                                           |              |
|     |                                                                |              |
|     |                                                                |              |

| 126 | Variation-Aware Statistical Energy Optimization on Voltage-   | 720 |
|-----|---------------------------------------------------------------|-----|
|     | Frequency Island Based MPSoCs under Performance Yield         |     |
|     | Constraints                                                   |     |
| 127 | QoS-Aware Dynamic Resource Allocation for Spatial-            | 726 |
|     | Multitasking GPUs                                             |     |
| 128 | Automated Debugging of Missing Assumptions                    | 732 |
| 129 | Property Directed Reachability for QF_BV with Mixed Type      | 738 |
|     | Atomic Reasoning Units                                        |     |
| 130 | Adaptive Interpolation-Based Model Checking                   | 744 |
| 131 | Efficient Parallel GPU Algorithms for BDD Manipulation        | 750 |
| 132 | Efficient Techniques for the Capacitance Extraction of Chip-  | 756 |
|     | Scale VLSI Interconnects Using Floating Random Walk           |     |
|     | Algorithm                                                     |     |
| 133 | 3DLAT: TSV-Based 3D ICs Crosstalk Minimization Utilizing      | 762 |
|     | Less Adjacent Transition Code                                 |     |
| 134 | Tackling Close-to-Band Passivity Violations in Passive Macro- | 768 |
|     | Modeling                                                      |     |
| 135 | HIE-Block Latency Insertion Method for Fast Transient         | 774 |
|     | Simulation of Nonuniform Multiconductor Transmission Lines    |     |
| 136 | The Role of Photons in Cryptanalysis                          | 780 |
| 137 | SPADs for Quantum Random Number Generators and Beyond         | 788 |
| 138 | Quantum Key Distribution with Integrated Optics               | 795 |
| 139 | Constraint-Based Platform Variants Specification for Early    | 800 |
|     | System Verification                                           |     |
| 140 | A Transaction-Oriented UVM-Based Library for Verification of  | 806 |
|     | Analog Behavior                                               |     |
| 141 | Automata-Theoretic Modeling of Fixed-Priority Non-Preemptive  | 812 |
|     | Scheduling for Formal Timing Verification                     |     |
| 142 | PROCEED: A Pareto Optimization-Based Circuit-Level            | 818 |
|     | Evaluator for Emerging Devices                                |     |
| 143 | Modeling and Design Analysis of 3D Vertical Resistive Memory  | 825 |
|     | - A Low Cost Cross-Point Architecture                         |     |
| 144 | The Stochastic Modeling of TiO2 Memristor and Its Usage in    | 831 |
|     | Neuromorphic System Design                                    |     |
| 145 | Through-Silicon-Via Inductor: Is It Real or Just A Fantasy?   | 837 |
| 146 | Design and Control Methodology for Fine Grain Power Gating    | 843 |
|     | Based on Energy Characterization and Code Profiling of        |     |
|     | Microprocessors                                               |     |
| 147 | A Hybrid Random Walk Algorithm for 3-D Thermal Analysis of    | 849 |
|     | Integrated Circuits                                           |     |
| 148 | LightSim : A Leakage Aware Ultrafast Temperature Simulator    | 855 |
| 149 | Fast Vectorless Power Grid Verification Using Maximum         | 861 |
|     | Voltage Drop Location Estimation                              |     |