default search action
CGO 2019: Washington, DC, USA
- Mahmut Taylan Kandemir, Alexandra Jimborean, Tipp Moseley:
IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2019, Washington, DC, USA, February 16-20, 2019. IEEE 2019, ISBN 978-1-7281-1436-1
Research Papers
Binary Optimization
- Maksim Panchenko, Rafael Auler, Bill Nell, Guilherme Ottoni:
BOLT: A Practical Binary Optimizer for Data Centers and Beyond. 2-14 - Ruoyu Zhou, Timothy M. Jones:
Janus: Statically-Driven and Profile-Guided Automatic Dynamic Binary Parallelisation. 15-25
Bugs and Security
- Misiker Tadesse Aga, Todd M. Austin:
Smokestack: Thwarting DOP Attacks with Runtime Stack Layout Randomization. 26-36 - Jay P. Lim, Santosh Nagarakatte:
Automatic Equivalence Checking for Assembly Implementations of Cryptography Libraries. 37-49 - Hongyu Liu, Sam Silvestro, Xiaoyin Wang, Lide Duan, Tongping Liu:
CSOD: Context-Sensitive Overflow Detection. 50-60 - Haiyang Sun, Daniele Bonetta, Filippo Schiavio, Walter Binder:
Reasoning about the Node.js Event Loop using Async Graphs. 61-72
GPUs and Tensors
- Simon Garcia De Gonzalo, Sitao Huang, Juan Gómez-Luna, Simon D. Hammond, Onur Mutlu, Wen-Mei Hwu:
Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs. 73-84 - Jinsung Kim, Aravind Sukumaran-Rajam, Vineeth Thumma, Sriram Krishnamoorthy, Ajay Panyala, Louis-Noël Pouchet, Atanas Rountev, P. Sadayappan:
A Code Generator for High-Performance Tensor Contractions on GPUs. 85-95
Potpourri
- Ruiqin Tian, Junqiao Qiu, Zhijia Zhao, Xu Liu, Bin Ren:
Transforming Query Sequences for High-Throughput B+ Tree Processing on Many-Core Processors. 96-108 - Girish Mururu, Ada Gavrilovska, Santosh Pande:
Quantifying and Reducing Execution Variance in STM via Model Driven Commit Optimization. 109-121 - Wen-Chuan Lee, Yingqi Liu, Peng Liu, Shiqing Ma, Hongjun Choi, Xiangyu Zhang, Rajiv Gupta:
White-Box Program Tuning. 122-135 - Marcus Rodrigues, Breno Guimarães, Fernando Magno Quintão Pereira:
Generation of In-Bounds Inputs for Arrays in Memory-Unsafe Languages. 136-148
Code Generation
- Rodrigo C. O. Rocha, Pavlos Petoumenos, Zheng Wang, Murray Cole, Hugh Leather:
Function Merging by Sequence Alignment. 149-163 - Aleksandar Prokopec, Gilles Duboscq, David Leopoldseder, Thomas Würthinger:
An Optimization-Driven Incremental Inline Substitution Algorithm for Just-in-Time Compilers. 164-179 - Fredrik Kjolstad, Willow Ahrens, Shoaib Kamil, Saman P. Amarasinghe:
Tensor Algebra Compilation with Workspaces. 180-192
Kernel Optimization
- Riyadh Baghdadi, Jessica Ray, Malek Ben Romdhane, Emanuele Del Sozzo, Abdurrahman Akkas, Yunming Zhang, Patricia Suriana, Shoaib Kamil, Saman P. Amarasinghe:
Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code. 193-205 - Vasileios Porpodas, Rodrigo C. O. Rocha, Evgueni Brevnov, Luís F. W. Góes, Timothy G. Mattson:
Super-Node SLP: Optimized Vectorization for Code Sequences Containing Operators and Their Inverse Elements. 206-216 - Thiago S. F. X. Teixeira, Corinne Ancourt, David A. Padua, William Gropp:
Locus: A System and a Language for Program Optimization. 217-228
GPUs
- Ari B. Hayes, Fei Hua, Jin Huang, Yan-Hao Chen, Eddy Z. Zhang:
Decoding CUDA Binary. 229-241 - Bo Qiao, Oliver Reiche, Frank Hannig, Jürgen Teich:
From Loop Fusion to Kernel Fusion: A Domain-Specific Approach to Locality Optimization. 242-253 - Anupama Chandrasekhar, Gang Chen, Po-Yu Chen, Wei-Yu Chen, Junjie Gu, Peng Guo, Shruthi Hebbur Prasanna Kumar, Guei-Yuan Lueh, Pankaj Mistry, Wei Pan, Thomas Raoux, Konrad Trifunovic:
IGC: The Open Source Intel Graphics Compiler. 254-265
Student Research Competition
Undergraduate
- Brandon Neth, Michelle Mills Strout:
Automatic Parallelization of Irregular x86-64 Loops. 266
Graduate
- Moumita Das, Ansuman Banerjee, Bhaskar Sardar:
A Shared BTB Design for Multicore Systems. 267-268 - Swetha Varadarajan:
Optimizing RNA-RNA Interaction Computations. 269-270 - Renata Martins Gomes, Marcel Baunach:
Code Generation from Formal Models for Automatic RTOS Portability. 271-272 - Jacob Nelson, Roberto Palmieri:
Understanding RDMA Behavior in NUMA Systems. 273-274 - Sheng-Yu Fu, Wei-Chung Hsu:
Translating Traditional SIMD Instructions to Vector Length Agnostic Architectures. 275 - Guangli Li, Lei Liu, Xiaobing Feng:
Accelerating GPU Computing at Runtime with Binary Optimization. 276-277 - Robin Kruppe, Julian Oppermann, Lukas Sommer, Andreas Koch:
Extending LLVM for Lightweight SPMD Vectorization: Using SIMD and Vector Instructions Easily from Any Language. 278-279 - Oscar Castro-López, Inés Fernando Vega López:
Multi-target Compiler for the Deployment of Machine Learning Models. 280-281 - Keren Zhou, John M. Mellor-Crummey:
A Tool for Performance Analysis of GPU-Accelerated Applications. 282 - Alok Mishra, Martin Kong, Barbara M. Chapman:
Kernel Fusion/Decomposition for Automatic GPU-Offloading. 283-284 - Yonghae Kim, Hyesoon Kim:
Translating CUDA to OpenCL for Hardware Generation using Neural Machine Translation. 285-286
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.