default search action
33rd SBAC-PAD 2021: Belo Horizonte, Brazil
- 33rd IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021, Belo Horizonte, Brazil, October 26-29, 2021. IEEE 2021, ISBN 978-1-6654-4301-2
Session 1: Accelerated Computing
- Hao Zhou, David Troendle, Byunghyun Jang:
DACHash: A Dynamic, Cache-Aware and Concurrent Hash Table on GPUs. 1-10 - Raúl Taranco, José-María Arnau, Antonio González:
A Low-Power Hardware Accelerator for ORB Feature Extraction in Self-Driving Cars. 11-21 - Dominik Ernst, Georg Hager, Matthias Knorr, Gerhard Wellein, Markus Holzer:
Opening the Black Box: Performance Estimation during Code Generation for GPUs. 22-32 - Jude Haris, Perry Gibson, José Cano, Nicolas Bohm Agostini, David R. Kaeli:
SECDA: Efficient Hardware/Software Co-Design of FPGA-based DNN Accelerators for Edge Inference. 33-43
Session 2: Memory Systems
- Catalina Munoz Morales, Bruno C. Honorio, Alexandro Baldassin, Guido Araujo:
Improving Phased Transactional Memory via Commit Throughput and Capacity Estimation. 44-53 - Jonathas Silveira, Lucas Wanner:
Design and evaluation of associative processing kernels. 64-73 - João Vicente Souto, Márcio Castro, Pedro Henrique Penna:
A Task-based Execution Engine for Distributed Operating Systems Tailored to Lightweight Manycores with Limited On-Chip Memory. 74-83
Session 3: Computer Architecture
- Rafael C. F. Sousa, Byungmin Jung, Jaehwa Kwak, Michael Frank, Guido Araujo:
Efficient Tensor Slicing for Multicore NPUs using Memory Burst Modeling. 84-93 - Ehsan Atoofian:
Sparsity-aware Power Gating for Tensor Cores. 94-103 - Vanderson Martins do Rosario, Raphael Zinsly, Sandro Rigo, Edson Borin:
Employing Simulation to Facilitate the Design of Dynamic Binary Translators. 104-113 - Hikaru Takayashiki, Masayuki Sato, Kazuhiko Komatsu, Hiroaki Kobayashi:
Register Flush-free Runahead Execution for Modern Vector Processors. 114-125
Session 4: Scheduling and Distributed Systems
- Anne Benoit, Louis-Claude Canon, Redouane Elghazi, Pierre-Cyrille Héam:
Shelf schedules for independent moldable tasks to minimize the energy consumption. 126-136 - Zeina Houmani, Daniel Balouek-Thomert, Eddy Caron, Manish Parashar:
Enabling microservices management for Deep Learning applications across the Edge-Cloud Continuum. 137-146 - Michael Guilherme Jordan, Guilherme Korol, Mateus Beck Rutzig, Antonio Carlos Schneider Beck:
FAIR: Fully-Adaptive Framework for Improving Resource Provisioning in Collaborative CPU-FPGA Cloud Environments. 147-156 - André Ramos Carneiro, Jean Luca Bez, Carla Osthoff, Lucas Mello Schnorr, Philippe O. A. Navaux:
HPC Data Storage at a Glance: The Santos Dumont Experience. 157-166 - Wilton Jaciel Loch, Guilherme Piêgas Koslovski:
Sparbit: a new logarithmic-cost and data locality-aware MPI Allgather algorithm. 167-176
Session 5: Applications
- Marco Barbone, Andreas Wetscherek, Thomas Yung, Uwe Oelfke, Wayne Luk, Georgi Gaydadjiev:
Efficient Online 4D Magnetic Resonance Imaging. 177-187 - Lucas Reis, Lucas Wanner:
Functional Approximation and Approximate Parallelization with the ACCEPT compiler. 188-197 - Gangyi Zhu, Gagan Agrawal:
Sampling-based Sparse Format Selection on GPUs. 198-208 - Erfan Bank Tavakoli, Michael Riera, Masudul Hassan Quraishi, Fengbo Ren:
FSCHOL: An OpenCL-based HPC Framework for Accelerating Sparse Cholesky Factorization on FPGAs. 209-220
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.