


default search action
36th ISC 2021: Virtual Event
- Bradford L. Chamberlain
, Ana Lucia Varbanescu
, Hatem Ltaief
, Piotr Luszczek
:
High Performance Computing - 36th International Conference, ISC High Performance 2021, Virtual Event, June 24 - July 2, 2021, Proceedings. Lecture Notes in Computer Science 12728, Springer 2021, ISBN 978-3-030-78712-7
Architecture, Networks, and Storage
- Yi Dai, Kai Lu, Junsheng Chang, Xingyun Qi, Jijun Cao, Jianmin Zhang:
Microarchitecture of a Configurable High-Radix Router for the Post-Moore Era. 3-17 - Mohammadreza Bayatpour, Nick Sarkauskas, Hari Subramoni, Jahanzeb Maqbool Hashmi, Dhabaleswar K. Panda:
BluesMPI: Efficient MPI Non-blocking Alltoall Offloading Designs on Modern BlueField Smart NICs. 18-37 - Jesmin Jahan Tithi, Fabrizio Petrini, David F. Richards:
Lessons Learned from Accelerating Quicksilver on Programmable Integrated Unified Memory Architecture (PIUMA) and How That's Different from CPU. 38-56 - Narasinga Rao Miniskar
, Frank Liu
, Aaron R. Young
, Dwaipayan Chakraborty
, Jeffrey S. Vetter
:
A Hierarchical Task Scheduler for Heterogeneous Computing. 57-76
Machine Learning, AI, and Emerging Technologies
- Ruobing Han, James Demmel, Yang You:
Auto-Precision Scaling for Distributed Deep Learning. 79-97 - Tian Ye, Yang Yang, Sanmukh R. Kuppannagari
, Rajgopal Kannan, Viktor K. Prasanna:
FPGA Acceleration of Number Theoretic Transform. 98-117 - Kawthar Shafie Khorassani, Jahanzeb Maqbool Hashmi, Ching-Hsiang Chu
, Chen-Chun Chen, Hari Subramoni, Dhabaleswar K. Panda:
Designing a ROCm-Aware MPI Library for AMD GPUs: Early Experiences. 118-136 - Kevin A. Brown, Neil McGlohon, Sudheer Chunduri, Eric Borch, Robert B. Ross, Christopher D. Carothers, Kevin Harms:
A Tunable Implementation of Quality-of-Service Classes for HPC Networks. 137-156 - Brian A. Page
, Peter M. Kogge
:
Scalability of Streaming Anomaly Detection in an Unbounded Key Space Using Migrating Threads. 157-175 - Pouya Fotouhi
, Marjan Fariborz
, Roberto Proietti
, Jason Lowe-Power
, Venkatesh Akella
, S. J. Ben Yoo
:
HTA: A Scalable High-Throughput Accelerator for Irregular HPC Workloads. 176-194 - Burak Aksar
, Yijia Zhang, Emre Ates, Benjamin Schwaller, Omar Aaziz, Vitus J. Leung, Jim M. Brandt, Manuel Egele, Ayse K. Coskun:
Proctor: A Semi-Supervised Performance Anomaly Diagnosis Framework for Production HPC Systems. 195-214
HPC Algorithms and Applications
- Marko Kabic, Simon Pintarelli, Anton Kozhevnikov
, Joost VandeVondele:
COSTA: Communication-Optimal Shuffle and Transpose Algorithm with Process Relabeling. 217-236 - Yicong Zhu, Peng Zhang, Changnian Han, Guojing Cong, Yuefan Deng:
Enabling AI-Accelerated Multiscale Modeling of Thrombogenesis at Millisecond and Molecular Resolutions on Supercomputers. 237-254 - Keith Obenschain, Yu Yu Khine, Raghunandan Mathur, Gopal Patnaik, Robert Rosenberg:
Evaluation of the NEC Vector Engine for Legacy CFD Codes. 255-271 - Pietro Incardona, Tommaso Bianucci
, Ivo F. Sbalzarini
:
Distributed Sparse Block Grids on GPUs. 272-290 - Luk Burchard
, Johannes Moe
, Daniel Thilo Schroeder
, Konstantin Pogorelov
, Johannes Langguth
:
iPUG: Accelerating Breadth-First Graph Traversals Using Manycore Graphcore IPUs. 291-309
Performance Modeling, Evaluation, and Analysis
- Richard Todd Evans
, Matthew Cawood, Stephen Lien Harrell
, Lei Huang, Si Liu
, Chun-Yaung Lu
, Amit Ruhela
, Yinzhi Wang
, Zhao Zhang
:
Optimizing GPU-Enhanced HPC System and Cloud Procurements for Scientific Workloads. 313-331 - Andrei Poenaru
, Wei-Chen Lin
, Simon McIntosh-Smith
:
A Performance Analysis of Modern Parallel Programming Models Using a Compute-Bound Application. 332-350 - Ayesha Afzal, Georg Hager, Gerhard Wellein
:
Analytic Modeling of Idle Waves in Parallel Programs: Communication, Cluster Topology, and Noise Impact. 351-371 - Masahiro Nakao
, Koji Ueno, Katsuki Fujisawa
, Yuetsu Kodama, Mitsuhisa Sato:
Performance of the Supercomputer Fugaku for Breadth-First Search in Graph500 Benchmark. 372-390 - István Z. Reguly, Andrew M. B. Owenson, Archie Powell, Stephen A. Jarvis
, Gihan R. Mudalige
:
Under the Hood of SYCL - An Initial Performance Analysis with An Unstructured-Mesh CFD Application. 391-410 - Amit Ruhela
, Stephen Lien Harrell
, Richard Todd Evans, Gregory J. Zynda, John M. Fonner, Matt Vaughn
, Tommy Minyard, John Cazes:
Characterizing Containerized HPC Applications Performance at Petascale on CPU and GPU Architectures. 411-430 - David Böhme
, Pascal Aschwanden, Olga Pearce, Kenneth Weiss
, Matthew P. LeGendre:
Ubiquitous Performance Analysis. 431-449
Programming Environments and Systems Software
- Chad Wood, Giorgis Georgakoudis
, David Beckingsale, David Poliakoff, Alfredo Giménez, Kevin A. Huck
, Allen D. Malony, Todd Gamblin:
Artemis: Automatic Runtime Tuning of Parallel Execution Parameters Using Machine Learning. 453-472

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.