default search action
33rd PACT 2024: Long Beach, CA, USA
- Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques, PACT 2024, Long Beach, CA, USA, October 14-16, 2024. ACM 2024, ISBN 979-8-4007-0631-8
- An Qi Zhang, Andrés Goens, Nicolai Oswald, Tobias Grosser, Daniel J. Sorin, Vijay Nagarajan:
PipeGen: Automated Transformation of a Single-Core Pipeline into a Multicore Pipeline for a Given Memory Consistency Model. 1-13 - Gyeongseo Park, Minho Kim, Ki-Dong Kang, Yunhyeong Jeon, Sungju Kim, Hyosang Kim, Daehoon Kim:
vSPACE: Supporting Parallel Network Packet Processing in Virtualized Environments through Dynamic Core Management. 14-25 - Devesh Singh, Donald Yeung:
MORSE: Memory Overwrite Time Guided Soft Writes to Improve ReRAM Energy and Endurance. 26-39 - Jakob Hartmann, Guoliang He, Eiko Yoneki:
Optimizing Tensor Computation Graphs with Equality Saturation and Monte Carlo Tree Search. 40-52 - Maurice Bailleu, Dimitrios Stavrakakis, Rodrigo Caetano Rocha, Soham Chakraborty, Deepak Garg, Pramod Bhatotia:
Toast: A Heterogeneous Memory Management System. 53-65 - Tri Nguyen, Michela Becchi:
A Transducers-based Programming Framework for Efficient Data Transformation. 66-77 - Sowoong Kim, Eunyeong Sim, Youngsam Shin, YeonGon Cho, Woongki Baek:
Activation Sequence Caching: High-Throughput and Memory-Efficient Generative Inference with a Single GPU. 78-90 - Jaeyong Song, Hongsun Jang, Hunseong Lim, Jaewon Jung, Youngsok Kim, Jinho Lee:
GraNNDis: Fast Distributed Graph Neural Network Training Framework for Multi-Server Clusters. 91-107 - Yiwei Li, Boyu Tian, Mingyu Gao:
Trimma: Trimming Metadata Storage and Latency for Hybrid Memory Systems. 108-120 - Ardhi Wiratama Baskara Yudha, Jiaqi Xue, Qian Lou, Huiyang Zhou, Yan Solihin:
BoostCom: Towards Efficient Universal Fully Homomorphic Encryption by Boosting the Word-wise Comparisons. 121-132 - Alberto Zeni, Seth Onken, Marco Domenico Santambrogio, Mehrzad Samadi:
Leveraging Difference Recurrence Relations for High-Performance GPU Genome Alignment. 133-143 - Shuiyi He, Zicong Wang, Xuan Tang, Qiyao Sun, Dezun Dong:
Chimera: Leveraging Hybrid Offsets for Efficient Data Prefetching. 144-155 - Akash Dutta, Ali Jannesari:
MIREncoder: Multi-modal IR-based Pretrained Embeddings for Performance Optimizations. 156-167 - Juseong Park, Boseok Kim, Hyojin Sung:
NavCim: Comprehensive Design Space Exploration for Analog Computing-in-Memory Architectures. 168-182 - Vignesh Suresh, Bakshree Mishra, Ying Jing, Zeran Zhu, Naiyin Jin, Charles Block, Paolo Mantovani, Davide Giri, Joseph Zuckerman, Luca P. Carloni, Sarita V. Adve:
Mozart: Taming Taxes and Composing Accelerators with Shared-Memory. 183-200 - Steve Rhyner, Haocong Luo, Juan Gómez-Luna, Mohammad Sadrosadati, Jiawei Jiang, Ataberk Olgun, Harshita Gupta, Ce Zhang, Onur Mutlu:
PIM-Opt: Demystifying Distributed Optimization Algorithms on a Real-World Processing-In-Memory System. 201-218 - Fangzhou Liu, Yifan Zhu, Shaotong Sun, Chen Ding, Wesley Smith, Kaave Seyed Hosseini:
Parallel Loop Locality Analysis for Symbolic Thread Counts. 219-232 - Daon Park, Bernhard Egger:
Improving Throughput-oriented LLM Inference with CPU Computations. 233-245 - Pranav Dangi, Zhenyu Bai, Rohan Juneja, Dhananjaya Wijerathne, Tulika Mitra:
ZeD: A Generalized Accelerator for Variably Sparse Matrix Computations in ML. 246-257 - Sankeerth Durvasula, Adrian Zhao, Raymond Kiguru, Yushi Guan, Zhonghan Chen, Nandita Vijaykumar:
ACE: Efficient GPU Kernel Concurrency for Input-Dependent Irregular Computational Graphs. 258-270 - Alhad Daftardar, Brandon Reagen, Siddharth Garg:
SZKP: A Scalable Accelerator Architecture for Zero-Knowledge Proofs. 271-283 - Qidong Su, Jiacheng Yang, Gennady Pekhimenko:
BOOM: Use your Desktop to Accurately Predict the Performance of Large Deep Neural Networks. 284-296 - Magnus Östgren, Ioannis Sourdis:
A Parallel Hash Table for Streaming Applications. 297-308 - Enhyeok Jang, Dongho Ha, Seungwoo Choi, Youngmin Kim, Jaewon Kwon, Yongju Lee, Sungwoo Ahn, Hyungseok Kim, Won Woo Ro:
Recompiling QAOA Circuits on Various Rotational Directions. 309-324 - Sungbin Jang, Junhyeok Park, Osang Kwon, Yongho Lee, Seokin Hong:
Rethinking Page Table Structure for Fast Address Translation in GPUs: A Fixed-Size Hashed Page Table. 325-337 - Hyoungwook Nam, Raghavendra Pradyumna Pothukuchi, Bo Li, Nam Sung Kim, Josep Torrellas:
FriendlyFoe: Adversarial Machine Learning as a Practical Architectural Defense against Side Channel Attacks. 338-350 - Pranav Gokhale, Teague Tomesh, Martin Suchara, Fred Chong:
Faster and More Reliable Quantum SWAPs via Native Gates. 351-362
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.