default search action
Antonio J. Peña
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j23]Ahmad Tarraf, Martin Schreiber, Alberto Cascajo, Jean-Baptiste Besnard, Marc-André Vef, Dominik Huber, Sonja Happ, André Brinkmann, David E. Singh, Hans-Christian Hoppe, Alberto Miranda, Antonio J. Peña, Rui Machado, Marta Garcia-Gasulla, Martin Schulz, Paul M. Carpenter, Simon Pickartz, Tiberiu Rotaru, Sergio Iserte, Víctor López, Jorge Ejarque, Heena Sirwani, Jesús Carretero, Felix Wolf:
Malleability in Modern HPC Systems: Current Experiences, Challenges, and Future Opportunities. IEEE Trans. Parallel Distributed Syst. 35(9): 1551-1564 (2024) - 2023
- [c47]Kazuaki Matsumura, Simon Garcia De Gonzalo, Antonio J. Peña:
A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code. CC 2023: 110-121 - [c46]Muhammad Usman, Sergio Iserte, Roger Ferrer, Antonio J. Peña:
OpenMP Offloading to DPU. CLUSTER Workshops 2023: 64-65 - [c45]Muhammad Usman, Sergio Iserte, Roger Ferrer, Antonio José Peña:
DPU Offloading Programming with the OpenMP API. SC Workshops 2023: 884-891 - [i11]Kazuaki Matsumura, Simon Garcia De Gonzalo, Antonio J. Peña:
A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code. CoRR abs/2301.11389 (2023) - [i10]Kazuaki Matsumura, Simon Garcia De Gonzalo, Antonio J. Peña:
ACC Saturator: Automatic Kernel Optimization for Directive-Based GPU Code. CoRR abs/2306.13002 (2023) - 2022
- [j22]Marc Jordà, Pedro Valero-Lara, Antonio J. Peña:
cuConv: CUDA implementation of convolution for CNN inference. Clust. Comput. 25(2): 1459-1473 (2022) - [j21]Guillermo Lloret-Talavera, Marc Jordà, Harald Servat, Fabian Boemer, Chetan Chauhan, Shigeki Tomishima, Nilesh N. Shah, Antonio J. Peña:
Enabling Homomorphically Encrypted Inference for Large DNN Models. IEEE Trans. Computers 71(5): 1145-1155 (2022) - [j20]Jidong Zhai, Min Si, Antonio J. Peña:
Guest Editorial. IEEE Trans. Parallel Distributed Syst. 33(11): 2644-2647 (2022) - [c44]Marc Jordà, Siddharth Rai, Eduard Ayguadé, Jesús Labarta, Antonio J. Peña:
ecoHMEM: Improving Object Placement Methodology for Hybrid Memory Systems in HPC. CLUSTER 2022: 278-288 - [c43]Orestis Korakitis, Simon Garcia De Gonzalo, Nicolas L. Guidotti, João Pedro Barreto, José C. Monteiro, Antonio J. Peña:
Towards OmpSs-2 and OpenACC interoperation. PPoPP 2022: 433-434 - [c42]Orestis Korakitis, Simon Garcia de Gonzalo, Nicolas L. Guidotti, João Barreto, José Monteiro, Antonio J. Peña:
OmpSs-2 and OpenACC Interoperation. WACCPD@SC 2022: 11-21 - 2021
- [j19]Sergio Iserte, Rafael Mayo, Enrique S. Quintana-Ortí, Antonio J. Peña:
DMRlib: Easy-Coding and Efficient Resource Management for Job Malleability. IEEE Trans. Computers 70(9): 1443-1457 (2021) - [c41]Nicolas L. Guidotti, Pedro Ceyrat, João Barreto, José Monteiro, Rodrigo Rodrigues, Ricardo Fonseca, Xavier Martorell, Antonio J. Peña:
Particle-In-Cell Simulation Using Asynchronous Tasking. Euro-Par 2021: 482-498 - [c40]Kazuaki Matsumura, Simon Garcia de Gonzalo, Antonio J. Peña:
JACC: An OpenACC Runtime Framework with Kernel-Level and Multi-GPU Parallelization. HiPC 2021: 182-191 - [c39]Leonel Toledo, Pedro Valero-Lara, Jeffrey S. Vetter, Antonio J. Peña:
Static Graphs for Coding Productivity in OpenACC. HiPC 2021: 364-369 - [i9]Guillermo Lloret-Talavera, Marc Jordà, Harald Servat, Fabian Boemer, Chetan Chauhan, Shigeki Tomishima, Nilesh N. Shah, Antonio J. Peña:
Enabling Homomorphically Encrypted Inference for Large DNN Models. CoRR abs/2103.16139 (2021) - [i8]Marc Jordà, Pedro Valero-Lara, Antonio J. Peña:
cuConv: A CUDA Implementation of Convolution for CNN Inference. CoRR abs/2103.16234 (2021) - [i7]Nicolas L. Guidotti, Pedro Ceyrat, João Barreto, José Monteiro, Rodrigo Rodrigues, Ricardo Fonseca, Xavier Martorell, Antonio J. Peña:
Particle-In-Cell Simulation using Asynchronous Tasking. CoRR abs/2106.12485 (2021) - [i6]Kazuaki Matsumura, Simon Garcia De Gonzalo, Antonio J. Peña:
JACC: An OpenACC Runtime Framework with Kernel-Level and Multi-GPU Parallelization. CoRR abs/2110.14340 (2021) - 2020
- [j18]Antonio J. Peña, Min Si:
Guest editorial: Special Issue on Applications and System Software for Hybrid Exascale Systems. Parallel Comput. 91 (2020) - [j17]Adrián Castelló, Rafael Mayo Gual, Sangmin Seo, Pavan Balaji, Enrique S. Quintana-Ortí, Antonio J. Peña:
Analysis of Threading Libraries for High Performance Computing. IEEE Trans. Computers 69(9): 1279-1292 (2020) - [i5]Harald Servat, Jesús Labarta, Hans-Christian Hoppe, Judit Giménez, Antonio J. Peña:
Understanding Memory Access Patterns Using the BSC Performance Tools. CoRR abs/2005.05872 (2020) - [i4]Sergio Iserte, Rafael Mayo, Enrique S. Quintana-Ortí, Vicenç Beltran, Antonio J. Peña:
DMR API: Improving cluster productivity by turning applications into malleable. CoRR abs/2005.05910 (2020) - [i3]Pedro Valero-Lara, Raül Sirvent, Antonio J. Peña, Jesús Labarta:
MPI+OpenMP Tasking Scalability for Multi-Morphology Simulations of the Human Brain. CoRR abs/2005.06332 (2020)
2010 – 2019
- 2019
- [j16]Marc Jordà, Pedro Valero-Lara, Antonio J. Peña:
Performance Evaluation of cuDNN Convolution Algorithms on NVIDIA Volta GPUs. IEEE Access 7: 70461-70473 (2019) - [j15]Sergio Iserte, Héctor Martínez, Sergio Barrachina, Maribel Castillo, Rafael Mayo, Antonio J. Peña:
Dynamic reconfiguration of noniterative scientific applications: A case study with HPG aligner. Int. J. High Perform. Comput. Appl. 33(5) (2019) - [j14]Pedro Valero-Lara, Raül Sirvent, Antonio J. Peña, Jesús Labarta:
MPI+OpenMP tasking scalability for multi-morphology simulations of the human brain. Parallel Comput. 84: 50-61 (2019) - [j13]Kevin Sala, Xavier Teruel, Josep M. Pérez, Antonio J. Peña, Vicenç Beltran, Jesús Labarta:
Integrating blocking and non-blocking MPI primitives with task-based programming models. Parallel Comput. 85: 153-166 (2019) - [c38]Antonio J. Peña:
Introduction to AsHES 2019. IPDPS Workshops 2019: 460 - [c37]Leonel Toledo, Antonio J. Peña, Sandra Catalán, Pedro Valero-Lara:
Tasking in Accelerators: Performance Evaluation. PDCAT 2019: 127-132 - [e2]Michela Taufer, Pavan Balaji, Antonio J. Peña:
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2019, Denver, Colorado, USA, November 17-19, 2019. ACM 2019, ISBN 978-1-4503-6229-0 [contents] - [i2]Kevin Sala, Xavier Teruel, Josep M. Pérez, Antonio J. Peña, Vicenç Beltran, Jesús Labarta:
Integrating Blocking and Non-Blocking MPI Primitives with Task-Based Programming Models. CoRR abs/1901.03271 (2019) - 2018
- [j12]Pedro Valero-Lara, Ivan Martínez-Pérez, Raül Sirvent, Xavier Martorell, Antonio J. Peña:
cuThomasBatch and cuThomasVBatch, CUDA Routines to compute batch of tridiagonal systems on NVIDIA GPUs. Concurr. Comput. Pract. Exp. 30(24) (2018) - [j11]Adrián Castelló, Rafael Mayo, Kevin Sala, Vicenç Beltran, Pavan Balaji, Antonio J. Peña:
On the adequacy of lightweight thread approaches for high-level parallel programming models. Future Gener. Comput. Syst. 84: 22-31 (2018) - [j10]Sunita Chandrasekaran, Antonio J. Peña:
Special issue on applications for the heterogeneous computing era 2017. Parallel Comput. 77: 125-127 (2018) - [j9]Harald Servat, Jesús Labarta, Hans-Christian Hoppe, Judit Giménez, Antonio J. Peña:
Understanding memory access patterns using the BSC performance tools. Parallel Comput. 78: 1-14 (2018) - [j8]Sergio Iserte, Rafael Mayo, Enrique S. Quintana-Ortí, Vicenç Beltran, Antonio J. Peña:
DMR API: Improving cluster productivity by turning applications into malleable. Parallel Comput. 78: 54-66 (2018) - [j7]Adrián Castelló, Antonio J. Peña, Rafael Mayo, Judit Planas, Enrique S. Quintana-Ortí, Pavan Balaji:
Exploring the interoperability of remote GPGPU virtualization using rCUDA and directive-based programming models. J. Supercomput. 74(11): 5628-5642 (2018) - [j6]Min Si, Antonio J. Peña, Jeff R. Hammond, Pavan Balaji, Masamichi Takagi, Yutaka Ishikawa:
Dynamic Adaptable Asynchronous Progress Model for MPI RMA Multiphase Applications. IEEE Trans. Parallel Distributed Syst. 29(9): 1975-1989 (2018) - [c36]Sunita Chandrasekaran, Antonio J. Peña, Min Si:
Introduction to AsHES 2018. IPDPS Workshops 2018: 520 - [c35]Sergio Rivas-Gomez, Antonio J. Peña, David Moloney, Erwin Laure, Stefano Markidis:
Exploring the Vision Processing Unit as Co-Processor for Inference. IPDPS Workshops 2018: 589-598 - [c34]Pedro Valero-Lara, Raül Sirvent, Antonio J. Peña, Xavier Martorell, Jesús Labarta:
MPI+OpenMP Tasking Scalability for the Simulation of the Human Brain: Human Brain Project. EuroMPI 2018: 5:1-5:8 - [c33]Kevin Sala, Jorge Bellón, Pau Farré, Xavier Teruel, Josep M. Pérez, Antonio J. Peña, Daniel J. Holmes, Vicenç Beltran, Jesús Labarta:
Improving the Interoperability between MPI and Task-Based Programming Models. EuroMPI 2018: 6:1-6:11 - [i1]Sergio Rivas-Gomez, Antonio J. Peña, David Moloney, Erwin Laure, Stefano Markidis:
Exploring the Vision Processing Unit as Co-processor for Inference. CoRR abs/1810.04150 (2018) - 2017
- [j5]Sunita Chandrasekaran, Antonio J. Peña:
Special Issue on Topics on Heterogeneous Computing. Parallel Comput. 68: 1-2 (2017) - [c32]Harald Servat, Antonio J. Peña, Germán Llort, Estanislao Mercadal, Hans-Christian Hoppe, Jesús Labarta:
Automating the Application Data Placement in Hybrid Memory Systems. CLUSTER 2017: 126-136 - [c31]Adrián Castelló, Sangmin Seo, Rafael Mayo, Pavan Balaji, Enrique S. Quintana-Ortí, Antonio J. Peña:
GLT: A Unified API for Lightweight Thread Libraries. Euro-Par 2017: 470-481 - [c30]Pedro Valero-Lara, Ivan Martínez-Pérez, Antonio J. Peña, Xavier Martorell, Raül Sirvent, Jesús Labarta:
cuHinesBatch: Solving Multiple Hines systems on GPUs Human Brain Project*. ICCS 2017: 566-575 - [c29]Adrián Castelló, Sangmin Seo, Rafael Mayo, Pavan Balaji, Enrique S. Quintana-Ortí, Antonio J. Peña:
GLTO: On the Adequacy of Lightweight Thread Approaches for OpenMP Implementations. ICPP 2017: 60-69 - [c28]Victor Garcia-Flores, Eduard Ayguadé, Antonio J. Peña:
Efficient Data Sharing on Heterogeneous Systems. ICPP 2017: 121-130 - [c27]Sergio Iserte, Rafael Mayo, Enrique S. Quintana-Ortí, Vicenç Beltran, Antonio J. Peña:
Efficient Scalable Computing through Flexible Applications and Adaptive Workloads. ICPP Workshops 2017: 180-189 - [c26]Harald Servat, Jesús Labarta, Hans-Christian Hoppe, Judit Giménez, Antonio J. Peña:
Integrating Memory Perspective into the BSC Performance Tools. ICPP Workshops 2017: 231-232 - [c25]Antonio J. Peña, Vicenç Beltran, Carsten Clauss, Thomas Moschny:
Supporting automatic recovery in offloaded distributed programming models through MPI-3 techniques. ICS 2017: 22:1-22:10 - [c24]Juan Gómez-Luna, Izzat El Hajj, Li-Wen Chang, Victor Garcia-Flores, Simon Garcia De Gonzalo, Thomas B. Jablin, Antonio J. Peña, Wen-mei W. Hwu:
Chai: Collaborative heterogeneous applications for integrated-architectures. ISPASS 2017: 43-54 - [c23]Pedro Valero-Lara, Ivan Martínez-Pérez, Raül Sirvent, Xavier Martorell, Antonio J. Peña:
NVIDIA GPUs Scalability to Solve Multiple (Batch) Tridiagonal Systems Implementation of cuThomasBatch. PPAM (1) 2017: 243-253 - [e1]Antonio J. Peña, Pavan Balaji, William Gropp, Rajeev Thakur:
Proceedings of the 24th European MPI Users' Group Meeting, EuroMPI/USA 2017, Chicago, IL, USA, September 25-28, 2017. ACM 2017, ISBN 978-1-4503-4849-2 [contents] - 2016
- [j4]Antonio J. Peña, Pavan Balaji:
A data-oriented profiler to assist in data partitioning and distribution for heterogeneous memory in HPC. Parallel Comput. 51: 46-55 (2016) - [j3]Ashwin M. Aji, Antonio J. Peña, Pavan Balaji, Wu-chun Feng:
MultiCL: Enabling automatic scheduling for task-parallel workloads in OpenCL. Parallel Comput. 58: 37-55 (2016) - [c22]Adrián Castelló, Antonio J. Peña, Sangmin Seo, Rafael Mayo, Pavan Balaji, Enrique S. Quintana-Ortí:
A Review of Lightweight Thread Approaches for High Performance Computing. CLUSTER 2016: 471-480 - [c21]Sayan Ghosh, Jeff R. Hammond, Antonio J. Peña, Pavan Balaji, Assefaw Hadish Gebremedhin, Barbara M. Chapman:
One-Sided Interface for Matrix Operations Using MPI-3 RMA: A Case Study with Elemental. ICPP 2016: 185-194 - [c20]Victor Garcia, Juan Gómez-Luna, Thomas Grass, Alejandro Rico, Eduard Ayguadé, Antonio J. Peña:
Evaluating the effect of last-level cache sharing on integrated GPU-CPU systems with heterogeneous applications. IISWC 2016: 168-177 - 2015
- [j2]Carlos Reaño, Federico Silla, Adrián Castelló, Antonio J. Peña, Rafael Mayo, Enrique S. Quintana-Ortí, José Duato:
Improving the user experience of the rCUDA remote GPU virtualization framework. Concurr. Comput. Pract. Exp. 27(14): 3746-3770 (2015) - [c19]Min Si, Antonio J. Peña, Jeff R. Hammond, Pavan Balaji, Yutaka Ishikawa:
Scaling NWChem with Efficient and Portable Asynchronous Communication in MPI RMA. CCGRID 2015: 811-816 - [c18]Antonio J. Peña, Pavan Balaji:
Understanding Data Access Patterns Using Object-Differentiated Memory Profiling. CCGRID 2015: 1143-1146 - [c17]Ken Raffenetti, Antonio J. Peña, Pavan Balaji:
Toward Implementing Robust Support for Portals 4 Networks in MPICH. CCGRID 2015: 1173-1176 - [c16]Ashwin Mandayam Aji, Antonio J. Peña, Pavan Balaji, Wu-chun Feng:
Automatic Command Queue Scheduling for Task-Parallel Workloads in OpenCL. CLUSTER 2015: 42-51 - [c15]Adrián Castelló, Antonio J. Peña, Rafael Mayo, Pavan Balaji, Enrique S. Quintana-Ortí:
Exploring the Suitability of Remote GPGPU Virtualization for the OpenACC Programming Model Using rCUDA. CLUSTER 2015: 92-95 - [c14]Min Si, Antonio J. Peña, Jeff R. Hammond, Pavan Balaji, Masamichi Takagi, Yutaka Ishikawa:
Casper: An Asynchronous Progress Model for MPI RMA on Many-Core Architectures. IPDPS 2015: 665-676 - [c13]Antonio J. Peña, Wesley Bland, Pavan Balaji:
VOCL-FT: introducing techniques for efficient soft error coprocessor recovery. SC 2015: 71:1-71:12 - 2014
- [j1]Antonio J. Peña, Carlos Reaño, Federico Silla, Rafael Mayo, Enrique S. Quintana-Ortí, José Duato:
A complete and efficient CUDA-sharing solution for HPC clusters. Parallel Comput. 40(10): 574-588 (2014) - [c12]Antonio J. Peña, Pavan Balaji:
Toward the efficient use of multiple explicitly managed memory subsystems. CLUSTER 2014: 123-131 - [c11]Carlos Reaño, Federico Silla, Antonio J. Peña, Gilad Shainer, Scot Schultz, Adrián Castelló, Enrique S. Quintana-Ortí, José Duato:
Boosting the performance of remote GPU virtualization using InfiniBand connect-IB and PCIe 3.0. CLUSTER 2014: 266-267 - [c10]Antonio J. Peña, Pavan Balaji:
A Framework for Tracking Memory Accesses in Scientific Applications. ICPP Workshops 2014: 235-244 - [c9]Min Si, Antonio J. Peña, Pavan Balaji, Masamichi Takagi, Yutaka Ishikawa:
MT-MPI: multithreaded MPI for many-core environments. ICS 2014: 125-134 - 2013
- [c8]Antonio J. Peña, Sadaf R. Alam:
Evaluation of Inter- and Intra-node Data Transfer Efficiencies between GPU Devices and their Impact on Scalable Applications. CCGRID 2013: 144-151 - [c7]Carlos Reaño, Rafael Mayo, Enrique S. Quintana-Ortí, Federico Silla, José Duato, Antonio J. Peña:
Influence of InfiniBand FDR on the performance of remote GPU virtualization. CLUSTER 2013: 1-8 - [c6]Antonio J. Peña, Ralf G. Correa Carvalho, James Dinan, Pavan Balaji, Rajeev Thakur, William Gropp:
Analysis of topology-dependent MPI performance on Gemini networks. EuroMPI 2013: 61-66 - 2012
- [c5]Carlos Reaño, Antonio J. Peña, Federico Silla, José Duato, Rafael Mayo, Enrique S. Quintana-Ortí:
CU2rCU: Towards the complete rCUDA remote GPU virtualization and sharing solution. HiPC 2012: 1-10 - 2011
- [c4]José Duato, Antonio J. Peña, Federico Silla, Juan Carlos Fernández, Rafael Mayo, Enrique S. Quintana-Ortí:
Enabling CUDA acceleration within virtual machines using rCUDA. HiPC 2011: 1-10 - [c3]José Duato, Antonio J. Peña, Federico Silla, Rafael Mayo, Enrique S. Quintana-Ortí:
Performance of CUDA Virtualized Remote GPUs in High Performance Clusters. ICPP 2011: 365-374 - 2010
- [c2]José Duato, Antonio J. Peña, Federico Silla, Rafael Mayo, Enrique S. Quintana-Ortí:
rCUDA: Reducing the number of GPU-based accelerators in high performance clusters. HPCS 2010: 224-231
2000 – 2009
- 2009
- [c1]José Duato, Francisco D. Igual, Rafael Mayo, Antonio J. Peña, Enrique S. Quintana-Ortí, Federico Silla:
An Efficient Implementation of GPU Virtualization in High Performance Clusters. Euro-Par Workshops 2009: 385-394
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-10 21:41 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint