


default search action
Dhabaleswar K. Panda 0001
Person information
- affiliation: Ohio State University, Columbus, USA
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
- [i20]Lang Xu, Quentin Anthony, Jacob Hatef, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Scaling Large Language Model Training on Frontier with Low-Bandwidth Partitioning. CoRR abs/2501.04266 (2025) - 2024
- [j65]Dhabaleswar K. Panda
, Vipin Chaudhary, Eric Fosler-Lussier, Raghu Machiraju, Amit Majumdar, Beth Plale
, Rajiv Ramnath, Ponnuswamy Sadayappan, Neelima Savardekar, Karen Tomko:
Creating intelligent cyberinfrastructure for democratizing AI. AI Mag. 45(1): 22-28 (2024) - [j64]Tu Tran
, Bharath Ramesh, Benjamin Michalowicz
, Mustafa Abduljabbar, Hari Subramoni, Aamir Shafi, Dhabaleswar K. Panda:
Accelerating communication with multi-HCA aware collectives in MPI. Concurr. Comput. Pract. Exp. 36(1) (2024) - [c515]Lang Xu, Quentin Anthony, Qinghua Zhou, Nawras Alnaasan, Radha Gulhane, Aamir Shafi, Hari Subramoni, Dhabaleswar K. D. K. Panda:
Accelerating Large Language Model Training with Hybrid GPU-based Compression. CCGrid 2024: 196-205 - [c514]Nawras Alnaasan, Horng-Ruey Huang, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Characterizing Communication in Distributed Parameter-Efficient Fine-Tuning for Large Language Models. HOTI 2024: 11-19 - [c513]Tu Tran, Goutham Kalikrishna Reddy Kuncham, Bharath Ramesh, Shulei Xu, Hari Subramoni, Mustafa Abduljabbar, Dhabaleswar K. Panda:
OHIO: Improving RDMA Network Scalability in MPI_Alltoall Through Optimized Hierarchical and Intra/Inter-Node Communication Overlap Design. HOTI 2024: 47-56 - [c512]Quentin Anthony, Benjamin Michalowicz
, Jacob Hatef, Lang Xu, Mustafa Abdul Jabbar, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Demystifying the Communication Characteristics for Distributed Transformer Models. HOTI 2024: 57-65 - [c511]Quentin Anthony
, Jacob Hatef
, Deepak Narayanan
, Stella Biderman
, Stas Bekman
, Junqi Yin
, Aamir Shafi
, Hari Subramoni
, Dhabaleswar K. Panda
:
The Case for Co-Designing Model Architectures with Hardware. ICPP 2024: 84-96 - [c510]Dhabaleswar K. Panda, Hari Subramoni:
Message from the HCW 2024 Technical Program Committee Co-Chairs. IPDPS (Workshops) 2024: 1 - [c509]Dhabaleswar K. Panda, Hari Subramoni:
Message from the HCW 2024 Technical Program Committee Co-Chairs. IPDPS (Workshops) 2024: 4 - [c508]HooYoung Ahn, SeonYoung Kim, Yoo-Mi Park, Woojong Han, Nick Contini, Bharath Ramesh, Mustafa Abduljabbar, Dhabaleswar K. Panda:
Towards Accelerating k-NN with MPI and Near-Memory Processing. IPDPS (Workshops) 2024: 608-615 - [c507]Mingzhe Han, Goutham Kalikrishna Reddy Kuncham, Benjamin Michalowicz, Rahul Vaidya, Mustafa Abduljabbar, Aamir Shafi, Hari Subramoni, Dhabaleswar K. D. K. Panda:
PML-MPI: A Pre-Trained ML Framework for Efficient Collective Algorithm Selection in MPI. IPDPS (Workshops) 2024: 761-770 - [c506]Bharath Ramesh, Nick Contini, Nawras Alnaasan
, Kaushik Kandadi Suresh, Mustafa Abduljabbar, Aamir Shafi, Hari Subramoni, Dhabaleswar K. D. K. Panda:
HINT: Designing Cache-Efficient MPI_Alltoall using Hybrid Memory Copy Ordering and Non-Temporal Instructions. IPDPS 2024: 802-813 - [c505]Jinghan Yao, Quentin Anthony, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference. IPDPS 2024: 915-925 - [c504]Qinghua Zhou, Bharath Ramesh, Aamir Shafi, Mustafa Abduljabbar, Hari Subramoni, Dhabaleswar K. Panda:
Accelerating MPI AllReduce Communication with Efficient GPU-Based Compression Schemes on Modern GPU Clusters. ISC 2024: 1-12 - [c503]Nicholas Contini
, Mustafa Abduljabbar
, Hari Subramoni
, Dhabaleswar K. Panda
:
OMB-FPGA: A Microbenchmark Suite for FPGA-aware MPIs using OpenCL and SYCL. PEARC 2024: 1:1-1:9 - [c502]Radha Gulhane
, Quentin Anthony
, Aamir Shafi
, Hari Subramoni
, Dhabaleswar K. Panda
:
Infer-HiRes: Accelerating Inference for High-Resolution Images with Quantization and Distributed Deep Learning. PEARC 2024: 5:1-5:9 - [c501]Chen-Chun Chen
, Goutham Kalikrishna Reddy Kuncham
, Pouya Kousha
, Hari Subramoni
, Dhabaleswar K. Panda
:
Design and Implementation of an IPC-based Collective MPI Library for Intel GPUs. PEARC 2024: 17:1-17:9 - [c500]Tu Tran
, Mustafa Abduljabbar
, HooYoung Ahn
, SeonYoung Kim
, Yoo-Mi Park
, Woojong Han
, Shin-Young Ahn
, Hari Subramoni
, Dhabaleswar K. Panda
:
OMB-CXL: A Micro-Benchmark Suite for Evaluating MPI Communication Utilizing Compute Express Link Memory Devices. PEARC 2024: 27:1-27:8 - [i19]Jinghan Yao, Quentin Anthony, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference. CoRR abs/2401.08383 (2024) - [i18]Quentin Anthony, Jacob Hatef, Deepak Narayanan, Stella Biderman, Stas Bekman, Junqi Yin, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
The Case for Co-Designing Model Architectures with Hardware. CoRR abs/2401.14489 (2024) - [i17]Quentin Anthony, Benjamin Michalowicz, Jacob Hatef, Lang Xu, Mustafa Abduljabbar, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Demystifying the Communication Characteristics for Distributed Transformer Models. CoRR abs/2408.10197 (2024) - [i16]Jinghan Yao, Sam Ade Jacobs, Masahiro Tanaka, Olatunji Ruwase, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer. CoRR abs/2408.16978 (2024) - [i15]Lang Xu, Quentin Anthony, Qinghua Zhou, Nawras Alnaasan, Radha Gulhane, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Accelerating Large Language Model Training with Hybrid GPU-based Compression. CoRR abs/2409.02423 (2024) - 2023
- [j63]Kawthar Shafie Khorassani, Chen-Chun Chen, Bharath Ramesh, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
High Performance MPI over the Slingshot Interconnect. J. Comput. Sci. Technol. 38(1): 128-145 (2023) - [j62]Kaushik Kandadi Suresh
, Kawthar Shafie Khorassani, Chen-Chun Chen, Bharath Ramesh, Mustafa Abduljabbar, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Network-Assisted Noncontiguous Transfers for GPU-Aware MPI Libraries. IEEE Micro 43(2): 131-139 (2023) - [c499]Pouya Kousha
, Qinghua Zhou
, Hari Subramoni
, Dhabaleswar K. Panda
:
Benchmarking Modern Databases for Storing and Profiling Very Large Scale HPC Communication Data. Bench 2023: 104-119 - [c498]Nawras Alnaasan
, Matthew Lieber, Aamir Shafi, Hari Subramoni, Scott A. Shearer, Dhabaleswar K. Panda:
HARVEST: High-Performance Artificial Vision Framework for Expert Labeling using Semi-Supervised Training. IEEE Big Data 2023: 139-148 - [c497]Kinan Al-Attar, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
MPI4Spark Meets YARN: Enhancing MPI4Spark through YARN support for HPC. IEEE Big Data 2023: 2265-2274 - [c496]Chen-Chun Chen, Kawthar Shafie Khorassani, Goutham Kalikrishna Reddy Kuncham, Rahul Vaidya, Mustafa Abduljabbar, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Implementing and Optimizing a GPU-aware MPI Library for Intel GPUs: Early Experiences. CCGrid 2023: 131-140 - [c495]Quentin Anthony, Lang Xu, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
ScaMP: Scalable Meta-Parallelism for Deep Learning Search. CCGridW 2023: 346-348 - [c494]Quentin Anthony, Lang Xu, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
ScaMP: Scalable Meta-Parallelism for Deep Learning Search. CCGrid 2023: 391-402 - [c493]Dhabaleswar K. D. K. Panda:
How to Educate HPC-Enabled AI and Data Science to Students and Professionals in a Holistic Manner? HiPCW 2023: 4 - [c492]Shulei Xu, Goutham Kalikrishna Reddy Kuncham, Mustafa Abduljabbar, Hari Subramoni, Dhabaleswar K. D. K. Panda:
Optimized All-to-All Connection Establishment for High-Performance MPI Libraries Over InfiniBand. HiPC 2023: 41-50 - [c491]Jinghan Yao, Nawras Alnaasan
, Tian Chen, Aamir Shafi, Hari Subramoni, Dhabaleswar K. D. K. Panda:
Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference. HiPC 2023: 107-116 - [c490]Bharath Ramesh, Goutham Kalikrishna Reddy Kuncham, Kaushik Kandadi Suresh, Rahul Vaidya, Nawras Alnaasan
, Mustafa Abduljabbar, Aamir Shafi, Hari Subramoni, Dhabaleswar K. D. K. Panda:
Designing In-network Computing Aware Reduction Collectives in MPI. HOTI 2023: 25-32 - [c489]Benjamin Michalowicz
, Kaushik Kandadi Suresh, Hari Subramoni, Dhabaleswar K. D. K. Panda, Stephen W. Poole:
Battle of the BlueFields: An In-Depth Comparison of the BlueField-2 and BlueField-3 SmartNICs. HOTI 2023: 41-48 - [c488]Hyunho Ahn, Tian Chen, Nawras Alnaasan
, Aamir Shafi, Mustafa Abduljabbar, Hari Subramoni, Dhabaleswar K. Panda:
Performance Characterization of Using Quantization for DNN Inference on Edge Devices. ICFEC 2023: 1-6 - [c487]Nicholas Contini
, Bharath Ramesh
, Kaushik Kandadi Suresh
, Tu Tran
, Benjamin Michalowicz
, Mustafa Abduljabbar
, Hari Subramoni
, Dhabaleswar K. Panda
:
Enabling Reconfigurable HPC through MPI-based Inter-FPGA Communication. ICS 2023: 477-487 - [c486]Kaushik Kandadi Suresh, Benjamin Michalowicz
, Bharath Ramesh, Nicholas Contini, Jinghan Yao, Shulei Xu, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
A Novel Framework for Efficient Offloading of Communication Operations to Bluefield SmartNICs. IPDPS 2023: 123-133 - [c485]Qinghua Zhou, Quentin Anthony, Lang Xu, Aamir Shafi, Mustafa Abduljabbar, Hari Subramoni, Dhabaleswar K. Panda:
Accelerating Distributed Deep Learning Training with Compression Assisted Allgather and Reduce-Scatter Communication. IPDPS 2023: 134-144 - [c484]Benjamin Michalowicz
, Kaushik Kandadi Suresh, Bharath Ramesh, Aamir Shafi, Hari Subramoni, Mustafa Abduljabbar, Dhabaleswar K. Panda:
In-Depth Evaluation of a Lower-Level Direct-Verbs API on InfiniBand-based Clusters: Early Experiences. IPDPS Workshops 2023: 354-363 - [c483]Kawthar Shafie Khorassani, Chen-Chun Chen, Hari Subramoni, Dhabaleswar K. Panda:
Designing and Optimizing GPU-aware Nonblocking MPI Neighborhood Collective Communication for PETSc*. IPDPS 2023: 646-656 - [c482]Quentin Anthony, Ammar Ahmad Awan, Jeff Rasley, Yuxiong He, Aamir Shafi, Mustafa Abduljabbar, Hari Subramoni, Dhabaleswar K. Panda:
MCR-DL: Mix-and-Match Communication Runtime for Deep Learning. IPDPS 2023: 996-1006 - [c481]Pouya Kousha
, Vivekananda Sathu
, Matthew Lieber
, Hari Subramoni
, Dhabaleswar K. Panda
:
Democratizing HPC Access and Use with Knowledge Graphs. SC Workshops 2023: 242-251 - [c480]Chen-Chun Chen
, Kawthar Shafie Khorassani
, Pouya Kousha
, Qinghua Zhou
, Jinghan Yao
, Hari Subramoni
, Dhabaleswar K. Panda
:
MPI-xCCL: A Portable MPI Library over Collective Communication Libraries for Various Accelerators. SC Workshops 2023: 847-854 - [c479]Pouya Kousha
, Arpan Jain, Ayyappa Kolli, Matthew Lieber, Mingzhe Han, Nicholas Contini, Hari Subramoni, Dhabaleswar K. Panda:
SAI: AI-Enabled Speech Assistant Interface for Science Gateways in HPC. ISC 2023: 402-424 - [c478]Benjamin Michalowicz
, Kaushik Kandadi Suresh
, Hari Subramoni
, Dhabaleswar K. Panda
, Steve Poole
:
DPU-Bench: A Micro-Benchmark Suite to Measure Offload Efficiency Of SmartNICs. PEARC 2023: 94-101 - [c477]Samuel Khuvis
, Karen Tomko
, Scott R. Brozell
, Chen-Chun Chen
, Hari Subramoni
, Dhabaleswar K. Panda
:
Optimizing Amber for Device-to-Device GPU Communication. PEARC 2023: 200-205 - [i14]Hyunho Ahn, Tian Chen, Nawras Alnaasan, Aamir Shafi, Mustafa Abduljabbar, Hari Subramoni, Dhabaleswar K. Panda:
Performance Characterization of using Quantization for DNN Inference on Edge Devices: Extended Version. CoRR abs/2303.05016 (2023) - [i13]Quentin Anthony, Ammar Ahmad Awan, Jeff Rasley, Yuxiong He, Aamir Shafi, Mustafa Abduljabbar, Hari Subramoni, Dhabaleswar K. Panda:
MCR-DL: Mix-and-Match Communication Runtime for Deep Learning. CoRR abs/2303.08374 (2023) - [i12]Jinghan Yao, Nawras Alnaasan, Tian Chen, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference. CoRR abs/2305.13484 (2023) - 2022
- [j61]Arpan Jain
, Nawras Alnaasan
, Aamir Shafi
, Hari Subramoni
, Dhabaleswar K. Panda:
Optimizing Distributed DNN Training Using CPUs and BlueField-2 DPUs. IEEE Micro 42(2): 53-60 (2022) - [c476]Kinan Al-Attar, Aamir Shafi, Mustafa Abduljabbar
, Hari Subramoni, Dhabaleswar K. Panda:
Spark Meets MPI: Towards High-Performance Communication Framework for Spark using MPI. CLUSTER 2022: 71-81 - [c475]Apan Qasem, Hartwig Anzt, Eduard Ayguadé, Katharine J. Cahill, Ramon Canal, Jany Chan
, Eric Fosler-Lussier, Fritz Göbel, Arpan Jain, Marcel Koch, Mateusz Kuzak, Josep Llosa, Raghu Machiraju, Xavier Martorell, Pratik Nayak, Shameema Oottikkal, Marcin Ostasz, Dhabaleswar K. Panda, Dirk Pleiter, Rajiv Ramnath, Maria-Ribera Sancho, Alessio Sclocco, Aamir Shafi, Hanno Spreeuw, Hari Subramoni, Karen Tomko
:
Lightning Talks of EduHPC 2022. EduHPC@SC 2022: 42-49 - [c474]Qinghua Zhou, Quentin Anthony, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Accelerating Broadcast Communication with GPU Compression for Deep Learning Workloads. HIPC 2022: 22-31 - [c473]Nawras Alnaasan
, Arpan Jain, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
AccDP: Accelerated Data-Parallel Distributed DNN Training for Modern GPU-Based HPC Clusters. HIPC 2022: 32-41 - [c472]Bharath Ramesh, Qinghua Zhou, Aamir Shafi, Mustafa Abduljabbar, Hari Subramoni, Dhabaleswar K. Panda:
Designing Efficient Pipelined Communication Schemes using Compression in MPI Libraries. HIPC 2022: 95-99 - [c471]Kaushik Kandadi Suresh, Akshay Paniraja Guptha, Benjamin Michalowicz
, Bharath Ramesh, Mustafa Abduljabbar, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Efficient Personalized and Non-Personalized Alltoall Communication for Modern Multi-HCA GPU-Based Clusters. HIPC 2022: 100-104 - [c470]Kaushik Kandadi Suresh, Kawthar Shafie Khorassani, Chen-Chun Chen, Bharath Ramesh, Mustafa Abduljabbar
, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Network Assisted Non-Contiguous Transfers for GPU-Aware MPI Libraries. HOTI 2022: 13-20 - [c469]Tu Tran
, Benjamin Michalowicz
, Bharath Ramesh, Hari Subramoni, Aamir Shafi, Dhabaleswar K. Panda:
Designing Hierarchical Multi-HCA Aware Allgather in MPI. ICPP Workshops 2022: 28:1-28:10 - [c468]Dhabaleswar K. Panda:
Challenges and Opportunities in Designing High-Performance and Scalable Middleware for HPC and AI: Past, Present, and Future. IPDPS 2022: 1 - [c467]Chen-Chun Chen, Kawthar Shafie Khorassani, Quentin G. Anthony, Aamir Shafi
, Hari Subramoni, Dhabaleswar K. Panda:
Highly Efficient Alltoall and Alltoallv Communication Algorithms for GPU Systems. IPDPS Workshops 2022: 24-33 - [c466]Shulei Xu, Aamir Shafi
, Hari Subramoni, Dhabaleswar K. Panda:
Arm meets Cloud: A Case Study of MPI Library Performance on AWS Arm-based HPC Cloud with Elastic Fabric Adapter. IPDPS Workshops 2022: 449-456 - [c465]Kinan Al-Attar, Aamir Shafi
, Hari Subramoni, Dhabaleswar K. Panda:
Towards Java-based HPC using the MVAPICH2 Library: Early Experiences. IPDPS Workshops 2022: 510-519 - [c464]Nawras Alnaasan
, Arpan Jain, Aamir Shafi
, Hari Subramoni, Dhabaleswar K. Panda:
OMB-Py: Python Micro-Benchmarks for Evaluating Performance of MPI Libraries on HPC Systems. IPDPS Workshops 2022: 870-879 - [c463]Qinghua Zhou, Pouya Kousha
, Quentin Anthony, Kawthar Shafie Khorassani, Aamir Shafi
, Hari Subramoni
, Dhabaleswar K. Panda:
Accelerating MPI All-to-All Communication with Online Compression on Modern GPU Clusters. ISC 2022: 3-25 - [c462]Pouya Kousha
, Arpan Jain, Ayyappa Kolli, Prasanna Sainath, Hari Subramoni
, Aamir Shafi
, Dhabaleswar K. Panda:
"Hey CAI" - Conversational AI Enabled User Interface for HPC Tools. ISC 2022: 87-108 - [c461]Arpan Jain, Aamir Shafi
, Quentin Anthony, Pouya Kousha
, Hari Subramoni, Dhabaleswar K. Panda:
Hy-Fi: Hybrid Five-Dimensional Parallel DNN Training on High-Performance GPU Clusters. ISC 2022: 109-130 - [c460]Kawthar Shafie Khorassani, Chen-Chun Chen, Bharath Ramesh, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
High Performance MPI over the Slingshot Interconnect: Early Experiences. PEARC 2022: 15:1-15:7 - [e8]Dhabaleswar K. Panda, Michael B. Sullivan:
Supercomputing Frontiers - 7th Asian Conference, SCFA 2022, Singapore, March 1-3, 2022, Proceedings. Lecture Notes in Computer Science 13214, Springer 2022, ISBN 978-3-031-10418-3 [contents] - 2021
- [j60]Dhabaleswar Kumar Panda
, Hari Subramoni
, Ching-Hsiang Chu
, Mohammadreza Bayatpour:
The MVAPICH project: Transforming research into high-performance MPI library for HPC community. J. Comput. Sci. 52: 101208 (2021) - [c459]Kawthar Shafie Khorassani, Ching-Hsiang Chu, Quentin G. Anthony, Hari Subramoni, Dhabaleswar K. Panda:
Adaptive and Hierarchical Large Message All-to-all Communication Algorithms for Large-scale Dense GPU Systems. CCGRID 2021: 113-122 - [c458]Aamir Shafi
, Jahanzeb Maqbool Hashmi, Hari Subramoni, Dhabaleswar K. D. K. Panda:
Efficient MPI-based Communication for GPU-Accelerated Dask Applications. CCGRID 2021: 277-286 - [c457]Bharath Ramesh, Jahanzeb Maqbool Hashmi, Shulei Xu, Aamir Shafi
, Seyedeh Mahdieh Ghazimirsaeed, Mohammadreza Bayatpour, Hari Subramoni, Dhabaleswar K. Panda:
Towards Architecture-aware Hierarchical Communication Trees on Modern HPC Systems. HiPC 2021: 272-281 - [c456]Yuntian He, Saket Gurukar, Pouya Kousha
, Hari Subramoni, Dhabaleswar K. Panda, Srinivasan Parthasarathy:
DistMILE: A Distributed Multi-Level Framework for Scalable Graph Embedding. HiPC 2021: 282-291 - [c455]Kaushik Kandadi Suresh, Bharath Ramesh, Chen-Chun Chen, Seyedeh Mahdieh Ghazimirsaeed, Mohammadreza Bayatpour, Aamir Shafi
, Hari Subramoni, Dhabaleswar K. Panda:
Layout-aware Hardware-assisted Designs for Derived Data Types in MPI. HiPC 2021: 302-311 - [c454]Nick Sarkauskas, Mohammadreza Bayatpour, Tu Tran
, Bharath Ramesh, Hari Subramoni, Dhabaleswar K. Panda:
Large-Message Nonblocking MPI_Iallgather and MPI Ibcast Offload via BlueField-2 DPU. HiPC 2021: 388-393 - [c453]Arpan Jain, Nawras Alnaasan
, Aamir Shafi
, Hari Subramoni, Dhabaleswar K. Panda:
Accelerating CPU-based Distributed DNN Training on Modern HPC Clusters using BlueField-2 DPUs. HOTI 2021: 17-24 - [c452]Q. Zhou, C. Chu, N. S. Kumar, Pouya Kousha
, Seyedeh Mahdieh Ghazimirsaeed, Hari Subramoni
, Dhabaleswar K. Panda:
Designing High-Performance MPI Libraries with On-the-fly Compression for Modern GPU Clusters*. IPDPS 2021: 444-453 - [c451]Arpan Jain, Tim Moon, Tom Benson, Hari Subramoni, Sam Adé Jacobs, Dhabaleswar K. Panda, Brian Van Essen:
SUPER: SUb-Graph Parallelism for TransformERs. IPDPS 2021: 629-638 - [c450]Quentin Anthony, Lang Xu, Hari Subramoni, Dhabaleswar K. D. K. Panda:
Scaling Single-Image Super-Resolution Training on Modern HPC Clusters: Early Experiences. IPDPS Workshops 2021: 923-932 - [c449]Mohammadreza Bayatpour, Nick Sarkauskas, Hari Subramoni, Jahanzeb Maqbool Hashmi, Dhabaleswar K. Panda:
BluesMPI: Efficient MPI Non-blocking Alltoall Offloading Designs on Modern BlueField Smart NICs. ISC 2021: 18-37 - [c448]Kawthar Shafie Khorassani, Jahanzeb Maqbool Hashmi, Ching-Hsiang Chu
, Chen-Chun Chen, Hari Subramoni, Dhabaleswar K. Panda:
Designing a ROCm-Aware MPI Library for AMD GPUs: Early Experiences. ISC 2021: 118-136 - [c447]Pouya Kousha
, Kamal Raj Sankarapandian Dayala Ganesh Ram, Mansa Kedia, Hari Subramoni
, Arpan Jain, Aamir Shafi
, Dhabaleswar K. Panda, Trey Dockendorf, Heechang Na, Karen Tomko
:
INAM: Cross-stack Profiling and Analysis of Communication in MPI-based Applications. PEARC 2021: 14:1-14:11 - [i11]Aamir Shafi, Jahanzeb Maqbool Hashmi, Hari Subramoni, Dhabaleswar K. Panda:
Efficient MPI-based Communication for GPU-Accelerated Dask Applications. CoRR abs/2101.08878 (2021) - [i10]Pouya Kousha, Quentin Anthony, Hari Subramoni, Dhabaleswar K. Panda:
Cross-layer Visualization and Profiling of Network and I/O Communication for HPC Clusters. CoRR abs/2109.08329 (2021) - [i9]Nawras Alnaasan, Arpan Jain, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
OMB-Py: Python Micro-Benchmarks for Evaluating Performance of MPI Libraries on HPC Systems. CoRR abs/2110.10659 (2021) - 2020
- [j59]Sourav Chakraborty, Ignacio Laguna
, Murali Emani, Kathryn M. Mohror, Dhabaleswar K. Panda, Martin Schulz
, Hari Subramoni:
EReinit: Scalable and efficient fault-tolerance for bulk-synchronous MPI applications. Concurr. Comput. Pract. Exp. 32(3) (2020) - [j58]Jahanzeb Maqbool Hashmi, Ching-Hsiang Chu
, Sourav Chakraborty, Mohammadreza Bayatpour, Hari Subramoni, Dhabaleswar K. Panda:
FALCON-X: Zero-copy MPI derived datatype processing on modern CPU and GPU architectures. J. Parallel Distributed Comput. 144: 1-13 (2020) - [j57]Ammar Ahmad Awan, Arpan Jain, Ching-Hsiang Chu
, Hari Subramoni, Dhabaleswar K. Panda:
Communication Profiling and Characterization of Deep-Learning Workloads on Clusters With High-Performance Interconnects. IEEE Micro 40(1): 35-43 (2020) - [c446]Mohammadreza Bayatpour, Seyedeh Mahdieh Ghazimirsaeed, Shulei Xu, Hari Subramoni, Dhabaleswar K. Panda:
Design and Characterization of InfiniBand Hardware Tag Matching in MPI. CCGRID 2020: 101-110 - [c445]Ching-Hsiang Chu
, Kawthar Shafie Khorassani, Qinghua Zhou, Hari Subramoni, Dhabaleswar K. Panda:
Dynamic Kernel Fusion for Bulk Non-contiguous Data Transfer on GPU Clusters. CLUSTER 2020: 130-141 - [c444]Aamir Shafi
, Jahanzeb Maqbool Hashmi, Hari Subramoni, Dhabaleswar K. Panda:
Blink: Towards Efficient RDMA-based Communication Coroutines for Parallel Python Applications. HiPC 2020: 111-120 - [c443]Ching-Hsiang Chu, Pouya Kousha
, Ammar Ahmad Awan, Kawthar Shafie Khorassani, Hari Subramoni, Dhabaleswar K. D. K. Panda:
NV-group: link-efficient reduction for distributed deep learning on modern dense GPU systems. ICS 2020: 6:1-6:12 - [c442]Jahanzeb Maqbool Hashmi, Shulei Xu, Bharath Ramesh, Mohammadreza Bayatpour, Hari Subramoni, Dhabaleswar K. D. K. Panda:
Machine-agnostic and Communication-aware Designs for MPI on Emerging Architectures. IPDPS 2020: 32-41 - [c441]Amit Ruhela
, Shulei Xu, Karthik Vadambacheri Manian
, Hari Subramoni, Dhabaleswar K. Panda:
Analyzing and Understanding the Impact of Interconnect Performance on HPC, Big Data, and Deep Learning Applications: A Case Study with InfiniBand EDR and HDR. IPDPS Workshops 2020: 869-878 - [c440]Kaushik Kandadi Suresh, Bharath Ramesh, Seyedeh Mahdieh Ghazimirsaeed, Mohammadreza Bayatpour, Jahanzeb Maqbool Hashmi, Hari Subramoni, Dhabaleswar K. D. K. Panda:
Performance Characterization of Network Mechanisms for Non-Contiguous Data Transfers in MPI. IPDPS Workshops 2020: 896-905 - [c439]Quentin Anthony, Ammar Ahmad Awan, Arpan Jain, Hari Subramoni, Dhabaleswar K. D. K. Panda:
Efficient Training of Semantic Image Segmentation on Summit using Horovod and MVAPICH2-GDR. IPDPS Workshops 2020: 1015-1023 - [c438]Bharath Ramesh, Kaushik Kandadi Suresh, Nick Sarkauskas, Mohammadreza Bayatpour, Jahanzeb Maqbool Hashmi, Hari Subramoni, Dhabaleswar K. Panda:
Scalable MPI Collectives using SHARP: Large Scale Performance Evaluation on the TACC Frontera System. ExaMPI@SC 2020: 11-20 - [c437]Seyedeh Mahdieh Ghazimirsaeed, Quentin Anthony, Aamir Shafi
, Hari Subramoni, Dhabaleswar K. D. K. Panda:
Accelerating GPU-based Machine Learning in Python using MPI Library: A Case Study with MVAPICH2-GDR. MLHPC/AI4S@SC 2020: 17-28 - [c436]Shulei Xu, Seyedeh Mahdieh Ghazimirsaeed, Jahanzeb Maqbool Hashmi, Hari Subramoni, Dhabaleswar K. Panda:
MPI Meets Cloud: Case Study with Amazon EC2 and Microsoft Azure. IPDRM@SC 2020: 41-48 - [c435]Arpan Jain, Ammar Ahmad Awan, Asmaa M. Aljuhani, Jahanzeb Maqbool Hashmi, Quentin G. Anthony, Hari Subramoni, Dhabaleswar K. Panda, Raghu Machiraju, Anil Parwani:
GEMS: GPU-enabled memory-aware model-parallelism system for distributed DNN training. SC 2020: 45 - [c434]Samuel Khuvis
, Karen Tomko
, Jahanzeb Maqbool Hashmi, Dhabaleswar K. Panda:
Exploring Hybrid MPI+Kokkos Tasks Programming Model. PAW-ATM@SC 2020: 66-73 - [c433]Ammar Ahmad Awan, Arpan Jain, Quentin Anthony, Hari Subramoni, Dhabaleswar K. Panda:
HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training with TensorFlow. ISC 2020: 83-103 - [c432]Mohammadreza Bayatpour, Jahanzeb Maqbool Hashmi, Sourav Chakraborty, Kaushik Kandadi Suresh, Seyedeh Mahdieh Ghazimirsaeed, Bharath Ramesh, Hari Subramoni, Dhabaleswar K. Panda:
Communication-Aware Hardware-Assisted MPI Overlap Engine. ISC 2020: 517-535 - [c431]Dan Stanzione
, John West
, R. Todd Evans
, Tommy Minyard, Omar Ghattas, Dhabaleswar K. Panda:
Frontera: The Evolution of Leadership Computing at the National Science Foundation. PEARC 2020: 106-111 - [c430]Pouya Kousha
, Kamal Raj S. D., Hari Subramoni
, Dhabaleswar K. Panda, Heechang Na, Trey Dockendorf, Karen Tomko
:
Accelerated Real-time Network Monitoring and Profiling at Scale using OSU INAM. PEARC 2020: 215-223 - [e7]Dhabaleswar K. Panda:
Supercomputing Frontiers - 6th Asian Conference, SCFA 2020, Singapore, February 24-27, 2020, Proceedings. Lecture Notes in Computer Science 12082, Springer 2020, ISBN 978-3-030-48841-3 [contents] - [i8]Ritu Arora, Xiaosong Li, Bonnie Hurwitz, Daniel Fay, Dhabaleswar K. Panda, Edward F. Valeev, Shaowen Wang, Shirley Moore, Sunita Chandrasekaran, Ting Cao, Holly Bik, Matthew Curry, Tanzima Z. Islam:
Future Directions of the Cyberinfrastructure for Sustained Scientific Innovation (CSSI) Program. CoRR abs/2010.15584 (2020)
2010 – 2019
- 2019
- [j56]Depai Qian, Dhabaleswar K. Panda:
CCF THPC inaugural issue editorial. CCF Trans. High Perform. Comput. 1(1): 1-2 (2019) - [j55]Amit Ruhela
, Hari Subramoni, Sourav Chakraborty, Mohammadreza Bayatpour, Pouya Kousha
, Dhabaleswar K. Panda:
Efficient design for MPI asynchronous progress without dedicated resources. Parallel Comput. 85: 13-26 (2019) - [j54]Ammar Ahmad Awan
, Karthik Vadambacheri Manian
, Ching-Hsiang Chu
, Hari Subramoni, Dhabaleswar K. Panda:
Optimized large-message broadcast for deep learning workloads: MPI, MPI+NCCL, or NCCL2? Parallel Comput. 85: 141-152 (2019) - [j53]Ching-Hsiang Chu
, Xiaoyi Lu
, Ammar Ahmad Awan
, Hari Subramoni
, Bracy Elton
, Dhabaleswar K. Panda:
Exploiting Hardware Multicast and GPUDirect RDMA for Efficient Broadcast. IEEE Trans. Parallel Distributed Syst. 30(3): 575-588 (2019) - [c429]Karthik Vadambacheri Manian
, A. A. Ammar, Amit Ruhela
, Ching-Hsiang Chu, Hari Subramoni, Dhabaleswar K. Panda:
Characterizing CUDA Unified Memory (UM)-Aware MPI Designs on Modern GPU Architectures. GPGPU@ASPLOS 2019: 43-52 - [c428]Jahanzeb Maqbool Hashmi, Sourav Chakraborty, Mohammadreza Bayatpour, Hari Subramoni, Dhabaleswar K. Panda:
Design and Characterization of Shared Address Space MPI Collectives on Modern Architectures. CCGRID 2019: 410-419 - [c427]Ammar Ahmad Awan, Jeroen Bédorf, Ching-Hsiang Chu, Hari Subramoni, Dhabaleswar K. Panda:
Scalable Distributed DNN Training using TensorFlow and CUDA-Aware MPI: Characterization, Designs, and Performance Evaluation. CCGRID 2019: 498-507 - [c426]Arpan Jain, Ammar Ahmad Awan, Quentin Anthony, Hari Subramoni, Dhabaleswar K. Panda:
Performance Characterization of DNN Training using TensorFlow and PyTorch on Modern Clusters. CLUSTER 2019: 1-11 - [c425]Pouya Kousha
, Bharath Ramesh, Kaushik Kandadi Suresh, Ching-Hsiang Chu, Arpan Jain, Nick Sarkauskas, Hari Subramoni
, Dhabaleswar K. Panda:
Designing a Profiling and Visualization Tool for Scalable and In-depth Analysis of High-Performance GPU Clusters. HiPC 2019: 93-102 - [c424]Dipti Shankar, Xiaoyi Lu, Dhabaleswar K. Panda:
SCOR-KV: SIMD-Aware Client-Centric and Optimistic RDMA-Based Key-Value Store for Emerging CPU Architectures. HiPC 2019: 257-266 - [c423]Ching-Hsiang Chu, Jahanzeb Maqbool Hashmi, Kawthar Shafie Khorassani, Hari Subramoni, Dhabaleswar K. Panda:
High-Performance Adaptive MPI Derived Datatype Communication for Modern Multi-GPU Systems. HiPC 2019: 267-276 - [c422]Sourav Chakraborty, Shulei Xu, Hari Subramoni, Dhabaleswar K. Panda:
Designing Scalable and High-Performance MPI Libraries on Amazon Elastic Fabric Adapter. Hot Interconnects 2019: 40-44 - [c421]Ammar Ahmad Awan, Arpan Jain, Ching-Hsiang Chu, Hari Subramoni, Dhabaleswar K. Panda:
Communication Profiling and Characterization of Deep Learning Workloads on Clusters with High-Performance Interconnects. Hot Interconnects 2019: 49-53 - [c420]Haiyang Shi, Xiaoyi Lu, Dipti Shankar, Dhabaleswar K. Panda:
UMR-EC: A Unified and Multi-Rail Erasure Coding Library for High-Performance Distributed Storage Systems. HPDC 2019: 219-230 - [c419]Dipti Shankar, Xiaoyi Lu, Dhabaleswar K. D. K. Panda:
SimdHT-Bench: Characterizing SIMD-Aware Hash Table Designs on Emerging CPU Architectures. IISWC 2019: 178-188 - [c418]Jie Zhang, Xiaoyi Lu, Ching-Hsiang Chu, Dhabaleswar K. Panda:
C-GDR: High-Performance Container-Aware GPUDirect MPI Communication Schemes on RDMA Networks. IPDPS 2019: 242-251 - [c417]Jahanzeb Maqbool Hashmi, Sourav Chakraborty, Mohammadreza Bayatpour, Hari Subramoni, Dhabaleswar K. Panda:
FALCON: Efficient Designs for Zero-Copy MPI Datatype Processing on Emerging Architectures. IPDPS 2019: 355-364 - [c416]Xiaoyi Lu, Jianfeng Zhan, Dhabaleswar K. Panda:
Introduction to HPBDC 2019. IPDPS Workshops 2019: 394 - [c415]Dhabaleswar K. Panda, Ammar Ahmad Awan, Hari Subramoni:
High performance distributed deep learning: a beginner's guide. PPoPP 2019: 452-454 - [c414]Amit Ruhela
, Bharath Ramesh, Sourav Chakraborty, Hari Subramoni, Jahanzeb Maqbool Hashmi, Dhabaleswar K. Panda:
Leveraging Network-level parallelism with Multiple Process-Endpoints for MPI Broadcast. IPDRM@SC 2019: 34-41 - [c413]Shulei Xu, Jahanzeb Maqbool Hashmi, Sourav Chakraborty, Hari Subramoni, Dhabaleswar K. Panda:
Design and Evaluation of Shared Memory CommunicationBenchmarks on Emerging Architectures using MVAPICH2. IPDRM@SC 2019: 42-49 - [c412]Arpan Jain, Ammar Ahmad Awan, Hari Subramoni, Dhabaleswar K. Panda:
Scaling TensorFlow, PyTorch, and MXNet using MVAPICH2 for High-Performance Deep Learning on Frontera. DLS@SC 2019: 76-83 - [c411]Kawthar Shafie Khorassani, Ching-Hsiang Chu
, Hari Subramoni, Dhabaleswar K. Panda:
Performance Evaluation of MPI Libraries on GPU-Enabled OpenPOWER Architectures: Early Experiences. ISC Workshops 2019: 361-378 - [i7]Ammar Ahmad Awan, Arpan Jain, Quentin Anthony, Hari Subramoni, Dhabaleswar K. Panda:
HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow. CoRR abs/1911.05146 (2019) - 2018
- [j52]Md. Wasi-ur-Rahman, Nusrat Sharmin Islam, Xiaoyi Lu, Dipti Shankar, Dhabaleswar K. Panda:
MR-Advisor: A comprehensive tuning, profiling, and prediction tool for MapReduce execution frameworks on HPC clusters. J. Parallel Distributed Comput. 120: 237-250 (2018) - [j51]Dhabaleswar K. Panda, Xiaoyi Lu
, Hari Subramoni:
Networking and communication challenges for post-exascale systems. Frontiers Inf. Technol. Electron. Eng. 19(10): 1230-1235 (2018) - [j50]Srinivasan Ramesh, Aurèle Mahéo, Sameer Shende, Allen D. Malony, Hari Subramoni, Amit Ruhela
, Dhabaleswar K. Panda:
MPI performance engineering with the MPI tool interface: The integration of MVAPICH and TAU. Parallel Comput. 77: 19-37 (2018) - [j49]Xiaoyi Lu
, Haiyang Shi, Rajarshi Biswas, M. Haseeb Javed
, Dhabaleswar K. Panda:
DLoBD: A Comprehensive Study of Deep Learning over Big Data Stacks on HPC Clusters. IEEE Trans. Multi Scale Comput. Syst. 4(4): 635-648 (2018) - [c410]Haiyang Shi, Xiaoyi Lu, Dhabaleswar K. Panda:
EC-Bench: Benchmarking Onload and Offload Erasure Coders on Modern Hardware Architectures. Bench 2018: 215-230 - [c409]Xiaoyi Lu, Dipti Shankar, Haiyang Shi, Dhabaleswar K. Panda:
Spark-uDAPL: Cost-Saving Big Data Analytics on Microsoft Azure Cloud with RDMA Networks*. IEEE BigData 2018: 321-326 - [c408]Haiyang Shi, Xiaoyi Lu, Dipti Shankar, Dhabaleswar K. Panda:
High-Performance Multi-Rail Erasure Coding Library over Modern Data Center Architectures: Early Experiences. SoCC 2018: 530-531 - [c407]Mohammadreza Bayatpour, Jahanzeb Maqbool Hashmi, Sourav Chakraborty, Hari Subramoni, Pouya Kousha
, Dhabaleswar K. Panda:
SALaR: Scalable and Adaptive Designs for Large Message Reduction Collectives. CLUSTER 2018: 12-23 - [c406]M. Haseeb Javed, Xiaoyi Lu, Dhabaleswar K. Panda:
Cutting the Tail: Designing High Performance Message Brokers to Reduce Tail Latencies in Stream Processing. CLUSTER 2018: 223-233 - [c405]Rajarshi Biswas
, Xiaoyi Lu, Dhabaleswar K. Panda:
Accelerating TensorFlow with Adaptive RDMA-Based gRPC. HiPC 2018: 2-11 - [c404]Ammar Ahmad Awan, Ching-Hsiang Chu, Hari Subramoni, Xiaoyi Lu, Dhabaleswar K. Panda:
OC-DNN: Exploiting Advanced Unified Memory Capabilities in CUDA 9 and Volta GPUs for Out-of-Core DNN Training. HiPC 2018: 143-152 - [c403]Xiaoyi Lu, Jianfeng Zhan, Dhabaleswar K. Panda:
Introduction to HPBDC 2018. IPDPS Workshops 2018: 446 - [c402]Jahanzeb Maqbool Hashmi, Sourav Chakraborty, Mohammadreza Bayatpour, Hari Subramoni, Dhabaleswar K. Panda:
Designing Efficient Shared Address Space Reduction Collectives for Multi-/Many-cores. IPDPS 2018: 1020-1029 - [c401]Ammar Ahmad Awan, Ching-Hsiang Chu, Hari Subramoni, Dhabaleswar K. Panda:
Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters: MPI or NCCL? EuroMPI 2018: 2:1-2:9 - [c400]Mingzhe Li, Xiaoyi Lu, Hari Subramoni, Dhabaleswar K. Panda:
Multi-Threading and Lock-Free MPI RMA Based Graph Processing on KNL and POWER Architectures. EuroMPI 2018: 4:1-4:10 - [c399]Amit Ruhela
, Hari Subramoni
, Sourav Chakraborty, Mohammadreza Bayatpour, Pouya Kousha
, Dhabaleswar K. Panda:
Efficient Asynchronous Communication Progress for MPI without Dedicated Resources. EuroMPI 2018: 14:1-14:11 - [c398]Sourav Chakraborty, Mohammadreza Bayatpour, Jahanzeb Maqbool Hashmi, Hari Subramoni, Dhabaleswar K. Panda:
Cooperative rendezvous protocols for improved performance and overlap. SC 2018: 28:1-28:13 - [c397]Shashank Gugnani, Xiaoyi Lu, Dhabaleswar K. Panda:
Analyzing, Modeling, and Provisioning QoS for NVMe SSDs. UCC 2018: 247-256 - [e6]Esam El-Araby, Dhabaleswar K. Panda, Sandra Gesing, Amy W. Apon, Volodymyr V. Kindratenko, Massimo Cafaro, Alfredo Cuzzocrea:
18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2018, Washington, DC, USA, May 1-4, 2018. IEEE Computer Society 2018, ISBN 978-1-5386-5815-4 [contents] - [i6]Rajarshi Biswas, Xiaoyi Lu, Dhabaleswar K. Panda:
Designing a Micro-Benchmark Suite to Evaluate gRPC for TensorFlow: Early Experiences. CoRR abs/1804.01138 (2018) - [i5]Ammar Ahmad Awan, Jeroen Bédorf, Ching-Hsiang Chu, Hari Subramoni, Dhabaleswar K. Panda:
Scalable Distributed DNN Training using TensorFlow and CUDA-Aware MPI: Characterization, Designs, and Performance Evaluation. CoRR abs/1810.11112 (2018) - 2017
- [j48]Xiaoyi Lu, Dipti Shankar, Dhabaleswar K. Panda:
Scalable and Distributed Key-Value Store-based Data Management Using RDMA-Memcached. IEEE Data Eng. Bull. 40(1): 50-61 (2017) - [j47]Md. Wasi-ur-Rahman, Nusrat Sharmin Islam, Xiaoyi Lu, Dhabaleswar K. Panda:
A Comprehensive Study of MapReduce Over Lustre for Intermediate Data Placement and Shuffle Strategies on HPC Clusters. IEEE Trans. Parallel Distributed Syst. 28(3): 633-646 (2017) - [c396]M. Haseeb Javed, Xiaoyi Lu, Dhabaleswar K. Panda:
Characterization of Big Data Stream Processing Pipeline: A Case Study using Flink and Kafka. BDCAT 2017: 1-10 - [c395]Shashank Gugnani, Xiaoyi Lu, Houliang Qi, Li Zha, Dhabaleswar K. Panda:
Characterizing and accelerating indexing techniques on distributed ordered tables. IEEE BigData 2017: 173-182 - [c394]Xiaoyi Lu, Haiyang Shi, Dipti Shankar, Dhabaleswar K. Panda:
Performance characterization and acceleration of big data workloads on OpenPOWER system. IEEE BigData 2017: 213-222 - [c393]Md. Wasi-ur-Rahman, Nusrat Sharmin Islam, Xiaoyi Lu, Dhabaleswar K. Panda:
NVMD: Non-volatile memory assisted design for accelerating MapReduce and DAG execution frameworks on HPC systems. IEEE BigData 2017: 369-374 - [c392]Shashank Gugnani, Xiaoyi Lu, Dhabaleswar K. Panda:
Swift-X: Accelerating OpenStack Swift with RDMA for Building an Efficient HPC Cloud. CCGrid 2017: 238-247 - [c391]Sourav Chakraborty, Hari Subramoni, Dhabaleswar K. Panda:
Contention-Aware Kernel-Assisted MPI Collectives for Multi-/Many-Core Systems. CLUSTER 2017: 13-24 - [c390]Hari Subramoni, Xiaoyi Lu, Dhabaleswar K. Panda:
A Scalable Network-Based Performance Analysis Tool for MPI on Large-Scale HPC Systems. CLUSTER 2017: 354-358 - [c389]Mingzhe Li, Xiaoyi Lu, Hari Subramoni, Dhabaleswar K. Panda:
Designing Registration Caching Free High-Performance MPI Library with Implicit On-Demand Paging (ODP) of InfiniBand. HiPC 2017: 62-71 - [c388]Jahanzeb Maqbool Hashmi, Khaled Hamidouche, Hari Subramoni, Dhabaleswar K. Panda:
Kernel-Assisted Communication Engine for MPI on Emerging Manycore Processors. HiPC 2017: 84-93 - [c387]Shashank Gugnani, Xiaoyi Lu, Franco Pestilli
, Cesar F. Caiafa, Dhabaleswar K. Panda:
MPI-LiFE: Designing High-Performance Linear Fascicle Evaluation of Brain Connectome with MPI. HiPC 2017: 213-222 - [c386]Xiaoyi Lu, Haiyang Shi, M. Haseeb Javed, Rajarshi Biswas, Dhabaleswar K. Panda:
Characterizing Deep Learning over Big Data (DLoBD) Stacks on RDMA-Capable Networks. Hot Interconnects 2017: 87-94 - [c385]Dipti Shankar, Xiaoyi Lu, Dhabaleswar K. Panda:
High-Performance and Resilient Key-Value Store with Online Erasure Coding for Big Data Workloads. ICDCS 2017: 527-537 - [c384]Akshay Venkatesh, Khaled Hamidouche, Sreeram Potluri, Davide Rossetti, Ching-Hsiang Chu
, Dhabaleswar K. Panda:
MPI-GDS: High Performance MPI Designs with GPUDirect-aSync for CPU-GPU Control Flow Decoupling. ICPP 2017: 151-160 - [c383]Ching-Hsiang Chu
, Xiaoyi Lu, Ammar Ahmad Awan, Hari Subramoni, Jahanzeb Maqbool Hashmi, Bracy Elton, Dhabaleswar K. Panda:
Efficient and Scalable Multi-Source Streaming Broadcast on GPU Clusters for Deep Learning. ICPP 2017: 161-170 - [c382]Jie Zhang, Xiaoyi Lu, Dhabaleswar K. Panda:
High-Performance Virtual Machine Migration Framework for MPI Applications on SR-IOV Enabled InfiniBand Clusters. IPDPS 2017: 143-152 - [c381]Xiaoyi Lu, Jianfeng Zhan, Dhabaleswar K. Panda:
Introduction to HPBDC Workshop. IPDPS Workshops 2017: 1020 - [c380]Jahanzeb Maqbool Hashmi, Mingzhe Li, Hari Subramoni, Dhabaleswar K. Panda:
Exploiting and Evaluating OpenSHMEM on KNL Architecture. OpenSHMEM 2017: 143-158 - [c379]Ammar Ahmad Awan, Khaled Hamidouche, Jahanzeb Maqbool Hashmi, Dhabaleswar K. Panda:
S-Caffe: Co-designing MPI Runtimes and Caffe for Scalable Deep Learning on Modern GPU Clusters. PPoPP 2017: 193-205 - [c378]Srinivasan Ramesh, Aurèle Mahéo, Sameer Shende, Allen D. Malony, Hari Subramoni, Dhabaleswar K. Panda:
MPI performance engineering with the MPI tool interface: the integration of MVAPICH and TAU. EuroMPI/USA 2017: 16:1-16:11 - [c377]Ammar Ahmad Awan, Hari Subramoni, Dhabaleswar K. Panda:
An In-depth Performance Characterization of CPU- and GPU-based DNN Training on Modern Architectures. MLHPC@SC 2017: 8:1-8:8 - [c376]Mohammadreza Bayatpour, Sourav Chakraborty, Hari Subramoni
, Xiaoyi Lu, Dhabaleswar K. Panda:
Scalable reduction collectives with data partitioning-based multi-leader design. SC 2017: 64 - [c375]Hari Subramoni, Sourav Chakraborty, Dhabaleswar K. Panda:
Designing Dynamic and Adaptive MPI Point-to-Point Communication Protocols for Efficient Overlap of Computation and Communication. ISC 2017: 334-354 - [c374]Jie Zhang, Xiaoyi Lu, Dhabaleswar K. Panda:
Is Singularity-based Container Technology Ready for Running MPI Applications on HPC Clouds? UCC 2017: 151-160 - [c373]Dhabaleswar K. Panda, Xiaoyi Lu:
HPC Meets Cloud: Building Efficient Clouds for HPC, Big Data, and Deep Learning Middleware and Applications. UCC 2017: 189-190 - [c372]Jie Zhang, Xiaoyi Lu, Dhabaleswar K. Panda:
Designing Locality and NUMA Aware MPI Runtime for Nested Virtualization based HPC Cloud with SR-IOV Enabled InfiniBand. VEE 2017: 187-200 - [c371]Dan Stanzione
, Bill Barth, Niall Gaffney
, Kelly P. Gaither, Chris Hempel, Tommy Minyard, Susan Mehringer
, Eric A. Wernert, H. Tufo, Dhabaleswar K. Panda, Patricia J. Teller:
Stampede 2: The Evolution of an XSEDE Supercomputer. PEARC 2017: 15:1-15:8 - [p1]Xiaoyi Lu, Jie Zhang, Dhabaleswar K. Panda:
Building Efficient HPC Cloud with SR-IOV-Enabled InfiniBand: The MVAPICH2 Approach. Research Advances in Cloud Computing 2017: 115-140 - [i4]Ammar Ahmad Awan, Ching-Hsiang Chu, Hari Subramoni, Dhabaleswar K. Panda:
Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters: MPI or NCCL? CoRR abs/1707.09414 (2017) - 2016
- [j46]Khaled Hamidouche
, Akshay Venkatesh, Ammar Ahmad Awan, Hari Subramoni, Ching-Hsiang Chu
, Dhabaleswar K. Panda:
CUDA-Aware OpenSHMEM: Extensions and Designs for High Performance OpenSHMEM on GPU Clusters. Parallel Comput. 58: 27-36 (2016) - [j45]Dipti Shankar, Xiaoyi Lu, Md. Wasi-ur-Rahman, Nusrat S. Islam, Dhabaleswar K. Panda:
Characterizing and benchmarking stand-alone Hadoop MapReduce on modern HPC clusters. J. Supercomput. 72(12): 4573-4600 (2016) - [c370]Shashank Gugnani, Xiaoyi Lu, Dhabaleswar K. Panda:
Performance characterization of hadoop workloads on SR-IOV-enabled virtualized InfiniBand clusters. BDCAT 2016: 36-45 - [c369]Nusrat Sharmin Islam, Md. Wasi-ur-Rahman, Xiaoyi Lu, Dhabaleswar K. Panda:
Efficient data access strategies for Hadoop and Spark on HPC cluster with heterogeneous storage. IEEE BigData 2016: 223-232 - [c368]Xiaoyi Lu, Dipti Shankar, Shashank Gugnani, Dhabaleswar K. Panda:
High-performance design of apache spark with RDMA and its benefits on various workloads. IEEE BigData 2016: 253-262 - [c367]Dipti Shankar, Xiaoyi Lu, Dhabaleswar K. Panda:
Boldio: A hybrid and resilient burst-buffer over lustre for accelerating big data I/O. IEEE BigData 2016: 404-409 - [c366]Sourav Chakraborty, Hari Subramoni, Jonathan L. Perkins, Dhabaleswar K. Panda:
SHMEMPMI - Shared Memory Based PMI for Improved Performance and Scalability. CCGrid 2016: 60-69 - [c365]Ching-Hsiang Chu, Khaled Hamidouche, Akshay Venkatesh, Ammar Ahmad Awan, Dhabaleswar K. Panda:
CUDA Kernel Based Collective Reduction Operations on Large-scale GPU Clusters. CCGrid 2016: 726-735 - [c364]Dip Sankar Banerjee
, Khaled Hamidouche, Dhabaleswar K. Panda:
Re-Designing CNTK Deep Learning Framework on Modern GPU Enabled Clusters. CloudCom 2016: 144-151 - [c363]Shashank Gugnani, Xiaoyi Lu, Dhabaleswar K. Panda:
Designing Virtualization-Aware and Automatic Topology Detection Schemes for Accelerating Hadoop on SR-IOV-Enabled Clouds. CloudCom 2016: 152-159 - [c362]Xiaoyi Lu, Dipti Shankar, Shashank Gugnani, Hari Subramoni, Dhabaleswar K. Panda:
Impact of HPC Cloud Networking Technologies on Accelerating Hadoop RPC and HBase. CloudCom 2016: 310-317 - [c361]Mohammadreza Bayatpour, Hari Subramoni, Sourav Chakraborty, Dhabaleswar K. Panda:
Adaptive and Dynamic Design for MPI Tag Matching. CLUSTER 2016: 1-10 - [c360]Jie Zhang, Xiaoyi Lu, Sourav Chakraborty, Dhabaleswar K. Panda:
Slurm-V: Extending Slurm for Building Efficient HPC Cloud with SR-IOV and IVShmem. Euro-Par 2016: 349-362 - [c359]Mingzhe Li, Xiaoyi Lu, Khaled Hamidouche, Jie Zhang, Dhabaleswar K. Panda:
Mizan-RMA: Accelerating Mizan Graph Processing Framework with MPI RMA. HiPC 2016: 42-51 - [c358]Khaled Hamidouche, Ammar Ahmad Awan, Akshay Venkatesh, Dhabaleswar K. Panda:
CUDA M3: Designing Efficient CUDA Managed Memory-Aware MPI by Exploiting GDR and IPC. HiPC 2016: 52-61 - [c357]Jahanzeb Maqbool Hashmi, Khaled Hamidouche, Dhabaleswar K. Panda:
Enabling Performance Efficient Runtime Support for Hybrid MPI+UPC++ Programming Models. HPCC/SmartCity/DSS 2016: 1180-1187 - [c356]Jiajun Cao, Kapil Arya
, Rohan Garg, L. Shawn Matott, Dhabaleswar K. Panda, Hari Subramoni, Jérôme Vienne, Gene Cooperman:
System-Level Scalable Checkpoint-Restart for Petascale Computing. ICPADS 2016: 932-941 - [c355]Jie Zhang, Xiaoyi Lu, Dhabaleswar K. Panda:
High Performance MPI Library for Container-Based HPC Cloud on InfiniBand Clusters. ICPP 2016: 268-277 - [c354]Nusrat Sharmin Islam, Md. Wasi-ur-Rahman, Xiaoyi Lu, Dhabaleswar K. Panda:
High Performance Design for HDFS with Byte-Addressability of NVM and RDMA. ICS 2016: 8:1-8:14 - [c353]Dipti Shankar, Xiaoyi Lu, Nusrat S. Islam, Md. Wasi-ur-Rahman, Dhabaleswar K. Panda:
High-Performance Hybrid Key-Value Store on Modern Clusters with RDMA Interconnects and SSDs: Non-blocking Extensions, Designs, and Benefits. IPDPS 2016: 393-402 - [c352]Ching-Hsiang Chu, Khaled Hamidouche, Akshay Venkatesh, Dip Sankar Banerjee
, Hari Subramoni, Dhabaleswar K. Panda:
Exploiting Maximal Overlap for Non-Contiguous Data Movement Processing on Modern GPU-Enabled Systems. IPDPS 2016: 983-992 - [c351]Dhabaleswar K. Panda, Jianfeng Zhan, Xiaoyi Lu:
HPBDC Introduction and Committees. IPDPS Workshops 2016: 1596 - [c350]Jie Zhang, Xiaoyi Lu, Dhabaleswar K. Panda:
Performance Characterization of Hypervisor-and Container-Based Virtualization for HPC on SR-IOV Enabled InfiniBand Clusters. IPDPS Workshops 2016: 1777-1784 - [c349]Dip Sankar Banerjee
, Khaled Hamidouche, Dhabaleswar K. Panda:
Designing high performance communication runtime for GPU managed memory: early experiences. GPGPU@PPoPP 2016: 82-91 - [c348]A. A. Awan, Khaled Hamidouche, Akshay Venkatesh, Dhabaleswar K. Panda:
Efficient Large Message Broadcast using NCCL and CUDA-Aware MPI for Deep Learning. EuroMPI 2016: 15-22 - [c347]Ching-Hsiang Chu
, Khaled Hamidouche, Hari Subramoni, Akshay Venkatesh, Bracy Elton, Dhabaleswar K. Panda:
Designing High Performance Heterogeneous Broadcast for Streaming Applications on GPU Clusters. SBAC-PAD 2016: 59-66 - [c346]Md. Wasi-ur-Rahman, Nusrat Sharmin Islam, Xiaoyi Lu, Dipti Shankar, Dhabaleswar K. Panda:
MR-Advisor: A Comprehensive Tuning Tool for Advising HPC Users to Accelerate MapReduce Applications on Supercomputers. SBAC-PAD 2016: 198-205 - [c345]Khaled Hamidouche, Jie Zhang, Dhabaleswar K. Panda, Karen Tomko
:
OpenSHMEM Non-blocking Data Movement Operations with MVAPICH2-X: Early Experiences. PAW@SC 2016: 9-16 - [c344]Md. Wasi-ur-Rahman, Nusrat Sharmin Islam, Xiaoyi Lu, Dhabaleswar K. Panda:
Can Non-volatile Memory Benefit MapReduce Applications on HPC Clusters? PDSW-DISCS@SC 2016: 19-24 - [c343]Ching-Hsiang Chu, Khaled Hamidouche, Hari Subramoni, Akshay Venkatesh, Bracy Elton, Dhabaleswar K. Panda:
Efficient Reliability Support for Hardware Multicast-Based Broadcast in GPU-enabled Streaming Applications. COMHPC@SC 2016: 29-38 - [c342]Mingzhe Li, Khaled Hamidouche, Xiaoyi Lu, Hari Subramoni, Jie Zhang, Dhabaleswar K. Panda:
Designing MPI library with on-demand paging (ODP) of infiniband: challenges and benefits. SC 2016: 433-443 - [c341]Hari Subramoni, Albert Mathews Augustine, Mark Daniel Arnold, Jonathan L. Perkins, Xiaoyi Lu, Khaled Hamidouche, Dhabaleswar K. Panda:
INAM2: InfiniBand Network Analysis and Monitoring with MPI. ISC 2016: 300-320 - [c340]Mahidhar Tatineni, Xiaoyi Lu, Dong Ju Choi, Amitava Majumdar
, Dhabaleswar K. Panda:
Experiences and Benefits of Running RDMA Hadoop and Spark on SDSC Comet. XSEDE 2016: 23:1-23:5 - [i3]Jiajun Cao, Kapil Arya, Rohan Garg, L. Shawn Matott, Dhabaleswar K. Panda, Hari Subramoni, Jérôme Vienne, Gene Cooperman:
System-level Scalable Checkpoint-Restart for Petascale Computing. CoRR abs/1607.07995 (2016) - 2015
- [c339]Nusrat Sharmin Islam, Md. Wasi-ur-Rahman, Xiaoyi Lu, Dipti Shankar, Dhabaleswar K. Panda:
Performance characterization and acceleration of in-memory file systems for Hadoop and Spark applications on HPC clusters. IEEE BigData 2015: 243-252 - [c338]Dipti Shankar, Xiaoyi Lu, Md. Wasi-ur-Rahman, Nusrat S. Islam, Dhabaleswar K. Panda:
Benchmarking key-value stores on high-performance storage and interconnects for web-scale workloads. IEEE BigData 2015: 539-544 - [c337]Adithya Bhat, Nusrat Sharmin Islam, Xiaoyi Lu, Md. Wasi-ur-Rahman, Dipti Shankar, Dhabaleswar K. Panda:
A Plugin-Based Approach to Exploit RDMA Benefits for Apache and Enterprise HDFS. BPOE 2015: 119-132 - [c336]Jie Zhang, Xiaoyi Lu, Mark Daniel Arnold, Dhabaleswar K. Panda:
MVAPICH2 over OpenStack with SR-IOV: An Efficient Approach to Build HPC Clouds. CCGRID 2015: 71-80 - [c335]Nusrat Sharmin Islam, Xiaoyi Lu, Md. Wasi-ur-Rahman, Dipti Shankar, Dhabaleswar K. Panda:
Triple-H: A Hybrid Approach to Accelerate HDFS on HPC Clusters with Heterogeneous Storage Architecture. CCGRID 2015: 101-110 - [c334]Sourav Chakraborty, Hari Subramoni, Adam Moody, Akshay Venkatesh, Jonathan L. Perkins, Dhabaleswar K. Panda:
Non-Blocking PMI Extensions for Fast MPI Startup. CCGRID 2015: 131-140 - [c333]Raghunath Raja Chandrasekar, Akshay Venkatesh, Khaled Hamidouche, Dhabaleswar K. Panda:
Power-Check: An Energy-Efficient Checkpointing Framework for HPC Clusters. CCGRID 2015: 261-270 - [c332]Khaled Hamidouche, Akshay Venkatesh, Ammar Ahmad Awan, Hari Subramoni, Ching-Hsiang Chu
, Dhabaleswar K. Panda:
Exploiting GPUDirect RDMA in Designing High Performance OpenSHMEM for NVIDIA GPU Clusters. CLUSTER 2015: 78-87 - [c331]Mingzhe Li, Hari Subramoni, Khaled Hamidouche, Xiaoyi Lu, Dhabaleswar K. Panda:
High Performance MPI Datatype Support with User-Mode Memory Registration: Challenges, Designs, and Benefits. CLUSTER 2015: 226-235 - [c330]Mingzhe Li, Khaled Hamidouche, Xiaoyi Lu, Jian Lin
, Dhabaleswar K. Panda:
High-Performance and Scalable Design of MPI-3 RMA on Xeon Phi Clusters. Euro-Par 2015: 625-637 - [c329]Akshay Venkatesh, Khaled Hamidouche, Hari Subramoni, Dhabaleswar K. Panda:
Offloaded GPU Collectives Using CORE-Direct and CUDA Capabilities on InfiniBand Clusters. HiPC 2015: 234-243 - [c328]Mingzhe Li, Khaled Hamidouche, Xiaoyi Lu, Jie Zhang, Jian Lin
, Dhabaleswar K. Panda:
High Performance OpenSHMEM Strided Communication Support with InfiniBand UMR. HiPC 2015: 244-253 - [c327]Hari Subramoni, Akshay Venkatesh, Khaled Hamidouche, Karen Tomko
, Dhabaleswar K. Panda:
Impact of InfiniBand DC Transport Protocol on Energy Consumption of All-to-All Collective Algorithms. Hot Interconnects 2015: 60-67 - [c326]Nusrat Sharmin Islam, Dipti Shankar, Xiaoyi Lu, Md. Wasi-ur-Rahman, Dhabaleswar K. Panda:
Accelerating I/O Performance of Big Data Analytics on HPC Clusters through RDMA-Based Key-Value Store. ICPP 2015: 280-289 - [c325]Jian Lin
, Khaled Hamidouche, Xiaoyi Lu, Mingzhe Li, Dhabaleswar K. Panda:
High-Performance Coarray Fortran Support with MVAPICH2-X: Initial Experience and Evaluation. IPDPS Workshops 2015: 225-234 - [c324]Sourav Chakraborty, Hari Subramoni, Jonathan L. Perkins, Ammar Ahmad Awan, Dhabaleswar K. Panda:
On-demand Connection Management for OpenSHMEM and OpenSHMEM+MPI. IPDPS Workshops 2015: 235-244 - [c323]Md. Wasi-ur-Rahman, Xiaoyi Lu, Nusrat Sharmin Islam, Raghunath Rajachandrasekar, Dhabaleswar K. Panda:
High-Performance Design of YARN MapReduce on Modern HPC Clusters with Lustre and RDMA. IPDPS 2015: 291-300 - [c322]Dipti Shankar, Xiaoyi Lu, Jithin Jose, Md. Wasi-ur-Rahman, Nusrat S. Islam, Dhabaleswar K. Panda:
Can RDMA benefit online data processing workloads on memcached and MySQL? ISPASS 2015: 159-160 - [c321]A. A. Awan, Khaled Hamidouche, Ching-Hsiang Chu
, Dhabaleswar K. Panda:
A Case for Non-blocking Collectives in OpenSHMEM: Design, Implementation, and Performance Evaluation using MVAPICH2-X. OpenSHMEM 2015: 69-86 - [c320]Antonio Gómez-Iglesias
, Jérôme Vienne, Khaled Hamidouche, Christopher S. Simmons, William L. Barth, Dhabaleswar K. Panda:
Scalable Out-of-core OpenSHMEM Library for HPC. OpenSHMEM 2015: 138-153 - [c319]Jian Lin
, Khaled Hamidouche, Jie Zhang, Xiaoyi Lu, Abhinav Vishnu, Dhabaleswar K. Panda:
Accelerating k-NN Algorithm with Hybrid MPI and OpenSHMEM. OpenSHMEM 2015: 164-177 - [c318]A. A. Awan, Khaled Hamidouche, Akshay Venkatesh, Jonathan L. Perkins, Hari Subramoni, Dhabaleswar K. Panda:
GPU-Aware Design, Implementation, and Evaluation of Non-blocking Collective Benchmarks. EuroMPI 2015: 9:1-9:10 - [c317]Akshay Venkatesh, Abhinav Vishnu, Khaled Hamidouche, Nathan R. Tallent
, Dhabaleswar K. Panda, Darren J. Kerbyson, Adolfy Hoisie
:
A case for application-oblivious energy-efficient MPI runtime. SC 2015: 29:1-29:12 - [c316]Hari Subramoni, Ammar Ahmad Awan, Khaled Hamidouche, Dmitry Pekurovsky
, Akshay Venkatesh, Sourav Chakraborty, Karen Tomko
, Dhabaleswar K. Panda:
Designing Non-blocking Personalized Collectives with Near Perfect Overlap for RDMA-Enabled Clusters. ISC 2015: 434-453 - [c315]Dhabaleswar K. Panda:
Accelerating Big Data Processing on Modern Clusters. PABS@ICPE 2015: 1 - [e5]Dhabaleswar K. Panda, Karl W. Schulz, Khaled Hamidouche, Hari Subramoni:
Proceedings of the First International Workshop on Extreme Scale Programming Models and Middleware, ESPM 2015, Austin, Texas, USA, November 15, 2015. ACM 2015, ISBN 978-1-4503-3996-4 [contents] - 2014
- [j44]Hao Wang
, Sreeram Potluri, Devendar Bureddy, Carlos Rosales, Dhabaleswar K. Panda:
GPU-Aware MPI on RDMA-Enabled Clusters: Design, Implementation and Evaluation. IEEE Trans. Parallel Distributed Syst. 25(10): 2595-2605 (2014) - [c314]Nusrat Sharmin Islam, Xiaoyi Lu, Md. Wasi-ur-Rahman, Raghunath Rajachandrasekar, Dhabaleswar K. Panda:
In-memory I/O and replication for HDFS with Memcached: Early experiences. IEEE BigData 2014: 213-218 - [c313]Jithin Jose, Khaled Hamidouche, Xiaoyi Lu, Sreeram Potluri, Jie Zhang, Karen Tomko
, Dhabaleswar K. Panda:
High performance OpenSHMEM for Xeon Phi clusters: Extensions, runtime designs and application co-design. CLUSTER 2014: 10-18 - [c312]Mingzhe Li, Xiaoyi Lu, Sreeram Potluri, Khaled Hamidouche, Jithin Jose, Karen Tomko
, Dhabaleswar K. Panda:
Scalable Graph500 design with MPI-3 RMA. CLUSTER 2014: 230-238 - [c311]Jie Zhang, Xiaoyi Lu, Jithin Jose, Rong Shi, Dhabaleswar K. Panda:
Can Inter-VM Shmem Benefit MPI Applications on SR-IOV Based Virtualized Infiniband Clusters? Euro-Par 2014: 342-353 - [c310]Md. Wasi-ur-Rahman, Xiaoyi Lu, Nusrat Sharmin Islam, Raghunath Rajachandrasekar, Dhabaleswar K. Panda:
MapReduce over Lustre: Can RDMA-Based Approach Benefit? Euro-Par 2014: 644-655 - [c309]Rong Shi, Sreeram Potluri, Khaled Hamidouche, Jonathan L. Perkins, Mingzhe Li, Davide Rossetti, Dhabaleswar K. Panda:
Designing efficient small message transfer mechanism for inter-node MPI communication on InfiniBand GPU clusters. HiPC 2014: 1-10 - [c308]Akshay Venkatesh, Hari Subramoni, Khaled Hamidouche, Dhabaleswar K. Panda:
A high performance broadcast design with hardware multicast and GPUDirect RDMA for streaming applications on Infiniband clusters. HiPC 2014: 1-10 - [c307]Jie Zhang, Xiaoyi Lu, Jithin Jose, Mingzhe Li, Rong Shi, Dhabaleswar K. Panda:
High performance MPI library over SR-IOV enabled infiniband clusters. HiPC 2014: 1-10 - [c306]Xiaoyi Lu, Md. Wasi-ur-Rahman, Nusrat S. Islam, Dipti Shankar, Dhabaleswar K. Panda:
Accelerating Spark with RDMA for Big Data Processing: Early Experiences. Hot Interconnects 2014: 9-16 - [c305]Raghunath Rajachandrasekar, Sreeram Potluri, Akshay Venkatesh, Khaled Hamidouche, Md. Wasi-ur-Rahman, Dhabaleswar K. Panda:
MIC-Check: a distributed check pointing framework for the intel many integrated cores architecture. HPDC 2014: 121-124 - [c304]Nusrat S. Islam, Xiaoyi Lu, Md. Wasi-ur-Rahman, Dhabaleswar K. Panda:
SOR-HDFS: a SEDA-based approach to maximize overlapping in RDMA-enhanced HDFS. HPDC 2014: 261-264 - [c303]Prasad Calyam, Alex Berryman, Erik Saule
, Hari Subramoni, Paul Schopis, Gordon Springer, Ümit V. Çatalyürek
, Dhabaleswar K. Panda:
Wide-area overlay networking to manage science DMZ accelerated flows. ICNC 2014: 269-275 - [c302]Dhabaleswar K. Panda, Jang-Ping Sheu:
Message from the general co-chairs IEEE ICPADS 2014. ICPADS 2014: xv - [c301]Md. Wasi-ur-Rahman, Xiaoyi Lu, Nusrat Sharmin Islam, Dhabaleswar K. Panda:
Performance Modeling for RDMA-Enhanced Hadoop MapReduce. ICPP 2014: 50-59 - [c300]Rong Shi, Xiaoyi Lu, Sreeram Potluri, Khaled Hamidouche, Jie Zhang, Dhabaleswar K. Panda:
HAND: A Hybrid Approach to Accelerate Non-contiguous Data Movement Using MPI Datatypes on GPU Clusters. ICPP 2014: 221-230 - [c299]Hari Subramoni, Krishna Chaitanya Kandalla, Jithin Jose, Karen Tomko
, Karl W. Schulz, Dmitry Pekurovsky
, Dhabaleswar K. Panda:
Designing Topology-Aware Communication Schedules for Alltoall Operations in Large InfiniBand Clusters. ICPP 2014: 231-240 - [c298]Md. Wasi-ur-Rahman, Xiaoyi Lu, Nusrat Sharmin Islam, Dhabaleswar K. Panda:
HOMR: a hybrid approach to exploit maximum overlapping in MapReduce over high performance interconnects. ICS 2014: 33-42 - [c297]Jithin Jose, Khaled Hamidouche, Jie Zhang, Akshay Venkatesh, Dhabaleswar K. Panda:
Optimizing Collective Communication in UPC. IPDPS Workshops 2014: 361-370 - [c296]Akshay Venkatesh, Sreeram Potluri, Raghunath Rajachandrasekar, Miao Luo, Khaled Hamidouche, Dhabaleswar K. Panda:
High Performance Alltoall and Allgather Designs for InfiniBand MIC Clusters. IPDPS 2014: 637-646 - [c295]Jithin Jose, Jie Zhang, Akshay Venkatesh, Sreeram Potluri, Dhabaleswar K. Panda:
A Comprehensive Performance Evaluation of OpenSHMEM Libraries on InfiniBand Clusters. OpenSHMEM 2014: 14-28 - [c294]Jithin Jose, Sreeram Potluri, Hari Subramoni, Xiaoyi Lu, Khaled Hamidouche, Karl W. Schulz, Hari Sundar, Dhabaleswar K. Panda:
Designing Scalable Out-of-core Sorting with Hybrid MPI+PGAS Programming Models. PGAS 2014: 7:1-7:9 - [c293]Mingzhe Li, Jian Lin
, Xiaoyi Lu, Khaled Hamidouche, Karen Tomko
, Dhabaleswar K. Panda:
Scalable MiniMD Design with Hybrid MPI and OpenSHMEM. PGAS 2014: 24:1-24:4 - [c292]Miao Luo, Xiaoyi Lu, Khaled Hamidouche, Krishna Chaitanya Kandalla, Dhabaleswar K. Panda:
Initial study of multi-endpoint runtime for MPI+OpenMP hybrid programming model on multi-core systems. PPoPP 2014: 395-396 - [c291]Sourav Chakraborty, Hari Subramoni, Jonathan L. Perkins, Adam Moody, Mark Daniel Arnold, Dhabaleswar K. Panda:
PMI Extensions for Scalable MPI Startup. EuroMPI/ASIA 2014: 21 - [c290]Raghunath Rajachandrasekar, Jonathan L. Perkins, Khaled Hamidouche, Mark Daniel Arnold, Dhabaleswar K. Panda:
Understanding the Memory-Utilization of MPI Libraries: Challenges and Designs in Implementing the MPI_T Interface. EuroMPI/ASIA 2014: 97 - [c289]Hari Subramoni, Khaled Hamidouche, Akshay Venkatesh, Sourav Chakraborty, Dhabaleswar K. Panda:
Designing MPI Library with Dynamic Connected Transport (DCT) of InfiniBand: Early Experiences. ISC 2014: 278-295 - [c288]Dipti Shankar, Xiaoyi Lu, Md. Wasi-ur-Rahman, Nusrat S. Islam, Dhabaleswar K. Panda:
A Micro-benchmark Suite for Evaluating Hadoop MapReduce on High-Performance Networks. BPOE@ASPLOS/VLDB 2014: 19-33 - 2013
- [j43]Miao Luo, Hao Wang, Jérôme Vienne, Dhabaleswar K. Panda:
Redesigning MPI shared memory communication for large multi-core architecture. Comput. Sci. Res. Dev. 28(2-3): 137-146 (2013) - [c287]Sreeram Potluri, Akshay Venkatesh, Devendar Bureddy, Krishna Chaitanya Kandalla, Dhabaleswar K. Panda:
Efficient Intra-node Communication on Intel-MIC Clusters. CCGRID 2013: 128-135 - [c286]Jithin Jose, Mingzhe Li, Xiaoyi Lu, Krishna Chaitanya Kandalla, Mark Daniel Arnold, Dhabaleswar K. Panda:
SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience. CCGRID 2013: 385-392 - [c285]Md. Wasi-ur-Rahman, Xiaoyi Lu, Nusrat S. Islam, Dhabaleswar K. Panda:
Does RDMA-based enhanced Hadoop MapReduce need a new performance model? SoCC 2013: 45:1-45:2 - [c284]Rong Shi, Sreeram Potluri, Khaled Hamidouche, Xiaoyi Lu, Karen Tomko
, Dhabaleswar K. Panda:
A scalable and portable approach to accelerate hybrid HPL on heterogeneous CPU-GPU clusters. CLUSTER 2013: 1-8 - [c283]Hari Subramoni, Devendar Bureddy, Krishna Chaitanya Kandalla, Karl W. Schulz, Bill Barth, Jonathan L. Perkins, Mark Daniel Arnold, Dhabaleswar K. Panda:
Design of network topology aware scheduling services for large InfiniBand clusters. CLUSTER 2013: 1-8 - [c282]Dhabaleswar K. Panda, Xiaoyi Lu:
Tutorials. Hot Interconnects 2013 - [c281]Krishna Chaitanya Kandalla, Akshay Venkatesh, Khaled Hamidouche, Sreeram Potluri, Devendar Bureddy, Dhabaleswar K. Panda:
Designing Optimized MPI Broadcast and Allreduce for Many Integrated Core (MIC) InfiniBand Clusters. Hot Interconnects 2013: 63-70 - [c280]Nusrat S. Islam, Xiaoyi Lu, Md. Wasi-ur-Rahman, Dhabaleswar K. Panda:
Can Parallel Replication Benefit Hadoop Distributed File System for High Performance Interconnects? Hot Interconnects 2013: 75-78 - [c279]Raghunath Rajachandrasekar, Adam Moody, Kathryn M. Mohror
, Dhabaleswar K. Panda:
A 1 PB/s file system to checkpoint three million MPI tasks. HPDC 2013: 143-154 - [c278]Sreeram Potluri, Khaled Hamidouche, Akshay Venkatesh, Devendar Bureddy, Dhabaleswar K. Panda:
Efficient Inter-node MPI Communication Using GPUDirect RDMA for InfiniBand Clusters with NVIDIA GPUs. ICPP 2013: 80-89 - [c277]Krishna Chaitanya Kandalla, Hari Subramoni, Karen Tomko
, Dmitry Pekurovsky
, Dhabaleswar K. Panda:
A Novel Functional Partitioning Approach to Design High-Performance MPI-3 Non-blocking Alltoallv Collective on Multi-core Systems. ICPP 2013: 611-620 - [c276]Xiaoyi Lu, Nusrat S. Islam, Md. Wasi-ur-Rahman, Jithin Jose, Hari Subramoni, Hao Wang, Dhabaleswar K. Panda:
High-Performance Design of Hadoop RPC with RDMA over InfiniBand. ICPP 2013: 641-650 - [c275]Khaled Hamidouche, Sreeram Potluri, Hari Subramoni, Krishna Chaitanya Kandalla, Dhabaleswar K. Panda:
MIC-RO: enabling efficient remote offload on heterogeneous many integrated core (MIC) clusters with InfiniBand. ICS 2013: 399-408 - [c274]Akshay Venkatesh, Krishna Chaitanya Kandalla, Dhabaleswar K. Panda:
Evaluation of Energy Characteristics of MPI Communication Primitives with RAPL. IPDPS Workshops 2013: 938-945 - [c273]Sreeram Potluri, Devendar Bureddy, Hao Wang, Hari Subramoni, Dhabaleswar K. Panda:
Extending OpenSHMEM for GPU Computing. IPDPS 2013: 1001-1012 - [c272]Md. Wasi-ur-Rahman, Nusrat Sharmin Islam, Xiaoyi Lu, Jithin Jose, Hari Subramoni, Hao Wang, Dhabaleswar K. Panda:
High-Performance RDMA-based Design of Hadoop MapReduce over InfiniBand. IPDPS Workshops 2013: 1908-1917 - [c271]Mingzhe Li, Sreeram Potluri, Khaled Hamidouche, Jithin Jose, Dhabaleswar K. Panda:
Efficient and truly passive MPI-3 RMA using InfiniBand atomics. EuroMPI 2013: 91-96 - [c270]Sreeram Potluri, Devendar Bureddy, Khaled Hamidouche, Akshay Venkatesh, Krishna Chaitanya Kandalla, Hari Subramoni, Dhabaleswar K. Panda:
MVAPICH-PRISM: a proxy-based communication framework using InfiniBand and SCIF for intel MIC clusters. SC 2013: 54:1-54:11 - [c269]Jithin Jose, Mohammad Banikazemi, Wendy Belluomini, Chet Murthy, Dhabaleswar K. Panda:
MetaData persistence using storage class memory: experiences with flash-backed DRAM. INFLOW@SOSP 2013: 3:1-3:7 - [c268]Jithin Jose, Sreeram Potluri, Karen Tomko
, Dhabaleswar K. Panda:
Designing Scalable Graph500 Benchmark with Hybrid MPI+OpenSHMEM Programming Models. ISC 2013: 109-124 - [c267]Xiaoyi Lu, Md. Wasi-ur-Rahman, Nusrat Sharmin Islam, Dhabaleswar K. Panda:
A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks. WBDB 2013: 32-42 - 2012
- [c266]Jithin Jose, Hari Subramoni, Krishna Chaitanya Kandalla, Md. Wasi-ur-Rahman, Hao Wang, Sundeep Narravula, Dhabaleswar K. Panda:
Scalable Memcached Design for InfiniBand Clusters Using Hybrid Transports. CCGRID 2012: 236-243 - [c265]Krishna Chaitanya Kandalla, Aydin Buluç
, Hari Subramoni, Karen Tomko
, Jérôme Vienne, Leonid Oliker, Dhabaleswar K. Panda:
Can Network-Offload Based Non-blocking Neighborhood MPI Collectives Improve Communication Overheads of Irregular Graph Algorithms? CLUSTER Workshops 2012: 222-230 - [c264]Raghunath Rajachandrasekar, Jai Jaswani, Hari Subramoni, Dhabaleswar K. Panda:
Minimizing Network Contention in InfiniBand Clusters with a QoS-Aware Data-Staging Framework. CLUSTER 2012: 329-336 - [c263]Hari Subramoni, Jérôme Vienne, Dhabaleswar K. Panda:
A Scalable InfiniBand Network Topology-Aware Performance Analysis Tool for MPI. Euro-Par Workshops 2012: 439-450 - [c262]Jérôme Vienne, Jitong Chen
, Md. Wasi-ur-Rahman, Nusrat S. Islam, Hari Subramoni, Dhabaleswar K. Panda:
Performance Analysis and Evaluation of InfiniBand FDR and 40GigE RoCE on HPC and Cloud Computing Systems. Hot Interconnects 2012: 48-55 - [c261]Jithin Jose, Krishna Chaitanya Kandalla, Miao Luo, Dhabaleswar K. Panda:
Supporting Hybrid MPI and OpenSHMEM over InfiniBand: Design and Performance Evaluation. ICPP 2012: 219-228 - [c260]Xiangyong Ouyang, Nusrat S. Islam, Raghunath Rajachandrasekar, Jithin Jose, Miao Luo, Hao Wang, Dhabaleswar K. Panda:
SSD-Assisted Hybrid Memory to Accelerate Memcached over High Performance Networks. ICPP 2012: 470-479 - [c259]Miao Luo, Dhabaleswar K. Panda, Khaled Z. Ibrahim, Costin Iancu:
Congestion avoidance on manycore high performance computing systems. ICS 2012: 121-132 - [c258]Jian Huang, Xiangyong Ouyang, Jithin Jose, Md. Wasi-ur-Rahman, Hao Wang, Miao Luo, Hari Subramoni, Chet Murthy, Dhabaleswar K. Panda:
High-Performance Design of HBase with RDMA over InfiniBand. IPDPS 2012: 774-785 - [c257]Raghunath Rajachandrasekar, Xavier Besseron
, Dhabaleswar K. Panda:
Monitoring and Predicting Hardware Failures in HPC Clusters with FTB-IPMI. IPDPS Workshops 2012: 1136-1143 - [c256]Krishna Chaitanya Kandalla, Ulrike Meier Yang
, Jeff Keasler, Tzanio V. Kolev
, Adam Moody, Hari Subramoni, Karen Tomko
, Jérôme Vienne, Bronis R. de Supinski, Dhabaleswar K. Panda:
Designing Non-blocking Allreduce with Collective Offload on InfiniBand Clusters: A Case Study with Conjugate Gradient Solvers. IPDPS 2012: 1156-1167 - [c255]S. Pai Raikar, Hari Subramoni, Krishna Chaitanya Kandalla, Jérôme Vienne, Dhabaleswar K. Panda:
Designing Network Failover and Recovery in MPI for Multi-Rail InfiniBand Clusters. IPDPS Workshops 2012: 1160-1167 - [c254]Sreeram Potluri, Hao Wang, Devendar Bureddy, Ashish Kumar Singh, Carlos Rosales, Dhabaleswar K. Panda:
Optimizing MPI Communication on Multi-GPU Systems Using CUDA Inter-Process Communication. IPDPS Workshops 2012: 1848-1857 - [c253]Md. Wasi-ur-Rahman, Jian Huang, Jithin Jose, Xiangyong Ouyang, Hao Wang, Nusrat S. Islam, Hari Subramoni, Chet Murthy, Dhabaleswar K. Panda:
Understanding the communication characteristics in HBase: What are the fundamental bottlenecks? ISPASS 2012: 122-123 - [c252]Devendar Bureddy, Hao Wang, Akshay Venkatesh, Sreeram Potluri, Dhabaleswar K. Panda:
OMB-GPU: A Micro-Benchmark Suite for Evaluating MPI Libraries on GPU Clusters. EuroMPI 2012: 110-120 - [c251]Nusrat S. Islam, Md. Wasi-ur-Rahman, Jithin Jose, Raghunath Rajachandrasekar, Hao Wang, Hari Subramoni, Chet Murthy, Dhabaleswar K. Panda:
High performance RDMA-based design of HDFS over InfiniBand. SC 2012: 35 - [c250]Hari Subramoni, Sreeram Potluri, Krishna Chaitanya Kandalla, Bill Barth, Jérôme Vienne, Jeff Keasler, Karen A. Tomko
, Karl W. Schulz, Adam Moody, Dhabaleswar K. Panda:
Design of a scalable InfiniBand topology service to enable network-topology-aware placement of processes. SC 2012: 70 - [c249]Nusrat Sharmin Islam, Xiaoyi Lu, Md. Wasi-ur-Rahman, Jithin Jose, Dhabaleswar K. Panda:
A Micro-benchmark Suite for Evaluating HDFS Operations on Modern Clusters. WBDB 2012: 129-147 - 2011
- [j42]Sayantan Sur, Sreeram Potluri, Krishna Chaitanya Kandalla, Hari Subramoni, Dhabaleswar K. Panda, Karen Tomko
:
Codesign for InfiniBand Clusters. Computer 44(11): 31-36 (2011) - [j41]Krishna Chaitanya Kandalla, Hari Subramoni, Karen A. Tomko
, Dmitry Pekurovsky
, Sayantan Sur, Dhabaleswar K. Panda:
High-performance and scalable non-blocking all-to-all with collective offload on InfiniBand clusters: a study with parallel 3D FFT. Comput. Sci. Res. Dev. 26(3-4): 237-246 (2011) - [j40]Hao Wang
, Sreeram Potluri, Miao Luo, Ashish Kumar Singh, Sayantan Sur, Dhabaleswar K. Panda:
MVAPICH2-GPU: optimized GPU to GPU communication for InfiniBand clusters. Comput. Sci. Res. Dev. 26(3-4): 257-266 (2011) - [c248]Xiangyong Ouyang, Raghunath Rajachandrasekar, Xavier Besseron
, Dhabaleswar K. Panda:
High Performance Pipelined Process Migration with RDMA. CCGRID 2011: 314-323 - [c247]Hao Wang, Sreeram Potluri, Miao Luo, Ashish Kumar Singh, Xiangyong Ouyang, Sayantan Sur, Dhabaleswar K. Panda:
Optimized Non-contiguous MPI Datatype Communication for GPU Clusters: Design, Implementation and Evaluation with MVAPICH2. CLUSTER 2011: 308-316 - [c246]Hari Subramoni, Krishna Chaitanya Kandalla, Jérôme Vienne, Sayantan Sur, Bill Barth, Karen A. Tomko
, Robert T. McLay
, Karl W. Schulz, Dhabaleswar K. Panda:
Design and Evaluation of Network Topology-/Speed- Aware Broadcast Algorithms for InfiniBand Clusters. CLUSTER 2011: 317-325 - [c245]Ashish Kumar Singh, Sreeram Potluri, Hao Wang, Krishna Chaitanya Kandalla, Sayantan Sur, Dhabaleswar K. Panda:
MPI Alltoall Personalized Exchange on GPGPU Clusters: Design Alternatives and Benefit. CLUSTER 2011: 420-427 - [c244]Vilobh Meshram, Xavier Besseron
, Xiangyong Ouyang, Raghunath Rajachandrasekar, Ravi Prakash, Dhabaleswar K. Panda:
Can a Decentralized Metadata Service Layer Benefit Parallel Filesystems? CLUSTER 2011: 484-493 - [c243]N. Dandapanthula, Hari Subramoni, Jérôme Vienne, Krishna Chaitanya Kandalla, Sayantan Sur, Dhabaleswar K. Panda, Ron Brightwell:
INAM - A Scalable InfiniBand Network Analysis and Monitoring Tool. Euro-Par Workshops (2) 2011: 166-177 - [c242]Raghunath Rajachandrasekar, Xiangyong Ouyang, Xavier Besseron
, Vilobh Meshram, Dhabaleswar K. Panda:
Can Checkpoint/Restart Mechanisms Benefit from Hierarchical Data Staging? Euro-Par Workshops (2) 2011: 312-321 - [c241]Miao Luo, Jithin Jose, Sayantan Sur, Dhabaleswar K. Panda:
Multi-threaded UPC runtime with network endpoints: Design alternatives and evaluation on multi-core architectures. HiPC 2011: 1-10 - [c240]Krishna Chaitanya Kandalla, Hari Subramoni, Jérôme Vienne, S. Pai Raikar, Karen Tomko
, Sayantan Sur, Dhabaleswar K. Panda:
Designing Non-blocking Broadcast with Collective Offload on InfiniBand Clusters: A Case Study with HPL. Hot Interconnects 2011: 27-34 - [c239]Xiangyong Ouyang, David W. Nellans, Robert Wipfel, David Flynn, Dhabaleswar K. Panda:
Beyond block I/O: Rethinking traditional storage primitives. HPCA 2011: 301-311 - [c238]Xiangyong Ouyang, Raghunath Rajachandrasekar, Xavier Besseron
, Hao Wang, Jian Huang, Dhabaleswar K. Panda:
CRFS: A Lightweight User-Level Filesystem for Generic Checkpoint/Restart. ICPP 2011: 375-384 - [c237]Jithin Jose, Hari Subramoni, Miao Luo, Minjia Zhang, Jian Huang, Md. Wasi-ur-Rahman, Nusrat S. Islam, Xiangyong Ouyang, Hao Wang, Sayantan Sur, Dhabaleswar K. Panda:
Memcached Design on High Performance RDMA Capable Interconnects. ICPP 2011: 743-752 - [c236]Sreeram Potluri, Hao Wang, Vijay Dhanraj, Sayantan Sur, Dhabaleswar K. Panda:
Optimizing MPI One Sided Communication on Multi-core InfiniBand Clusters Using Shared Memory Backed Windows. EuroMPI 2011: 99-109 - [c235]Sreeram Potluri, Sayantan Sur, Devendar Bureddy, Dhabaleswar K. Panda:
Design and Implementation of Key Proposed MPI-3 One-Sided Communication Semantics on InfiniBand. EuroMPI 2011: 321-324 - [r2]Dhabaleswar K. Panda, Sayantan Sur, Hari Subramoni, Krishna Chaitanya Kandalla:
Collective Communication, Network Support For. Encyclopedia of Parallel Computing 2011: 327-334 - [r1]Dhabaleswar K. Panda, Sayantan Sur:
InfiniBand. Encyclopedia of Parallel Computing 2011: 927-935 - 2010
- [j39]Ping Lai, Sayantan Sur, Dhabaleswar K. Panda:
Designing truly one-sided MPI-2 RMA intra-node communication on multi-core systems. Comput. Sci. Res. Dev. 25(1-2): 3-14 (2010) - [c234]Emilio Pasquale Mancini, Gregory Marsh, Dhabaleswar K. Panda:
An MPI-Stream Hybrid Programming Model for Computational Clusters. CCGRID 2010: 323-330 - [c233]Hari Subramoni, Ping Lai, Rajkumar Kettimuthu, Dhabaleswar K. Panda:
High Performance Data Transfer in Grid Environment Using GridFTP over InfiniBand. CCGRID 2010: 557-564 - [c232]Xiangyong Ouyang, Sonya Marcarelli, Raghunath Rajachandrasekar, Dhabaleswar K. Panda:
RDMA-Based Job Migration Framework for MPI over InfiniBand. CLUSTER 2010: 116-125 - [c231]Hari Subramoni, Krishna Chaitanya Kandalla, Sayantan Sur, Dhabaleswar K. Panda:
Design and Evaluation of Generalized Collective Communication Primitives with Overlap Using ConnectX-2 Offload Engine. Hot Interconnects 2010: 40-49 - [c230]Dhabaleswar K. Panda, Sayantan Sur, Pavan Balaji:
Designing High-End Computing Systems with InfiniBand and High-Speed Ethernet. Hot Interconnects 2010: 125-127 - [c229]Krishna Chaitanya Kandalla, Emilio Pasquale Mancini, Sayantan Sur, Dhabaleswar K. Panda:
Designing Power-Aware Collective Communication Algorithms for InfiniBand Clusters. ICPP 2010: 218-227 - [c228]Hari Subramoni, Ping Lai, Sayantan Sur, Dhabaleswar K. Panda:
Improving Application Performance and Predictability Using Multiple Virtual Lanes in Modern Multi-core InfiniBand Clusters. ICPP 2010: 462-471 - [c227]Miao Luo, Sreeram Potluri, Ping Lai, Emilio Pasquale Mancini, Hari Subramoni, Krishna Chaitanya Kandalla, Sayantan Sur, Dhabaleswar K. Panda:
High Performance Design and Implementation of Nemesis Communication Layer for Two-Sided and One-Sided MPI Semantics in MVAPICH2. ICPP Workshops 2010: 377-386 - [c226]Sreeram Potluri, Ping Lai, Karen A. Tomko
, Sayantan Sur, Yifeng Cui, Mahidhar Tatineni, Karl W. Schulz, William L. Barth
, Amitava Majumdar
, Dhabaleswar K. Panda:
Quantifying performance benefits of overlap using MPI-2 in a seismic modeling application. ICS 2010: 17-25 - [c225]Krishna Chaitanya Kandalla, Hari Subramoni, Abhinav Vishnu, Dhabaleswar K. Panda:
Designing topology-aware collective communication algorithms for large scale InfiniBand clusters: Case studies with Scatter and Gather. IPDPS Workshops 2010: 1-8 - [c224]Matthew J. Koop, Pavel Shamis, Ishai Rabinovitz, Dhabaleswar K. Panda:
Designing high-performance and resilient message passing on InfiniBand. IPDPS Workshops 2010: 1-7 - [c223]Jithin Jose, Miao Luo, Sayantan Sur, Dhabaleswar K. Panda:
Unifying UPC and MPI runtimes: experience with MVAPICH. PGAS 2010: 5 - [c222]Yifeng Cui, Kim B. Olsen, Thomas H. Jordan, Kwangyoon Lee, Jun Zhou, Patrick Small, Daniel Roten
, Geoffrey Ely, Dhabaleswar K. Panda, Amit Chourasia, John M. Levesque, Steven M. Day, Philip Maechling
:
Scalable Earthquake Simulation on Petascale Supercomputers. SC 2010: 1-20
2000 – 2009
- 2009
- [j38]Abhinav Vishnu, Matthew J. Koop, Adam Moody, Amith R. Mamidala, Sundeep Narravula, Dhabaleswar K. Panda:
Topology agnostic hot-spot avoidance with InfiniBand. Concurr. Comput. Pract. Exp. 21(3): 301-319 (2009) - [j37]Ping Lai, Pavan Balaji, Rajeev Thakur
, Dhabaleswar K. Panda:
ProOnE: a general-purpose protocol onload engine for multi- and many-core architectures. Comput. Sci. Res. Dev. 23(3-4): 133-142 (2009) - [j36]Dhabaleswar K. Panda:
IPDPS 2007: Comments from the Guest Editor. J. Parallel Distributed Comput. 69(8): 679 (2009) - [c221]Gopalakrishnan Santhanaraman, Pavan Balaji, K. Gopalakrishnan, Rajeev Thakur
, William Gropp
, Dhabaleswar K. Panda:
Natively Supporting True One-Sided Communication in. CCGRID 2009: 380-387 - [c220]Matthew J. Koop, Miao Luo, Dhabaleswar K. Panda:
Reducing network contention with mixed workloads on modern multicore, clusters. CLUSTER 2009: 1-10 - [c219]Gopalakrishnan Santhanaraman, Tejus Gangadharappa, Sundeep Narravula, Amith R. Mamidala, Dhabaleswar K. Panda:
Design alternatives for implementing fence synchronization in MPI-2 one-sided communication for InfiniBand clusters. CLUSTER 2009: 1-9 - [c218]Hari Subramoni, Ping Lai, Miao Luo, Dhabaleswar K. Panda:
RDMA over Ethernet - A preliminary study. CLUSTER 2009: 1-9 - [c217]Abhinav Vishnu, Manojkumar Krishnan, Dhabaleswar K. Panda:
An efficient hardware-software approach to network fault tolerance with InfiniBand. CLUSTER 2009: 1-9 - [c216]Xiangyong Ouyang, Karthik Gopalakrishnan, Tejus Gangadharappa, Dhabaleswar K. Panda:
Fast checkpointing by Write Aggregation with Dynamic Buffer and Interleaving on multicore architecture. HiPC 2009: 99-108 - [c215]Dhabaleswar K. Panda, Matthew J. Koop, Pavan Balaji:
Tutorial: Infiniband and 10-Gigabit Ethernet for Dummies. Hot Interconnects 2009 - [c214]Dhabaleswar K. Panda, Matthew J. Koop, Pavan Balaji:
Tutorial: Designing High-End Computing Systems with Infiniband and 10-Gigabit Ethernet. Hot Interconnects 2009 - [c213]Hari Subramoni, Matthew J. Koop, Dhabaleswar K. Panda:
Designing Next Generation Clusters: Evaluation of InfiniBand DDR/QDR on Intel Computing Platforms. Hot Interconnects 2009: 112-120 - [c212]Xiangyong Ouyang, Karthik Gopalakrishnan, Dhabaleswar K. Panda:
Accelerating Checkpoint Operation by Node-Level Write Aggregation on Multicore Systems. ICPP 2009: 34-41 - [c211]Ping Lai, Hari Subramoni, Sundeep Narravula, Amith R. Mamidala, Dhabaleswar K. Panda:
Designing Efficient FTP Mechanisms for High Performance Data-Transfer over InfiniBand. ICPP 2009: 156-163 - [c210]Rinku Gupta, Peter H. Beckman, Byung-Hoon Park, Ewing L. Lusk, Paul Hargrove
, Al Geist, Dhabaleswar K. Panda, Andrew Lumsdaine
, Jack J. Dongarra:
CIFTS: A Coordinated Infrastructure for Fault-Tolerant Systems. ICPP 2009: 237-245 - [c209]Tejus Gangadharappa, Matthew J. Koop, Dhabaleswar K. Panda:
Designing and Evaluating MPI-2 Dynamic Process Management Support for InfiniBand. ICPP Workshops 2009: 89-96 - [c208]Krishna Chaitanya Kandalla, Hari Subramoni, Gopalakrishnan Santhanaraman, Matthew J. Koop, Dhabaleswar K. Panda:
Designing multi-leader-based Allgather algorithms for multi-core clusters. IPDPS 2009: 1-8 - [c207]Matthew J. Koop, Jaidev K. Sridhar, Dhabaleswar K. Panda:
TupleQ: Fully-asynchronous and zero-copy MPI over InfiniBand. IPDPS 2009: 1-8 - [c206]Jaidev K. Sridhar, Dhabaleswar K. Panda:
Impact of Node Level Caching in MPI Job Launch Mechanisms. PVM/MPI 2009: 230-239 - 2008
- [c205]Amith R. Mamidala, Rahul Kumar, Debraj De, Dhabaleswar K. Panda:
MPI Collectives on Modern Multicore Clusters: Performance Optimizations and Communication Characteristics. CCGRID 2008: 130-137 - [c204]Karthikeyan Vaidyanathan, Ping Lai, Sundeep Narravula, Dhabaleswar K. Panda:
Optimized Distributed Data Sharing Substrate in Multi-core Commodity Clusters: A Comprehensive Study with Applications. CCGRID 2008: 138-145 - [c203]Ping Lai, Sundeep Narravula, Karthikeyan Vaidyanathan, Dhabaleswar K. Panda:
Advanced RDMA-Based Admission Control for Modern Data-Centers. CCGRID 2008: 384-391 - [c202]Wei Huang, Matthew J. Koop, Dhabaleswar K. Panda:
Efficient one-copy MPI shared memory communication in Virtual Machines. CLUSTER 2008: 107-115 - [c201]Dhabaleswar K. Panda:
Designing next generation clusters with InfiniBand and 10GE/iWARP: Opportunities and challenges. CLUSTER 2008: 202 - [c200]Matthew J. Koop, Jaidev K. Sridhar, Dhabaleswar K. Panda:
Scalable MPI design over InfiniBand using eXtended Reliable Connection. CLUSTER 2008: 203-212 - [c199]Jaidev K. Sridhar, Matthew J. Koop, Jonathan L. Perkins, Dhabaleswar K. Panda:
ScELA: Scalable and Extensible Launching Architecture for Clusters. HiPC 2008: 323-335 - [c198]Ranjit Noronha, Xiangyong Ouyang, Dhabaleswar K. Panda:
Designing a High-Performance Clustered NAS: A Case Study with pNFS over RDMA on InfiniBand. HiPC 2008: 465-477 - [c197]Pavan Balaji, Sitha Bhagvat, Rajeev Thakur
, Dhabaleswar K. Panda:
Sockets Direct Protocol for Hybrid Network Stacks: A Case Study with iWARP over 10G Ethernet. HiPC 2008: 478-490 - [c196]Matthew J. Koop, Wei Huang, Karthik Gopalakrishnan, Dhabaleswar K. Panda:
Performance Analysis and Evaluation of PCIe 2.0 and Quad-Data Rate InfiniBand. Hot Interconnects 2008: 85-92 - [c195]Lei Chai, Ping Lai, Hyun-Wook Jin, Dhabaleswar K. Panda:
Designing an Efficient Kernel-Level and User-Level Hybrid Approach for MPI Intra-Node Communication on Multi-Core Systems. ICPP 2008: 222-229 - [c194]Sundeep Narravula, Hari Subramoni, Ping Lai, Ranjit Noronha, Dhabaleswar K. Panda:
Performance of HPC Middleware over InfiniBand WAN. ICPP 2008: 304-311 - [c193]Ranjit Noronha, Dhabaleswar K. Panda:
IMCa: A High Performance Caching Front-End for GlusterFS on InfiniBand. ICPP 2008: 462-469 - [c192]Matthew J. Koop, Rahul Kumar, Dhabaleswar K. Panda:
Can software reliability outperform hardware reliability on high performance interconnects?: a case study with MPI over infiniband. ICS 2008: 145-154 - [c191]Matthew J. Koop, Terry R. Jones
, Dhabaleswar K. Panda:
MVAPICH-Aptus: Scalable high-performance multi-transport MPI over InfiniBand. IPDPS 2008: 1-12 - [c190]Rahul Kumar, Amith R. Mamidala, Dhabaleswar K. Panda:
Scaling alltoall collective on multi-core systems. IPDPS 2008: 1-8 - [c189]Gopalakrishnan Santhanaraman, Sundeep Narravula, Dhabaleswar K. Panda:
Designing passive synchronization for MPI-2 one-sided communication to maximize overlap. IPDPS 2008: 1-11 - [c188]Rahul Kumar, Amith R. Mamidala, Matthew J. Koop, Gopalakrishnan Santhanaraman, Dhabaleswar K. Panda:
Lock-Free Asynchronous Rendezvous Design for MPI Point-to-Point Communication. PVM/MPI 2008: 185-193 - [e4]Mark A. Franklin, Dhabaleswar K. Panda, Dimitrios Stiliadis:
Proceedings of the 2008 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, ANCS 2008, San Jose, California, USA, November 6-7, 2008. ACM 2008, ISBN 978-1-60558-346-4 [contents] - 2007
- [c187]Lei Chai, Qi Gao, Dhabaleswar K. Panda:
Understanding the Impact of Multi-Core Architecture in Cluster Computing: A Case Study with Intel Dual-Core System. CCGRID 2007: 471-478 - [c186]Abhinav Vishnu, Matthew J. Koop, Adam Moody, Amith R. Mamidala, Sundeep Narravula, Dhabaleswar K. Panda:
Hot-Spot Avoidance With Multi-Pathing Over InfiniBand: An MPI Perspective. CCGRID 2007: 479-486 - [c185]Matthew J. Koop, Terry R. Jones
, Dhabaleswar K. Panda:
Reducing Connection Memory Requirements of MPI for InfiniBand Clusters: A Message Coalescing Approach. CCGRID 2007: 495-504 - [c184]Sundeep Narravula, A. Marnidala, Abhinav Vishnu, Karthikeyan Vaidyanathan, Dhabaleswar K. Panda:
High Performance Distributed Lock Management Services using Network-based Remote Atomic Operations. CCGRID 2007: 583-590 - [c183]Wei Huang, Qi Gao, Jiuxing Liu, Dhabaleswar K. Panda:
High performance virtual machine migration with RDMA over modern interconnects. CLUSTER 2007: 11-20 - [c182]Karthikeyan Vaidyanathan, Lei Chai, Wei Huang, Dhabaleswar K. Panda:
Efficient asynchronous memory copy operations on multi-core systems and I/OAT. CLUSTER 2007: 159-168 - [c181]Matthew J. Koop, Sayantan Sur, Dhabaleswar K. Panda:
Zero-copy protocol for MPI using infiniband unreliable datagram. CLUSTER 2007: 179-186 - [c180]Hyun-Wook Jin, Sayantan Sur, Lei Chai, Dhabaleswar K. Panda:
Lightweight kernel-level primitives for high-performance MPI intra-node communication over multi-core systems. CLUSTER 2007: 446-451 - [c179]Dhabaleswar K. Panda, Pavan Balaji:
Designing high-end computing systems with InfiniBand and10-Gigabit Ethernet iWARP. CLUSTER 2007 - [c178]Sayantan Sur, Matthew J. Koop, Lei Chai, Dhabaleswar K. Panda:
Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms. Hot Interconnects 2007: 125-134 - [c177]Sundeep Narravula, Amith R. Mamidala, Abhinav Vishnu, Gopalakrishnan Santhanaraman, Dhabaleswar K. Panda:
High Performance MPI over iWARP: Early Experiences. ICPP 2007: 46 - [c176]Qi Gao, Wei Huang, Matthew J. Koop, Dhabaleswar K. Panda:
Group-based Coordinated Checkpointing for MPI: A Case Study on InfiniBand. ICPP 2007: 47 - [c175]Ranjit Noronha, Lei Chai, Thomas Talpey, Dhabaleswar K. Panda:
Designing NFS with RDMA for Security, Performance and Scalability. ICPP 2007: 49 - [c174]Pavan Balaji, Sitha Bhagvat, Dhabaleswar K. Panda, Rajeev Thakur
, William Gropp
:
Advanced Flow-control Mechanisms for the Sockets Direct Protocol over InfiniBand. ICPP 2007: 73 - [c173]Matthew J. Koop, Sayantan Sur, Qi Gao, Dhabaleswar K. Panda:
High performance MPI design using unreliable datagram for ultra-scale InfiniBand clusters. ICS 2007: 180-189 - [c172]Ranjit Noronha, Dhabaleswar K. Panda:
Improving Scalability of OpenMP Applications on Multi-core Systems Using Large Page Support. IPDPS 2007: 1-8 - [c171]Karthikeyan Vaidyanathan, Wei Huang, Lei Chai, Dhabaleswar K. Panda:
Designing Efficient Asynchronous Memory Operations Using Hardware Copy Engine: A Case Study with I/OAT. IPDPS 2007: 1-8 - [c170]Karthikeyan Vaidyanathan, Sundeep Narravula, Pavan Balaji, Dhabaleswar K. Panda:
Designing Efficient Systems Services and Primitives for Next-Generation Data-Centers. IPDPS 2007: 1-6 - [c169]Abhinav Vishnu, Brad Benton, Dhabaleswar K. Panda:
High Performance MPI on IBM 12x InfiniBand Architecture. IPDPS 2007: 1-8 - [c168]Abhinav Vishnu, Amith R. Mamidala, Sundeep Narravula, Dhabaleswar K. Panda:
Automatic Path Migration over InfiniBand: Early Experiences. IPDPS 2007: 1-8 - [c167]Karthikeyan Vaidyanathan, Dhabaleswar K. Panda:
Benefits of I/O Acceleration Technology (I/OAT) in Clusters. ISPASS 2007: 220-229 - [c166]Amith R. Mamidala, Sundeep Narravula, Abhinav Vishnu, Gopalakrishnan Santhanaraman, Dhabaleswar K. Panda:
On using connection-oriented vs. connection-less transport for performance and scalability of collective and one-sided operations: trade-offs and impact. PPoPP 2007: 46-54 - [c165]Gopalakrishnan Santhanaraman, Sundeep Narravula, Amith R. Mamidala, Dhabaleswar K. Panda:
MPI-2 One-Sided Usage and Implementation for Read Modify Write Operations: A Case Study with HPCC. PVM/MPI 2007: 251-259 - [c164]Lei Chai, Xiangyong Ouyang, Ranjit Noronha, Dhabaleswar K. Panda:
pNFS/PVFS2 over InfiniBand: early experiences. PDSW 2007: 5-11 - [c163]Wei Huang, Matthew J. Koop, Qi Gao, Dhabaleswar K. Panda:
Virtual machine aware communication libraries for high performance computing. SC 2007: 9 - [c162]Qi Gao, Feng Qin, Dhabaleswar K. Panda:
DMTracker: finding bugs in large-scale parallel programs by detecting anomaly in data movements. SC 2007: 15 - [c161]Pavan Balaji, Wu-chun Feng, Sitha Bhagvat, Dhabaleswar K. Panda, Rajeev Thakur
, William Gropp
:
Analyzing the impact of supporting out-of-order communication on in-order performance with iWARP. SC 2007: 35 - [c160]Wei Huang, Jiuxing Liu, Matthew J. Koop, Bülent Abali, Dhabaleswar K. Panda:
Nomad: migrating OS-bypass networks in virtual machines. VEE 2007: 158-168 - [e3]John W. Lockwood, Fabrizio Petrini, Ron Brightwell, Dhabaleswar K. Panda:
15th Annual IEEE Symposium on High-Performance Interconnects, HOTI 2007, Stanford, CA, USA, August 22-24, 2007. IEEE Computer Society 2007, ISBN 978-0-7695-2979-0 [contents] - 2006
- [j35]Jarek Nieplocha, Vinod Tipparaju, Manojkumar Krishnan, Dhabaleswar K. Panda:
High Performance Remote Memory Access Communication: The Armci Approach. Int. J. High Perform. Comput. Appl. 20(2): 233-253 (2006) - [j34]Fabrizio Petrini, Adam Moody, Juan Fernández Peinador, Eitan Frachtenberg
, Dhabaleswar K. Panda:
NIC-based reduction algorithms for large-scale clusters. Int. J. High Perform. Comput. Netw. 4(3/4): 122-136 (2006) - [j33]Pavan Balaji, Wu-chun Feng, Dhabaleswar K. Panda:
Bridging the Ethernet-Ethernot Performance Gap. IEEE Micro 26(3): 24-40 (2006) - [c159]Lei Chai, Ranjit Noronha, Dhabaleswar K. Panda:
MPI over uDAPL: Can High Performance and Portability Exist Across Architectures?. CCGRID 2006: 19-26 - [c158]Wei Huang, Gopalakrishnan Santhanaraman, Hyun-Wook Jin, Qi Gao, Dhabaleswar K. Panda:
Design of High Performance MVAPICH2: MPI2 over InfiniBand. CCGRID 2006: 43-48 - [c157]Sundeep Narravula, Hyun-Wook Jin, Karthikeyan Vaidyanathan, Dhabaleswar K. Panda:
Designing Efficient Cooperative Caching Schemes for Multi-Tier Data-Centers over RDMA-enabled Networks. CCGRID 2006: 401-408 - [c156]Lei Chai, Albert Hartono, Dhabaleswar K. Panda:
Designing High Performance and Scalable MPI Intra-node Communication Support for Clusters. CLUSTER 2006 - [c155]Karthikeyan Vaidyanathan, Hyun-Wook Jin, Dhabaleswar K. Panda:
Exploiting RDMA operations for Providing Efficient Fine-Grained Resource Monitoring in Cluster-based Servers. CLUSTER 2006 - [c154]Karthikeyan Vaidyanathan, Sundeep Narravula, Dhabaleswar K. Panda:
DDSS: A Low-Overhead Distributed Data Sharing Substrate for Cluster-Based Data-Centers over Modern Interconnects. HiPC 2006: 472-484 - [c153]Matthew J. Koop, Wei Huang, Abhinav Vishnu, Dhabaleswar K. Panda:
Memory Scalability Evaluation of the Next-Generation Intel Bensley Platform with InfiniBand. Hot Interconnects 2006: 52-60 - [c152]Hyun-Wook Jin, Sundeep Narravula, Karthikeyan Vaidyanathan, Dhabaleswar K. Panda:
NemC: A Network Emulator for Cluster-of-Clusters. ICCCN 2006: 177-182 - [c151]Shuang Liang, Weikuan Yu
, Dhabaleswar K. Panda:
High Performance Block I/O for Global File System (GFS) with InfiniBand RDMA. ICPP 2006: 391-398 - [c150]Qi Gao, Weikuan Yu
, Wei Huang, Dhabaleswar K. Panda:
Application-Transparent Checkpoint/Restart for MPI Programs over InfiniBand. ICPP 2006: 471-478 - [c149]Wei Huang, Jiuxing Liu, Bülent Abali, Dhabaleswar K. Panda:
A case for high performance computing with virtual machines. ICS 2006: 125-134 - [c148]Pavan Balaji, Sitha Bhagvat, Hyun-Wook Jin, Dhabaleswar K. Panda:
Asynchronous zero-copy communication for synchronous sockets in the sockets direct protocol (SDP) over InfiniBand. IPDPS 2006 - [c147]Pavan Balaji, Karthikeyan Vaidyanathan, Sundeep Narravula, Hyun-Wook Jin, Dhabaleswar K. Panda:
Designing next generation data-centers with advanced communication protocols and systems services. IPDPS 2006 - [c146]Amith R. Mamidala, Lei Chai, Hyun-Wook Jin, Dhabaleswar K. Panda:
Efficient SMP-aware MPI-level broadcast over InfiniBand's hardware multicast. IPDPS 2006 - [c145]Sayantan Sur, Lei Chai, Hyun-Wook Jin, Dhabaleswar K. Panda:
Shared receive queue based scalable MPI design for InfiniBand clusters. IPDPS 2006 - [c144]Weikuan Yu
, Qi Gao, Dhabaleswar K. Panda:
Adaptive connection management for scalable MPI over InfiniBand. IPDPS 2006 - [c143]Weikuan Yu, Ranjit Noronha, Shuang Liang, Dhabaleswar K. Panda:
Benefits of high speed interconnects to cluster file systems: a case study with Lustre. IPDPS 2006 - [c142]Sayantan Sur, Hyun-Wook Jin, Lei Chai, Dhabaleswar K. Panda:
RDMA read based rendezvous protocol for MPI over InfiniBand: design alternatives and benefits. PPoPP 2006: 32-39 - [c141]Amith R. Mamidala, Abhinav Vishnu, Dhabaleswar K. Panda:
Efficient Shared Memory and RDMA Based Design for MPI_Allgather over InfiniBand. PVM/MPI 2006: 66-75 - [c140]Leslie S. Perkins, Phil Andrews, Dhabaleswar K. Panda, Dave Morton, Ron Bonica, Nick Henry Werstiuk, Randy Kreiser:
Panel: Data intensive computing. SC 2006: 69 - [c139]Abhinav Vishnu, Prachi Gupta, Amith R. Mamidala, Dhabaleswar K. Panda:
Scalable systems software - A software based approach for providing network fault tolerance in clusters with uDAPL interface: MPI level design and performance evaluation. SC 2006: 85 - [c138]Sayantan Sur, Matthew J. Koop, Dhabaleswar K. Panda:
MPI and communication - High-performance and scalable MPI over InfiniBand with reduced memory usage: an in-depth performance analysis. SC 2006: 105 - [c137]Jiuxing Liu, Wei Huang, Bülent Abali, Dhabaleswar K. Panda:
High Performance VMM-Bypass I/O in Virtual Machines. USENIX ATC, General Track 2006: 29-42 - 2005
- [j32]Gopalakrishnan Santhanaraman, Jiesheng Wu, Wei Huang, Dhabaleswar K. Panda:
Designing Zero-Copy Message Passing Interface Derived Datatype Communication Over Infiniband: Alternative Approaches and Performance Evaluation. Int. J. High Perform. Comput. Appl. 19(2): 129-142 (2005) - [j31]Weikuan Yu
, Sayantan Sur, Dhabaleswar K. Panda, Rob T. Aulwes, Richard L. Graham:
High Performance Broadcast Support in La-Mpi Over Quadrics. Int. J. High Perform. Comput. Appl. 19(4): 453-463 (2005) - [j30]Rajkumar Kettimuthu, Vijay Subramani, Srividya Srinivasan, Thiagaraja Gopalsamy, Dhabaleswar K. Panda, P. Sadayappan:
Selective preemption strategies for parallel job scheduling. Int. J. High Perform. Comput. Netw. 3(2/3): 122-152 (2005) - [j29]Hyun-Wook Jin, Pavan Balaji, Chuck Yoo, Jin-Young Choi
, Dhabaleswar K. Panda:
Exploiting NIC architectural support for enhancing IP-based protocols on high-performance networks. J. Parallel Distributed Comput. 65(11): 1348-1365 (2005) - [j28]Jiuxing Liu, Amith R. Mamidala, Abhinav Vishnu, Dhabaleswar K. Panda:
Evaluating InfiniBand Performance with PCI Express. IEEE Micro 25(1): 20-29 (2005) - [c136]Sundeep Narravula, Pavan Balaji, Karthikeyan Vaidyanathan, Hyun-Wook Jin, Dhabaleswar K. Panda:
Architecture for caching responses with multiple dynamic dependencies in multi-tier data-centers over InfiniBand. CCGRID 2005: 374-381 - [c135]Ranjit Noronha, Dhabaleswar K. Panda:
Can high performance software DSM systems designed with InfiniBand features benefit from PCI-Express? CCGRID 2005: 945-952 - [c134]Pavan Balaji, Wu-chun Feng, Qi Gao, Ranjit Noronha, Weikuan Yu
, Dhabaleswar K. Panda:
Head-to-TOE Evaluation of High-Performance Sockets over Protocol Offload Engines. CLUSTER 2005: 1-10 - [c133]Pavan Balaji, Hyun-Wook Jin, Karthikeyan Vaidyanathan, Dhabaleswar K. Panda:
Supporting iWARP Compatibility and Features for Regular Network Adapters. CLUSTER 2005: 1-10 - [c132]Shuang Liang, Ranjit Noronha, Dhabaleswar K. Panda:
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device. CLUSTER 2005: 1-10 - [c131]Ranjit Noronha, Dhabaleswar K. Panda:
Performance Evaluation of MM5 on Clusters with Modern Interconnects: Scalability and Impact. Euro-Par 2005: 134-145 - [c130]Abhinav Vishnu, Gopalakrishnan Santhanaraman, Wei Huang, Hyun-Wook Jin, Dhabaleswar K. Panda:
Supporting MPI-2 One Sided Communication on Multi-rail InfiniBand Clusters: Design Challenges and Performance Benefits. HiPC 2005: 137-147 - [c129]Sayantan Sur, Uday Bondhugula, Amith R. Mamidala, Hyun-Wook Jin, Dhabaleswar K. Panda:
High Performance RDMA Based All-to-All Broadcast for InfiniBand Clusters. HiPC 2005: 148-157 - [c128]Sayantan Sur, Abhinav Vishnu, Hyun-Wook Jin, Wei Huang, Dhabaleswar K. Panda:
Can Memory-Less Network Adapters Benefit Next-Generation InfiniBand Systems?. Hot Interconnects 2005: 45-50 - [c127]Wu-chun Feng, Pavan Balaji, Christopher Baron, Laxmi N. Bhuyan, Dhabaleswar K. Panda:
Performance Characterization of a 10-Gigabit Ethernet TOE. Hot Interconnects 2005: 58-63 - [c126]Hyun-Wook Jin, Sayantan Sur, Lei Chai, Dhabaleswar K. Panda:
LiMIC: Support for High-Performance MPI Intra-node Communication on Linux Cluster. ICPP 2005: 184-191 - [c125]Weikuan Yu
, Shuang Liang, Dhabaleswar K. Panda:
High performance support of parallel virtual file system (PVFS2) over Quadrics. ICS 2005: 323-331 - [c124]Lei Chai, Sayantan Sur, Hyun-Wook Jin, Dhabaleswar K. Panda:
Analysis of Design Considerations for Optimizing Multi-Channel MPI over InfiniBand. IPDPS 2005 - [c123]Wei Huang, Gopalakrishnan Santhanaraman, Hyun-Wook Jin, Dhabaleswar K. Panda:
Scheduling of MPI-2 One Sided Operations over InfiniBand. IPDPS 2005 - [c122]Abhinav Vishnu, Amith R. Mamidala, Hyun-Wook Jin, Dhabaleswar K. Panda:
Performance Modeling of Subnet Management on Fat Tree InfiniBand Networks using OpenSM. IPDPS 2005 - [c121]Weikuan Yu
, Timothy S. Woodall, Richard L. Graham, Dhabaleswar K. Panda:
Design and Implementation of Open MPI over Quadrics/Elan4. IPDPS 2005 - [c120]Pavan Balaji, Sundeep Narravula, Karthikeyan Vaidyanathan, Hyun-Wook Jin, Dhabaleswar K. Panda:
On the provision of prioritization and soft qos in dynamically reconfigurable shared data-centers over infiniband. ISPASS 2005: 280-289 - [c119]Wei Huang, Gopalakrishnan Santhanaraman, Hyun-Wook Jin, Dhabaleswar K. Panda:
Design Alternatives and Performance Trade-Offs for Implementing MPI-2 over InfiniBand. PVM/MPI 2005: 191-199 - [c118]Lei Chai, Ranjit Noronha, Prachi Gupta, G. Brown, Dhabaleswar K. Panda:
Designing a Portable MPI-2 over Modern Interconnects Using uDAPL Interface. PVM/MPI 2005: 200-208 - [c117]Amith R. Mamidala, Hyun-Wook Jin, Dhabaleswar K. Panda:
Efficient Hardware Multicast Group Management for Multiple MPI Communicators over InfiniBand. PVM/MPI 2005: 388-398 - 2004
- [j27]Adam Wagner, Darius Buntinas, Ron Brightwell, Dhabaleswar K. Panda:
Application-bypass reduction for large-scale clusters. Int. J. High Perform. Comput. Netw. 2(2/3/4): 99-109 (2004) - [j26]Jarek Nieplocha, Vinod Tipparaju, Manojkumar Krishnan, Gopalakrishnan Santhanaraman, Dhabaleswar K. Panda:
Optimisation and performance evaluation of mechanisms for latency tolerance in remote memory access communication on clusters. Int. J. High Perform. Comput. Netw. 2(2/3/4): 198-209 (2004) - [j25]Jiuxing Liu, Jiesheng Wu, Dhabaleswar K. Panda:
High Performance RDMA-Based MPI Implementation over InfiniBand. Int. J. Parallel Program. 32(3): 167-198 (2004) - [j24]Jiuxing Liu, B. Chandrasekaran, Weikuan Yu
, Jiesheng Wu, Darius Buntinas, Sushmitha P. Kini, Dhabaleswar K. Panda, Pete Wyckoff:
Microbenchmark Performance Comparison of High-Speed Cluster Interconnects. IEEE Micro 24(1): 42-51 (2004) - [c116]Ranjit Noronha, Dhabaleswar K. Panda:
Designing high performance DSM systems using InfiniBand features. CCGRID 2004: 467-474 - [c115]Jiesheng Wu, Pete Wyckoff, Dhabaleswar K. Panda, Robert B. Ross:
Unifier: unifying cache management and communication buffer management for PVFS over InfiniBand. CCGRID 2004: 523-530 - [c114]Weihang Jiang, Jiuxing Liu, Hyun-Wook Jin, Dhabaleswar K. Panda, William Gropp
, Rajeev Thakur:
High performance MPI-2 one-sided communication over InfiniBand. CCGRID 2004: 531-538 - [c113]Dhabaleswar K. Panda:
State of InfiniBand in designing HPC clusters, storage/file systems, and datacenters [datacenters read as data centers]. CLUSTER 2004: 3 - [c112]Weikuan Yu
, Dhabaleswar K. Panda, Darius Buntinas:
Scalable, high-performance NIC-based all-to-all broadcast over Myrinet/GM. CLUSTER 2004: 125-134 - [c111]Amith R. Mamidala, Jiuxing Liu, Dhabaleswar K. Panda:
Efficient Barrier and Allreduce on Infiniband clusters using multicast and adaptive algorithms. CLUSTER 2004: 135-144 - [c110]Adam Wagner, Hyun-Wook Jin, Dhabaleswar K. Panda, Rolf Riesen:
NIC-based offload of dynamic user-defined modules for Myrinet clusters. CLUSTER 2004: 205-214 - [c109]Mohammad Islam, Pavan Balaji, P. Sadayappan, Dhabaleswar K. Panda:
Towards provision of quality of service guarantees in job scheduling. CLUSTER 2004: 245-254 - [c108]Weikuan Yu
, Jiesheng Wu, Dhabaleswar K. Panda:
Fast and Scalable Startup of MPI Programs in InfiniBand Clusters. HiPC 2004: 440-449 - [c107]Jiuxing Liu, Amith R. Mamidala, Abhinav Vishnu, Dhabaleswar K. Panda:
Performance evaluation of InfiniBand with PCI Express. Hot Interconnects 2004: 13-19 - [c106]Sayantan Sur, Hyun-Wook Jin, Dhabaleswar K. Panda:
Efficient and Scalable All-to-All Personalized Exchange for InfiniBand-Based Clusters. ICPP 2004: 275-282 - [c105]Qingda Lu, Jiesheng Wu, Dhabaleswar K. Panda, P. Sadayappan:
Applying MPI Derived Datatypes to the NAS Benchmarks: A Case Study. ICPP Workshops 2004: 538-545 - [c104]Jiuxing Liu, Weihang Jiang, Pete Wyckoff, Dhabaleswar K. Panda, David Ashton, Darius Buntinas, William D. Gropp, Brian R. Toonen:
Design and Implementation of MPICH2 over InfiniBand with RDMA Support. IPDPS 2004 - [c103]Jiuxing Liu, Amith R. Mamidala, Dhabaleswar K. Panda:
Fast and Scalable MPI-Level Broadcast Using InfiniBand?s Hardware Multicast Support. IPDPS 2004 - [c102]Jiuxing Liu, Dhabaleswar K. Panda:
Implementing Efficient and Scalable Flow Control Schemes in MPI over InfiniBand. IPDPS 2004 - [c101]Vinod Tipparaju, Gopalakrishnan Santhanaraman, Jarek Nieplocha, Dhabaleswar K. Panda:
Host-Assisted Zero-Copy Remote Memory Access Communication on InfiniBand. IPDPS 2004 - [c100]Jiesheng Wu, Pete Wyckoff, Dhabaleswar K. Panda:
High Performance Implementation of MPI Derived Datatype Communication over InfiniBand. IPDPS 2004 - [c99]Weikuan Yu, Darius Buntinas, Richard L. Graham, Dhabaleswar K. Panda:
Efficient and Scalable Barrier over Quadrics and Myrinet with a New NIC-Based Collective Message Passing Protocol. IPDPS 2004 - [c98]Pavan Balaji, Sundeep Narravula, Karthikeyan Vaidyanathan, Savitha Krishnamoorthy, Jiesheng Wu, Dhabaleswar K. Panda:
Sockets Direct Protocol over InfiniBand in clusters: is it beneficial? ISPASS 2004: 28-35 - [c97]Gopalakrishnan Santhanaraman, Dhabaleswar Wu, Dhabaleswar K. Panda:
Zero-Copy MPI Derived Datatype Communication over InfiniBand. PVM/MPI 2004: 47-56 - [c96]Weihang Jiang, Jiuxing Liu, Hyun-Wook Jin, Dhabaleswar K. Panda, Darius Buntinas, Rajeev Thakur
, William D. Gropp
:
Efficient Implementation of MPI-2 Passive One-Sided Communication on InfiniBand Clusters. PVM/MPI 2004: 68-76 - [c95]Jiuxing Liu, Abhinav Vishnu, Dhabaleswar K. Panda:
Building Multirail InfiniBand Clusters: MPI-Level Design and Performance Evaluation. SC 2004: 33 - [i2]Weikuan Yu, Darius Buntinas, Richard L. Graham, Dhabaleswar K. Panda:
Efficient and Scalable Barrier over Quadrics and Myrinet with a New NIC-Based Collective Message Passing Protocol. CoRR cs.DC/0402027 (2004) - 2003
- [c94]Darius Buntinas, Dhabaleswar K. Panda, Ron Brightwell:
Application-Bypas Broadcast in MPICH over GM. CCGRID 2003: 2-9 - [c93]Jarek Nieplocha, Vinod Tipparaju, Manojkumar Krishnan, Gopalakrishnan Santhanaraman, Dhabaleswar K. Panda:
Optimizing Mechanisms for Latency Tolerance in Remote Memory Access Communication on Clusters. CLUSTER 2003: 138-147 - [c92]Dhabaleswar K. Panda:
Designing Next Generation Clusters with Infiniband: Opportunities and Challenges. CLUSTER 2003 - [c91]Jiesheng Wu, Pete Wyckoff, Dhabaleswar K. Panda:
Supporting Efficient Noncontiguous Access in PVFS over InfiniBand. CLUSTER 2003: 344- - [c90]Adam Wagner, Darius Buntinas, Dhabaleswar K. Panda, Ron Brightwell:
Application-Bypass Reduction for Large-Scale Clusters. CLUSTER 2003: 404-411 - [c89]B. Chandrasekaran, Pete Wyckoff, Dhabaleswar K. Panda:
MIBA: A Micro-Benchmark Suite for Evaluating InfiniBand Architecture Implementations. Computer Performance Evaluation / TOOLS 2003: 29-46 - [c88]Vinod Tipparaju, Manojkumar Krishnan, Jarek Nieplocha, Gopalakrishnan Santhanaraman, Dhabaleswar K. Panda:
Exploiting Non-blocking Remote Memory Access Communication in Scientific Benchmarks. HiPC 2003: 248-258 - [c87]Jiuxing Liu, Balasubramanian Chandrasekaran, Weikuan Yu
, Jiesheng Wu, Darius Buntinas, Sushmitha P. Kini, Pete Wyckoff, Dhabaleswar K. Panda:
Micro-benchmark level performance comparison of high-speed cluster interconnects. Hot Interconnects 2003: 60-65 - [c86]Pavan Balaji, Jiesheng Wu, Tahsin M. Kurç, Ümit V. Çatalyürek, Dhabaleswar K. Panda, Joel H. Saltz:
Impact of High Performance Sockets on Data Intensive Applications. HPDC 2003: 24-33 - [c85]S. Senapathi, B. Chandrasekaran, Don Stredney, Han-Wei Shen, Dhabaleswar K. Panda:
QoS-Aware Middleware for Cluster-Based Servers to support Interactive and Resource-Adaptive Applications. HPDC 2003: 205-215 - [c84]Jiesheng Wu, Pete Wyckoff, Dhabaleswar K. Panda:
PVFS over InfiniBand: Design and Performance Evaluation. ICPP 2003: 125-132 - [c83]Weikuan Yu
, Darius Buntinas, Dhabaleswar K. Panda:
High Performance and Reliable NIC-Based Multicast over Myrinet/GM-2. ICPP 2003: 197-204 - [c82]Jiuxing Liu, Jiesheng Wu, Sushmitha P. Kini, Pete Wyckoff, Dhabaleswar K. Panda:
High performance RDMA-based MPI implementation over InfiniBand. ICS 2003: 295-304 - [c81]Rinku Gupta, Pavan Balaji, Dhabaleswar K. Panda, Jarek Nieplocha:
Efficient Collective Operations Using Remote Memory Operations on VIA-Based Clusters. IPDPS 2003: 46 - [c80]Vinod Tipparaju, Jarek Nieplocha, Dhabaleswar K. Panda:
Fast Collective Operations Using Shared and Remote Memory Access Protocols on Clusters. IPDPS 2003: 84 - [c79]Darius Buntinas, Amina Saify, Dhabaleswar K. Panda, Jarek Nieplocha:
Optimizing Synchronization Operations for Remote Memory Communication Systems. IPDPS 2003: 199 - [c78]Ranjit Noronha, Dhabaleswar K. Panda:
Implementing TreadMarks over GM on Myrinet: Challenges, Design Experience, and Performance Evaluation. IPDPS 2003: 200 - [c77]Mohammad Islam, Pavan Balaji, P. Sadayappan, Dhabaleswar K. Panda:
QoPS: A QoS Based Scheme for Parallel Job Scheduling. JSSPP 2003: 252-268 - [c76]Matthew Eric Otey, Srinivasan Parthasarathy
, Amol Ghoting, G. Li, Sundeep Narravula, Dhabaleswar K. Panda:
Towards NIC-based intrusion detection. KDD 2003: 723-728 - [c75]Sushmitha P. Kini, Jiuxing Liu, Jiesheng Wu, Pete Wyckoff, Dhabaleswar K. Panda:
Fast and Scalable Barrier Using RDMA and Multicast Mechanisms for InfiniBand-Based Clusters. PVM/MPI 2003: 369-378 - [c74]Jiuxing Liu, B. Chandrasekaran, Jiesheng Wu, Weihang Jiang, Sushmitha P. Kini, Weikuan Yu
, Darius Buntinas, Pete Wyckoff, Dhabaleswar K. Panda:
Performance Comparison of MPI Implementations over InfiniBand, Myrinet and Quadrics. SC 2003: 58 - [c73]Adam Moody, Juan Fernández, Fabrizio Petrini, Dhabaleswar K. Panda:
Scalable NIC-based Reduction on Large-scale Clusters. SC 2003: 59 - [c72]Jiesheng Wu, Pete Wyckoff, Dhabaleswar K. Panda:
Demotion-based exclusive caching through demote buffering: design and evaluations over different networks. SNAPI@PACT 2003: 73-80 - [i1]Jiuxing Liu, Weihang Jiang, Pete Wyckoff, Dhabaleswar K. Panda, David Ashton, Darius Buntinas, William Gropp, Brian R. Toonen:
Design and Implementation of MPICH2 over InfiniBand with RDMA Support. CoRR cs.AR/0310059 (2003) - 2002
- [j23]Rajeev Sivaram, Craig B. Stunkel, Dhabaleswar K. Panda:
HIPIQS: A High-Performance Switch Architecture Using Input Queuing. IEEE Trans. Parallel Distributed Syst. 13(3): 275-289 (2002) - [c71]Rinku Gupta, Vinod Tipparaju, Jarek Nieplocha, Dhabaleswar K. Panda:
Efficient Barrier Using Remote Memory Operations on VIA-Based Clusters. CLUSTER 2002: 83- - [c70]Jiesheng Wu, Jiuxing Liu, Pete Wyckoff, Dhabaleswar K. Panda:
Impact of On-Demand Connection Management in MPI over VIA. CLUSTER 2002: 152-159 - [c69]Pavan Balaji, Piyush Shivam, Pete Wyckoff, Dhabaleswar K. Panda:
High Performance User Level Sockets over Gigabit Ethernet. CLUSTER 2002: 179-186 - [c68]Dhabaleswar K. Panda:
Tutorial 2: InfiniBand Architecture and Where it is Headed. Hot Interconnects 2002: 157-158 - [c67]Thiagaraja Gopalsamy, Mukesh Singhal, Dhabaleswar K. Panda, P. Sadayappan:
A Reliable Multicast Algorithm for Mobile Ad Hoc Networks. ICDCS 2002: 563-570 - [c66]Jarek Nieplocha, Vinod Tipparaju, Amina Saify, Dhabaleswar K. Panda:
Protocols and Strategies for Optimizing Performance of Remote Memory Operations on Clusters. IPDPS 2002 - [c65]Dhabaleswar K. Panda, José Duato, Craig B. Stunkel:
Workshop Introduction. IPDPS 2002 - [c64]Piyush Shivam, Pete Wyckoff, Dhabaleswar K. Panda:
Can User-Level Protocols Take Advantage of Multi-CPU NICs?. IPDPS 2002 - [c63]Jiesheng Wu, Dhabaleswar K. Panda:
MPI/IO on DAFS over VIA: Implementation and Performance Evaluation. IPDPS 2002 - [c62]Dhabaleswar K. Panda:
Active Network Interface: Opportunities and Challenges. LCN 2002: 605 - [c61]Naveen Kumar Polapally, Raghu Machiraju, Dhabaleswar K. Panda:
Feature estimation for efficient streaming. VolVis 2002: 107-114 - 2001
- [j22]Bülent Abali, Craig B. Stunkel, Jay Herring, Mohammad Banikazemi, Dhabaleswar K. Panda, Cevdet Aykanat, Yucel Aydogan:
Adaptive Routing on the New Switch Chip for IBM SP Systems. J. Parallel Distributed Comput. 61(9): 1148-1179 (2001) - [j21]Mohammad Banikazemi, Bülent Abali, Lorraine Herger, Dhabaleswar K. Panda:
Design Alternatives for Virtual Interface Architecture and an Implementation on IBM Netfinity NT Cluster. J. Parallel Distributed Comput. 61(11): 1512-1545 (2001) - [j20]Rajeev Sivaram, Ram Kesavan, Dhabaleswar K. Panda, Craig B. Stunkel:
Architectural Support for Efficient Multicasting in Irregular Networks. IEEE Trans. Parallel Distributed Syst. 12(5): 489-513 (2001) - [j19]Ram Kesavan, Dhabaleswar K. Panda:
Efficient Multicast on Irregular Switch-Based Cut-Through Networks with Up-Down Routing. IEEE Trans. Parallel Distributed Syst. 12(8): 808-828 (2001) - [j18]Mohammad Banikazemi, Rama Govindaraju, Robert Blackmore, Dhabaleswar K. Panda:
MPI-LAPI: An Efficient Implementation of MPI for IBM RS/6000 SP Systems. IEEE Trans. Parallel Distributed Syst. 12(10): 1081-1093 (2001) - [j17]N. S. Sundar, Doddaballapur Narasimha-Murthy Jayasimha, Dhabaleswar K. Panda:
Hybrid Algorithms for Complete Exchange in 2D Meshes. IEEE Trans. Parallel Distributed Syst. 12(12): 1201-1218 (2001) - [c60]Mohammad Banikazemi, Jiuxing Liu, Dhabaleswar K. Panda, P. Sadayappan:
Implementing TreadMarksover VIA on Myrinet and Gigabit Ethernet: Challenges, Design Experience, and Performance Evaluation. ICPP 2001: 167-174 - [c59]Abhishek Gulati, Dhabaleswar K. Panda, P. Sadayappan, Pete Wyckoff:
NIC-Based Rate Control for Proportional Bandwidth Allocation in Myrinet Clusters. ICPP 2001: 305-312 - [c58]Mohammad Banikazemi, Jiuxing Liu, S. Kutlug, P. Sadayappan, H. Shah, Dhabaleswar K. Panda:
VIBe: A Micro-benchmark Suite for Evaluating Virtual Interface Architecture (VIA) Implementations. IPDPS 2001: 24 - [c57]Darius Buntinas, Dhabaleswar K. Panda, P. Sadayappan:
Fast NIC-Based Barrier over Myrinet/GM. IPDPS 2001: 52 - [c56]Amit Singhal, Mohammad Banikazemi, P. Sadayappan, Dhabaleswar K. Panda:
Efficient Multicast Algorithms for Heterogeneous Switch-based Irregular Networks of Workstations. IPDPS 2001: 71 - [c55]Darius Buntinas, Dhabaleswar K. Panda, P. Sadayappan:
Performance Benefits of NIC-Based Barrier on Myrinet/GM. IPDPS 2001: 166 - [c54]Piyush Shivam, Pete Wyckoff, Dhabaleswar K. Panda:
EMP: zero-copy OS-bypass NIC-driven gigabit ethernet message passing. SC 2001: 57 - 2000
- [j16]Rajeev Sivaram, Craig B. Stunkel, Dhabaleswar K. Panda:
Implementing Multidestination Worms in Switch-Based Parallel Systems: Architectural Alternatives and Their Impact. IEEE Trans. Parallel Distributed Syst. 11(8): 794-812 (2000) - [c53]Vijay Moorthy, Dhabaleswar K. Panda, P. Sadayappan:
Fast Collective Communication Algorithms for Reflective Memory Network Clusters. CANPC 2000: 100-114 - [c52]Darius Buntinas, Dhabaleswar K. Panda, José Duato, P. Sadayappan:
Broadcast/Multicast over Myrinet Using NIC-Assisted Multidestination Messages. CANPC 2000: 115-129 - [c51]Mohammad Banikazemi, Bülent Abali, Dhabaleswar K. Panda:
Comparison and Evaluation of Design Choices for Implementing the Virtual Interface Architecture (VIA). CANPC 2000: 145-161 - [c50]Praveen Holenarsipur, Vladimir Yarmolenko, José Duato, Dhabaleswar K. Panda, P. Sadayappan:
Characterization and enhancement of Static Mapping Heuristics for Heterogeneous Systems. HiPC 2000: 37-48 - [c49]Mohammad Banikazemi, Dhabaleswar K. Panda:
Can Scatter Communication Take Advantage of Multidestination Message Passing? HiPC 2000: 204-211 - [c48]Vladimir Yarmolenko, José Duato
, Dhabaleswar K. Panda, P. Sadayappan:
Characterization and Enhancement of Dynamic Mapping Heuristics for Heterogeneous Systems. ICPP Workshops 2000: 437-446 - [c47]Arindam Paul, Wu-chi Feng, Dhabaleswar K. Panda, P. Sadayappan:
Balancing Web Server Load for Adaptable Video Distribution. ICPP Workshops 2000: 469-478 - [c46]Mohammad Banikazemi, Vijay Moorthy, Dhabaleswar K. Panda, Lorraine Herger, Bülent Abali:
Efficient Virtual Interface Architecture (VIA) Support for the IBM SP Switch-Connected NT Clusters. IPDPS 2000: 33-42 - [c45]Mohammad Banikazemi, Dhabaleswar K. Panda, Craig B. Stunkel, Bülent Abali:
Adaptive Routing in RS/6000 SP-Like Bidirectional Multistage Interconnection Networks. IPDPS 2000: 43-52
1990 – 1999
- 1999
- [j15]Donglai Dai, Dhabaleswar K. Panda:
Exploiting the Benefits of Multiple-Path Network DSM Systems: Architectural Alternatives and Performance Evaluation. IEEE Trans. Computers 48(2): 236-244 (1999) - [j14]Dhabaleswar K. Panda, Sanjay Singal, Ram Kesavan:
Multidestination Message Passing in Wormhole k-ary n-cube Networks with Base Routing Conformed Paths. IEEE Trans. Parallel Distributed Syst. 10(1): 76-96 (1999) - [j13]Ram Kesavan, Dhabaleswar K. Panda:
Multiple Multicast with Minimized Node Contention on Wormhole k-ary n-cube Networks. IEEE Trans. Parallel Distributed Syst. 10(4): 371-393 (1999) - [c44]Matthew G. Jacunski, Vijay Moorthy, Peter P. Ware, Manoj Pillai, Dhabaleswar K. Panda, P. Sadayappan:
Low Latency Message-Passing for Reflective Memory Networks. CANPC 1999: 211-224 - [c43]Mohammad Banikazemi, Jayanthi Sampathkumar, Sandeep Prabhu, Dhabaleswar K. Panda, P. Sadayappan:
Communication Modeling of Heterogeneous Networks of Workstations for Performance Characterization of Collective Operations. Heterogeneous Computing Workshop 1999: 125- - [c42]Vijay Moorthy, Matthew G. Jacunski, Manoj Pillai, Peter P. Ware, Dhabaleswar K. Panda, Thomas W. Page Jr., P. Sadayappan, V. Nagarajan, Johns Daniel:
Low-Latency Message Passing on Workstation Clusters using SCRAMNet. IPPS/SPDP 1999: 148-152 - [c41]Mohammad Banikazemi, Rama Govindaraju, Robert Blackmore, Dhabaleswar K. Panda:
Implementing Efficient MPI on LAPI for IBM RS/6000 SP Systems: Experiences and Performance Evaluation. IPPS/SPDP 1999: 183-190 - [c40]Matthew G. Jacunski, P. Sadayappan, Dhabaleswar K. Panda:
All-to-All Broadcast on Switch-Based Clusters of Workstations. IPPS/SPDP 1999: 325-329 - 1998
- [j12]Ravi Prakash
, Dhabaleswar K. Panda:
Designing communication strategies for heterogeneous parallel systems. Parallel Comput. 24(14): 2035-2052 (1998) - [j11]Debashis Basak, Dhabaleswar K. Panda:
Alleviating Consumption Channel Bottleneck in Wormhole-Routed k-ary n-Cube Systems. IEEE Trans. Parallel Distributed Syst. 9(5): 481-496 (1998) - [j10]Rajeev Sivaram, Dhabaleswar K. Panda, Craig B. Stunkel:
Efficient Broadcast and Multicast on Multistage Interconnection Networks Using Multiport Encoding. IEEE Trans. Parallel Distributed Syst. 9(10): 1004-1028 (1998) - [c39]Federico Silla, Manuel P. Malumbres
, José Duato
, Donglai Dai, Dhabaleswar K. Panda:
Impact of Adaptivity on the Behaviour of Networks of Workstations under Bursty Traffic. ICPP 1998: 88-95 - [c38]Rajeev Sivaram, Ram Kesavan, Dhabaleswar K. Panda, Craig B. Stunkel:
Where to Provide Support for Efficient Multicasting in Irregular Networks: Network Interface or Switch? ICPP 1998: 452-459 - [c37]Mohammad Banikazemi, Vijay Moorthy, Dhabaleswar K. Panda:
Efficient Collective Communication on Heterogeneous Networks of Workstations. ICPP 1998: 460-467 - [c36]Aravind Bala, Darshat Shah, Wu-chi Feng, Dhabaleswar K. Panda:
Experiences with Software MPEG-2 Video Decompression on an SMP PC. ICPP Workshops 1998: 29-37 - [c35]Rajeev Sivaram, Craig B. Stunkel, Dhabaleswar K. Panda:
HIPIQS: A High-Performance Switch Architecture Using Input Queuing. IPPS/SPDP 1998: 134-143 - [e2]Dhabaleswar K. Panda, Craig B. Stunkel:
Network-Based Parallel Computing: Communication, Architecture, and Applications, Second International Workshop, CANPC '98, Las Vegas, Nevada, USA, January 31 - February 1, 1998, Proceedings. Lecture Notes in Computer Science 1362, Springer 1998, ISBN 3-540-64140-8 [contents] - 1997
- [j9]Dhabaleswar K. Panda, Lionel M. Ni:
Special Issue on Workstation Clusters and Network-Based Computing: Guest Editors' Introduction. J. Parallel Distributed Comput. 40(1): 1-3 (1997) - [j8]Dhabaleswar K. Panda, Lionel M. Ni:
Special Issue on Workstation Clusters and Network-Based Computing: Guest Editors' Introduction. J. Parallel Distributed Comput. 43(2): 63-64 (1997) - [j7]Yu-Chee Tseng, Ting-Hsien Lin, Sandeep K. S. Gupta, Dhabaleswar K. Panda:
Bandwidth-Optimal Complete Exchange on Wormhole-Routed 2D/3D Torus Networks: A Diagonal-Propagation Approach. IEEE Trans. Parallel Distributed Syst. 8(4): 380-396 (1997) - [c34]Abdel-Halim Smai, Dhabaleswar K. Panda, Lars-Erik Thorelli:
Prioritized demand multiplexing (PDM): a low-latency virtual channel flow control framework for prioritized traffic. HiPC 1997: 449-454 - [c33]Ram Kesavan, Kiran Bondalapati, Dhabaleswar K. Panda:
Multicast on Irregular Switch-Based Networks with Wormhole Routing. HPCA 1997: 48-57 - [c32]Ram Kesavan, Dhabaleswar K. Panda:
Optimal Multicast with Packetization and Network Interface Support. ICPP 1997: 370-377 - [c31]Donglai Dai, Dhabaleswar K. Panda:
How Much Does Network Contention Affect Distributed Shared Memory Performance? ICPP 1997: 454-461 - [c30]Rajeev Sivaram, Craig B. Stunkel, Dhabaleswar K. Panda:
A Reliable Hardware Barrier Synchronization Scheme. IPPS 1997: 274-280 - [c29]Craig B. Stunkel, Rajeev Sivaram, Dhabaleswar K. Panda:
Implementing Multidestination Worms in Switch-Based Parallel Systems: Architectural Alternatives and their Impact. ISCA 1997: 50-61 - [c28]Rajeev Sivaram, Dhabaleswar K. Panda, Craig B. Stunkel:
Multicasting in Irregular Networks with Cut-Through Switches Using Tree-Based Multidestination Worms. PCRCW 1997: 39-54 - [c27]Dhabaleswar K. Panda:
Designing High-Performance Communication Subsystems: Top Five Problems to Solve and Five Problems Not to Solve During the Next Five Years (Panel). PCRCW 1997: 153-158 - [c26]Donglai Dai, Dhabaleswar K. Panda:
How Can We Design Better Networks for DSM Systems? PCRCW 1997: 171-184 - [c25]Ram Kesavan, Dhabaleswar K. Panda:
Multicasting on Switch-Based Irregular Networks Using Multi-drop Path-Based Multidestination Worms. PCRCW 1997: 217-230 - [c24]Dhabaleswar K. Panda, Debashis Basak, Donglai Dai, Ram Kesavan, Rajeev Sivaram, Mohammad Banikazemi, Vijay Moorthy:
Simulation of Modern Parallel Systems: A CSIM-based Approach. WSC 1997: 1013-1020 - [e1]Dhabaleswar K. Panda, Craig B. Stunkel:
Communication and Architectural Support for Network-Based Parallel Computing, First International Workshop, CANPC '97, San Antonio, Texas, USA, February 1-2, 1997, Proceedings. Lecture Notes in Computer Science 1199, Springer 1997, ISBN 3-540-62573-9 [contents] - 1996
- [j6]Yu-Chee Tseng, Dhabaleswar K. Panda, Ten-Hwang Lai:
A Trip-Based Multicasting Model in Wormhole-Routed Networks with Virtual Channels. IEEE Trans. Parallel Distributed Syst. 7(2): 138-150 (1996) - [j5]Debashis Basak, Dhabaleswar K. Panda:
Designing Clustered Multiprocessor Systems under Packaging and Technological Advancements. IEEE Trans. Parallel Distributed Syst. 7(9): 962-978 (1996) - [c23]Donglai Dai, Dhabaleswar K. Panda:
Reducing Cache Invalidation Overheads in Wormhole Routed DSMs Using Multidestination Message Passing. ICPP, Vol. 1 1996: 138-145 - [c22]Ram Kesavan, Dhabaleswar K. Panda:
Minimizing Node Contention in Multiple Multicast on Wormhole k-ary N-Cube Networks. ICPP, Vol. 1 1996: 188-195 - [c21]Debashis Basak, Dhabaleswar K. Panda:
Designing Processor-Cluster Based Systems: Interplay Between Organizations and Broadcasting Algorithms. ICPP, Vol. 1 1996: 271-274 - [c20]N. S. Sundar, Doddaballapur Narasimha-Murthy Jayasimha, Dhabaleswar K. Panda, P. Sadayappan:
Hybrid Algorithms for Complete Exchange in 2D Meshes. International Conference on Supercomputing 1996: 181-188 - [c19]Debashis Basak, Dhabaleswar K. Panda, Mohammad Banikazemi:
Benefits of Processor Clustering in Designing Large Parallel Systems: When and How? IPPS 1996: 286-290 - [c18]Rajeev Sivaram, Dhabaleswar K. Panda, Craig B. Stunkel:
Efficient broadcast and multicast on multistage interconnection networks using multiport encoding. SPDP 1996: 36-45 - 1995
- [j4]Dhabaleswar K. Panda:
Fast barrier synchronization in wormhole k-ary n-cube networks with multidestination worms. Future Gener. Comput. Syst. 11(6): 585-602 (1995) - [c17]Dhabaleswar K. Panda:
Fast Barrier Synchronization in Wormhole k-ary n-cube Networks with Multidestination Worms. HPCA 1995: 200-209 - [c16]Yu-Chee Tseng, Sandeep K. S. Gupta, Dhabaleswar K. Panda:
An efficient scheme for complete exchange in 2D tori. IPPS 1995: 532-536 - [c15]Dhabaleswar K. Panda:
Global reduction in wormhole k-ary n-cube networks with multidestination exchange worms. IPPS 1995: 652-659 - 1994
- [c14]Debashis Basak, Dhabaleswar K. Panda:
Designing Large Hierarchical Multiprocessor Systems under Processor, Interconnection, and Packaging Advancements. ICPP (1) 1994: 63-66 - [c13]Vibha A. Dixit-Radiya, Dhabaleswar K. Panda:
Clustering and Intra-Processor Scheduling for Explicitly-Parallel Programs on Distributed-Memory Systems. IPPS 1994: 609-616 - [c12]Ravi Prakash, Dhabaleswar K. Panda:
Architectural issues in designing heterogeneous parallel systems with passive star-coupled optical interconnection. ISPAN 1994: 246-253 - [c11]Dhabaleswar K. Panda, Sanjay Singal, Pradeep Prabhakaran:
Multidestination Message Passing Mechanism Conforming to Base Wormhole Routing Scheme. PCRCW 1994: 131-145 - 1993
- [c10]Shobana Balakrishnan, Dhabaleswar K. Panda:
Impact of Multiple Consumption Channels on Wormhole Routed k-ary n-cube Networks. IPPS 1993: 163-167 - [c9]Yu-Chee Tseng, Dhabaleswar K. Panda:
A Trip-Based Multicasting Model for Wormhole-Routed Networks with Virtual Channels. IPPS 1993: 276-283 - [c8]Sandeep K. S. Gupta, Dhabaleswar K. Panda:
Barrier Synchronization in Distributed-Memory Multiprocessing Using Rendezvous Primitives. IPPS 1993: 501-505 - [c7]Vibha A. Dixit-Radiya, Dhabaleswar K. Panda:
Task Assignment on Distributed-Memory Systems with Adaptive Wormhole Routing. SPDP 1993: 674-681 - [c6]Debashis Basak, Dhabaleswar K. Panda:
Scalable Architectures with k-ary n-Cube Cluster-c organization. SPDP 1993: 780-787 - 1991
- [j3]Kai Hwang, Dhabaleswar K. Panda:
Architectural Design of Orthogonal Multiprocessor for Multidimensional Information Processing. J. Inf. Sci. Eng. 7(4): 459-485 (1991) - [j2]Dhabaleswar K. Panda, Kai Hwang:
Fast Data Manipulation in Multiprocessors Using Parallel Pipelined Memories. J. Parallel Distributed Comput. 12(2): 130-145 (1991) - [c5]Dhabaleswar K. Panda, Kai Hwang:
Message Vectorization for Converting Multicomputer Programs to Shared-Memory Multiprocessors. ICPP (1) 1991: 204-211 - 1990
- [c4]Dhabaleswar Kumar Panda, Kai Hwang:
Reconfigurable vector register windows for fast matrix computation on the orthogonal multiprocessor. ASAP 1990: 202-213 - [c3]Sharad Mehrotra, Chien-Ming Cheng, Kai Hwang, Michel Dubois, Dhabaleswar K. Panda:
Algorithm-Driven Simulation and Performance Projection of a RISC-based Orthogonal Multiprocessor. ICPP (3) 1990: 244-253 - [c2]Kai Hwang, Michel Dubois, Dhabaleswar K. Panda, S. Rao, Shisheng Shang, Aydin Üresin, W. Mao, H. Nair, M. Lytwyn, F. Hsieh, J. Liu, Sharad Mehrotra, Chien-Ming Cheng:
OMP: a RISC-based multiprocessor using orthogonal-access memories and multiple spanning buses. ICS 1990: 7-22
1980 – 1989
- 1989
- [c1]Kai Hwang, Dhabaleswar K. Panda:
Optical arithmetic using high-radix symbolic substitution rules. IEEE Symposium on Computer Arithmetic 1989: 226-232 - 1988
- [j1]Dhabaleswar K. Panda, T. Viswanathan:
A Parallel-Serial Binary Arbitration Scheme for Collision-Free Multi-Access Techniques. Comput. Networks 15: 217-223 (1988)
Coauthor Index
aka: Quentin G. Anthony
aka: A. A. Awan
aka: Nusrat Sharmin Islam
aka: Mustafa Abduljabbar
aka: Raghunath Raja Chandrasekar
aka: Ponnuswamy Sadayappan
aka: Karen Tomko

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-02-20 01:46 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint