default search action
Stanimire Tomov
Person information
- affiliation: University of Tennessee, Knoxville, TN, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j55]Piotr Luszczek, Ahmad Abdelfattah, Hartwig Anzt, Atsushi Suzuki, Stanimire Tomov:
Batched sparse and mixed-precision linear algebra interface for efficient use of GPU hardware accelerators in scientific applications. Future Gener. Comput. Syst. 160: 359-374 (2024) - [c115]Julian Halloy, Stephen Qiu, Stanimire Tomov, Kwai Wong:
PyMAGMA: A Python Interface for MAGMA. PEARC 2024: 40:1-40:4 - [c114]Noah Dahle, Meghan Kwon, Kwai Wong, Stanimire Tomov:
Using Graph Neural Networks to Predict Gene-Autoimmune Disease Associations. PEARC 2024: 93:1-93:4 - [c113]Kristina Wilson, Clifford Li, Hon Man Lau, Kwai Wong, Stanimire Tomov:
Implementing Single-precision and Half-precision Tensor Operations. PEARC 2024: 109:1-109:4 - 2023
- [c112]Wissam M. Sid-Lakhdar, Sébastien Cayrols, Daniel Bielich, Ahmad Abdelfattah, Piotr Luszczek, Mark Gates, Stanimire Tomov, Hans Johansen, David B. Williams-Young, Timothy A. Davis, Jack J. Dongarra, Hartwig Anzt:
PAQR: Pivoting Avoiding QR factorization. IPDPS 2023: 322-332 - [c111]Ahmad Abdelfattah, Stanimire Tomov, Piotr Luszczek, Hartwig Anzt, Jack J. Dongarra:
GPU-based LU Factorization and Solve on Batches of Matrices with Band Structure. SC Workshops 2023: 1670-1679 - [d5]Jed Brown, Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Jean-Sylvain Camier, Veselin Dobrev, Yohann Dudouit, Leila Ghaffari, Tzanio V. Kolev, David S. Medina, Will Pazner, Thilina Ratnayaka, Rezgar Shakeri, Jeremy L. Thompson, Stanimire Tomov, James Wright:
libCEED: Efficient Extensible Discretization. Version v0.12.0. Zenodo, 2023 [all versions] - 2022
- [c110]Sébastien Cayrols, Jiali Li, George Bosilca, Stanimire Tomov, Alan Ayala, Jack J. Dongarra:
Lossy all-to-all exchange for accelerating parallel 3-D FFTs on hybrid architectures with GPUs. CLUSTER 2022: 152-160 - [c109]Chiang-Heng Chien, Hongyi Fan, Ahmad Abdelfattah, Elias P. Tsigaridas, Stanimire Tomov, Benjamin B. Kimia:
GPU-Based Homotopy Continuation for Minimal Problems in Computer Vision. CVPR 2022: 15744-15755 - [c108]Ahmad Abdelfattah, Stan Tomov, Jack J. Dongarra:
Batch QR Factorization on GPUs: Design, Optimization, and Tuning. ICCS (1) 2022: 60-74 - [c107]Alan Ayala, Stan Tomov, Miroslav Stoyanov, Azzam Haidar, Jack J. Dongarra:
Performance Analysis of Parallel FFT on Large Multi-GPU Systems. IPDPS Workshops 2022: 372-381 - [c106]Ahmad Abdelfattah, Pieter Ghysels, Wajih Boukaram, Stanimire Tomov, Xiaoye Sherry Li, Jack J. Dongarra:
Addressing Irregular Patterns of Matrix Computations on GPUs and Their Impact on Applications Powered by Sparse Direct Solvers. SC 2022: 26:1-26:14 - [c105]Anna Fortenberry, Stanimire Tomov:
Extending MAGMA Portability with OneAPI. WACCPD@SC 2022: 22-31 - [d4]Jed Brown, Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Jean-Sylvain Camier, Veselin Dobrev, Yohann Dudouit, Leila Ghaffari, Tzanio V. Kolev, David S. Medina, Will Pazner, Thilina Ratnayaka, Rezgar Shakeri, Jeremy L. Thompson, Stanimire Tomov, James Wright:
libCEED: Efficient Extensible Discretization. Version 0.11.0. Zenodo, 2022 [all versions] - 2021
- [j54]Zafar Iqbal, Saeid Nooshabadi, Ichitaro Yamazaki, Stanimire Tomov, Jack J. Dongarra:
Exploiting Block Structures of KKT Matrices for Efficient Solution of Convex Optimization Problems. IEEE Access 9: 116604-116611 (2021) - [j53]Ahmad Abdelfattah, Hartwig Anzt, Erik G. Boman, Erin C. Carson, Terry Cojean, Jack J. Dongarra, Alyson Fox, Mark Gates, Nicholas J. Higham, Xiaoye S. Li, Jennifer A. Loe, Piotr Luszczek, Srikara Pranesh, Siva Rajamanickam, Tobias Ribizel, Barry F. Smith, Kasia Swirydowicz, Stephen J. Thomas, Stanimire Tomov, Yaohung M. Tsai, Ulrike Meier Yang:
A survey of numerical linear algebra methods utilizing mixed-precision arithmetic. Int. J. High Perform. Comput. Appl. 35(4) (2021) - [j52]Tzanio V. Kolev, Paul F. Fischer, Misun Min, Jack J. Dongarra, Jed Brown, Veselin Dobrev, Tim Warburton, Stanimire Tomov, Mark S. Shephard, Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Jean-Sylvain Camier, Noel Chalmers, Yohann Dudouit, Ali Karakus, Ian Karlin, Stefan Kerkemeier, Yu-Hsiang Lan, David S. Medina, Elia Merzari, Aleksandr Obabko, Will Pazner, Thilina Rathnayake, Cameron W. Smith, Lukas Spies, Kasia Swirydowicz, Jeremy L. Thompson, Ananias Tomboulides, Vladimir Z. Tomov:
Efficient exascale discretizations: High-order finite element methods. Int. J. High Perform. Comput. Appl. 35(6): 527-552 (2021) - [j51]Jack J. Dongarra, Mark Gates, Piotr Luszczek, Stanimire Tomov:
Translational process: Mathematical software perspective. J. Comput. Sci. 52: 101216 (2021) - [j50]Jed Brown, Ahmad Abdelfattah, Valeria Barra, Natalie N. Beams, Jean-Sylvain Camier, Veselin Dobrev, Yohann Dudouit, Leila Ghaffari, Tzanio V. Kolev, David S. Medina, Will Pazner, Thilina Rathnayake, Jeremy L. Thompson, Stan Tomov:
libCEED: Fast algebra for high-order element-based discretizations. J. Open Source Softw. 6(63): 2945 (2021) - [j49]Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Ryan Bleile, Jed Brown, Jean-Sylvain Camier, Robert Carson, Noel Chalmers, Veselin Dobrev, Yohann Dudouit, Paul F. Fischer, Ali Karakus, Stefan Kerkemeier, Tzanio V. Kolev, Yu-Hsiang Lan, Elia Merzari, Misun Min, Malachi Phillips, Thilina Rathnayake, Robert N. Rieben, Thomas Stitt, Ananias Tomboulides, Stanimire Tomov, Vladimir Z. Tomov, Arturo Vargas, Tim Warburton, Kenneth Weiss:
GPU algorithms for Efficient Exascale Discretizations. Parallel Comput. 108: 102841 (2021) - [j48]Ahmad Abdelfattah, Timothy B. Costa, Jack J. Dongarra, Mark Gates, Azzam Haidar, Sven Hammarling, Nicholas J. Higham, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Mawussi Zounon:
A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines. ACM Trans. Math. Softw. 47(3): 21:1-21:23 (2021) - [c104]Alan Ayala, Stan Tomov, Miroslav Stoyanov, Azzam Haidar, Jack J. Dongarra:
Accelerating Multi - Process Communication for Parallel 3-D FFT. ExaMPI@SC 2021: 46-53 - [c103]Daniel Sharp, Miroslav Stoyanov, Stanimire Tomov, Jack J. Dongarra:
A More Portable HeFFTe: Implementing a Fallback Algorithm for Scalable Fourier Transforms. HPEC 2021: 1-5 - [c102]Alan Ayala, Stanimire Tomov, Miroslav Stoyanov, Jack J. Dongarra:
Scalability Issues in FFT Computation. PaCT 2021: 279-287 - [d3]Jed Brown, Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Jean-Sylvain Camier, Veselin Dobrev, Yohann Dudouit, Leila Ghaffari, Tzanio V. Kolev, David S. Medina, Will Pazner, Thilina Ratnayaka, Jeremy L. Thompson, Stan Tomov:
CEED/libCEED: v0.9.0. Version v0.9.0. Zenodo, 2021 [all versions] - [d2]Jed Brown, Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Jean-Sylvain Camier, Veselin Dobrev, Yohann Dudouit, Leila Ghaffari, Tzanio V. Kolev, David S. Medina, Will Pazner, Thilina Ratnayaka, Jeremy L. Thompson, Stanimire Tomov:
libCEED: Efficient Extensible Discretization. Version 0.10.0. Zenodo, 2021 [all versions] - [d1]Jed Brown, Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Jean-Sylvain Camier, Veselin Dobrev, Yohann Dudouit, Leila Ghaffari, Tzanio V. Kolev, David S. Medina, Will Pazner, Thilina Ratnayaka, Jeremy L. Thompson, Stanimire Tomov:
libCEED: Efficient Extensible Discretization. Version 0.10.1. Zenodo, 2021 [all versions] - [i12]Tzanio V. Kolev, Paul F. Fischer, Misun Min, Jack J. Dongarra, Jed Brown, Veselin Dobrev, Tim Warburton, Stanimire Tomov, Mark S. Shephard, Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Jean-Sylvain Camier, Noel Chalmers, Yohann Dudouit, Ali Karakus, Ian Karlin, Stefan Kerkemeier, Yu-Hsiang Lan, David S. Medina, Elia Merzari, Aleksandr Obabko, Will Pazner, Thilina Rathnayake, Cameron W. Smith, Lukas Spies, Kasia Swirydowicz, Jeremy L. Thompson, Ananias Tomboulides, Vladimir Z. Tomov:
Efficient Exascale Discretizations: High-Order Finite Element Methods. CoRR abs/2109.04996 (2021) - [i11]Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Ryan Bleile, Jed Brown, Jean-Sylvain Camier, Robert Carson, Noel Chalmers, Veselin Dobrev, Yohann Dudouit, Paul F. Fischer, Ali Karakus, Stefan Kerkemeier, Tzanio V. Kolev, Yu-Hsiang Lan, Elia Merzari, Misun Min, Malachi Phillips, Thilina Rathnayake, Robert N. Rieben, Thomas Stitt, Ananias Tomboulides, Stanimire Tomov, Vladimir Z. Tomov, Arturo Vargas, Tim Warburton, Kenneth Weiss:
GPU Algorithms for Efficient Exascale Discretizations. CoRR abs/2109.05072 (2021) - [i10]Chiang-Heng Chien, Hongyi Fan, Ahmad Abdelfattah, Elias P. Tsigaridas, Stanimire Tomov, Benjamin B. Kimia:
GPU-Based Homotopy Continuation for Minimal Problems in Computer Vision. CoRR abs/2112.03444 (2021) - 2020
- [j47]Yuechao Lu, Ichitaro Yamazaki, Fumihiko Ino, Yasuyuki Matsushita, Stanimire Tomov, Jack J. Dongarra:
Reducing the amount of out-of-core data access for GPU-accelerated randomized SVD. Concurr. Comput. Pract. Exp. 32(19) (2020) - [j46]Mohammed A. Al Farhan, Ahmad Abdelfattah, Stanimire Tomov, Mark Gates, Dalal Sukkari, Azzam Haidar, Robert Rosenberg, Jack J. Dongarra:
MAGMA templates for scalable linear algebra on emerging architectures. Int. J. High Perform. Comput. Appl. 34(6) (2020) - [j45]Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra:
Matrix multiplication on batches of small matrices in half and half-complex precisions. J. Parallel Distributed Comput. 145: 188-201 (2020) - [j44]Hartwig Anzt, Terry Cojean, Chen Yen-Chen, Jack J. Dongarra, Goran Flegar, Pratik Nayak, Stanimire Tomov, Yuhsiang M. Tsai, Weichung Wang:
Load-balancing Sparse Matrix Vector Product Kernels on GPUs. ACM Trans. Parallel Comput. 7(1): 2:1-2:26 (2020) - [c101]Cade Brown, Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra:
Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs. HPEC 2020: 1-7 - [c100]Ahmad Abdelfattah, Stan Tomov, Jack J. Dongarra:
Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices Using GPUs. ICCS (2) 2020: 237-250 - [c99]Alan Ayala, Stanimire Tomov, Azzam Haidar, Jack J. Dongarra:
heFFTe: Highly Efficient FFT for Exascale. ICCS (1) 2020: 262-275 - [c98]Florent Lopez, Edmond Chow, Stanimire Tomov, Jack J. Dongarra:
Asynchronous SGD for DNN training on Shared-memory Parallel Architectures. IPDPS Workshops 2020: 995-998 - [c97]Natalie Beams, Ahmad Abdelfattah, Stan Tomov, Jack J. Dongarra, Tzanio V. Kolev, Yohann Dudouit:
High-Order Finite Element Method using Standard and Device-Level Batch GEMM on GPUs. ScalA@SC 2020: 53-60 - [c96]Rick Archibald, Edmond Chow, Eduardo F. D'Azevedo, Jack J. Dongarra, Markus Eisenbach, Rocco Febbo, Florent Lopez, Daniel Nichols, Stanimire Tomov, Kwai Wong, Junqi Yin:
Integrating Deep Learning in Domain Sciences at Exascale. SMC 2020: 35-50 - [i9]Ahmad Abdelfattah, Hartwig Anzt, Erik G. Boman, Erin C. Carson, Terry Cojean, Jack J. Dongarra, Mark Gates, Thomas Grützmacher, Nicholas J. Higham, Xiaoye Sherry Li, Neil Lindquist, Yang Liu, Jennifer A. Loe, Piotr Luszczek, Pratik Nayak, Srikara Pranesh, Sivasankaran Rajamanickam, Tobias Ribizel, Barry Smith, Kasia Swirydowicz, Stephen J. Thomas, Stanimire Tomov, Yaohung M. Tsai, Ichitaro Yamazaki, Ulrike Meier Yang:
A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic. CoRR abs/2007.06674 (2020) - [i8]Rick Archibald, Edmond Chow, Eduardo F. D'Azevedo, Jack J. Dongarra, Markus Eisenbach, Rocco Febbo, Florent Lopez, Daniel Nichols, Stanimire Tomov, Kwai Wong, Junqi Yin:
Integrating Deep Learning in Domain Sciences at Exascale. CoRR abs/2011.11188 (2020)
2010 – 2019
- 2019
- [j43]Azzam Haidar, Heike Jagode, Phil Vaccaro, Asim YarKhan, Stanimire Tomov, Jack J. Dongarra:
Investigating power capping toward energy-efficient scientific applications. Concurr. Comput. Pract. Exp. 31(6) (2019) - [j42]M. Graham Lopez, Wayne Joubert, Verónica G. Vergara Larrea, Oscar R. Hernandez, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Evaluation of directive-based performance portable programming models. Int. J. High Perform. Comput. Netw. 14(2): 165-182 (2019) - [j41]Ian Masliah, Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Marc Baboulin, Joël Falcou, Jack J. Dongarra:
Algorithms and optimization techniques for high-performance matrix-matrix multiplications of very small matrices. Parallel Comput. 81: 1-21 (2019) - [j40]Dmitry Zaitsev, Stanimire Tomov, Jack J. Dongarra:
Solving Linear Diophantine Systems on Parallel Architectures. IEEE Trans. Parallel Distributed Syst. 30(5): 1158-1169 (2019) - [c95]Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra:
Progressive Optimization of Batched LU Factorization on GPUs. HPEC 2019: 1-6 - [c94]Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra:
Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs. IPDPS 2019: 111-122 - [c93]Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra:
Towards Half-Precision Computation for Complex Matrices: A Case Study for Mixed Precision Solvers on GPUs. ScalA@SC 2019: 17-24 - [c92]Daniel Nichols, Nathalie-Sofia Tomov, Frank Betancourt, Stanimire Tomov, Kwai Wong, Jack J. Dongarra:
MagmaDNN: Towards High-Performance Data Analytics and Machine Learning for Data-Driven Scientific Computing. ISC Workshops 2019: 490-503 - [c91]Kwai Wong, Stanimire Tomov, Jack J. Dongarra:
Hands-On Research and Training in High Performance Data Sciences, Data Analytics, and Machine Learning for Emerging Environments. ISC Workshops 2019: 643-655 - [c90]Frank Betancourt, Kwai Wong, Efosa Asemota, Quindell Marshall, Daniel Nichols, Stanimire Tomov:
openDIEL: A Parallel Workflow Engine and Data Analytics Framework. PEARC 2019: 20:1-20:7 - [c89]Daniel Nichols, Kwai Wong, Stanimire Tomov, Lucien Ng, Sihan Chen, Alex Gessinger:
MagmaDNN: Accelerated Deep Learning Using MAGMA. PEARC 2019: 71:1-71:6 - 2018
- [j39]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Batched one-sided factorizations of tiny matrices using GPUs: Challenges and countermeasures. J. Comput. Sci. 26: 226-236 (2018) - [j38]Tingxing Dong, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Accelerating the SVD bi-diagonalization of a batch of small matrices using GPUs. J. Comput. Sci. 26: 237-245 (2018) - [j37]Mark Gates, Stanimire Tomov, Jack J. Dongarra:
Accelerating the SVD two stage bidiagonal reduction and divide and conquer using GPUs. Parallel Comput. 74: 3-18 (2018) - [j36]Jack J. Dongarra, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki:
The Singular Value Decomposition: Anatomy of Optimizing an Algorithm for Extreme Scale. SIAM Rev. 60(4): 808-865 (2018) - [j35]Azzam Haidar, Ahmad Abdelfattah, Mawussi Zounon, Stanimire Tomov, Jack J. Dongarra:
A Guide for Achieving High Performance with Very Small Matrices on GPU: A Case Study of Batched LU and Cholesky Factorizations. IEEE Trans. Parallel Distributed Syst. 29(5): 973-984 (2018) - [j34]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Analysis and Design Techniques towards High-Performance and Energy-Efficient Dense Linear Solvers on GPUs. IEEE Trans. Parallel Distributed Syst. 29(12): 2700-2712 (2018) - [c88]Anumeena Sorna, Xiaohe Cheng, Eduardo F. D'Azevedo, Kwai Wong, Stanimire Tomov:
Optimizing the Fast Fourier Transform Using Mixed Precision on Tensor Core Hardware. HiPC Workshops 2018: 3-7 - [c87]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Optimizing GPU Kernels for Irregular Batch Workloads: A Case Study for Cholesky Factorization. HPEC 2018: 1-7 - [c86]Azzam Haidar, Ahmad Abdelfattah, Mawussi Zounon, Panruo Wu, Srikara Pranesh, Stanimire Tomov, Jack J. Dongarra:
The Design of Fast and Energy-Efficient Linear Solvers: On the Potential of Half-Precision Arithmetic and Iterative Refinement Techniques. ICCS (1) 2018: 586-600 - [c85]Ichitaro Yamazaki, Ahmad Abdelfattah, Akihiro Ida, Satoshi Ohshima, Stanimire Tomov, Rio Yokota, Jack J. Dongarra:
Performance of Hierarchical-matrix BiCGStab Solver on GPU Clusters. IPDPS 2018: 930-939 - [c84]Azzam Haidar, Stanimire Tomov, Jack J. Dongarra, Nicholas J. Higham:
Harnessing GPU tensor cores for fast FP16 arithmetic to speed up mixed-precision iterative refinement solvers. SC 2018: 47:1-47:11 - [i7]Nathalie-Sofia Tomov, Stanimire Tomov:
On Deep Neural Networks for Detecting Heart Disease. CoRR abs/1808.07168 (2018) - 2017
- [j33]Ichitaro Yamazaki, Stanimire Tomov, Jack J. Dongarra:
Non-GPU-resident symmetric indefinite factorization. Concurr. Comput. Pract. Exp. 29(5) (2017) - [j32]Marc Baboulin, Jack J. Dongarra, Adrien Rémy, Stanimire Tomov, Ichitaro Yamazaki:
Solving dense symmetric indefinite systems using GPUs. Concurr. Comput. Pract. Exp. 29(9) (2017) - [j31]Jack J. Dongarra, Stanimire Tomov, Piotr Luszczek, Jakub Kurzak, Mark Gates, Ichitaro Yamazaki, Hartwig Anzt, Azzam Haidar, Ahmad Abdelfattah:
With Extreme Computing, the Rules Have Changed. Comput. Sci. Eng. 19(3): 52-62 (2017) - [j30]Ichitaro Yamazaki, Saeid Nooshabadi, Stanimire Tomov, Jack J. Dongarra:
Structure-Aware Linear Solver for Realtime Convex Optimization for Embedded Systems. IEEE Embed. Syst. Lett. 9(3): 61-64 (2017) - [j29]Hartwig Anzt, Stanimire Tomov, Jack J. Dongarra:
On the performance and energy efficiency of sparse linear algebra on GPUs. Int. J. High Perform. Comput. Appl. 31(5): 375-390 (2017) - [j28]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Fast Cholesky factorization on GPUs for batch and native modes in MAGMA. J. Comput. Sci. 20: 85-93 (2017) - [c83]Ichitaro Yamazaki, Stanimire Tomov, Jack J. Dongarra:
Sampling algorithms to update truncated SVD. IEEE BigData 2017: 817-826 - [c82]Azzam Haidar, Heike Jagode, Asim YarKhan, Phil Vaccaro, Stanimire Tomov, Jack J. Dongarra:
Power-aware computing: Measurement, control, and performance analysis for Intel Xeon Phi. HPEC 2017: 1-7 - [c81]Azzam Haidar, Khairul Kabir, Diana Fayad, Stanimire Tomov, Jack J. Dongarra:
Out of memory SVD solver for big data. HPEC 2017: 1-7 - [c80]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Factorization and Inversion of a Million Matrices using GPUs: Challenges and Countermeasures. ICCS 2017: 606-615 - [c79]Tingxing Dong, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices. ICCS 2017: 1008-1018 - [c78]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Novel HPC techniques to batch execution of many variable size BLAS computations on GPUs. ICS 2017: 5:1-5:10 - [c77]Azzam Haidar, Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra:
High-performance Cholesky factorization for GPU-only execution. GPGPU@PPoPP 2017: 42-52 - [c76]Azzam Haidar, Panruo Wu, Stanimire Tomov, Jack J. Dongarra:
Investigating half precision arithmetic to accelerate dense linear system solvers. ScalA@SC 2017: 10:1-10:8 - [c75]Khairul Kabir, Azzam Haidar, Stanimire Tomov, Aurélien Bouteiller, Jack J. Dongarra:
A Framework for Out of Memory SVD Algorithms. ISC 2017: 158-178 - [p5]Hartwig Anzt, Jack J. Dongarra, Mark Gates, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki:
Bringing High Performance Computing to Big Data Algorithms. Handbook of Big Data Technologies 2017: 777-806 - 2016
- [j27]Ahmad Abdelfattah, Hartwig Anzt, Jack J. Dongarra, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki, Asim YarKhan:
Linear algebra software for large-scale accelerated multicore computing. Acta Numer. 25: 1-160 (2016) - [j26]Ichitaro Yamazaki, Stanimire Tomov, Jack J. Dongarra:
Stability and Performance of Various Singular Value QR Implementations on Multicore CPU with a GPU. ACM Trans. Math. Softw. 43(2): 10:1-10:18 (2016) - [c74]Ian Masliah, Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Marc Baboulin, Joël Falcou, Jack J. Dongarra:
High-Performance Matrix-Matrix Multiplications of Very Small Matrices. Euro-Par 2016: 659-671 - [c73]Azzam Haidar, Benjamin Brock, Stanimire Tomov, Michael Guidry, Jay Jay Billings, Daniel Shyles, Jack J. Dongarra:
Performance analysis and acceleration of explicit integration for large kinetic networks using batched GPU computations. HPEC 2016: 1-7 - [c72]Azzam Haidar, Stanimire Tomov, Konstantin Arturov, Murat Efe Guney, Shane Story, Jack J. Dongarra:
LU, QR, and Cholesky factorizations: Programming model, performance analysis and optimization techniques for the Intel Knights Landing Xeon Phi. HPEC 2016: 1-7 - [c71]Ahmad Abdelfattah, Marc Baboulin, Veselin Dobrev, Jack J. Dongarra, Christopher W. Earl, Joel Falcou, Azzam Haidar, Ian Karlin, Tzanio V. Kolev, Ian Masliah, Stanimire Tomov:
High-Performance Tensor Contractions for GPUs. ICCS 2016: 108-118 - [c70]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Performance Tuning and Optimization Techniques of Fixed and Variable Size Batched Cholesky Factorization on GPUs. ICCS 2016: 119-130 - [c69]Chris J. Newburn, Gaurav Bansal, Michael Wood, Luis Crivelli, Judit Planas, Alejandro Duran, Paulo Souza, Leonardo Borges, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra, Hartwig Anzt, Mark Gates, Azzam Haidar, Yulu Jia, Khairul Kabir, Ichitaro Yamazaki, Jesús Labarta:
Heterogeneous Streaming. IPDPS Workshops 2016: 611-620 - [c68]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
On the Development of Variable Size Batched Computation for Heterogeneous Parallel Architectures. IPDPS Workshops 2016: 1249-1258 - [c67]M. Graham Lopez, Verónica G. Vergara Larrea, Wayne Joubert, Oscar R. Hernandez, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Towards Achieving Performance Portability Using Directives for Accelerators. WACCPD@SC 2016: 13-24 - [c66]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Performance, Design, and Autotuning of Batched GEMM for GPUs. ISC 2016: 21-38 - 2015
- [j25]Hartwig Anzt, Stanimire Tomov, Piotr Luszczek, William B. Sawyer, Jack J. Dongarra:
Acceleration of GPU-based Krylov solvers via data transfer reduction. Int. J. High Perform. Comput. Appl. 29(3): 366-383 (2015) - [j24]Ichitaro Yamazaki, Stanimire Tomov, Jack J. Dongarra:
Mixed-Precision Cholesky QR Factorization and Its Case Studies on Multicore CPU with Multiple GPUs. SIAM J. Sci. Comput. 37(3) (2015) - [j23]Ichitaro Yamazaki, Stanimire Tomov, Jack J. Dongarra:
Computing Low-Rank Approximation of a Dense Matrix on Multicore CPUs with a GPU and Its Application to Solving a Hierarchically Semiseparable Linear System of Equations. Sci. Program. 2015: 246019:1-246019:17 (2015) - [j22]Jack J. Dongarra, Mark Gates, Azzam Haidar, Yulu Jia, Khairul Kabir, Piotr Luszczek, Stanimire Tomov:
HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi. Sci. Program. 2015: 502593:1-502593:11 (2015) - [j21]Jack J. Dongarra, Maksims Abalenkovs, Ahmad Abdelfattah, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki, Asim YarKhan:
Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems. Supercomput. Front. Innov. 2(4): 67-86 (2015) - [c65]Azzam Haidar, Asim YarKhan, Chongxiao Cao, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Flexible Linear Algebra Development and Scheduling with Cholesky Factorization. HPCC/CSS/ICESS 2015: 861-864 - [c64]Azzam Haidar, Stanimire Tomov, Piotr Luszczek, Jack J. Dongarra:
MAGMA embedded: Towards a dense linear algebra library for energy efficient extreme computing. HPEC 2015: 1-6 - [c63]Khairul Kabir, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Performance Analysis and Optimisation of Two-sided Factorization Algorithms for Heterogeneous Platform. ICCS 2015: 180-190 - [c62]Marc Baboulin, Jack J. Dongarra, Adrien Rémy, Stanimire Tomov, Ichitaro Yamazaki:
Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures. PPAM (1) 2015: 86-95 - [c61]Hartwig Anzt, Stanimire Tomov, Jack J. Dongarra:
Energy efficiency and performance frontiers for sparse computations on GPU supercomputers. PMAM@PPoPP 2015: 1-10 - [c60]Azzam Haidar, Tingxing Dong, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Optimization for performance and energy for batched matrix computations on GPUs. GPGPU@PPoPP 2015: 59-69 - [c59]Azzam Haidar, Tingxing Dong, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Towards batched linear solvers on accelerated hardware platforms. PPoPP 2015: 261-262 - [c58]Ichitaro Yamazaki, Stanimire Tomov, Jakub Kurzak, Jack J. Dongarra, Jesse L. Barlow:
Mixed-precision block gram Schmidt orthogonalization. ScalA@SC 2015: 2:1-2:8 - [c57]Azzam Haidar, Yulu Jia, Piotr Luszczek, Stanimire Tomov, Asim YarKhan, Jack J. Dongarra:
Weighted dynamic scheduling with many parallelism grains for offloading of numerical workloads to multiple varied accelerators. ScalA@SC 2015: 5:1-5:8 - [c56]Raffaele Solcà, Anton Kozhevnikov, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra, Thomas C. Schulthess:
Efficient implementation of quantum materials simulations on distributed CPU-GPU systems. SC 2015: 10:1-10:12 - [c55]Théo Mary, Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Performance of random sampling for computing low-rank approximations of a dense matrix on GPUs. SC 2015: 60:1-60:11 - [c54]Hartwig Anzt, Stanimire Tomov, Jack J. Dongarra:
Accelerating the LOBPCG method on GPUs using a blocked sparse matrix vector product. SpringSim (HPS) 2015: 75-82 - [c53]Khairul Kabir, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Performance analysis and design of a hessenberg reduction using stabilized blocked elementary transformations for new architectures. SpringSim (HPS) 2015: 135-142 - [c52]Azzam Haidar, Tingxing Tim Dong, Stanimire Tomov, Piotr Luszczek, Jack J. Dongarra:
A Framework for Batched and GPU-Resident Factorization Algorithms Applied to Block Householder Transformations. ISC 2015: 31-47 - [c51]Khairul Kabir, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
On the Design, Development, and Analysis of Optimized Matrix-Vector Multiplication Routines for Coprocessors. ISC 2015: 58-73 - 2014
- [j20]Ichitaro Yamazaki, Tingxing Dong, Raffaele Solcà, Stanimire Tomov, Jack J. Dongarra, Thomas C. Schulthess:
Tridiagonalization of a dense symmetric matrix on multiple GPUs and its application to symmetric eigenvalue problems. Concurr. Comput. Pract. Exp. 26(16): 2652-2666 (2014) - [j19]Azzam Haidar, Stanimire Tomov, Jack J. Dongarra, Raffaele Solcà, Thomas C. Schulthess:
A novel hybrid CPU-GPU generalized eigensolver for electronic structure calculations based on fine-grained memory aware tasks. Int. J. High Perform. Comput. Appl. 28(2): 196-209 (2014) - [j18]Jack J. Dongarra, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Asim YarKhan:
Model-Driven One-Sided Factorizations on Multicore Accelerated Systems. Supercomput. Front. Innov. 1(1): 85-115 (2014) - [c50]Ichitaro Yamazaki, Théo Mary, Jakub Kurzak, Stanimire Tomov, Jack J. Dongarra:
Access-averse framework for computing low-rank matrix approximations. IEEE BigData 2014: 70-77 - [c49]Tingxing Dong, Azzam Haidar, Piotr Luszczek, James Austin Harris, Stanimire Tomov, Jack J. Dongarra:
LU Factorization of Small Matrices: Accelerating Batched DGETRF on the GPU. HPCC/CSS/ICESS 2014: 157-160 - [c48]Tingxing Dong, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
A Fast Batched Cholesky Factorization on a GPU. ICPP 2014: 432-440 - [c47]Dimitar Lukarski, Hartwig Anzt, Stanimire Tomov, Jack J. Dongarra:
Hybrid Multi-elimination ILU Preconditioners on GPUs. IPDPS Workshops 2014: 7-16 - [c46]Ichitaro Yamazaki, Hartwig Anzt, Stanimire Tomov, Mark Hoemmen, Jack J. Dongarra:
Improving the Performance of CA-GMRES on Multicores with Multiple GPUs. IPDPS 2014: 382-391 - [c45]Azzam Haidar, Chongxiao Cao, Asim YarKhan, Piotr Luszczek, Stanimire Tomov, Khairul Kabir, Jack J. Dongarra:
Unified Development for Mixed Multi-GPU and Multi-coprocessor Environments Using a Lightweight Runtime Environment. IPDPS 2014: 491-500 - [c44]Hartwig Anzt, William B. Sawyer, Stanimire Tomov, Piotr Luszczek, Ichitaro Yamazaki, Jack J. Dongarra:
Optimizing Krylov Subspace Solvers on Graphics Processing Units. IPDPS Workshops 2014: 941-949 - [c43]Simplice Donfack, Stanimire Tomov, Jack J. Dongarra:
Dynamically Balanced Synchronization-Avoiding LU Factorization with Multicore and GPUs. IPDPS Workshops 2014: 958-965 - [c42]Tingxing Dong, Veselin Dobrev, Tzanio V. Kolev, Robert N. Rieben, Stanimire Tomov, Jack J. Dongarra:
A Step towards Energy Efficient Computing: Redesigning a Hydrodynamic Application on CPU-GPU. IPDPS 2014: 972-981 - [c41]Chongxiao Cao, Jack J. Dongarra, Peng Du, Mark Gates, Piotr Luszczek, Stanimire Tomov:
clMAGMA: high performance dense linear algebra with OpenCL. IWOCL 2014: 1:1-1:9 - [c40]Ichitaro Yamazaki, Stanimire Tomov, Jack J. Dongarra:
Deflation strategies to improve the convergence of communication-avoiding GMRES. ScalA@SC 2014: 39-46 - [c39]Chongxiao Cao, Mark Gates, Azzam Haidar, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki, Jack J. Dongarra:
Performance and portability with OpenCL for throughput-oriented HPC workloads across accelerators, coprocessors, and multicore processors. ScalA@SC 2014: 61-68 - [c38]Ichitaro Yamazaki, Sivasankaran Rajamanickam, Erik G. Boman, Mark Hoemmen, Michael A. Heroux, Stanimire Tomov:
Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on a Hybrid CPU/GPU Cluster. SC 2014: 933-944 - [c37]Ichitaro Yamazaki, Stanimire Tomov, Tingxing Dong, Jack J. Dongarra:
Mixed-Precision Orthogonalization Scheme and Adaptive Step Size for Improving the Stability and Performance of CA-GMRES on GPUs. VECPAR 2014: 17-30 - [c36]Azzam Haidar, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Heterogenous Acceleration for Linear Algebra in Multi-coprocessor Environments. VECPAR 2014: 31-42 - [c35]Hartwig Anzt, Dimitar Lukarski, Stanimire Tomov, Jack J. Dongarra:
Self-adaptive Multiprecision Preconditioners on Multicore and Manycore Architectures. VECPAR 2014: 115-123 - [p4]Jack J. Dongarra, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki:
Accelerating Numerical Dense Linear Algebra Calculations with GPUs. Numerical Computations with GPUs 2014: 3-28 - 2013
- [j17]Peng Du, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Soft error resilient QR factorization for hybrid system with GPGPU. J. Comput. Sci. 4(6): 457-464 (2013) - [j16]Hartwig Anzt, Stanimire Tomov, Jack J. Dongarra, Vincent Heuveline:
A block-asynchronous relaxation method for graphics processing units. J. Parallel Distributed Comput. 73(12): 1613-1626 (2013) - [j15]Marc Baboulin, Jack J. Dongarra, Julien Herrmann, Stanimire Tomov:
Accelerating Linear System Solutions Using Randomization Techniques. ACM Trans. Math. Softw. 39(2): 8:1-8:13 (2013) - [c34]Azzam Haidar, Mark Gates, Stanimire Tomov, Jack J. Dongarra:
Toward a scalable multi-GPU eigensolver via compute-intensive kernels and efficient communication. ICS 2013: 223-232 - [c33]Ichitaro Yamazaki, Tingxing Dong, Stanimire Tomov, Jack J. Dongarra:
Tridiagonalization of a Symmetric Dense Matrix on a GPU Cluster. IPDPS Workshops 2013: 1070-1079 - [c32]Jack J. Dongarra, Mark Gates, Azzam Haidar, Yulu Jia, Khairul Kabir, Piotr Luszczek, Stanimire Tomov:
Portable HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi. PPAM (1) 2013: 571-581 - [c31]Azzam Haidar, Raffaele Solcà, Mark Gates, Stanimire Tomov, Thomas C. Schulthess, Jack J. Dongarra:
Leading Edge Hybrid Multi-GPU Algorithms for Generalized Eigenproblems in Electronic Structure Calculations. ISC 2013: 67-80 - 2012
- [j14]Peng Du, Rick Weber, Piotr Luszczek, Stanimire Tomov, Gregory D. Peterson, Jack J. Dongarra:
From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming. Parallel Comput. 38(8): 391-407 (2012) - [j13]Christof Vömel, Stanimire Tomov, Jack J. Dongarra:
Divide and Conquer on Hybrid GPU-Accelerated Multicore Systems. SIAM J. Sci. Comput. 34(2) (2012) - [j12]Jakub Kurzak, Stanimire Tomov, Jack J. Dongarra:
Autotuning GEMM Kernels for the Fermi GPU. IEEE Trans. Parallel Distributed Syst. 23(11): 2045-2057 (2012) - [c30]Hartwig Anzt, Stanimire Tomov, Jack J. Dongarra, Vincent Heuveline:
Weighted Block-Asynchronous Iteration on GPU-Accelerated Systems. Euro-Par Workshops 2012: 145-154 - [c29]George Bosilca, Aurélien Bouteiller, Anthony Danalis, Thomas Hérault, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Scalable Dense Linear Algebra on Heterogeneous Hardware. High Performance Computing Workshop (2) 2012: 65-103 - [c28]Fengguang Song, Stanimire Tomov, Jack J. Dongarra:
Enabling and scaling matrix computations on heterogeneous multi-core and multi-GPU systems. ICS 2012: 365-376 - [c27]Hartwig Anzt, Stanimire Tomov, Jack J. Dongarra, Vincent Heuveline:
A Block-Asynchronous Relaxation Method for Graphics Processing Units. IPDPS Workshops 2012: 113-124 - [c26]Emmanuel Agullo, George Bosilca, Bérenger Bramas, Cedric Castagnede, Olivier Coulaud, Eric Darve, Jack J. Dongarra, Mathieu Faverge, Nathalie Furmento, Luc Giraud, Xavier Lacoste, Julien Langou, Hatem Ltaief, Matthias Messner, Raymond Namyst, Pierre Ramet, Toru Takahashi, Samuel Thibault, Stanimire Tomov, Ichitaro Yamazaki:
Abstract: Matrices Over Runtime Systems at Exascale. SC Companion 2012: 1330-1331 - [c25]Emmanuel Agullo, George Bosilca, Bérenger Bramas, Cedric Castagnede, Olivier Coulaud, Eric Darve, Jack J. Dongarra, Mathieu Faverge, Nathalie Furmento, Luc Giraud, Xavier Lacoste, Julien Langou, Hatem Ltaief, Matthias Messner, Raymond Namyst, Pierre Ramet, Toru Takahashi, Samuel Thibault, Stanimire Tomov, Ichitaro Yamazaki:
Poster: Matrices over Runtime Systems at Exascale. SC Companion 2012: 1332 - [c24]Raffaele Solcà, Azzam Haidar, Stanimire Tomov, Thomas C. Schulthess, Jack J. Dongarra:
Abstract: A Novel Hybrid CPU-GPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks. SC Companion 2012: 1338-1339 - [c23]Raffaele Solcà, Azzam Haidar, Stanimire Tomov, Thomas C. Schulthess, Jack J. Dongarra:
Poster: A Novel Hybrid CPU-GPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks. SC Companion 2012: 1340 - [c22]Hartwig Anzt, Stanimire Tomov, Mark Gates, Jack J. Dongarra, Vincent Heuveline:
Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems. ICCS 2012: 7-16 - [c21]Marc Baboulin, Simplice Donfack, Jack J. Dongarra, Laura Grigori, Adrien Rémy, Stanimire Tomov:
A Class of Communication-avoiding Algorithms for Solving General Dense Linear Systems on CPU/GPU Parallel Machines. ICCS 2012: 17-26 - [c20]Ichitaro Yamazaki, Stanimire Tomov:
One-sided Dense Matrix Factorizations on a Multicore with Multiple GPU Accelerators. ICCS 2012: 37-46 - [p3]Jack J. Dongarra, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov:
Dense Linear Algebra on Accelerated Multicore Hardware. High-Performance Scientific Computing 2012: 123-146 - [i6]Raffaele Solcà, Thomas C. Schulthess, Azzam Haidar, Stanimire Tomov, Ichitaro Yamazaki, Jack J. Dongarra:
A hybrid Hermitian general eigenvalue solver. CoRR abs/1207.1773 (2012) - 2011
- [c19]Emmanuel Agullo, Cédric Augonnet, Jack J. Dongarra, Mathieu Faverge, Julien Langou, Hatem Ltaief, Stanimire Tomov:
LU factorization for accelerator-based systems. AICCSA 2011: 217-224 - [c18]George Bosilca, Aurélien Bouteiller, Thomas Hérault, Pierre Lemarinier, Narapat Ohm Saengpatsa, Stanimire Tomov, Jack J. Dongarra:
Performance Portability of a GPU Enabled Factorization with the DAGuE Framework. CLUSTER 2011: 395-402 - [c17]Emmanuel Agullo, Jack J. Dongarra, Rajib Nath, Stanimire Tomov:
A Fully Empirical Autotuned Dense QR Factorization for Multicore Architectures. Euro-Par (2) 2011: 194-205 - [c16]Wolfgang Karl, Samuel Thibault, Stanimire Tomov, Taisuke Boku:
Introduction. Euro-Par (2) 2011: 399-400 - [c15]Allen D. Malony, Scott Biersdorff, Sameer Shende, Heike Jagode, Stanimire Tomov, Guido Juckeland, Robert Dietrich, Duncan Poole, Christopher Lamb:
Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs. ICPP 2011: 176-185 - [c14]Emmanuel Agullo, Cédric Augonnet, Jack J. Dongarra, Mathieu Faverge, Hatem Ltaief, Samuel Thibault, Stanimire Tomov:
QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators. IPDPS 2011: 932-943 - [c13]Rajib Nath, Stanimire Tomov, Tingxing Dong, Jack J. Dongarra:
Optimizing symmetric dense matrix-vector multiplication on GPUs. SC 2011: 6:1-6:10 - [c12]Peng Du, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Soft error resilient QR factorization for hybrid system with GPGPU. ScalA@SC 2011: 11-14 - [i5]Emmanuel Agullo, Jack J. Dongarra, Rajib Nath, Stanimire Tomov:
Fully Empirical Autotuned QR Factorization For Multicore Architectures. CoRR abs/1102.5328 (2011) - 2010
- [j11]Rajib Nath, Stanimire Tomov, Jack J. Dongarra:
An Improved Magma Gemm For Fermi Graphics Processing Units. Int. J. High Perform. Comput. Appl. 24(4): 511-515 (2010) - [j10]Stanimire Tomov, Jack J. Dongarra, Marc Baboulin:
Towards dense linear algebra for hybrid GPU accelerated manycore systems. Parallel Comput. 36(5-6): 232-240 (2010) - [j9]Stanimire Tomov, Rajib Nath, Jack J. Dongarra:
Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing. Parallel Comput. 36(12): 645-654 (2010) - [c11]Peng Du, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra:
Mixed-Tool Performance Analysis on Hybrid Multicore Architectures. ICPP Workshops 2010: 236-244 - [c10]Stanimire Tomov, Rajib Nath, Hatem Ltaief, Jack J. Dongarra:
Dense linear algebra solvers for multicore with GPU accelerators. IPDPS Workshops 2010: 1-8 - [c9]Rajib Nath, Stanimire Tomov, Jack J. Dongarra:
Accelerating GPU Kernels for Dense Linear Algebra. VECPAR 2010: 83-92 - [c8]Hatem Ltaief, Stanimire Tomov, Rajib Nath, Peng Du, Jack J. Dongarra:
A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators. VECPAR 2010: 93-101 - [p2]Stanimire Tomov, Jack J. Dongarra:
Dense Linear Algebra for Hybrid GPU-Based Systems. Scientific Computing with Multicore and Accelerators 2010: 37-55 - [p1]Rajib Nath, Stanimire Tomov, Jack J. Dongarra:
BLAS for GPUs. Scientific Computing with Multicore and Accelerators 2010: 57-80
2000 – 2009
- 2009
- [j8]Marc Baboulin, Alfredo Buttari, Jack J. Dongarra, Jakub Kurzak, Julie Langou, Julien Langou, Piotr Luszczek, Stanimire Tomov:
Accelerating scientific computations with mixed precision algorithms. Comput. Phys. Commun. 180(12): 2526-2533 (2009) - [c7]Yinan Li, Jack J. Dongarra, Stanimire Tomov:
A Note on Auto-tuning GEMM for GPUs. ICCS (1) 2009: 884-892 - [c6]Christof Vömel, Stanimire Tomov, Osni Marques:
Bulk based preconditioning for quantum dot computations. SAC 2009: 961-965 - 2008
- [j7]Christof Vömel, Stanimire Tomov, Osni A. Marques, Andrew Canning, Lin-Wang Wang, Jack J. Dongarra:
State-of-the-art eigensolvers for electronic structure calculations of large scale nano-systems. J. Comput. Phys. 227(15): 7113-7124 (2008) - [j6]Alfredo Buttari, Jack J. Dongarra, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov:
Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy. ACM Trans. Math. Softw. 34(4): 17:1-17:22 (2008) - [i4]Marc Baboulin, Alfredo Buttari, Jack J. Dongarra, Jakub Kurzak, Julie Langou, Julien Langou, Piotr Luszczek, Stanimire Tomov:
Accelerating Scientific Computations with Mixed Precision Algorithms. CoRR abs/0808.2794 (2008) - 2007
- [j5]Christof Vömel, Stanimire Tomov, Lin-Wang Wang, Osni A. Marques, Jack J. Dongarra:
The use of bulk states to accelerate the band edge state calculation of a semiconductor quantum dot. J. Comput. Phys. 223(2): 774-782 (2007) - [r1]Yozo Hida, James Demmel, Julien Langou, Jakub Kurzak, Ming Gu, Alfredo Buttari, Stanimire Tomov, Piotr Luszczek, Julie Langou, Osni Marques, Christof Vömel, Xiaoye S. Li, E. Jason Riedy, Jack J. Dongarra, William Kahan, Beresford N. Parlett, David Bindel:
Prospectus for a Dense Linear Algebra Software Library. Handbook of Parallel Computing 2007 - 2006
- [j4]Stanimire Tomov, Julien Langou, Jack J. Dongarra, Andrew Canning, Lin-Wang Wang:
Conjugate-gradient eigenvalue solvers in computing electronic properties of nanostructure architectures. Int. J. Comput. Sci. Eng. 2(3/4): 205-212 (2006) - [c5]Alfredo Buttari, Jack J. Dongarra, Jakub Kurzak, Julie Langou, Julien Langou, Piotr Luszczek, Stanimire Tomov:
Exploiting Mixed Precision Floating Point Hardware in Scientific Computations. High Performance Computing Workshop 2006: 19-36 - [c4]Alfredo Buttari, Jack J. Dongarra, Jakub Kurzak, Julien Langou, Piotr Luszczek, Stanimire Tomov:
The Impact of Multicore on Math Software. PARA 2006: 1-10 - [c3]James Demmel, Jack J. Dongarra, Beresford N. Parlett, William Kahan, Ming Gu, David Bindel, Yozo Hida, Xiaoye S. Li, Osni Marques, E. Jason Riedy, Christof Vömel, Julien Langou, Piotr Luszczek, Jakub Kurzak, Alfredo Buttari, Julie Langou, Stanimire Tomov:
Prospectus for the Next LAPACK and ScaLAPACK Libraries. PARA 2006: 11-23 - 2005
- [j3]Stanimire Tomov, Michael D. McGuigan, Robert Bennett, Gordon Smith, John Spiletic:
Benchmarking and implementation of probability-based simulations on programmable graphics cards. Comput. Graph. 29(1): 71-80 (2005) - [j2]Carsten Carstensen, Raytcho D. Lazarov, Stanimire Tomov:
Explicit and Averaging A Posteriori Error Estimates for Adaptive Finite Volume Methods. SIAM J. Numer. Anal. 42(6): 2496-2521 (2005) - [c2]Stanimire Tomov, Julien Langou, Andrew Canning, Lin-Wang Wang, Jack J. Dongarra:
Comparison of Nonlinear Conjugate-Gradient Methods for Computing the Electronic Properties of Nanostructure Architectures. International Conference on Computational Science (3) 2005: 317-325 - 2004
- [j1]Stanimire Tomov, Robert Bennett, Michael D. McGuigan, Arnold M. Peskin, Gordon Smith, John Spiletic:
Application of interactive parallel visualization for commodity-based clusters using visualization APIs. Comput. Graph. 28(2): 273-278 (2004) - [c1]Donald J. Johann, Michael D. McGuigan, Stanimire Tomov, Eric Blum, Gordon R. Whiteley, Emanuel F. Petricoin, Lance A. Liotta:
Toward a Systems Biology Software Toolkit. CBMS 2004: 500-505 - [i3]Stanimire Tomov, Michael D. McGuigan:
Interactive visualization of higher dimensional data in a multiview environment. CoRR cs.GR/0405048 (2004) - 2003
- [i2]Stanimire Tomov, Robert Bennett, Michael D. McGuigan, Arnold M. Peskin, Gordon Smith, John Spiletic:
Application of interactive parallel visualization for commodity-based clusters using visualization APIs. CoRR cs.GR/0307065 (2003) - [i1]Stanimire Tomov, Michael D. McGuigan, Robert Bennett, Gordon Smith, John Spiletic:
Benchmarking and Implementation of Probability-Based Simulations on Programmable Graphics Cards. CoRR cs.GR/0312006 (2003)
Coauthor Index
aka: Tingxing Tim Dong
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-13 20:06 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint