default search action
James Demmel
Person information
- affiliation: University of California, Berkeley, USA
- award (2014): Paris Kanellakis Award
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Books and Theses
- 1999
- [b3]Edward C. Anderson, Zhaojun Bai, Christian H. Bischof, L. Susan Blackford, James Demmel, Jack J. Dongarra, Jeremy Du Croz, Anne Greenbaum, Sven Hammarling, A. McKenney, Danny C. Sorensen:
LAPACK Users' Guide, Third Edition. Software, Environments and Tools, SIAM 1999, ISBN 978-0-89871-447-0, pp. 1-404 - 1997
- [b2]James Demmel:
Applied Numerical Linear Algebra. SIAM 1997, ISBN 978-0-898713-89-3, pp. I-XI, 1-419 - 1994
- [b1]Richard F. Barrett, Michael W. Berry, Tony F. Chan, James Demmel, June M. Donato, Jack J. Dongarra, Victor Eijkhout, Roldan Pozo, Charles H. Romine, Henk A. van der Vorst:
Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. Other Titles in Applied Mathematics, SIAM 1994, ISBN 978-0-89871-328-2, pp. 1-118
Journal Articles
- 2024
- [j94]Hengrui Luo, Younghyun Cho, James Weldon Demmel, Igor Kozachenko, Xiaoye S. Li, Yang Liu:
Non-smooth Bayesian optimization in tuning scientific applications. Int. J. High Perform. Comput. Appl. 38(6): 633-657 (2024) - [j93]Michael Christ, James Demmel, Nicholas Knight, Thomas Scanlon, Katherine A. Yelick:
On Multilinear Inequalities of Holder-Brascamp-Lieb Type for Torsion-Free Discrete Abelian Groups. J. Log. Anal. 16 (2024) - [j92]Chaoyu Gong, Jim Demmel, Yang You:
Scalable Evidential K-Nearest Neighbor Classification on Big Data. IEEE Trans. Big Data 10(3): 226-237 (2024) - [j91]Chaoyu Gong, Jim Demmel, Yang You:
Distributed and Joint Evidential K-Nearest Neighbor Classification. IEEE Trans. Knowl. Data Eng. 36(11): 5972-5985 (2024) - 2023
- [j90]James Demmel:
Nearly Optimal Block-Jacobi Preconditioning. SIAM J. Matrix Anal. Appl. 44(1): 408-413 (2023) - [j89]James Demmel, Laura Grigori, Alexander Rusciano:
An Improved Analysis and Unified Perspective on Deterministic and Randomized Low-Rank Matrix Approximation. SIAM J. Matrix Anal. Appl. 44(2): 559-591 (2023) - 2021
- [j88]Yang You, Jingyue Huang, Cho-Jui Hsieh, Richard W. Vuduc, James Demmel:
Communication-avoiding kernel ridge regression on parallel and distributed systems. CCF Trans. High Perform. Comput. 3(3): 252-270 (2021) - [j87]Edgar Solomonik, James Demmel:
Fast Bilinear Algorithms for Symmetric Tensor Contractions. Comput. Methods Appl. Math. 21(1): 211-231 (2021) - [j86]Swapnil Das, James Demmel, Kimon Fountoulakis, Laura Grigori, Michael W. Mahoney, Shenghao Yang:
Parallel and Communication Avoiding Least Angle Regression. SIAM J. Sci. Comput. 43(2): C154-C176 (2021) - [j85]Edgar Solomonik, James Demmel, Torsten Hoefler:
Communication Lower Bounds of Bilinear Algorithms for Symmetric Tensor Contractions. SIAM J. Sci. Comput. 43(5): A3328-A3356 (2021) - 2020
- [j84]Yang You, Yuxiong He, Samyam Rajbhandari, Wenhan Wang, Cho-Jui Hsieh, Kurt Keutzer, James Demmel:
Fast LSTM by dynamic decomposition on cloud and distributed systems. Knowl. Inf. Syst. 62(11): 4169-4197 (2020) - [j83]Osni Marques, James Demmel, Paulo B. Vasconcelos:
Bidiagonal SVD Computation via an Associated Tridiagonal Eigenproblem. ACM Trans. Math. Softw. 46(2): 14:1-14:25 (2020) - [j82]Willow Ahrens, James Demmel, Hong Diep Nguyen:
Algorithms for Efficient Reproducible Floating Point Summation. ACM Trans. Math. Softw. 46(3): 22:1-22:49 (2020) - 2019
- [j81]Aditya Devarakonda, Kimon Fountoulakis, James Demmel, Michael W. Mahoney:
Avoiding Communication in Primal and Dual Block Coordinate Descent Methods. SIAM J. Sci. Comput. 41(1): C1-C27 (2019) - [j80]Yang You, Zhao Zhang, Cho-Jui Hsieh, James Demmel, Kurt Keutzer:
Fast Deep Neural Network Training on Distributed Systems and Cloud TPUs. IEEE Trans. Parallel Distributed Syst. 30(11): 2449-2462 (2019) - 2018
- [j79]Laura Grigori, Sébastien Cayrols, James Weldon Demmel:
Low Rank Approximation of a Sparse Matrix Based on LU Factorization with Column and Row Tournament Pivoting. SIAM J. Sci. Comput. 40(2) (2018) - 2017
- [j78]Yang You, James Demmel, Kent Czechowski, Le Song, Rich Vuduc:
Design and Implementation of a Communication-Optimal Classifier for Distributed Kernel Support Vector Machines. IEEE Trans. Parallel Distributed Syst. 28(4): 974-988 (2017) - 2016
- [j77]Steven I. Gordon, James Demmel, Lizanne DeStefano, Lorna Rivera:
Implementing a Collaborative Online Course to Extend Access to HPC Skills. Comput. Sci. Eng. 18(1): 73-79 (2016) - [j76]Ariful Azad, Grey Ballard, Aydin Buluç, James Demmel, Laura Grigori, Oded Schwartz, Sivan Toledo, Samuel Williams:
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication. SIAM J. Sci. Comput. 38(6) (2016) - 2015
- [j75]Grey Ballard, James Demmel, Laura Grigori, Mathias Jacquelin, Nicholas Knight, Hong Diep Nguyen:
Reconstructing Householder vectors from Tall-Skinny QR. J. Parallel Distributed Comput. 85: 3-31 (2015) - [j74]James Demmel, Laura Grigori, Ming Gu, Hua Xiang:
Communication Avoiding Rank Revealing QR Factorization with Column Pivoting. SIAM J. Matrix Anal. Appl. 36(1): 55-89 (2015) - [j73]Erin C. Carson, James Weldon Demmel:
Accuracy of the s-Step Lanczos Method for the Symmetric Eigenproblem in Finite Precision. SIAM J. Matrix Anal. Appl. 36(2): 793-819 (2015) - [j72]James Demmel, Hong Diep Nguyen:
Parallel Reproducible Summation. IEEE Trans. Computers 64(7): 2060-2070 (2015) - [j71]Grey Ballard, James Demmel, Nicholas Knight:
Avoiding Communication in Successive Band Reduction. ACM Trans. Parallel Comput. 1(2): 11:1-11:37 (2015) - 2014
- [j70]Grey Ballard, Erin C. Carson, James Demmel, Mark Hoemmen, Nicholas Knight, Oded Schwartz:
Communication lower bounds and optimal algorithms for numerical linear algebra. Acta Numer. 23: 1-155 (2014) - [j69]Grey Ballard, James Demmel, Olga Holtz, Oded Schwartz:
Communication costs of Strassen's matrix multiplication. Commun. ACM 57(2): 107-114 (2014) - [j68]Edgar Solomonik, Devin Matthews, Jeff R. Hammond, John F. Stanton, James Demmel:
A massively parallel tensor contraction framework for coupled-cluster computations. J. Parallel Distributed Comput. 74(12): 3176-3190 (2014) - [j67]Erin C. Carson, James Demmel:
A Residual Replacement Strategy for Improving the Maximum Attainable Accuracy of s-Step Krylov Subspace Methods. SIAM J. Matrix Anal. Appl. 35(1): 22-43 (2014) - [j66]Grey Ballard, Dulceneia Becker, James Demmel, Jack J. Dongarra, Alex Druinsky, Inon Peled, Oded Schwartz, Sivan Toledo, Ichitaro Yamazaki:
Communication-Avoiding Symmetric-Indefinite Factorization. SIAM J. Matrix Anal. Appl. 35(4): 1364-1406 (2014) - 2013
- [j65]Amal Khabou, James Demmel, Laura Grigori, Ming Gu:
LU Factorization with Panel Rank Revealing Pivoting and Its Communication Avoiding Version. SIAM J. Matrix Anal. Appl. 34(3): 1401-1429 (2013) - [j64]Erin C. Carson, Nicholas Knight, James Demmel:
Avoiding Communication in Nonsymmetric Lanczos-Based Krylov Subspace Methods. SIAM J. Sci. Comput. 35(5) (2013) - 2012
- [j63]Grey Ballard, James Demmel, Olga Holtz, Oded Schwartz:
Graph expansion and communication costs of fast matrix multiplication. J. ACM 59(6): 32:1-32:23 (2012) - [j62]James Demmel, Laura Grigori, Mark Hoemmen, Julien Langou:
Communication-optimal Parallel and Sequential QR and LU Factorizations. SIAM J. Sci. Comput. 34(1) (2012) - [j61]Mark Murphy, Marcus T. Alley, James Demmel, Kurt Keutzer, Shreyas Vasanawala, Michael Lustig:
Fast ℓ1-SPIRiT Compressed Sensing Parallel Imaging MRI: Scalable Parallel Implementation and Clinically Feasible Runtime. IEEE Trans. Medical Imaging 31(6): 1250-1262 (2012) - 2011
- [j60]Grey Ballard, James Demmel, Olga Holtz, Oded Schwartz:
Minimizing Communication in Numerical Linear Algebra. SIAM J. Matrix Anal. Appl. 32(3): 866-901 (2011) - [j59]Laura Grigori, James Demmel, Hua Xiang:
CALU: A Communication Optimal LU Factorization Algorithm. SIAM J. Matrix Anal. Appl. 32(4): 1317-1350 (2011) - 2010
- [j58]Grey Ballard, James Demmel, Olga Holtz, Oded Schwartz:
Communication-optimal Parallel and Sequential Cholesky Decomposition. SIAM J. Sci. Comput. 32(6): 3495-3523 (2010) - 2009
- [j57]Krste Asanovic, Rastislav Bodík, James Demmel, Tony M. Keaveny, Kurt Keutzer, John Kubiatowicz, Nelson Morgan, David A. Patterson, Koushik Sen, John Wawrzynek, David Wessel, Katherine A. Yelick:
A view of the parallel computing landscape. Commun. ACM 52(10): 56-67 (2009) - [j56]Samuel Williams, Leonid Oliker, Richard W. Vuduc, John Shalf, Katherine A. Yelick, James Demmel:
Optimization of sparse matrix-vector multiplication on emerging multicore platforms. Parallel Comput. 35(3): 178-194 (2009) - [j55]James Demmel, Mark Hoemmen, Yozo Hida, E. Jason Riedy:
Nonnegative Diagonals and High Performance on Low-Profile Matrices from Householder QR. SIAM J. Sci. Comput. 31(4): 2832-2841 (2009) - [j54]James Demmel, Yozo Hida, E. Jason Riedy, Xiaoye S. Li:
Extra-Precise Iterative Refinement for Overdetermined Least Squares Problems. ACM Trans. Math. Softw. 35(4): 28:1-28:32 (2009) - 2008
- [j53]Jiawang Nie, James Demmel, Ming Gu:
Global minimization of rational functions and the nearest GCDs. J. Glob. Optim. 40(4): 697-718 (2008) - [j52]Jiawang Nie, James Demmel:
Sparse SOS Relaxations for Minimizing Functions that are Summations of Small Polynomials. SIAM J. Optim. 19(4): 1534-1558 (2008) - [j51]David Bindel, James Demmel, Mark J. Friedman:
Continuation of Invariant Subspaces in Large Bifurcation Problems. SIAM J. Sci. Comput. 30(2): 637-656 (2008) - [j50]James Demmel, Osni Marques, Beresford N. Parlett, Christof Vömel:
Performance and Accuracy of LAPACK's Symmetric Tridiagonal Eigensolvers. SIAM J. Sci. Comput. 30(3): 1508-1526 (2008) - [j49]Gary W. Howell, James Demmel, Charles T. Fulton, Sven Hammarling, Karen Marmol:
Cache efficient bidiagonalization using BLAS 2.5 operators. ACM Trans. Math. Softw. 34(3): 14:1-14:33 (2008) - [j48]Osni Marques, Christof Vömel, James Demmel, Beresford N. Parlett:
Algorithm 880: A testing infrastructure for symmetric tridiagonal eigensolvers. ACM Trans. Math. Softw. 35(1): 8:1-8:13 (2008) - 2007
- [j47]Rajesh Nishtala, Richard W. Vuduc, James Demmel, Katherine A. Yelick:
When cache blocking of sparse matrix vector multiply works and why. Appl. Algebra Eng. Commun. Comput. 18(3): 297-311 (2007) - [j46]James Demmel, Ioana Dumitriu, Olga Holtz, Robert Kleinberg:
Fast matrix multiplication is stable. Numerische Mathematik 106(2): 199-224 (2007) - [j45]James Demmel, Ioana Dumitriu, Olga Holtz:
Fast linear algebra is stable. Numerische Mathematik 108(1): 59-91 (2007) - [j44]Laura Grigori, James Demmel, Xiaoye S. Li:
Parallel Symbolic Factorization for Sparse LU with Static Pivoting. SIAM J. Sci. Comput. 29(3): 1289-1314 (2007) - 2006
- [j43]James Demmel, Plamen Koev:
Accurate and efficient evaluation of Schur and Jack functions. Math. Comput. 75(253): 223-239 (2006) - [j42]Jiawang Nie, James Demmel, Bernd Sturmfels:
Minimizing Polynomials via Sum of Squares over the Gradient Ideal. Math. Program. 106(3): 587-606 (2006) - [j41]James Demmel, Yozo Hida, William Kahan, Xiaoye S. Li, Sonil Mukherjee, E. Jason Riedy:
Error bounds from extra-precise iterative refinement. ACM Trans. Math. Softw. 32(2): 325-351 (2006) - 2005
- [j40]Jiawang Nie, James Demmel:
Minimum Ellipsoid Bounds for Solutions of Polynomial Systems via Sum of Squares. J. Glob. Optim. 33(4): 511-525 (2005) - [j39]James Demmel, Plamen Koev:
The Accurate and Efficient Solution of a Totally Positive Generalized Vandermonde Linear System. SIAM J. Matrix Anal. Appl. 27(1): 142-152 (2005) - 2004
- [j38]Richard W. Vuduc, James Demmel, Jeff A. Bilmes:
Statistical Models for Empirical Search-Based Performance Tuning. Int. J. High Perform. Comput. Appl. 18(1): 65-94 (2004) - [j37]James Demmel, Yozo Hida:
Fast and Accurate Floating Point Summation with Application to Computational Geometry. Numer. Algorithms 37(1-4): 101-112 (2004) - [j36]James Demmel, Plamen Koev:
Accurate SVDs of weakly diagonally dominant M-matrices. Numerische Mathematik 98(1): 99-104 (2004) - [j35]James Demmel, Yozo Hida:
Accurate and Efficient Floating Point Summation. SIAM J. Sci. Comput. 25(4): 1214-1248 (2004) - 2003
- [j34]Eiji Mizutani, James Demmel:
On structure-exploiting trust-region regularized nonlinear least squares algorithms for neural-network learning. Neural Networks 16(5-6): 745-753 (2003) - [j33]Xiaoye S. Li, James Demmel:
SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems. ACM Trans. Math. Softw. 29(2): 110-140 (2003) - 2002
- [j32]Xiaoye S. Li, James Demmel, David H. Bailey, Greg Henry, Yozo Hida, Jimmy Iskandar, William Kahan, Suh Y. Kang, Anil Kapur, Michael C. Martin, Brandon Thompson, Teresa Tung, Daniel J. Yoo:
Design, implementation and testing of extended and mixed precision BLAS. ACM Trans. Math. Softw. 28(2): 152-205 (2002) - [j31]David Bindel, James Demmel, William Kahan, Osni Marques:
On computing givens rotations reliably and efficiently. ACM Trans. Math. Softw. 28(2): 206-238 (2002) - 2001
- [j30]James Demmel, Benjamin Diament, Gregorio Malajovich:
On the Complexity of Computing Error Bounds. Found. Comput. Math. 1(1): 101-125 (2001) - 2000
- [j29]James Demmel:
Accurate Singular Value Decompositions of Structured Matrices. SIAM J. Matrix Anal. Appl. 21(2): 562-580 (2000) - [j28]James Weldon Demmel, Luca Dieci, Mark J. Friedman:
Computing Connecting Orbits via an Improved Algorithm for Continuing Invariant Subspaces. SIAM J. Sci. Comput. 22(1): 81-94 (2000) - 1999
- [j27]James Weldon Demmel, Stanley C. Eisenstat, John R. Gilbert, Xiaoye S. Li, Joseph W. H. Liu:
A Supernodal Approach to Sparse Partial Pivoting. SIAM J. Matrix Anal. Appl. 20(3): 720-755 (1999) - [j26]James Weldon Demmel, John R. Gilbert, Xiaoye S. Li:
An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination. SIAM J. Matrix Anal. Appl. 20(4): 915-952 (1999) - 1998
- [j25]Joel H. Saltz, Alan Sussman, Susan L. Graham, James Demmel, Scott B. Baden, Jack J. Dongarra:
Programming Tools and Environments. Commun. ACM 41(11): 64-73 (1998) - [j24]Zhaojun Bai, James Demmel:
Using the Matrix Sign Function to Compute Invariant Subspaces. SIAM J. Matrix Anal. Appl. 19(1): 205-225 (1998) - 1997
- [j23]Soumen Chakrabarti, James Demmel, Katherine A. Yelick:
Models and Scheduling Algorithms for Mixed Data and Task Parallel Programs. J. Parallel Distributed Comput. 47(2): 168-184 (1997) - [j22]Zhaojun Bai, James Demmel, Jack J. Dongarra, Antoine Petitet, Howard Robinson, Ken Stanley:
The Spectral Decomposition of Nonsymmetric Matrices on Distributed Memory Parallel Computers. SIAM J. Sci. Comput. 18(5): 1446-1461 (1997) - [j21]L. Susan Blackford, Andrew J. Cleary, Antoine Petitet, R. Clinton Whaley, James Demmel, Inderjit S. Dhillon, H. Ren, Ken Stanley, Jack J. Dongarra, Sven Hammarling:
Practical Experience in the Numerical Dangers of Heterogeneous Computing. ACM Trans. Math. Softw. 23(2): 133-147 (1997) - 1995
- [j20]Dinesh Manocha, James Demmel:
Algorithms for Intersecting Parametric and Algebraic Curves II: Multiple Intersections. CVGIP Graph. Model. Image Process. 57(2): 81-100 (1995) - [j19]James Weldon Demmel, Nicholas J. Higham, Robert S. Schreiber:
Stability of block LU factorization. Numer. Linear Algebra Appl. 2(2): 173-190 (1995) - 1994
- [j18]James Demmel, Xiaoye S. Li:
Faster Numerical Algorithms via Exception Handling. IEEE Trans. Computers 43(8): 983-992 (1994) - [j17]Dinesh Manocha, James Demmel:
Algorithms for intersecting parametric and algebraic curves I: simple intersections. ACM Trans. Graph. 13(1): 73-100 (1994) - 1993
- [j16]Victor Y. Pan, James Demmel:
A New Algorithm for the Symmetric Tridiagonal Eigenvalue Problem. J. Complex. 9(3): 387-405 (1993) - [j15]James Weldon Demmel, Nicholas J. Higham:
Improved Error Bounds for Underdetermined System Solvers. SIAM J. Matrix Anal. Appl. 14(1): 1-14 (1993) - [j14]Zhaojun Bai, James Weldon Demmel:
Computing the Generalized Singular Value Decomposition. SIAM J. Sci. Comput. 14(6): 1464-1486 (1993) - [j13]James Demmel, Bo Kågström:
The generalized Schur decomposition of an arbitrary pencil A-λB - robust software with error bounds and applications. Part I: theory and algorithms. ACM Trans. Math. Softw. 19(2): 160-174 (1993) - [j12]James Demmel, Bo Kågström:
The generalized Schur decomposition of an arbitrary pencil A-λB - robust software with error bounds and applications. Part II: software and applications. ACM Trans. Math. Softw. 19(2): 175-201 (1993) - [j11]Zhaojun Bai, James Demmel, A. McKenney:
On computing condition numbers for the nonsymmetric eigenproblem. ACM Trans. Math. Softw. 19(2): 202-223 (1993) - 1992
- [j10]James Demmel:
The Componentwise Distance to the Nearest Singular Matrix. SIAM J. Matrix Anal. Appl. 13(1): 10-19 (1992) - [j9]James Demmel, Kresimir Veselic:
Jacobi's Method is More Accurate than QR. SIAM J. Matrix Anal. Appl. 13(4): 1204-1245 (1992) - [j8]James Demmel, Nicholas J. Higham:
Stability of block algorithms with fast level-3 BLAS. ACM Trans. Math. Softw. 18(3): 274-291 (1992) - 1991
- [j7]James Demmel:
LAPACK: A portable linear algebra library for high-performance computers. Concurr. Pract. Exp. 3(6): 655-666 (1991) - 1990
- [j6]James Weldon Demmel:
Matrix Computations; Second Edition (Gene Golub and Charles F. Van Loan). SIAM Rev. 32(4): 690-691 (1990) - [j5]James Demmel, William Kahan:
Accurate Singular Values of Bidiagonal Matrices. SIAM J. Sci. Comput. 11(5): 873-912 (1990) - 1989
- [j4]Zhaojun Bai, James Demmel:
On a Block Implementation of Hessenberg Multishift QR Iteration. Int. J. High Speed Comput. 1(1): 97-112 (1989) - 1987
- [j3]James Weldon Demmel:
Three methods for refining estimates of invariant subspaces. Computing 38(1): 43-57 (1987) - [j2]James Demmel:
The geometry of III-conditioning. J. Complex. 3(2): 201-229 (1987) - 1985
- [j1]James Weldon Demmel, Fritz Krückeberg:
An interval algorithm for solving systems of linear equations to prespecified accuracy. Computing 34(2): 117-129 (1985)
Conference and Workshop Papers
- 2024
- [c115]Tianyu Liang, Riley Murray, Aydin Buluç, James Demmel:
Fast multiplication of random dense matrices with sparse matrices. IPDPS 2024: 52-62 - [c114]Vivek Bharadwaj, Osman Asif Malik, Riley Murray, Aydin Buluç, James Demmel:
Distributed-Memory Randomized Algorithms for Sparse Tensor CP Decomposition. SPAA 2024: 155-168 - 2023
- [c113]Younghyun Cho, James Weldon Demmel, Jacob King, Xiaoye S. Li, Yang Liu, Hengrui Luo:
Harnessing the Crowd for Autotuning High-Performance Computing Applications. IPDPS 2023: 635-645 - [c112]Vivek Bharadwaj, Osman Asif Malik, Riley Murray, Laura Grigori, Aydin Buluç, James Demmel:
Fast Exact Leverage Score Sampling from Khatri-Rao Products with Applications to Tensor Decomposition. NeurIPS 2023 - 2022
- [c111]James Demmel, Jack J. Dongarra, Mark Gates, Greg Henry, Julien Langou, Xiaoye S. Li, Piotr Luszczek, Weslley S. Pereira, E. Jason Riedy, Cindy Rubio-González:
Proposed Consistent Exception Handling for the BLAS and LAPACK. Correctness@SC 2022: 1-9 - [c110]Vivek Bharadwaj, Aydin Buluç, James Demmel:
Distributed-Memory Sparse Kernels for Machine Learning. IPDPS 2022: 47-58 - [c109]Anthony Chen, James Demmel, Grace Dinh, Mason Haberle, Olga Holtz:
Communication bounds for convolutional neural networks. PASC 2022: 1:1-1:10 - 2021
- [c108]Arissa Wongpanich, Hieu Pham, James Demmel, Mingxing Tan, Quoc V. Le, Yang You, Sameer Kumar:
Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour. IPDPS Workshops 2021: 947-950 - [c107]Qijing Huang, Aravind Kalaiah, Minwoo Kang, James Demmel, Grace Dinh, John Wawrzynek, Thomas Norell, Yakun Sophia Shao:
CoSA: Scheduling by Constrained Optimization for Spatial Accelerators. ISCA 2021: 554-566 - [c106]Younghyun Cho, James Demmel, Xiaoye S. Li, Yang Liu, Hengrui Luo:
Enhancing Autotuning Capability with a History Database. MCSoC 2021: 249-257 - [c105]Yang Liu, Wissam M. Sid-Lakhdar, Osni Marques, Xinran Zhu, Chang Meng, James Weldon Demmel, Xiaoye S. Li:
GPTune: multitask learning for autotuning exascale applications. PPoPP 2021: 234-246 - [c104]Ruobing Han, Min Si, James Demmel, Yang You:
Dynamic scaling for low-precision learning. PPoPP 2021: 480-482 - [c103]Ruobing Han, James Demmel, Yang You:
Auto-Precision Scaling for Distributed Deep Learning. ISC 2021: 79-97 - 2020
- [c102]Aditya Devarakonda, James Demmel:
Avoiding Communication in Logistic Regression. HiPC 2020: 91-100 - [c101]Arissa Wongpanich, Yang You, James Demmel:
Rethinking the Value of Asynchronous Solvers for Distributed Deep Learning. HPC Asia 2020: 52-60 - [c100]Yang You, Jing Li, Sashank J. Reddi, Jonathan Hseu, Sanjiv Kumar, Srinadh Bhojanapalli, Xiaodan Song, James Demmel, Kurt Keutzer, Cho-Jui Hsieh:
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes. ICLR 2020 - [c99]Grace Dinh, James Demmel:
Communication-Optimal Tilings for Projective Nested Loops with Arbitrary Bounds. SPAA 2020: 523-525 - 2019
- [c98]Yang You, Yuxiong He, Samyam Rajbhandari, Wenhan Wang, Cho-Jui Hsieh, Kurt Keutzer, James Demmel:
Fast LSTM Inference by Dynamic Decomposition on Cloud Systems. ICDM 2019: 748-757 - [c97]Yang You, Jonathan Hseu, Chris Ying, James Demmel, Kurt Keutzer, Cho-Jui Hsieh:
Large-batch training for LSTM and beyond. SC 2019: 9:1-9:16 - 2018
- [c96]E. Jason Riedy, James Demmel:
Augmented Arithmetic Operations Proposed for IEEE-754 2018. ARITH 2018: 45-52 - [c95]Yang You, Zhao Zhang, Cho-Jui Hsieh, James Demmel, Kurt Keutzer:
ImageNet Training in Minutes. ICPP 2018: 1:1-1:10 - [c94]Saeed Soori, Aditya Devarakonda, Zachary Blanco, James Demmel, Mert Gürbüzbalaban, Maryam Mehri Dehnavi:
Reducing Communication in Proximal Newton Methods for Sparse Least Squares Problems. ICPP 2018: 22:1-22:10 - [c93]Yang You, James Demmel, Cho-Jui Hsieh, Richard W. Vuduc:
Accurate, Fast and Scalable Kernel Ridge Regression on Parallel and Distributed Systems. ICS 2018: 307-317 - [c92]Aditya Devarakonda, Kimon Fountoulakis, James Demmel, Michael W. Mahoney:
Avoiding Synchronization in First-Order Methods for Sparse Convex Optimization. IPDPS 2018: 409-418 - [c91]Grey Ballard, James Demmel, Laura Grigori, Mathias Jacquelin, Nicholas Knight:
A 3D Parallel Algorithm for QR Decomposition. SPAA 2018: 55-65 - 2017
- [c90]Yang You, James Demmel:
Runtime Data Layout Scheduling for Machine Learning Dataset. ICPP 2017: 452-461 - [c89]Yang You, Aydin Buluç, James Demmel:
Scaling deep learning on GPU and knights landing clusters. SC 2017: 9 - [c88]Edgar Solomonik, Grey Ballard, James Demmel, Torsten Hoefler:
A Communication-Avoiding Parallel Algorithm for the Symmetric Eigenvalue Problem. SPAA 2017: 111-121 - 2016
- [c87]Alex Gittens, Aditya Devarakonda, Evan Racah, Michael F. Ringenburg, Lisa Gerhardt, Jey Kottalam, Jialin Liu, Kristyn J. Maschhoff, Shane Canon, Jatin Chhugani, Pramod Sharma, Jiyan Yang, James Demmel, Jim Harrell, Venkat Krishnamurthy, Michael W. Mahoney, Prabhat:
Matrix factorizations at scale: A comparison of scientific data analytics in spark and C+MPI using three case studies. IEEE BigData 2016: 204-213 - [c86]Cindy Rubio-González, Cuong Nguyen, Benjamin Mehne, Koushik Sen, James Demmel, William Kahan, Costin Iancu, Wim Lavrijsen, David H. Bailey, David Hough:
Floating-point precision tuning using blame analysis. ICSE 2016: 1074-1085 - [c85]Erin C. Carson, James Demmel, Laura Grigori, Nicholas Knight, Penporn Koanantakool, Oded Schwartz, Harsha Vardhan Simhadri:
Write-Avoiding Algorithms. IPDPS 2016: 648-658 - [c84]Yang You, Xiangru Lian, Ji Liu, Hsiang-Fu Yu, Inderjit S. Dhillon, James Demmel, Cho-Jui Hsieh:
Asynchronous Parallel Greedy Coordinate Descent. NIPS 2016: 4682-4690 - [c83]Grey Ballard, James Demmel, Andrew Gearhart, Benjamin Lipshitz, Yishai Oltchik, Oded Schwartz, Sivan Toledo:
Network Topologies and Inevitable Contention. COMHPC@SC 2016: 39-52 - 2015
- [c82]Hong Diep Nguyen, James Demmel:
Reproducible Tall-Skinny QR. ARITH 2015: 152-159 - [c81]Yang You, James Demmel, Kenneth Czechowski, Le Song, Richard W. Vuduc:
CA-SVM: Communication-Avoiding Support Vector Machines on Distributed Systems. IPDPS 2015: 847-859 - [c80]Steven I. Gordon, James Demmel, Lizanne DeStefano, Lorna Rivera:
Extending access to HPC skills through a blended online course. XSEDE 2015: 15:1-15:5 - 2014
- [c79]Jeff A. Bilmes, Krste Asanovic, Chee-Whye Chin, Jim Demmel:
Author retrospective for optimizing matrix multiply using PHiPAC: a portable high-performance ANSI C coding methodology. ICS 25th Anniversary 2014: 42-44 - [c78]Samuel Williams, Mike Lijewski, Ann S. Almgren, Brian van Straalen, Erin C. Carson, Nicholas Knight, James Demmel:
s-Step Krylov Subspace Methods as Bottom Solvers for Geometric Multigrid. IPDPS 2014: 1149-1158 - [c77]Grey Ballard, James Demmel, Laura Grigori, Mathias Jacquelin, Hong Diep Nguyen, Edgar Solomonik:
Reconstructing Householder Vectors from Tall-Skinny QR. IPDPS 2014: 1159-1170 - [c76]Edgar Solomonik, Erin C. Carson, Nicholas Knight, James Demmel:
Tradeoffs between synchronization, communication, and computation in parallel linear algebra computations. SPAA 2014: 307-318 - [c75]Razvan Carbunescu, Aditya Devarakonda, James Demmel, Steven I. Gordon, Jay Alameda, Susan Mehringer:
Architecting an autograder for parallel code. XSEDE 2014: 68:1-68:8 - 2013
- [c74]James Demmel, Hong Diep Nguyen:
Fast Reproducible Floating-Point Summation. IEEE Symposium on Computer Arithmetic 2013: 163-172 - [c73]James Demmel, Hong Diep Nguyen:
Numerical Reproducibility and Accuracy at ExaScale. IEEE Symposium on Computer Arithmetic 2013: 235-237 - [c72]Austin R. Benson, David F. Gleich, James Demmel:
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures. IEEE BigData 2013: 264-272 - [c71]James Demmel, David Eliahu, Armando Fox, Shoaib Kamil, Benjamin Lipshitz, Oded Schwartz, Omer Spillinger:
Communication-Optimal Parallel Recursive Rectangular Matrix Multiplication. IPDPS 2013: 261-272 - [c70]Edgar Solomonik, Aydin Buluç, James Demmel:
Minimizing Communication in All-Pairs Shortest Paths. IPDPS 2013: 548-559 - [c69]James Demmel:
Communication-Avoiding Algorithms for Linear Algebra and Beyond. IPDPS 2013: 585 - [c68]James Demmel, Andrew Gearhart, Benjamin Lipshitz, Oded Schwartz:
Perfect Strong Scaling Using No Additional Energy. IPDPS 2013: 649-660 - [c67]Edgar Solomonik, Devin Matthews, Jeff R. Hammond, James Demmel:
Cyclops Tensor Framework: Reducing Communication and Eliminating Load Imbalance in Massively Parallel Contractions. IPDPS 2013: 813-824 - [c66]Grey Ballard, Dulceneia Becker, James Demmel, Jack J. Dongarra, Alex Druinsky, Inon Peled, Oded Schwartz, Sivan Toledo, Ichitaro Yamazaki:
Implementing a Blocked Aasen's Algorithm with a Dynamic Scheduler on Multicore Architectures. IPDPS 2013: 895-907 - [c65]Nicholas Knight, Erin C. Carson, James Demmel:
Exploiting Data Sparsity in Parallel Matrix Powers Computations. PPAM (1) 2013: 15-25 - [c64]Cindy Rubio-González, Cuong Nguyen, Hong Diep Nguyen, James Demmel, William Kahan, Koushik Sen, David H. Bailey, Costin Iancu, David Hough:
Precimonious: tuning assistant for floating-point precision. SC 2013: 27:1-27:12 - [c63]Grey Ballard, Aydin Buluç, James Demmel, Laura Grigori, Benjamin Lipshitz, Oded Schwartz, Sivan Toledo:
Communication optimal parallel multiplication of sparse random matrices. SPAA 2013: 222-231 - [c62]Grey Ballard, James Demmel, Benjamin Lipshitz, Oded Schwartz, Sivan Toledo:
Communication efficient gaussian elimination with partial pivoting using a shape morphing data layout. SPAA 2013: 232-240 - [c61]Steven I. Gordon, Jay Alameda, James Demmel, Razvan Carbunescu, Susan Mehringer:
Providing a supported online course on parallel computing. XSEDE 2013: 60:1-60:4 - 2012
- [c60]Grey Ballard, James Demmel, Olga Holtz, Benjamin Lipshitz, Oded Schwartz:
Graph Expansion Analysis for Communication Costs of Fast Rectangular Matrix Multiplication. MedAlg 2012: 13-36 - [c59]Grey Ballard, James Demmel, Nicholas Knight:
Communication avoiding successive band reduction. PPoPP 2012: 35-44 - [c58]Benjamin Lipshitz, Grey Ballard, James Demmel, Oded Schwartz:
Communication-avoiding parallel strassen: implementation and performance. SC 2012: 101 - [c57]James Demmel, David Eliahu, Armando Fox, Shoaib Kamil, Benjamin Lipshitz, Oded Schwartz, Omer Spillinger:
Poster: Beating MKL and ScaLAPACK at Rectangular Matrix Multiplication Using the BFS/DFS Approach. SC Companion 2012: 1370 - [c56]Jim Demmel:
Communication avoiding algorithms. SC Companion 2012: 1942-2000 - [c55]Grey Ballard, James Demmel, Olga Holtz, Benjamin Lipshitz, Oded Schwartz:
Brief announcement: strong scaling of matrix multiplication algorithms and memory-independent communication lower bounds. SPAA 2012: 77-79 - [c54]Grey Ballard, James Demmel, Olga Holtz, Benjamin Lipshitz, Oded Schwartz:
Communication-optimal parallel algorithm for strassen's matrix multiplication. SPAA 2012: 193-204 - [c53]Edgar Solomonik, James Demmel:
Matrix Multiplication on Multidimensional Torus Networks. VECPAR 2012: 201-215 - 2011
- [c52]James Demmel:
Avoiding Communication in Numerical Linear Algebra. ALENEX 2011: 59 - [c51]Edgar Solomonik, James Demmel:
Communication-Optimal Parallel 2.5D Matrix Multiplication and LU Factorization Algorithms. Euro-Par (2) 2011: 90-109 - [c50]Jim Demmel:
Rethinking algorithms for future architectures: Communication-avoiding algorithms. Hot Chips Symposium 2011: 1-63 - [c49]Eiji Mizutani, James Demmel:
On improving trust-region variable projection algorithms for separable nonlinear least squares learning. IJCNN 2011: 397-404 - [c48]Michael J. Anderson, Grey Ballard, James Demmel, Kurt Keutzer:
Communication-Avoiding QR Decomposition for GPUs. IPDPS 2011: 48-58 - [c47]Aydin Buluç, Samuel Williams, Leonid Oliker, James Demmel:
Reduced-Bandwidth Multithreaded Algorithms for Sparse Matrix-Vector Multiplication. IPDPS 2011: 721-733 - [c46]Edgar Solomonik, Abhinav Bhatele, James Demmel:
Improving communication performance in dense linear algebra via topology aware collectives. SC 2011: 77:1-77:11 - [c45]James Demmel:
Accurate and efficient expression evaluation and linear algebra, or why it can be easier to compute accurate eigenvalues of a Vandermonde matrix than the accurate sum of 3 numbers. SNC 2011: 2 - [c44]Grey Ballard, James Demmel, Olga Holtz, Oded Schwartz:
Graph expansion and communication costs of fast matrix multiplication: regular submission. SPAA 2011: 1-12 - [c43]Grey Ballard, James Demmel, Andrew Gearhart:
Brief announcement: communication bounds for heterogeneous architectures. SPAA 2011: 257-258 - 2010
- [c42]Laura Grigori, Pierre-Yves David, James Demmel, Sylvain Peyronnet:
Brief announcement: Lower bounds on communication for sparse Cholesky factorization of a model problem. SPAA 2010: 79-81 - 2009
- [c41]Marghoob Mohiyuddin, Mark Hoemmen, James Demmel, Katherine A. Yelick:
Minimizing communication in sparse matrix solvers. SC 2009 - [c40]Grey Ballard, James Demmel, Olga Holtz, Oded Schwartz:
Communication-optimal parallel and sequential Cholesky decomposition: extended abstract. SPAA 2009: 245-252 - 2008
- [c39]James Demmel, Mark Hoemmen, Marghoob Mohiyuddin, Katherine A. Yelick:
Avoiding communication in sparse matrix computations. IPDPS 2008: 1-12 - [c38]Laura Grigori, James Demmel, Hua Xiang:
Communication avoiding Gaussian elimination. SC 2008: 29 - [c37]Vasily Volkov, James Demmel:
Benchmarking GPUs to tune dense linear algebra. SC 2008: 31 - 2007
- [c36]Sukun Kim, Shamim Pakzad, David E. Culler, James Demmel, Gregory Fenves, Steven D. Glaser, Martin Turon:
Health monitoring of civil infrastructures using wireless sensor networks. IPSN 2007: 254-263 - [c35]Samuel Williams, Leonid Oliker, Richard W. Vuduc, John Shalf, Katherine A. Yelick, James Demmel:
Optimization of sparse matrix-vector multiplication on emerging multicore platforms. SC 2007: 38 - 2006
- [c34]James Demmel, Jack J. Dongarra, Beresford N. Parlett, William Kahan, Ming Gu, David Bindel, Yozo Hida, Xiaoye S. Li, Osni Marques, E. Jason Riedy, Christof Vömel, Julien Langou, Piotr Luszczek, Jakub Kurzak, Alfredo Buttari, Julie Langou, Stanimire Tomov:
Prospectus for the Next LAPACK and ScaLAPACK Libraries. PARA 2006: 11-23 - [c33]Takahiro Katagiri, Christof Vömel, James Demmel:
Automatic Performance Tuning for the Multi-section with Multiple Eigenvalues Method for Symmetric Tridiagonal Eigenproblems. PARA 2006: 938-948 - [c32]Sukun Kim, Shamim Pakzad, David E. Culler, James Demmel, Gregory Fenves, Steven D. Glaser, Martin Turon:
Wireless sensor networks for structural health monitoring. SenSys 2006: 427-428 - 2005
- [c31]David Bindel, James Demmel, Mark J. Friedman, Willy Govaerts, Yuri A. Kuznetsov:
Bifurcation Analysis of Large Equilibrium Systems in Matlab. International Conference on Computational Science (1) 2005: 50-57 - 2004
- [c30]Benjamin C. Lee, Richard W. Vuduc, James Demmel, Katherine A. Yelick:
Performance Models for Evaluation and Automatic Tuning of Symmetric Sparse Matrix-Vector Multiply. ICPP 2004: 169-176 - [c29]David Bindel, Zhaojun Bai, James Demmel:
Model Reduction for RF MEMS Simulation. PARA 2004: 286-295 - [c28]Eun-Jin Im, Ismail Bustany, Cleve Ashcraft, James Demmel, Katherine A. Yelick:
Performance Tuning of Matrix Triple Products Based on Matrix Structure. PARA 2004: 740-746 - 2003
- [c27]Rich Vuduc, Attila Gyulassy, James Demmel, Katherine A. Yelick:
Memory Hierarchy Optimizations and Performance ounds for Sparse A. International Conference on Computational Science 2003: 705-714 - [c26]Eiji Mizutani, James Demmel:
Iterative Scaled Trust-Region Learning in Krylov Subspaces via Pearlmutter's Implicit Sparse Hessian-Vector Multiply. NIPS 2003: 209-216 - 2002
- [c25]Rich Vuduc, James Demmel, Katherine A. Yelick, Shoaib Kamil, Rajesh Nishtala, Benjamin C. Lee:
Performance optimizations and bounds for sparse matrix-vector multiply. SC 2002: 35:1-35:35 - 2001
- [c24]Leroy Anthony Drummond, James Demmel, Carlos R. Mechoso, Howard Robinson, Keith Sklower, Joseph A. Spahr:
A Data Broker for Distributed Computing Environments. International Conference on Computational Science (1) 2001: 31-40 - [c23]Rich Vuduc, James Demmel, Jeff A. Bilmes:
Statistical Models for Automatic Performance Tuning. International Conference on Computational Science (1) 2001: 117-126 - 2000
- [c22]Eiji Mizutani, James Demmel:
On Iterative Krylov-Dogleg Trust-Region Steps for Solving Neural Networks Nonlinear Least Squares Problems. NIPS 2000: 605-611 - [c21]Rich Vuduc, James Demmel:
Code Generators for Automatic Tuning of Numerical Kernels: Experiences with FFTW. SAIG 2000: 190-211 - 1999
- [c20]Xiaoye S. Li, James Demmel:
A Scalable Sparse Direct Solver Using Static Pivoting. PP 1999 - [c19]Mark Adams, Jim Demmel:
Parallel Multigrid Solver for 3D Unstructured Finite Element Problems. SC 1999: 27 - [c18]James Demmel:
Making Sparse Matrix Computations Scalable (Invited Talk Abstract). SPAA 1999: 43 - 1998
- [c17]Xiaoye S. Li, James Demmel:
Making Sparse Gaussian Elimination Scalable by Static Pivoting. SC 1998: 34 - 1997
- [c16]Jeff A. Bilmes, Krste Asanovic, Chee-Whye Chin, James Demmel:
Using PHiPAC to speed error back-propagation learning. ICASSP 1997: 4153-4156 - [c15]Jeff A. Bilmes, Krste Asanovic, Chee-Whye Chin, James Demmel:
Optimizing Matrix Multiply Using PHiPAC: A Portable, High-Performance, ANSI C Coding Methodology. International Conference on Supercomputing 1997: 340-347 - [c14]L. Susan Blackford, Jaeyoung Choi, Andrew J. Cleary, Eduardo F. D'Azevedo, James Demmel, Inderjit S. Dhillon, Jack J. Dongarra, Sven Hammarling, Greg Henry, Antoine Petitet, Ken Stanley, David W. Walker, R. Clinton Whaley:
ScaLAPACK: A Linear Algebra Library for Message-Passing Computers. PP 1997 - 1996
- [c13]Andrew J. Cleary, James Demmel, Inderjit S. Dhillon, Jack J. Dongarra, Sven Hammarling, Antoine Petitet, H. Ren, Ken Stanley, R. Clinton Whaley:
Practical Experience in the Dangers of Heterogeneous Computing. PARA 1996: 57-64 - [c12]L. Susan Blackford, Jaeyoung Choi, Andrew J. Cleary, James Demmel, Inderjit S. Dhillon, Jack J. Dongarra, Sven Hammarling, Greg Henry, Antoine Petitet, Ken Stanley, David W. Walker, R. Clinton Whaley:
ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance. SC 1996: 5 - 1995
- [c11]Jaeyoung Choi, James Demmel, Inderjit S. Dhillon, Jack J. Dongarra, Susan Ostrouchov, Antoine Petitet, Ken Stanley, David W. Walker, R. Clinton Whaley:
ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance. PARA 1995: 95-106 - [c10]James Demmel, Ken Stanley:
The Performance of Finding Eigenvalues and Eigenvaectors of Dense Symmetric Matrices on Distributed Memory Computers. PP 1995: 528-533 - [c9]James Demmel, Sharon Smith:
Performance of a Parallel Global Atmospheric Chemical Tracer Model. SC 1995: 80 - [c8]Soumen Chakrabarti, James Demmel, Katherine A. Yelick:
Modeling the Benefits of Mixed Data and Task Parallelism. SPAA 1995: 74-83 - 1993
- [c7]James Demmel, Xiaoye S. Li:
Faster numerical algorithms via exception handling. IEEE Symposium on Computer Arithmetic 1993: 234-241 - [c6]James Demmel, Jack J. Dongarra, Robert A. van de Geijn, David W. Walker:
LAPACK for Distributed Memory Architectures: The Next Generation. PPSC 1993: 323-329 - [c5]Zhaojun Bai, James Demmel:
Design of a Parallel Nonsymmetric Eigenroutine Toolbox, Part I. PPSC 1993: 391-398 - 1990
- [c4]Edward C. Anderson, Zhaojun Bai, Jack J. Dongarra, Anne Greenbaum, A. McKenney, Jeremy Du Croz, Sven Hammarling, James Demmel, Christian H. Bischof, Danny C. Sorensen:
LAPACK: a portable linear algebra library for high-performance computers. SC 1990: 2-11 - 1989
- [c3]James Demmel, Gerardo Lafferriere:
Optimal three finger grasps. ICRA 1989: 936-942 - 1988
- [c2]James Demmel, Gerardo Lafferriere, Jacob T. Schwartz, Micha Sharir:
Theoretical and experimental studies using a multifinger planar manipulator. ICRA 1988: 390-395 - 1987
- [c1]James Demmel:
On error analysis in arithmetic with varying relative precision. IEEE Symposium on Computer Arithmetic 1987: 148-152
Parts in Books or Collections
- 2000
- [p5]James Demmel:
A Brief Tour of Eigenproblems. Templates for the Solution of Algebraic Eigenvalue Problems 2000: 7-36 - [p4]James Demmel:
Singular Value Decomposition. Templates for the Solution of Algebraic Eigenvalue Problems 2000: 135-147 - [p3]T. Chen, James Demmel, Ming Gu, Yousef Saad, Richard B. Lehoucq, Danny C. Sorensen, Kristyn J. Maschhoff, Zhaojun Bai, David Day, Roland W. Freund, Gerard L. G. Sleijpen, Henk A. van der Vorst, Ruipeng Li:
Non-Hermitian Eigenvalue Problems. Templates for the Solution of Algebraic Eigenvalue Problems 2000: 149-231 - [p2]Jack J. Dongarra, Plamen Koev, Xiaoye S. Li, James Demmel, Henk A. van der Vorst:
Common Issues. Templates for the Solution of Algebraic Eigenvalue Problems 2000: 315-336 - 1995
- [p1]Zhaojun Bai, David Day, James Demmel, Jack J. Dongarra, Ming Gu, Axel Ruhe, Henk A. van der Vorst:
Templates for Linear Algebra Problems. Computer Science Today 1995: 115-140
Editorship
- 2000
- [e1]Zhaojun Bai, James Demmel, Jack J. Dongarra, Axel Ruhe, Henk A. van der Vorst:
Templates for the Solution of Algebraic Eigenvalue Problems. Software, environments, tools 11, SIAM 2000, ISBN 978-0-89871-471-5 [contents]
Reference Works
- 2011
- [r2]Xiaoye Sherry Li, James Demmel, John R. Gilbert, Laura Grigori, Meiyue Shao:
SuperLU. Encyclopedia of Parallel Computing 2011: 1955-1962 - 2007
- [r1]Yozo Hida, James Demmel, Julien Langou, Jakub Kurzak, Ming Gu, Alfredo Buttari, Stanimire Tomov, Piotr Luszczek, Julie Langou, Osni Marques, Christof Vömel, Xiaoye S. Li, E. Jason Riedy, Jack J. Dongarra, William Kahan, Beresford N. Parlett, David Bindel:
Prospectus for a Dense Linear Algebra Software Library. Handbook of Parallel Computing 2007
Informal and Other Publications
- 2024
- [i57]Xuan Jiang, Raja Sengupta, James Demmel, Samuel Williams:
LPSim: Large Scale Multi-GPU Parallel Computing based Regional Scale Traffic Simulation Framework. CoRR abs/2406.08496 (2024) - [i56]Ziming Liu, Shaoyu Wang, Shenggan Cheng, Zhongkai Zhao, Xuanlei Zhao, James Demmel, Yang You:
WallFacer: Guiding Transformer Model Training Out of the Long-Context Dark Forest with N-body Problem. CoRR abs/2407.00611 (2024) - [i55]Ahmad Abdelfattah, Willow Ahrens, Hartwig Anzt, Chris Armstrong, Ben Brock, Aydin Buluç, Federico Busato, Terry Cojean, Timothy A. Davis, Jim Demmel, Grace Dinh, David Gardener, Jan Fiala, Mark Gates, Azzam Haider, Toshiyuki Imamura, Pedro Valero-Lara, José E. Moreira, Xiaoye Sherry Li, Piotr Luszczek, Max Melichenko, Jose Moeira, Yvan Mokwinski, Riley Murray, Spencer Patty, Slaven Peles, Tobias Ribizel, E. Jason Riedy, Siva Rajamanickam, Piyush Sao, Manu Shantharam, Keita Teranishi, Stan Tomov, Yu-Hsiang Tsai, Heiko K. Weichelt:
Interface for Sparse Linear Algebra Operations. CoRR abs/2411.13259 (2024) - 2023
- [i54]Vivek Bharadwaj, Osman Asif Malik, Riley Murray, Laura Grigori, Aydin Buluç, James Demmel:
Fast Exact Leverage Score Sampling from Khatri-Rao Products with Applications to Tensor Decomposition. CoRR abs/2301.12584 (2023) - [i53]Riley Murray, James Demmel, Michael W. Mahoney, N. Benjamin Erichson, Maksim Melnichenko, Osman Asif Malik, Laura Grigori, Piotr Luszczek, Michal Derezinski, Miles E. Lopes, Tianyu Liang, Hengrui Luo, Jack J. Dongarra:
Randomized Numerical Linear Algebra : A Perspective on the Field With an Eye to Software. CoRR abs/2302.11474 (2023) - [i52]James Demmel, Ioana Dumitriu, Ryan Schneider:
Generalized Pseudospectral Shattering and Inverse-Free Matrix Pencil Diagonalization. CoRR abs/2306.03700 (2023) - [i51]Daniel Zou, Xinchen Jin, Xueyang Yu, Hao Zhang, James Demmel:
Computron: Serving Distributed Deep Learning Models with Model Parallel Swapping. CoRR abs/2306.13835 (2023) - [i50]Younghyun Cho, James Weldon Demmel, Michal Derezinski, Haoyun Li, Hengrui Luo, Michael W. Mahoney, Riley J. Murray:
Surrogate-based Autotuning for Randomized Sketching Algorithms in Regression Problems. CoRR abs/2308.15720 (2023) - [i49]Tianyu Liang, Riley Murray, Aydin Buluç, James Demmel:
Fast multiplication of random dense matrices with fixed sparse matrices. CoRR abs/2310.15419 (2023) - [i48]Maksim Melnichenko, Oleg Balabanov, Riley Murray, James Demmel, Michael W. Mahoney, Piotr Luszczek:
CholeskyQR with Randomization and Pivoting for Tall Matrices (CQRRPT). CoRR abs/2311.08316 (2023) - 2022
- [i47]Vivek Bharadwaj, Aydin Buluç, James Demmel:
Distributed-Memory Sparse Kernels for Machine Learning. CoRR abs/2203.07673 (2022) - [i46]Anthony Chen, James Demmel, Grace Dinh, Mason Haberle, Olga Holtz:
Communication Bounds for Convolutional Neural Networks. CoRR abs/2204.08279 (2022) - [i45]Hengrui Luo, Younghyun Cho, James Weldon Demmel, Xiaoye S. Li, Yang Liu:
Hybrid Models for Mixed Variables in Bayesian Optimization. CoRR abs/2206.01409 (2022) - [i44]James Demmel, Jack J. Dongarra, Mark Gates, Greg Henry, Julien Langou, Xiaoye S. Li, Piotr Luszczek, Weslley da Silva Pereira, E. Jason Riedy, Cindy Rubio-González:
Proposed Consistent Exception Handling for the BLAS and LAPACK. CoRR abs/2207.09281 (2022) - [i43]Vivek Bharadwaj, Osman Asif Malik, Riley Murray, Aydin Buluç, James Demmel:
Distributed-Memory Randomized Algorithms for Sparse Tensor CP Decomposition. CoRR abs/2210.05105 (2022) - 2021
- [i42]Qijing Huang, Minwoo Kang, Grace Dinh, Thomas Norell, Aravind Kalaiah, James Demmel, John Wawrzynek, Yakun Sophia Shao:
CoSA: Scheduling by Constrained Optimization for Spatial Accelerators. CoRR abs/2105.01898 (2021) - [i41]Hengrui Luo, James Weldon Demmel, Younghyun Cho, Xiaoye S. Li, Yang Liu:
Non-smooth Bayesian Optimization in Tuning Problems. CoRR abs/2109.07563 (2021) - 2020
- [i40]Grace Dinh, James Demmel:
Communication-Optimal Tilings for Projective Nested Loops with Arbitrary Bounds. CoRR abs/2003.00119 (2020) - [i39]Yang You, Yuhui Wang, Huan Zhang, Zhao Zhang, James Demmel, Cho-Jui Hsieh:
The Limit of the Batch Size. CoRR abs/2006.08517 (2020) - [i38]Arissa Wongpanich, Hieu Pham, James Demmel, Mingxing Tan, Quoc V. Le, Yang You, Sameer Kumar:
Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour. CoRR abs/2011.00071 (2020) - [i37]Aditya Devarakonda, James Demmel:
Avoiding Communication in Logistic Regression. CoRR abs/2011.08281 (2020) - 2019
- [i36]Yang You, Jonathan Hseu, Chris Ying, James Demmel, Kurt Keutzer, Cho-Jui Hsieh:
Large-Batch Training for LSTM and Beyond. CoRR abs/1901.08256 (2019) - [i35]Yang You, Jing Li, Jonathan Hseu, Xiaodan Song, James Demmel, Cho-Jui Hsieh:
Reducing BERT Pre-Training Time from 3 Days to 76 Minutes. CoRR abs/1904.00962 (2019) - [i34]Swapnil Das, Jim Demmel, Kimon Fountoulakis, Laura Grigori, Michael W. Mahoney:
Parallel and Communication Avoiding Least Angle Regression. CoRR abs/1905.11340 (2019) - [i33]Wissam M. Sid-Lakhdar, Mohsen Mahmoudi Aznaveh, Xiaoye S. Li, James Weldon Demmel:
Multitask and Transfer Learning for Autotuning Exascale Applications. CoRR abs/1908.05792 (2019) - [i32]Grey Ballard, James Demmel, Ioana Dumitriu, Alexander Rusciano:
A Generalized Randomized Rank-Revealing Factorization. CoRR abs/1909.06524 (2019) - [i31]James Demmel, Laura Grigori, Alexander Rusciano:
An improved analysis and unified perspective on deterministic and randomized low rank matrix approximations. CoRR abs/1910.00223 (2019) - [i30]Ruobing Han, Yang You, James Demmel:
Auto-Precision Scaling for Distributed Deep Learning. CoRR abs/1911.08907 (2019) - 2018
- [i29]James Demmel, Grace Dinh:
Communication-Optimal Convolutional Neural Nets. CoRR abs/1802.06905 (2018) - [i28]Yang You, James Demmel, Cho-Jui Hsieh, Richard W. Vuduc:
Accurate, Fast and Scalable Kernel Ridge Regression on Parallel and Distributed Systems. CoRR abs/1805.00569 (2018) - [i27]Grey Ballard, James Demmel, Laura Grigori, Mathias Jacquelin, Nicholas Knight:
A 3D Parallel Algorithm for QR Decomposition. CoRR abs/1805.05278 (2018) - 2017
- [i26]Edgar Solomonik, James Demmel, Torsten Hoefler:
Communication Lower Bounds of Bilinear Algorithms for Symmetric Tensor Contractions. CoRR abs/1707.04618 (2017) - [i25]Yang You, Aydin Buluç, James Demmel:
Scaling Deep Learning on GPU and Knights Landing clusters. CoRR abs/1708.02983 (2017) - [i24]Yang You, Zhao Zhang, Cho-Jui Hsieh, James Demmel:
100-epoch ImageNet Training with AlexNet in 24 Minutes. CoRR abs/1709.05011 (2017) - [i23]Saeed Soori, Aditya Devarakonda, James Demmel, Mert Gürbüzbalaban, Maryam Mehri Dehnavi:
Avoiding Communication in Proximal Methods for Convex Optimization Problems. CoRR abs/1710.08883 (2017) - [i22]Aditya Devarakonda, Kimon Fountoulakis, James Demmel, Michael W. Mahoney:
Avoiding Synchronization in First-Order Methods for Sparse Convex Optimization. CoRR abs/1712.06047 (2017) - 2016
- [i21]Edgar Solomonik, Grey Ballard, James Demmel, Torsten Hoefler:
A communication-avoiding parallel algorithm for the symmetric eigenvalue problem. CoRR abs/1604.03703 (2016) - [i20]Alex Gittens, Aditya Devarakonda, Evan Racah, Michael F. Ringenburg, Lisa Gerhardt, Jey Kottalam, Jialin Liu, Kristyn J. Maschhoff, Shane Canon, Jatin Chhugani, Pramod Sharma, Jiyan Yang, James Demmel, Jim Harrell, Venkat Krishnamurthy, Michael W. Mahoney, Prabhat:
Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies. CoRR abs/1607.01335 (2016) - [i19]James Demmel, Alex Rusciano:
Parallelepipeds obtaining HBL lower bounds. CoRR abs/1611.05944 (2016) - [i18]Aditya Devarakonda, Kimon Fountoulakis, James Demmel, Michael W. Mahoney:
Avoiding communication in primal and dual block coordinate descent methods. CoRR abs/1612.04003 (2016) - 2015
- [i17]Ariful Azad, Grey Ballard, Aydin Buluç, James Demmel, Laura Grigori, Oded Schwartz, Sivan Toledo, Samuel Williams:
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication. CoRR abs/1510.00844 (2015) - 2013
- [i16]Austin R. Benson, David F. Gleich, James Demmel:
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures. CoRR abs/1301.1071 (2013) - [i15]Michael Christ, James Demmel, Nicholas Knight, Thomas Scanlon, Katherine A. Yelick:
Communication lower bounds and optimal algorithms for programs that reference arrays - Part 1. CoRR abs/1308.0068 (2013) - 2012
- [i14]Grey Ballard, James Demmel, Olga Holtz, Benjamin Lipshitz, Oded Schwartz:
Communication-Optimal Parallel Algorithm for Strassen's Matrix Multiplication. CoRR abs/1202.3173 (2012) - [i13]Grey Ballard, James Demmel, Olga Holtz, Benjamin Lipshitz, Oded Schwartz:
Strong Scaling of Matrix Multiplication Algorithms and Memory-Independent Communication Lower Bounds. CoRR abs/1202.3177 (2012) - [i12]Amal Khabou, James Demmel, Laura Grigori, Ming Gu:
LU factorization with panel rank revealing pivoting and its communication avoiding version. CoRR abs/1208.2451 (2012) - [i11]Grey Ballard, James Demmel, Olga Holtz, Benjamin Lipshitz, Oded Schwartz:
Graph Expansion Analysis for Communication Costs of Fast Rectangular Matrix Multiplication. CoRR abs/1209.2184 (2012) - 2011
- [i10]Grey Ballard, James Demmel, Olga Holtz, Oded Schwartz:
Graph Expansion and Communication Costs of Fast Matrix Multiplication. CoRR abs/1109.1693 (2011) - 2010
- [i9]Grey Ballard, James Demmel, Ioana Dumitriu:
Minimizing Communication for Eigenproblems and the Singular Value Decomposition. CoRR abs/1011.3077 (2010) - 2009
- [i8]Grey Ballard, James Demmel, Olga Holtz, Oded Schwartz:
Communication-optimal Parallel and Sequential Cholesky Decomposition. CoRR abs/0902.2537 (2009) - [i7]Grey Ballard, James Demmel, Olga Holtz, Oded Schwartz:
Minimizing Communication in Linear Algebra. CoRR abs/0905.2485 (2009) - 2008
- [i6]James Demmel, Laura Grigori, Mark Hoemmen, Julien Langou:
Communication-avoiding parallel and sequential QR factorizations. CoRR abs/0806.2159 (2008) - 2007
- [i5]James Demmel, Ioana Dumitriu, Olga Holtz, Plamen Koev:
Accurate and Efficient Expression Evaluation and Linear Algebra. CoRR abs/0712.4027 (2007) - 2006
- [i4]James Demmel, Ioana Dumitriu, Olga Holtz, Robert Kleinberg:
Fast matrix multiplication is stable. CoRR abs/math/0603207 (2006) - [i3]James Demmel, Ioana Dumitriu, Olga Holtz:
Fast linear algebra is stable. CoRR abs/math/0612264 (2006) - 2005
- [i2]James Demmel, Ioana Dumitriu, Olga Holtz:
Toward accurate polynomial evaluation in rounded arithmetic (short report). Algebraic and Numerical Algorithms and Computer-assisted Proofs 2005 - [i1]James Demmel, Ioana Dumitriu, Olga Holtz:
Toward accurate polynomial evaluation in rounded arithmetic. CoRR abs/math/0508350 (2005)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-01-13 01:59 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint