default search action
Jeffrey Pennington
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Journal Articles
- 2024
- [j2]Avi Singh, John D. Co-Reyes, Rishabh Agarwal, Ankesh Anand, Piyush Patil, Xavier Garcia, Peter J. Liu, James Harrison, Jaehoon Lee, Kelvin Xu, Aaron T. Parisi, Abhishek Kumar, Alexander A. Alemi, Alex Rizkowsky, Azade Nova, Ben Adlam, Bernd Bohnet, Gamaleldin Fathy Elsayed, Hanie Sedghi, Igor Mordatch, Isabelle Simpson, Izzeddin Gur, Jasper Snoek, Jeffrey Pennington, Jiri Hron, Kathleen Kenealy, Kevin Swersky, Kshiteej Mahajan, Laura Culp, Lechao Xiao, Maxwell L. Bileschi, Noah Constant, Roman Novak, Rosanne Liu, Tris Warkentin, Yundi Qian, Yamini Bansal, Ethan Dyer, Behnam Neyshabur, Jascha Sohl-Dickstein, Noah Fiedel:
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models. Trans. Mach. Learn. Res. 2024 (2024) - 2023
- [j1]Atish Agarwala, Samuel Stern Schoenholz, Jeffrey Pennington, Yann N. Dauphin:
Temperature check: theory and practice for training models with softmax-cross-entropy losses. Trans. Mach. Learn. Res. 2023 (2023)
Conference and Workshop Papers
- 2024
- [c36]Mitchell Wortsman, Peter J. Liu, Lechao Xiao, Katie E. Everett, Alexander A. Alemi, Ben Adlam, John D. Co-Reyes, Izzeddin Gur, Abhishek Kumar, Roman Novak, Jeffrey Pennington, Jascha Sohl-Dickstein, Kelvin Xu, Jaehoon Lee, Justin Gilmer, Simon Kornblith:
Small-scale proxies for large-scale Transformer training instabilities. ICLR 2024 - [c35]Katie E. Everett, Lechao Xiao, Mitchell Wortsman, Alexander A. Alemi, Roman Novak, Peter J. Liu, Izzeddin Gur, Jascha Sohl-Dickstein, Leslie Pack Kaelbling, Jaehoon Lee, Jeffrey Pennington:
Scaling Exponents Across Parameterizations and Optimizers. ICML 2024 - 2023
- [c34]Atish Agarwala, Fabian Pedregosa, Jeffrey Pennington:
Second-order regression models exhibit progressive sharpening to the edge of stability. ICML 2023: 169-195 - 2022
- [c33]Ben Adlam, Jake A. Levinson, Jeffrey Pennington:
A Random Matrix Perspective on Mixtures of Nonlinearities in High Dimensions. AISTATS 2022: 3434-3457 - [c32]Jeffrey Pennington, Rose Hartman, Ashwini Davison, Ali Shokoufandeh, Joy Payton, Daniel Chen, André Dietrich:
Online education for data science: Opportunities and challenges. AMIA 2022 - [c31]Gabriel Mel, Jeffrey Pennington:
Anisotropic Random Feature Regression in High Dimensions. ICLR 2022 - [c30]Jiri Hron, Roman Novak, Jeffrey Pennington, Jascha Sohl-Dickstein:
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling. ICML 2022: 8926-8945 - [c29]Lechao Xiao, Jeffrey Pennington:
Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm. ICML 2022: 24347-24369 - [c28]Courtney Paquette, Elliot Paquette, Ben Adlam, Jeffrey Pennington:
Implicit Regularization or Implicit Conditioning? Exact Risk Trajectories of SGD in High Dimensions. NeurIPS 2022 - [c27]Lechao Xiao, Hong Hu, Theodor Misiakiewicz, Yue Lu, Jeffrey Pennington:
Precise Learning Curves and Higher-Order Scalings for Dot-product Kernel Regression. NeurIPS 2022 - 2021
- [c26]Ben Adlam, Jaehoon Lee, Lechao Xiao, Jeffrey Pennington, Jasper Snoek:
Exploring the Uncertainty Properties of Neural Networks' Implicit Priors in the Infinite-Width Limit. ICLR 2021 - [c25]Nilesh Tripuraneni, Ben Adlam, Jeffrey Pennington:
Overparameterization Improves Robustness to Covariate Shift in High Dimensions. NeurIPS 2021: 13883-13897 - 2020
- [c24]Wei Hu, Lechao Xiao, Jeffrey Pennington:
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks. ICLR 2020 - [c23]Ben Adlam, Jeffrey Pennington:
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization. ICML 2020: 74-84 - [c22]Lechao Xiao, Jeffrey Pennington, Samuel Stern Schoenholz:
Disentangling Trainability and Generalization in Deep Neural Networks. ICML 2020: 10462-10472 - [c21]Ben Adlam, Jeffrey Pennington:
Understanding Double Descent Requires A Fine-Grained Bias-Variance Decomposition. NeurIPS 2020 - [c20]Wei Hu, Lechao Xiao, Ben Adlam, Jeffrey Pennington:
The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks. NeurIPS 2020 - [c19]Jaehoon Lee, Samuel S. Schoenholz, Jeffrey Pennington, Ben Adlam, Lechao Xiao, Roman Novak, Jascha Sohl-Dickstein:
Finite Versus Infinite Neural Networks: an Empirical Study. NeurIPS 2020 - 2019
- [c18]Krzysztof Choromanski, Aldo Pacchiano, Jeffrey Pennington, Yunhao Tang:
KAMA-NNs: Low-dimensional Rotation Based Neural Networks. AISTATS 2019: 236-245 - [c17]Roman Novak, Lechao Xiao, Yasaman Bahri, Jaehoon Lee, Greg Yang, Jiri Hron, Daniel A. Abolafia, Jeffrey Pennington, Jascha Sohl-Dickstein:
Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes. ICLR (Poster) 2019 - [c16]Greg Yang, Jeffrey Pennington, Vinay Rao, Jascha Sohl-Dickstein, Samuel S. Schoenholz:
A Mean Field Theory of Batch Normalization. ICLR (Poster) 2019 - [c15]Jaehoon Lee, Lechao Xiao, Samuel S. Schoenholz, Yasaman Bahri, Roman Novak, Jascha Sohl-Dickstein, Jeffrey Pennington:
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent. NeurIPS 2019: 8570-8581 - 2018
- [c14]Jeffrey Pennington, Samuel S. Schoenholz, Surya Ganguli:
The emergence of spectral universality in deep networks. AISTATS 2018: 1924-1932 - [c13]Jaehoon Lee, Yasaman Bahri, Roman Novak, Samuel S. Schoenholz, Jeffrey Pennington, Jascha Sohl-Dickstein:
Deep Neural Networks as Gaussian Processes. ICLR (Poster) 2018 - [c12]Roman Novak, Yasaman Bahri, Daniel A. Abolafia, Jeffrey Pennington, Jascha Sohl-Dickstein:
Sensitivity and Generalization in Neural Networks: an Empirical Study. ICLR (Poster) 2018 - [c11]Minmin Chen, Jeffrey Pennington, Samuel S. Schoenholz:
Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks. ICML 2018: 872-881 - [c10]Lechao Xiao, Yasaman Bahri, Jascha Sohl-Dickstein, Samuel S. Schoenholz, Jeffrey Pennington:
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10, 000-Layer Vanilla Convolutional Neural Networks. ICML 2018: 5389-5398 - [c9]Jeffrey Pennington, Pratik Worah:
The Spectrum of the Fisher Information Matrix of a Single-Hidden-Layer Neural Network. NeurIPS 2018: 5415-5424 - 2017
- [c8]Jeffrey Pennington, Yasaman Bahri:
Geometry of Neural Network Loss Surfaces via Random Matrix Theory. ICML 2017: 2798-2806 - [c7]Jeffrey Pennington, Pratik Worah:
Nonlinear random matrix theory for deep learning. NIPS 2017: 2637-2646 - [c6]Jeffrey Pennington, Samuel S. Schoenholz, Surya Ganguli:
Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice. NIPS 2017: 4785-4795 - 2016
- [c5]Kevin Murphy, Jeffrey Pennington, Aaron N. Browne, Byron Ruth, Ritu Khare, LeMar Davidson, Patricia K. Morris, Levon H. Utidjian, Charles Bailey:
Clinical Data Research Network Lessons Learned. CRI 2016 - 2015
- [c4]Jeffrey Pennington, Felix X. Yu, Sanjiv Kumar:
Spherical Random Features for Polynomial Kernels. NIPS 2015: 1846-1854 - 2014
- [c3]Jeffrey Pennington, Richard Socher, Christopher D. Manning:
Glove: Global Vectors for Word Representation. EMNLP 2014: 1532-1543 - 2011
- [c2]Richard Socher, Jeffrey Pennington, Eric H. Huang, Andrew Y. Ng, Christopher D. Manning:
Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions. EMNLP 2011: 151-161 - [c1]Richard Socher, Eric H. Huang, Jeffrey Pennington, Andrew Y. Ng, Christopher D. Manning:
Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection. NIPS 2011: 801-809
Informal and Other Publications
- 2024
- [i35]Brian Lester, Jaehoon Lee, Alex Alemi, Jeffrey Pennington, Adam Roberts, Jascha Sohl-Dickstein, Noah Constant:
Training LLMs over Neurally Compressed Text. CoRR abs/2404.03626 (2024) - [i34]Atish Agarwala, Jeffrey Pennington:
High dimensional analysis reveals conservative sharpening and a stochastic edge of stability. CoRR abs/2404.19261 (2024) - [i33]Elliot Paquette, Courtney Paquette, Lechao Xiao, Jeffrey Pennington:
4+3 Phases of Compute-Optimal Neural Scaling Laws. CoRR abs/2405.15074 (2024) - [i32]Katie Everett, Lechao Xiao, Mitchell Wortsman, Alexander A. Alemi, Roman Novak, Peter J. Liu, Izzeddin Gur, Jascha Sohl-Dickstein, Leslie Pack Kaelbling, Jaehoon Lee, Jeffrey Pennington:
Scaling Exponents Across Parameterizations and Optimizers. CoRR abs/2407.05872 (2024) - [i31]Jiri Hron, Laura Culp, Gamaleldin F. Elsayed, Rosanne Liu, Ben Adlam, Maxwell L. Bileschi, Bernd Bohnet, JD Co-Reyes, Noah Fiedel, C. Daniel Freeman, Izzeddin Gur, Kathleen Kenealy, Jaehoon Lee, Peter J. Liu, Gaurav Mishra, Igor Mordatch, Azade Nova, Roman Novak, Aaron Parisi, Jeffrey Pennington, Alex Rizkowsky, Isabelle Simpson, Hanie Sedghi, Jascha Sohl-Dickstein, Kevin Swersky, Sharad Vikram, Tris Warkentin, Lechao Xiao, Kelvin Xu, Jasper Snoek, Simon Kornblith:
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability. CoRR abs/2408.07852 (2024) - 2023
- [i30]Mitchell Wortsman, Peter J. Liu, Lechao Xiao, Katie Everett, Alex Alemi, Ben Adlam, John D. Co-Reyes, Izzeddin Gur, Abhishek Kumar, Roman Novak, Jeffrey Pennington, Jascha Sohl-Dickstein, Kelvin Xu, Jaehoon Lee, Justin Gilmer, Simon Kornblith:
Small-scale proxies for large-scale Transformer training instabilities. CoRR abs/2309.14322 (2023) - [i29]C. Daniel Freeman, Laura Culp, Aaron Parisi, Maxwell L. Bileschi, Gamaleldin F. Elsayed, Alex Rizkowsky, Isabelle Simpson, Alex Alemi, Azade Nova, Ben Adlam, Bernd Bohnet, Gaurav Mishra, Hanie Sedghi, Igor Mordatch, Izzeddin Gur, Jaehoon Lee, John D. Co-Reyes, Jeffrey Pennington, Kelvin Xu, Kevin Swersky, Kshiteej Mahajan, Lechao Xiao, Rosanne Liu, Simon Kornblith, Noah Constant, Peter J. Liu, Roman Novak, Yundi Qian, Noah Fiedel, Jascha Sohl-Dickstein:
Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5? CoRR abs/2311.07587 (2023) - [i28]Avi Singh, John D. Co-Reyes, Rishabh Agarwal, Ankesh Anand, Piyush Patil, Xavier Garcia, Peter J. Liu, James Harrison, Jaehoon Lee, Kelvin Xu, Aaron Parisi, Abhishek Kumar, Alex Alemi, Alex Rizkowsky, Azade Nova, Ben Adlam, Bernd Bohnet, Gamaleldin F. Elsayed, Hanie Sedghi, Igor Mordatch, Isabelle Simpson, Izzeddin Gur, Jasper Snoek, Jeffrey Pennington, Jiri Hron, Kathleen Kenealy, Kevin Swersky, Kshiteej Mahajan, Laura Culp, Lechao Xiao, Maxwell L. Bileschi, Noah Constant, Roman Novak, Rosanne Liu, Tris Warkentin, Yundi Qian, Yamini Bansal, Ethan Dyer, Behnam Neyshabur, Jascha Sohl-Dickstein, Noah Fiedel:
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models. CoRR abs/2312.06585 (2023) - 2022
- [i27]Lechao Xiao, Jeffrey Pennington:
Precise Learning Curves and Higher-Order Scaling Limits for Dot Product Kernel Regression. CoRR abs/2205.14846 (2022) - [i26]Courtney Paquette, Elliot Paquette, Ben Adlam, Jeffrey Pennington:
Implicit Regularization or Implicit Conditioning? Exact Risk Trajectories of SGD in High Dimensions. CoRR abs/2206.07252 (2022) - [i25]Jiri Hron, Roman Novak, Jeffrey Pennington, Jascha Sohl-Dickstein:
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling. CoRR abs/2206.07673 (2022) - [i24]Lechao Xiao, Jeffrey Pennington:
Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm. CoRR abs/2207.04612 (2022) - [i23]Atish Agarwala, Fabian Pedregosa, Jeffrey Pennington:
Second-order regression models exhibit progressive sharpening to the edge of stability. CoRR abs/2210.04860 (2022) - 2021
- [i22]Nilesh Tripuraneni, Ben Adlam, Jeffrey Pennington:
Covariate Shift in High-Dimensional Random Feature Regression. CoRR abs/2111.08234 (2021) - 2020
- [i21]Wei Hu, Lechao Xiao, Jeffrey Pennington:
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks. CoRR abs/2001.05992 (2020) - [i20]Jiri Hron, Yasaman Bahri, Roman Novak, Jeffrey Pennington, Jascha Sohl-Dickstein:
Exact posterior distributions of wide Bayesian neural networks. CoRR abs/2006.10541 (2020) - [i19]Wei Hu, Lechao Xiao, Ben Adlam, Jeffrey Pennington:
The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks. CoRR abs/2006.14599 (2020) - [i18]Jaehoon Lee, Samuel S. Schoenholz, Jeffrey Pennington, Ben Adlam, Lechao Xiao, Roman Novak, Jascha Sohl-Dickstein:
Finite Versus Infinite Neural Networks: an Empirical Study. CoRR abs/2007.15801 (2020) - [i17]Ben Adlam, Jeffrey Pennington:
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization. CoRR abs/2008.06786 (2020) - [i16]Atish Agarwala, Jeffrey Pennington, Yann N. Dauphin, Samuel S. Schoenholz:
Temperature check: theory and practice for training models with softmax-cross-entropy losses. CoRR abs/2010.07344 (2020) - [i15]Ben Adlam, Jaehoon Lee, Lechao Xiao, Jeffrey Pennington, Jasper Snoek:
Exploring the Uncertainty Properties of Neural Networks' Implicit Priors in the Infinite-Width Limit. CoRR abs/2010.07355 (2020) - [i14]Ben Adlam, Jeffrey Pennington:
Understanding Double Descent Requires a Fine-Grained Bias-Variance Decomposition. CoRR abs/2011.03321 (2020) - 2019
- [i13]Dar Gilboa, Bo Chang, Minmin Chen, Greg Yang, Samuel S. Schoenholz, Ed H. Chi, Jeffrey Pennington:
Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs. CoRR abs/1901.08987 (2019) - [i12]Jaehoon Lee, Lechao Xiao, Samuel S. Schoenholz, Yasaman Bahri, Jascha Sohl-Dickstein, Jeffrey Pennington:
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent. CoRR abs/1902.06720 (2019) - [i11]Greg Yang, Jeffrey Pennington, Vinay Rao, Jascha Sohl-Dickstein, Samuel S. Schoenholz:
A Mean Field Theory of Batch Normalization. CoRR abs/1902.08129 (2019) - [i10]Ben Adlam, Jake Levinson, Jeffrey Pennington:
A Random Matrix Perspective on Mixtures of Nonlinearities for Deep Learning. CoRR abs/1912.00827 (2019) - [i9]Lechao Xiao, Jeffrey Pennington, Samuel S. Schoenholz:
Disentangling trainability and generalization in deep learning. CoRR abs/1912.13053 (2019) - 2018
- [i8]Roman Novak, Yasaman Bahri, Daniel A. Abolafia, Jeffrey Pennington, Jascha Sohl-Dickstein:
Sensitivity and Generalization in Neural Networks: an Empirical Study. CoRR abs/1802.08760 (2018) - [i7]Jeffrey Pennington, Samuel S. Schoenholz, Surya Ganguli:
The Emergence of Spectral Universality in Deep Networks. CoRR abs/1802.09979 (2018) - [i6]Lechao Xiao, Yasaman Bahri, Jascha Sohl-Dickstein, Samuel S. Schoenholz, Jeffrey Pennington:
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10, 000-Layer Vanilla Convolutional Neural Networks. CoRR abs/1806.05393 (2018) - [i5]Minmin Chen, Jeffrey Pennington, Samuel S. Schoenholz:
Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks. CoRR abs/1806.05394 (2018) - [i4]Roman Novak, Lechao Xiao, Jaehoon Lee, Yasaman Bahri, Daniel A. Abolafia, Jeffrey Pennington, Jascha Sohl-Dickstein:
Bayesian Convolutional Neural Networks with Many Channels are Gaussian Processes. CoRR abs/1810.05148 (2018) - 2017
- [i3]Samuel S. Schoenholz, Jeffrey Pennington, Jascha Sohl-Dickstein:
A Correspondence Between Random Neural Networks and Statistical Field Theory. CoRR abs/1710.06570 (2017) - [i2]Jaehoon Lee, Yasaman Bahri, Roman Novak, Samuel S. Schoenholz, Jeffrey Pennington, Jascha Sohl-Dickstein:
Deep Neural Networks as Gaussian Processes. CoRR abs/1711.00165 (2017) - [i1]Jeffrey Pennington, Samuel S. Schoenholz, Surya Ganguli:
Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice. CoRR abs/1711.04735 (2017)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-09-25 01:39 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint