default search action
Daniel P. W. Ellis
Person information
- affiliation: Columbia University, New York City, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2023
- [c147]R. Channing Moore, Daniel P. W. Ellis, Eduardo Fonseca, Shawn Hershey, Aren Jansen, Manoj Plakal:
Dataset Balancing Can Hurt Model Performance. ICASSP 2023: 1-5 - [i18]R. Channing Moore, Daniel P. W. Ellis, Eduardo Fonseca, Shawn Hershey, Aren Jansen, Manoj Plakal:
Dataset balancing can hurt model performance. CoRR abs/2307.00079 (2023) - 2022
- [c146]Francesca Ronchini, Samuele Cornell, Romain Serizel, Nicolas Turpault, Eduardo Fonseca, Daniel P. W. Ellis:
Description and Analysis of Novelties Introduced in DCASE Task 4 2022 on the Baseline System. DCASE 2022 - [c145]Qingqing Huang, Aren Jansen, Joonseok Lee, Ravi Ganti, Judith Yue Li, Daniel P. W. Ellis:
MuLan: A Joint Embedding of Music Audio and Natural Language. ISMIR 2022: 559-566 - [i17]Qingqing Huang, Aren Jansen, Joonseok Lee, Ravi Ganti, Judith Yue Li, Daniel P. W. Ellis:
MuLan: A Joint Embedding of Music Audio and Natural Language. CoRR abs/2208.12415 (2022) - [i16]Francesca Ronchini, Samuele Cornell, Romain Serizel, Nicolas Turpault, Eduardo Fonseca, Daniel P. W. Ellis:
Description and analysis of novelties introduced in DCASE Task 4 2022 on the baseline system. CoRR abs/2210.07856 (2022) - 2021
- [c144]Scott Wisdom, Hakan Erdogan, Daniel P. W. Ellis, Romain Serizel, Nicolas Turpault, Eduardo Fonseca, Justin Salamon, Prem Seetharaman, John R. Hershey:
What's all the Fuss about Free Universal Sound Separation Data? ICASSP 2021: 186-190 - [c143]Shawn Hershey, Daniel P. W. Ellis, Eduardo Fonseca, Aren Jansen, Caroline Liu, R. Channing Moore, Manoj Plakal:
The Benefit of Temporally-Strong Labels in Audio Event Classification. ICASSP 2021: 366-370 - [c142]Efthymios Tzinis, Scott Wisdom, Aren Jansen, Shawn Hershey, Tal Remez, Dan Ellis, John R. Hershey:
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds. ICLR 2021 - [c141]Eduardo Fonseca, Aren Jansen, Daniel P. W. Ellis, Scott Wisdom, Marco Tagliasacchi, John R. Hershey, Manoj Plakal, Shawn Hershey, R. Channing Moore, Xavier Serra:
Self-Supervised Learning from Automatically Separated Sound Scenes. WASPAA 2021: 251-255 - [e3]Frederic Font, Annamaria Mesaros, Daniel P. W. Ellis, Eduardo Fonseca, Magdalena Fuentes, Benjamin Elizalde:
Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), Online, November 15-19, 2021. 2021, ISBN 978-84-09-36072-7 [contents] - [i15]Eduardo Fonseca, Aren Jansen, Daniel P. W. Ellis, Scott Wisdom, Marco Tagliasacchi, John R. Hershey, Manoj Plakal, Shawn Hershey, R. Channing Moore, Xavier Serra:
Self-Supervised Learning from Automatically Separated Sound Scenes. CoRR abs/2105.02132 (2021) - [i14]Shawn Hershey, Daniel P. W. Ellis, Eduardo Fonseca, Aren Jansen, Caroline Liu, R. Channing Moore, Manoj Plakal:
The Benefit Of Temporally-Strong Labels In Audio Event Classification. CoRR abs/2105.07031 (2021) - 2020
- [j32]Eduardo Fonseca, Shawn Hershey, Manoj Plakal, Daniel P. W. Ellis, Aren Jansen, R. Channing Moore:
Addressing Missing Labels in Large-Scale Sound Event Recognition Using a Teacher-Student Framework With Loss Masking. IEEE Signal Process. Lett. 27: 1235-1239 (2020) - [c140]Efthymios Tzinis, Scott Wisdom, John R. Hershey, Aren Jansen, Daniel P. W. Ellis:
Improving Universal Sound Separation Using Sound Classification. ICASSP 2020: 96-100 - [c139]Aren Jansen, Daniel P. W. Ellis, Shawn Hershey, R. Channing Moore, Manoj Plakal, Ashok C. Popat, Rif A. Saurous:
Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision. ICASSP 2020: 121-125 - [c138]Qingqing Huang, Aren Jansen, Li Zhang, Daniel P. W. Ellis, Rif A. Saurous, John R. Anderson:
Large-Scale Weakly-Supervised Content Embeddings for Music Recommendation and Tagging. ICASSP 2020: 8364-8368 - [d3]Eduardo Fonseca, Manoj Plakal, Frederic Font, Daniel P. W. Ellis, Xavier Serra:
FSDKaggle2019. Zenodo, 2020 - [i13]Eduardo Fonseca, Shawn Hershey, Manoj Plakal, Daniel P. W. Ellis, Aren Jansen, R. Channing Moore, Xavier Serra:
Addressing Missing Labels in Large-scale Sound Event Recognition using a Teacher-student Framework with Loss Masking. CoRR abs/2005.00878 (2020) - [i12]Scott Wisdom, Hakan Erdogan, Daniel P. W. Ellis, Romain Serizel, Nicolas Turpault, Eduardo Fonseca, Justin Salamon, Prem Seetharaman, John R. Hershey:
What's All the FUSS About Free Universal Sound Separation Data? CoRR abs/2011.00803 (2020) - [i11]Efthymios Tzinis, Scott Wisdom, Aren Jansen, Shawn Hershey, Tal Remez, Daniel P. W. Ellis, John R. Hershey:
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds. CoRR abs/2011.01143 (2020)
2010 – 2019
- 2019
- [c137]Eduardo Fonseca, Manoj Plakal, Frederic Font, Daniel P. W. Ellis, Xavier Serra:
Audio Tagging with Noisy Labels and Minimal Supervision. DCASE 2019: 69-73 - [c136]Eduardo Fonseca, Manoj Plakal, Daniel P. W. Ellis, Frederic Font, Xavier Favory, Xavier Serra:
Learning Sound Event Classifiers from Web Audio with Noisy Labels. ICASSP 2019: 21-25 - [e2]Michael I. Mandel, Justin Salamon, Daniel P. W. Ellis:
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events 2019 (DCASE 2019), New York University, NY, USA, October 2019. 2019, ISBN 978-0-578-59596-2 [contents] - [d2]Eduardo Fonseca, Mercedes Collado, Manoj Plakal, Daniel P. W. Ellis, Frederic Font, Xavier Favory, Xavier Serra:
FSDnoisy18k. Zenodo, 2019 - [d1]Eduardo Fonseca, Xavier Favory, Jordi Pons, Frederic Font, Manoj Plakal, Daniel P. W. Ellis, Xavier Serra:
FSDKaggle2018. Zenodo, 2019 - [i10]Eduardo Fonseca, Manoj Plakal, Daniel P. W. Ellis, Frederic Font, Xavier Favory, Xavier Serra:
Learning Sound Event Classifiers from Web Audio with Noisy Labels. CoRR abs/1901.01189 (2019) - [i9]Eduardo Fonseca, Manoj Plakal, Frederic Font, Daniel P. W. Ellis, Xavier Serra:
Audio tagging with noisy labels and minimal supervision. CoRR abs/1906.02975 (2019) - [i8]Aren Jansen, Daniel P. W. Ellis, Shawn Hershey, R. Channing Moore, Manoj Plakal, Ashok C. Popat, Rif A. Saurous:
Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision. CoRR abs/1911.05894 (2019) - [i7]Efthymios Tzinis, Scott Wisdom, John R. Hershey, Aren Jansen, Daniel P. W. Ellis:
Improving Universal Sound Separation Using Sound Classification. CoRR abs/1911.07951 (2019) - [i6]Ahmed Hussen Abdelaziz, Shuo-Yiin Chang, Nelson Morgan, Erik Edwards, Dorothea Kolossa, Dan Ellis, David A. Moses, Edward F. Chang:
On Neural Phone Recognition of Mixed-Source ECoG Signals. CoRR abs/1912.05869 (2019) - 2018
- [c135]Eduardo Fonseca, Manoj Plakal, Frederic Font, Daniel P. W. Ellis, Xavier Favory, Jordi Pons, Xavier Serra:
General-purpose tagging of Freesound audio with AudioSet labels: task description, dataset, and baseline. DCASE 2018: 69-73 - [c134]Aren Jansen, Manoj Plakal, Ratheet Pandya, Daniel P. W. Ellis, Shawn Hershey, Jiayang Liu, R. Channing Moore, Rif A. Saurous:
Unsupervised Learning of Semantic Audio Representations. ICASSP 2018: 126-130 - [c133]Sourish Chaudhuri, Joseph Roth, Daniel P. W. Ellis, Andrew C. Gallagher, Liat Kaver, Radhika Marvin, Caroline Pantofaru, Nathan Reale, Loretta Guarino Reid, Kevin W. Wilson, Zhonghua Xi:
AVA-Speech: A Densely Labeled Dataset of Speech Activity in Movies. INTERSPEECH 2018: 1239-1243 - [e1]Mark D. Plumbley, Christian Kroos, Juan Pablo Bello, Gaël Richard, Daniel P. W. Ellis, Annamaria Mesaros:
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, DCASE 2018, Surrey, UK, November 19-20, 2018. 2018, ISBN 978-952-15-4262-6 [contents] - [i5]Eduardo Fonseca, Manoj Plakal, Frederic Font, Daniel P. W. Ellis, Xavier Favory, Jordi Pons, Xavier Serra:
General-purpose Tagging of Freesound Audio with AudioSet Labels: Task Description, Dataset, and Baseline. CoRR abs/1807.09902 (2018) - [i4]Sourish Chaudhuri, Joseph Roth, Daniel P. W. Ellis, Andrew C. Gallagher, Liat Kaver, Radhika Marvin, Caroline Pantofaru, Nathan Reale, Loretta Guarino Reid, Kevin W. Wilson, Zhonghua Xi:
AVA-Speech: A Densely Labeled Dataset of Speech Activity in Movies. CoRR abs/1808.00606 (2018) - 2017
- [c132]Shawn Hershey, Sourish Chaudhuri, Daniel P. W. Ellis, Jort F. Gemmeke, Aren Jansen, R. Channing Moore, Manoj Plakal, Devin Platt, Rif A. Saurous, Bryan Seybold, Malcolm Slaney, Ron J. Weiss, Kevin W. Wilson:
CNN architectures for large-scale audio classification. ICASSP 2017: 131-135 - [c131]Jort F. Gemmeke, Daniel P. W. Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R. Channing Moore, Manoj Plakal, Marvin Ritter:
Audio Set: An ontology and human-labeled dataset for audio events. ICASSP 2017: 776-780 - [c130]Aren Jansen, Jort F. Gemmeke, Daniel P. W. Ellis, Xiaofeng Liu, Wade Lawrence, Dylan Freedman:
Large-scale audio event discovery in one million YouTube videos. ICASSP 2017: 786-790 - [i3]Aren Jansen, Manoj Plakal, Ratheet Pandya, Daniel P. W. Ellis, Shawn Hershey, Jiayang Liu, R. Channing Moore, Rif A. Saurous:
Unsupervised Learning of Semantic Audio Representations. CoRR abs/1711.02209 (2017) - 2016
- [c129]Colin Raffel, Daniel P. W. Ellis:
Optimizing DTW-based audio-to-MIDI alignment and matching. ICASSP 2016: 81-85 - [c128]Colin Raffel, Daniel P. W. Ellis:
Pruning subsequence search with attention-based embedding. ICASSP 2016: 554-558 - [c127]Colin Raffel, Daniel P. W. Ellis:
Extracting Ground-Truth Information from MIDI Files: A MIDIfesto. ISMIR 2016: 796-802 - [i2]Shawn Hershey, Sourish Chaudhuri, Daniel P. W. Ellis, Jort F. Gemmeke, Aren Jansen, R. Channing Moore, Manoj Plakal, Devin Platt, Rif A. Saurous, Bryan Seybold, Malcolm Slaney, Ron J. Weiss, Kevin W. Wilson:
CNN Architectures for Large-Scale Audio Classification. CoRR abs/1609.09430 (2016) - 2015
- [j31]Diego Furtado Silva, Vinícius M. A. de Souza, Daniel P. W. Ellis, Eamonn J. Keogh, Gustavo E. A. P. A. Batista:
Exploring Low Cost Laser Sensors to Identify Flying Insect Species - Evaluation of Machine Learning and Signal Processing Methods. J. Intell. Robotic Syst. 80(Supplement-1): 313-330 (2015) - [c126]Jonathan Le Roux, Emmanuel Vincent, John R. Hershey, Daniel P. W. Ellis:
Micbots: Collecting large realistic datasets for speech and audio research using mobile robots. ICASSP 2015: 5635-5639 - [c125]Colin Raffel, Daniel P. W. Ellis:
Large-Scale Content-Based Matching of MIDI and Audio Files. ISMIR 2015: 234-240 - [c124]Dawen Liang, Minshu Zhan, Daniel P. W. Ellis:
Content-Aware Collaborative Music Recommendation Using Pre-trained Neural Networks. ISMIR 2015: 295-301 - [c123]Brian McFee, Colin Raffel, Dawen Liang, Daniel P. W. Ellis, Matt McVicar, Eric Battenberg, Oriol Nieto:
librosa: Audio and Music Signal Analysis in Python. SciPy 2015: 18-24 - [i1]Colin Raffel, Daniel P. W. Ellis:
Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems. CoRR abs/1512.08756 (2015) - 2014
- [j30]Justin Salamon, Emilia Gómez, Daniel P. W. Ellis, Gaël Richard:
Melody Extraction from Polyphonic Music Signals: Approaches, applications, and challenges. IEEE Signal Process. Mag. 31(2): 118-134 (2014) - [c122]Hélène Papadopoulos, Daniel P. W. Ellis:
Music-Content-Adaptive Robust Principal Component Analysis for a Semantically Consistent Separation of Foreground and Background in Music Audio Signals. DAFx 2014: 279-286 - [c121]Zhuo Chen, Hélène Papadopoulos, Daniel P. W. Ellis:
Content-adaptive speech enhancement by a sparsely-activated dictionary plus low rank decomposition. HSCMA 2014: 16-20 - [c120]Colin Raffel, Daniel P. W. Ellis:
Estimating timing and channel distortion across related signals. ICASSP 2014: 654-658 - [c119]Brian McFee, Daniel P. W. Ellis:
Better beat tracking through robust onset aggregation. ICASSP 2014: 2154-2158 - [c118]Dawen Liang, Daniel P. W. Ellis, Matthew D. Hoffman, Gautham J. Mysore:
Speech decoloration based on the product-of-filters model. ICASSP 2014: 2400-2404 - [c117]Matt McVicar, Daniel P. W. Ellis, Masataka Goto:
Leveraging repetition for improved automatic lyric transcription in popular music. ICASSP 2014: 3117-3121 - [c116]Brian McFee, Daniel P. W. Ellis:
Learning to segment songs with ordinal linear discriminant analysis. ICASSP 2014: 5197-5201 - [c115]Daniel P. W. Ellis, Hiroyuki Satoh, Zhuo Chen:
Detecting proximity from personal audio recordings. INTERSPEECH 2014: 2519-2523 - [c114]Zhuo Chen, Brian McFee, Daniel P. W. Ellis:
Speech enhancement by low-rank and convolutive dictionary spectrogram decomposition. INTERSPEECH 2014: 2833-2837 - [c113]Dawen Liang, John W. Paisley, Dan Ellis:
Codebook-based Scalable Music Tagging with Poisson Matrix Factorization. ISMIR 2014: 167-172 - [c112]Colin Raffel, Brian McFee, Eric J. Humphrey, Justin Salamon, Oriol Nieto, Dawen Liang, Daniel P. W. Ellis:
MIR_EVAL: A Transparent Implementation of Common MIR Metrics. ISMIR 2014: 367-372 - [c111]Brian McFee, Dan Ellis:
Analyzing Song Structure with Spectral Clustering. ISMIR 2014: 405-410 - 2013
- [c110]Courtenay V. Cotton, Daniel P. W. Ellis:
Subband autocorrelation features for video soundtrack classification. ICASSP 2013: 8663-8666 - [c109]Diego Furtado Silva, Vinícius M. A. de Souza, Gustavo E. A. P. A. Batista, Eamonn J. Keogh, Daniel P. W. Ellis:
Applying Machine Learning and Audio Analysis Techniques to Insect Recognition in Intelligent Traps. ICMLA (1) 2013: 99-104 - [c108]Martin Graciarena, Abeer Alwan, Dan Ellis, Horacio Franco, Luciana Ferrer, John H. L. Hansen, Adam Janin, Byung Suk Lee, Yun Lei, Vikramjit Mitra, Nelson Morgan, Seyed Omid Sadjadi, T. J. Tsai, Nicolas Scheffer, Lee Ngee Tan, Benjamin Williams:
All for one: feature combination for highly channel-degraded speech activity detection. INTERSPEECH 2013: 709-713 - [c107]Diego Furtado Silva, Hélène Papadopoulos, Gustavo Enrique De Almeida Prado Alves Batista, Daniel P. W. Ellis:
A Video Compression-Based Approach to Measure Music Structural Similarity. ISMIR 2013: 95-100 - [c106]Dawen Liang, Matthew D. Hoffman, Daniel P. W. Ellis:
Beta Process Sparse Nonnegative Matrix Factorization for Music. ISMIR 2013: 375-380 - [c105]Lisa M. Brown, Liangliang Cao, Chih-Fu Chang, Yu Cheng, Alok N. Choudhary, Noel Codella, Courtenay V. Cotton, Dan Ellis, Quanfu Fan, Rogério Schmidt Feris, Leiguang Gong, Matthew L. Hill, Gang Hua, John R. Kender, Michele Merler, Yadong Mu, Sharath Pankanti, John R. Smith, Felix X. Yu:
IBM Research and Columbia University TRECVID-2013 Multimedia Event Detection (MED), Multimedia Event Recounting (MER), Surveillance Event Detection (SED), and Semantic Indexing (SIN) Systems. TRECVID 2013 - [c104]Zhuo Chen, Daniel P. W. Ellis:
Speech enhancement by sparse, low-rank, and dictionary spectrogram decomposition. WASPAA 2013: 1-4 - [c103]Daniel J. Gillespie, Daniel P. W. Ellis:
Modeling nonlinear circuits with linearized dynamical models via kernel regression. WASPAA 2013: 1-4 - 2012
- [j29]Jón Guðnason, Mark R. P. Thomas, Daniel P. W. Ellis, Patrick A. Naylor:
Data-driven voice source waveform analysis and synthesis. Speech Commun. 54(2): 199-211 (2012) - [c102]Josh H. McDermott, Daniel P. W. Ellis, Hideki Kawahara:
Inharmonic speech: a tool for the study of speech perception and separation. SAPA@INTERSPEECH 2012: 114-117 - [c101]Byung Suk Lee, Daniel P. W. Ellis:
Noise Robust Pitch Tracking by Subband Autocorrelation Classification. INTERSPEECH 2012: 707-710 - [c100]Thierry Bertin-Mahieux, Daniel P. W. Ellis:
Large-Scale Cover Song Recognition Using the 2D Fourier Transform Magnitude. ISMIR 2012: 241-246 - [c99]Kai Su, Mor Naaman, Avadhut Gurjar, Mohsin Patel, Daniel P. W. Ellis:
Making a scene: alignment of complete sets of clips based on pairwise audio match. ICMR 2012: 26 - [c98]Gerald Friedland, Daniel P. W. Ellis, Florian Metze:
AMVA'12: ACM international workshop on audio and multimedia methods for large-scale video analysis. ACM Multimedia 2012: 1513-1514 - [c97]Liangliang Cao, Shih-Fu Chang, Noel Codella, Courtenay V. Cotton, Dan Ellis, Leiguang Gong, Matthew L. Hill, Gang Hua, John R. Kender, Michele Merler, Yadong Mu, John R. Smith, Felix X. Yu:
IBM Research and Columbia University TRECVID-2012 Multimedia Event Detection (MED), Multimedia Event Recounting (MER), and Semantic Indexing (SIN) Systems. TRECVID 2012 - [c96]Brian McFee, Thierry Bertin-Mahieux, Daniel P. W. Ellis, Gert R. G. Lanckriet:
The million song dataset challenge. WWW (Companion Volume) 2012: 909-916 - 2011
- [b2]Ben Gold, Nelson Morgan, Dan Ellis:
Speech and Audio Signal Processing - Processing and Perception of Speech and Music, Second Edition. Wiley 2011, ISBN 978-0-470-19536-9, pp. I-XXII, 1-661 - [j28]Meinard Müller, Daniel P. W. Ellis, Anssi Klapuri, Gaël Richard, Shigeki Sagayama:
Introduction to the Special Issue on Music Signal Processing. IEEE J. Sel. Top. Signal Process. 5(6): 1085-1087 (2011) - [j27]Meinard Müller, Daniel P. W. Ellis, Anssi Klapuri, Gaël Richard:
Signal Processing for Music Analysis. IEEE J. Sel. Top. Signal Process. 5(6): 1088-1110 (2011) - [j26]Graham Grindlay, Daniel P. W. Ellis:
Transcribing Multi-Instrument Polyphonic Music With Hierarchical Eigeninstruments. IEEE J. Sel. Top. Signal Process. 5(6): 1159-1169 (2011) - [j25]Ron J. Weiss, Michael I. Mandel, Daniel P. W. Ellis:
Combining localization cues and source model constraints for binaural source separation. Speech Commun. 53(5): 606-621 (2011) - [c95]Thierry Bertin-Mahieux, Graham Grindlay, Ron J. Weiss, Daniel P. W. Ellis:
Evaluating music sequence models through missing data. ICASSP 2011: 177-180 - [c94]Christos Vezyrtzis, Aaron E. Klein, Dan Ellis, Yannis P. Tsividis:
Direct processing of mpeg audio using companding and BFP techniques. ICASSP 2011: 361-364 - [c93]Courtenay V. Cotton, Daniel P. W. Ellis, Alexander C. Loui:
Soundtrack classification by transient events. ICASSP 2011: 473-476 - [c92]Daniel P. W. Ellis, Xiaohong Zeng, Josh H. McDermott:
Classifying soundtracks with audio texture features. ICASSP 2011: 5880-5883 - [c91]Fadi Biadsy, Julia Hirschberg, Daniel P. W. Ellis:
Dialect and Accent Recognition Using Phonetic-Segmentation Supervectors. INTERSPEECH 2011: 745-748 - [c90]Thierry Bertin-Mahieux, Daniel P. W. Ellis, Brian Whitman, Paul Lamere:
The Million Song Dataset. ISMIR 2011: 591-596 - [c89]Yu-Gang Jiang, Guangnan Ye, Shih-Fu Chang, Daniel P. W. Ellis, Alexander C. Loui:
Consumer video understanding: a benchmark database and an evaluation of human and machine performance. ICMR 2011: 29 - [c88]Liangliang Cao, Shih-Fu Chang, Noel Codella, Courtenay V. Cotton, Dan Ellis, Leiguang Gong, Matthew L. Hill, Gang Hua, John R. Kender, Michele Merler, Yadong Mu, Apostol Natsev, John R. Smith:
IBM Research and Columbia University TRECVID-2011 Multimedia Event Detection (MED) System. TRECVID 2011 - [c87]Courtenay V. Cotton, Daniel P. W. Ellis:
Spectral vs. spectro-temporal features for acoustic event detection. WASPAA 2011: 69-72 - [c86]Thierry Bertin-Mahieux, Daniel P. W. Ellis:
Large-scale cover song recognition using hashed chroma landmarks. WASPAA 2011: 117-120 - [c85]Dan Ellis:
General chair's introduction. WASPAA 2011 - 2010
- [j24]Ron J. Weiss, Daniel P. W. Ellis:
Speech separation using speaker-adapted eigenvoice speech models. Comput. Speech Lang. 24(1): 16-29 (2010) - [j23]Michael I. Mandel, Ron J. Weiss, Daniel P. W. Ellis:
Model-Based Expectation-Maximization Source Separation and Localization. IEEE Trans. Speech Audio Process. 18(2): 382-394 (2010) - [j22]Keansub Lee, Daniel P. W. Ellis:
Audio-Based Semantic Concept Classification for Consumer Video. IEEE Trans. Speech Audio Process. 18(6): 1406-1416 (2010) - [j21]Michael I. Mandel, Scott Bressler, Barbara G. Shinn-Cunningham, Daniel P. W. Ellis:
Evaluating Source Separation Algorithms With Reverberant Speech. IEEE Trans. Speech Audio Process. 18(7): 1872-1883 (2010) - [j20]Wei Jiang, Courtenay V. Cotton, Shih-Fu Chang, Dan Ellis, Alexander C. Loui:
Audio-visual atoms for generic video concept classification. ACM Trans. Multim. Comput. Commun. Appl. 6(3): 14:1-14:19 (2010) - [c84]Suman V. Ravuri, Daniel P. W. Ellis:
Cover song detection: From high scores to general classification. ICASSP 2010: 65-68 - [c83]Keansub Lee, Daniel P. W. Ellis, Alexander C. Loui:
Detecting local semantic concepts in environmental sounds using Markov model based clustering. ICASSP 2010: 2278-2281 - [c82]Courtenay V. Cotton, Daniel P. W. Ellis:
Audio fingerprinting to identify multiple videos of an event. ICASSP 2010: 2386-2389 - [c81]Graham Grindlay, Daniel P. W. Ellis:
A Probabilistic Subspace Model for Multi-instrument Polyphonic Transcription. ISMIR 2010: 21-26 - [c80]Thierry Bertin-Mahieux, Ron J. Weiss, Daniel P. W. Ellis:
Clustering Beat-Chroma Patterns in a Large Music Database. ISMIR 2010: 111-116 - [c79]Yu-Gang Jiang, Xiaohong Zeng, Guangnan Ye, Dan Ellis, Shih-Fu Chang, Subhabrata Bhattacharya, Mubarak Shah:
Columbia-UCF TRECVID2010 Multimedia Event Detection: Combining Multiple Modalities, Contextual Concepts, and Temporal Matching. TRECVID 2010
2000 – 2009
- 2009
- [j19]Jesper Højvang Jensen, Mads Græsbøll Christensen, Daniel P. W. Ellis, Søren Holdt Jensen:
Quantitative Analysis of a Common Audio Similarity Measure. IEEE Trans. Speech Audio Process. 17(4): 693-703 (2009) - [c78]Jesper Bünsow Boldt, Daniel P. W. Ellis:
A simple correlation-based model of intelligibility for nonlinear speech enhancement and separation. EUSIPCO 2009: 1849-1853 - [c77]Ron J. Weiss, Daniel P. W. Ellis:
A variational EM algorithm for learning eigenvoice parameters in mixed signals. ICASSP 2009: 113-116 - [c76]Johanna Devaney, Daniel P. W. Ellis:
Handling Asynchrony in Audio-Score Alignment. ICMC 2009 - [c75]Douglas Eck, Dan Ellis, Philippe Hamel:
Workshop summary: Sparse methods for music audio. ICML 2009: 11 - [c74]Adrian Weller, Daniel P. W. Ellis, Tony Jebara:
Structured Prediction Models for Chord Transcription of Music Audio. ICMLA 2009: 590-595 - [c73]Jón Guðnason, Mark R. P. Thomas, Patrick A. Naylor, Daniel P. W. Ellis:
Voice source waveform analysis and synthesis using principal component analysis and Gaussian mixture modelling. INTERSPEECH 2009: 108-111 - [c72]Wei Jiang, Courtenay V. Cotton, Shih-Fu Chang, Dan Ellis, Alexander C. Loui:
Short-term audio-visual atoms for generic video concept classification. ACM Multimedia 2009: 5-14 - [c71]Christine Smit, Daniel P. W. Ellis:
Guided harmonic sinusoid estimation in a multi-pitch environment. WASPAA 2009: 41-44 - [c70]Johanna Devaney, Michael I. Mandel, Daniel P. W. Ellis:
Improving MIDI-audio alignment with acoustic features. WASPAA 2009: 45-48 - [c69]Graham Grindlay, Daniel P. W. Ellis:
Multi-voice polyphonic music transcription using eigeninstruments. WASPAA 2009: 53-56 - [c68]Michael I. Mandel, Daniel P. W. Ellis:
The Ideal Interaural Parameter Mask: A bound on binaural separation systems. WASPAA 2009: 85-88 - [c67]Courtenay V. Cotton, Daniel P. W. Ellis:
Finding similar acoustic events using matching pursuit and locality-sensitive hashing. WASPAA 2009: 125-128 - 2008
- [j18]Thomas S. Huang, Charlie K. Dagli, Shyamsundar Rajaram, Edward Y. Chang, Michael I. Mandel, Graham E. Poliner, Daniel P. W. Ellis:
Active Learning for Interactive Multimedia Retrieval. Proc. IEEE 96(4): 648-667 (2008) - [c66]Keansub Lee, Daniel P. W. Ellis:
Detecting music in ambient audio by long-window autocorrelation. ICASSP 2008: 9-12 - [c65]Daniel P. W. Ellis, Courtenay V. Cotton, Michael I. Mandel:
Cross-correlation of beat-synchronous representations for music similarity. ICASSP 2008: 57-60 - [c64]Jesper Højvang Jensen, Mads Græsbøll Christensen, Daniel P. W. Ellis, Søren Holdt Jensen:
A tempo-insensitive distance measure for cover song identification based on chroma features. ICASSP 2008: 2209-2212 - [c63]Suman V. Ravuri, Daniel P. W. Ellis:
Stylization of pitch with syllable-based linear segments. ICASSP 2008: 3985-3988 - [c62]Ke Hu, Pierre L. Divenyi, Daniel P. W. Ellis, Zhaozhang Jin, Barbara G. Shinn-Cunningham, DeLiang Wang:
Preliminary intelligibility tests of a monaural speech segregation system. SAPA@INTERSPEECH 2008: 11-16 - [c61]Adam C. Lammert, Daniel P. W. Ellis, Pierre L. Divenyi:
Data-driven articulatory inversion incorporating articulator priors. SAPA@INTERSPEECH 2008: 29-34 - [c60]Ron J. Weiss, Michael I. Mandel, Daniel P. W. Ellis:
Source separation based on binaural cues and source model constraints. INTERSPEECH 2008: 419-422 - [c59]Michael I. Mandel, Daniel P. W. Ellis:
Multiple-Instance Learning for Music Information Retrieval. ISMIR 2008: 577-582 - 2007
- [j17]Graham E. Poliner, Daniel P. W. Ellis:
A Discriminative Model for Polyphonic Piano Transcription. EURASIP J. Adv. Signal Process. 2007 (2007) - [j16]Patricia Scanlon, Daniel P. W. Ellis, Richard B. Reilly:
Using Broad Phonetic Group Experts for Improved Speech Recognition. IEEE Trans. Speech Audio Process. 15(3): 803-812 (2007) - [j15]Graham E. Poliner, Daniel P. W. Ellis, Andreas F. Ehmann, Emilia Gómez, Sebastian Streich, Beesuan Ong:
Melody Transcription From Music Audio: Approaches and Evaluation. IEEE Trans. Speech Audio Process. 15(4): 1247-1256 (2007) - [j14]Marios Athineos, Daniel P. W. Ellis:
Autoregressive Modeling of Temporal Envelopes. IEEE Trans. Signal Process. 55(11): 5237-5245 (2007) - [c58]James P. Ogle, Daniel P. W. Ellis:
Fingerprinting to Identify Repeated Sound Events in Long-Duration Personal Audio Recordings. ICASSP (1) 2007: 233-236 - [c57]Daniel P. W. Ellis, Graham E. Poliner:
Identifying 'Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking. ICASSP (4) 2007: 1429-1432 - [c56]Jesper Højvang Jensen, Daniel P. W. Ellis, Mads Græsbøll Christensen, Søren Holdt Jensen:
Evaluation of Distance Measures Between Gaussian Mixture Models of MFCCs. ISMIR 2007: 107-108 - [c55]Daniel P. W. Ellis:
Classifying Music Audio with Timbral and Chroma Features. ISMIR 2007: 339-340 - [c54]Michael I. Mandel, Daniel P. W. Ellis:
A Web-Based Game for Collecting Music Metadata. ISMIR 2007: 365-366 - [c53]Alexander C. Loui, Jiebo Luo, Shih-Fu Chang, Dan Ellis, Wei Jiang, Lyndon S. Kennedy, Keansub Lee, Akira Yanagawa:
Kodak's consumer video benchmark data set: concept definition and annotation. Multimedia Information Retrieval 2007: 245-254 - [c52]Shih-Fu Chang, Dan Ellis, Wei Jiang, Keansub Lee, Akira Yanagawa, Alexander C. Loui, Jiebo Luo:
Large-scale multimodal semantic concept detection for consumer video. Multimedia Information Retrieval 2007: 255-264 - [c51]Aiden R. Doherty, Alan F. Smeaton, Keansub Lee, Daniel P. W. Ellis:
Multimodal Segmentation of Lifelog Data. RIAO 2007: 21-38 - 2006
- [j13]Daniel P. W. Ellis:
Extracting information from music audio. Commun. ACM 49(8): 32-37 (2006) - [j12]Daniel P. W. Ellis, Keansub Lee:
Accessing Minimal-Impact Personal Audio Archives. IEEE Multim. 13(4): 30-38 (2006) - [j11]Daniel P. W. Ellis, Graham E. Poliner:
Classification-based melody transcription. Mach. Learn. 65(2-3): 439-456 (2006) - [j10]Michael I. Mandel, Graham E. Poliner, Daniel P. W. Ellis:
Support vector machine active learning for music retrieval. Multim. Syst. 12(1): 3-13 (2006) - [j9]Nicholas Weaver, Dan Ellis:
White Worms Don't Work. login Usenix Mag. 31(6) (2006) - [c50]Xanadu Halkias, Daniel P. W. Ellis:
Estimating the Number of Marine Mammals Using Recordings of Clicks from One Microphone. ICASSP (5) 2006: 769-772 - [c49]Daniel P. W. Ellis, Ron J. Weiss:
Model-Based Monaural Source Separation Using a Vector-Quantized Phase-Vocoder Representation. ICASSP (5) 2006: 957-960 - [c48]Keansub Lee, Daniel P. W. Ellis:
Voice activity detection in personal audio recordings using autocorrelogram compensation. INTERSPEECH 2006 - [c47]Michael I. Mandel, Daniel P. W. Ellis:
A probability model for interaural phase difference. SAPA@INTERSPEECH 2006: 1-6 - [c46]Ron J. Weiss, Daniel P. W. Ellis:
Estimating single-channel source separation masks: relevance vector machine classifiers vs. pitch-based masking. SAPA@INTERSPEECH 2006: 31-36 - [c45]Michael I. Mandel, Daniel P. W. Ellis, Tony Jebara:
An EM Algorithm for Localizing Multiple Sound Sources in Reverberant Environments. NIPS 2006: 953-960 - 2005
- [j8]Jon P. Barker, Martin P. Cooke, Daniel P. W. Ellis:
Decoding speech in the presence of other sources. Speech Commun. 45(1): 5-25 (2005) - [j7]Nelson Morgan, Qifeng Zhu, Andreas Stolcke, M. Kemal Sönmez, Sunil Sivadas, Takahiro Shinozaki, Mari Ostendorf, Pratibha Jain, Hynek Hermansky, Dan Ellis, George R. Doddington, Barry Y. Chen, Özgür Çetin, Hervé Bourlard, Marios Athineos:
Pushing the envelope - aside [speech recognition]. IEEE Signal Process. Mag. 22(5): 81-88 (2005) - [c44]Manuel Reyes-Gomez, Nebojsa Jojic, Daniel P. W. Ellis:
Deformable Spectrograms. AISTATS 2005: 285-292 - [c43]Nathan Lesser, Daniel P. W. Ellis:
Clap detection and discrimination for rhythm therapy. ICASSP (3) 2005: 37-40 - [c42]Chia-Ping Chen, Jeff A. Bilmes, Daniel P. W. Ellis:
Speech Feature Smoothing for Robust ASR. ICASSP (1) 2005: 525-528 - [c41]Graham E. Poliner, Daniel P. W. Ellis:
A Classification Approach to Melody Transcription. ISMIR 2005: 161-166 - [c40]Michael I. Mandel, Dan Ellis:
Song-Level Features and Support Vector Machines for Music Classification. ISMIR 2005: 594-599 - [p1]Daniel P. W. Ellis:
Evaluating Speech Separation Systems. Speech Separation by Humans and Machines 2005: 295-304 - 2004
- [j6]Adam Berenzweig, Beth Logan, Daniel P. W. Ellis, Brian Whitman:
A Large-Scale Evaluation of Acoustic and Subjective Music-Similarity Measures. Comput. Music. J. 28(2): 63-76 (2004) - [j5]Martin P. Cooke, Daniel P. W. Ellis:
Introduction to the special issue on the recognition and organization of real-world sound. Speech Commun. 43(4): 273-274 (2004) - [j4]Nicholas Weaver, Dan Ellis:
Reflections on Witty. login Usenix Mag. 29(3) (2004) - [c39]Nicholas Weaver, Dan Ellis, Stuart Staniford, Vern Paxson:
Worms vs. perimeters: the case for hard-LANs. Hot Interconnects 2004: 70-76 - [c38]Manuel J. Reyes Gomez, Daniel P. W. Ellis, Nebojsa Jojic:
Multiband audio modeling for single-channel acoustic source separation. ICASSP (5) 2004: 641-644 - [c37]Daniel P. W. Ellis, Keansub Lee:
Features for segmenting and classifying long-duration recordings of "personal" audio. SAPA@INTERSPEECH 2004: 106 - [c36]Marios Athineos, Hynek Hermansky, Daniel P. W. Ellis:
PLP-squared: autoregressive modeling of auditory-like 2-d spectro-temporal patterns. SAPA@INTERSPEECH 2004: 129 - [c35]Manuel Reyes-Gomez, Nebojsa Jojic, Daniel P. W. Ellis:
Towards single-channel unsupervised source separation of speech mixtures: the layered harmonics/formants separation-tracking model. SAPA@INTERSPEECH 2004: 137 - [c34]Marios Athineos, Hynek Hermansky, Daniel P. W. Ellis:
LP-TRAP: linear predictive temporal patterns. INTERSPEECH 2004: 949-952 - [c33]Dan Ellis, John Arroyo:
Eigenrhythms: Drum pattern basis sets for classification and generation. ISMIR 2004 - [c32]Brian Whitman, Dan Ellis:
Automatic Record Reviews. ISMIR 2004 - 2003
- [c31]Adam Janin, Don Baron, Jane Edwards, Dan Ellis, David Gelbart, Nelson Morgan, Barbara Peskin, Thilo Pfau, Elizabeth Shriberg, Andreas Stolcke, Chuck Wooters:
The ICSI Meeting Corpus. ICASSP (1) 2003: 364-367 - [c30]Marios Athineos, Daniel P. W. Ellis:
Sound texture modelling with linear prediction in both time and frequency domains. ICASSP (5) 2003: 648-651 - [c29]Manuel J. Reyes Gomez, Bhiksha Raj, Dan Ellis:
Multi-channel source separation by factorial HMMs. ICASSP (1) 2003: 664-667 - [c28]Steve Renals, Dan Ellis:
Audio information access from meeting rooms. ICASSP (4) 2003: 744-747 - [c27]Adam Berenzweig, Daniel P. W. Ellis, Steve Lawrence:
Anchor space for classification and similarity measurement of music. ICME 2003: 29-32 - [c26]Manuel J. Reyes Gomez, Daniel P. W. Ellis:
Selection, parameter estimation, and discriminative training of hidden Markov models for general audio modeling. ICME 2003: 73-76 - [c25]Patricia Scanlon, Daniel P. W. Ellis, Richard B. Reilly:
Using mutual information to design class-specific phone recognizers. INTERSPEECH 2003: 857-860 - [c24]Adam Berenzweig, Beth Logan, Daniel P. W. Ellis, Brian Whitman:
A large-scale evalutation of acoustic and subjective music similarity measures. ISMIR 2003 - [c23]Alexander Sheh, Daniel P. W. Ellis:
Chord segmentation and recognition using EM-trained hidden markov models. ISMIR 2003 - [c22]Robert J. Turetsky, Daniel P. W. Ellis:
Ground-truth transcriptions of real music from force-aligned MIDI syntheses. ISMIR 2003 - [c21]Dan Ellis:
Worm anatomy and model. WORM 2003: 42-50 - 2002
- [j3]Anthony J. Robinson, Gary D. Cook, Daniel P. W. Ellis, Eric Fosler-Lussier, Steve Renals, D. A. G. Williams:
Connectionist speech recognition of Broadcast News. Speech Commun. 37(1-2): 27-45 (2002) - [c20]Manuel J. Reyes Gomez, Daniel P. W. Ellis:
Error visualization for tandem acoustic modeling on the Aurora task. ICASSP 2002: 4176 - [c19]Daniel P. W. Ellis, Brian Whitman, Adam Berenzweig, Steve Lawrence:
The Quest for Ground Truth in Musical Artist Similarity. ISMIR 2002 - 2001
- [j2]Martin Cooke, Daniel P. W. Ellis:
The auditory organization of speech and other sources in listeners and computational models. Speech Commun. 35(3-4): 141-177 (2001) - [c18]Daniel P. W. Ellis, Rita Singh, Sunil Sivadas:
Tandem acoustic modeling in large-vocabulary recognition. ICASSP 2001: 517-520 - [c17]Daniel P. W. Ellis, Manuel J. Reyes Gomez:
Investigations into tandem acoustic modeling for the Aurora task. INTERSPEECH 2001: 189-192 - [c16]Nelson Morgan, Don Baron, Jane Edwards, Daniel P. W. Ellis, David Gelbart, Adam Janin, Thilo Pfau, Elizabeth Shriberg, Andreas Stolcke:
The Meeting Project at ICSI. HLT 2001 - 2000
- [c15]Sangita Sharma, Dan Ellis, Sachin S. Kajarekar, Pratibha Jain, Hynek Hermansky:
Feature extraction using non-linear transformation for robust speech recognition on the Aurora database. ICASSP 2000: 1117-1120 - [c14]Hynek Hermansky, Daniel P. W. Ellis, Sangita Sharma:
Tandem connectionist feature extraction for conventional HMM systems. ICASSP 2000: 1635-1638 - [c13]Daniel P. W. Ellis, Jeff A. Bilmes:
Using mutual information to design feature combinations. INTERSPEECH 2000: 79-82 - [c12]Jon Barker, Martin Cooke, Daniel P. W. Ellis:
Decoding speech in the presence of other sound sources. INTERSPEECH 2000: 270-273 - [c11]Javier Ferreiros López, Daniel P. W. Ellis:
Using acoustic condition clustering to improve acoustic change detection on broadcast news. INTERSPEECH 2000: 568-571
1990 – 1999
- 1999
- [j1]Daniel P. W. Ellis:
Using knowledge to organize sound: The prediction-driven approach to computational auditory scene analysis and its application to speech/nonspeech mixtures. Speech Commun. 27(3-4): 281-298 (1999) - [c10]Dan Ellis, Nelson Morgan:
Size matters: an empirical study of neural network training for large vocabulary continuous speech recognition. ICASSP 1999: 1013-1016 - [c9]Adam Janin, Dan Ellis, Nelson Morgan:
Multi-stream speech recognition: ready for prime time? EUROSPEECH 1999: 591-594 - [c8]Gethin Williams, Daniel P. W. Ellis:
Speech/music discrimination based on posterior probability features. EUROSPEECH 1999 - [c7]Dave Abberley, Steve Renals, Dan Ellis, Anthony J. Robinson:
The THISL SDR System At TREC-8. TREC 1999 - 1997
- [c6]Dan Ellis:
The weft: a representation for periodic sounds. ICASSP 1997: 1307-1310 - 1996
- [b1]Daniel Patrick Whittlesey Ellis:
Prediction-driven computational auditory scene analysis. Massachusetts Institute of Technology, Cambridge, MA, USA, 1996 - 1994
- [c5]Daniel P. W. Ellis:
A computer implementation of psychoacoustic grouping rules. ICPR (3) 1994: 108-112 - [c4]Dan Ellis:
Barefoot multimedia, or, All is not what it seems, Moriarty. Interactive Multimedia in University Education 1994: 151-154 - 1992
- [c3]Daniel P. W. Ellis:
Timescale Modification and Wavelet Representations. ICMC 1992 - 1991
- [c2]Daniel P. W. Ellis, Barry Vercoe:
A Wavelet Based Sinusoid Model of Sound for Auditory Signal Separation. ICMC 1991 - 1990
- [c1]Barry Vercoe, Dan Ellis:
Real-time CSound: Software Synthesis with Sensing and Control. ICMC 1990
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-05 21:43 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint