default search action
IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 22
Volume 22, Number 1, January 2014
- Li Deng, Steve Renals, Marcello Federico, Mari Ostendorf:
Editorial: Expanding the Technical Reach of our Transactions. 5 - Jalal Taghia, Rainer Martin:
Objective Intelligibility Measures Based on Mutual Information for Speech Subjected to Speech Enhancement Processing. 6-16 - Liang Lu, Arnab Ghoshal, Steve Renals:
Cross-Lingual Subspace Gaussian Mixture Models for Low-Resource Speech Recognition. 17-27 - Milica Gasic, Steve J. Young:
Gaussian Processes for POMDP-Based Dialogue Manager Optimization. 28-40 - Imen Marrakchi-Mezghani, Gaël Mahé, Sonia Djaziri Larbi, Meriem Jaïdane, Monia Turki-Hadj Alouane:
Nonlinear Audio Systems Identification Through Audio Input Gaussianization. 41-53 - Joao B. Crespo, Richard C. Hendriks:
Multizone Speech Reinforcement. 54-66 - Chao Pan, Jingdong Chen, Jacob Benesty:
Performance Study of the MVDR Beamformer as a Function of the Source Incidence Angle. 67-79 - Hung-yi Lee, Lin-Shan Lee:
Improved Semantic Retrieval of Spoken Content by Document/Query Expansion with Random Walk Over Acoustic Similarity Graphs. 80-94 - Volker Leutnant, Alexander Krueger, Reinhold Haeb-Umbach:
A New Observation Model in the Logarithmic Mel Power Spectral Domain for the Automatic Recognition of Noisy Reverberant Speech. 95-109 - Nancy F. Chen, Sharon W. Tam, Wade Shen, Joseph P. Campbell:
Characterizing Phonetic Transformations and Acoustic Differences Across English Dialects. 110-124 - Dejan Markovic, Konrad Kowalczyk, Fabio Antonacci, Christian Hofmann, Augusto Sarti, Walter Kellermann:
Estimation of Acoustic Reflection Coefficients Through Pseudospectrum Matching. 125-137 - Zhiyao Duan, Jinyu Han, Bryan Pardo:
Multi-pitch Streaming of Harmonic Sound Mixtures. 138-150 - Shilin Liu, Khe Chai Sim:
Temporally Varying Weight Regression: A Semi-Parametric Trajectory Model for Automatic Speech Recognition. 151-160 - Vikrant Singh Tomar, Richard C. Rose:
A Family of Discriminative Manifold Learning Algorithms and Their Application to Speech Recognition. 161-171 - Hironori Doi, Tomoki Toda, Keigo Nakamura, Hiroshi Saruwatari, Kiyohiro Shikano:
Alaryngeal Speech Enhancement Based on One-to-Many Eigenvoice Conversion. 172-183 - Ebru Arisoy, Stanley F. Chen, Bhuvana Ramabhadran, Abhinav Sethy:
Converting Neural Network Language Models into Back-off Language Models for Efficient Decoding in Automatic Speech Recognition. 184-192 - Craig T. Jin, Nicolas Epain, Abhaya Parthy:
Design, Optimization and Evaluation of a Dual-Radius Spherical Microphone Array. 193-204 - Rémi Mignot, Gilles Chardon, Laurent Daudet:
Low Frequency Interpolation of Room Impulse Responses Using Compressed Sensing. 205-216 - Mohammed Senoussaoui, Patrick Kenny, Themos Stafylakis, Pierre Dumouchel:
A Study of the Cosine Distance-Based Mean Shift for Telephone Speech Diarization. 217-227 - Hideyuki Tachibana, Nobutaka Ono, Shigeki Sagayama:
Singing Voice Enhancement in Monaural Music Signals Based on Two-stage Harmonic/Percussive Sound Separation on Multiple Resolution Spectrograms. 228-237 - Noam R. Shabtai, Boaz Rafaely:
Generalized Spherical Array Beamforming for Binaural Speech Reproduction. 238-247 - Sandro Cumani, Pietro Laface:
Factorized Sub-Space Estimation for Fast and Memory Effective I-vector Extraction. 248-259 - Yuan Zeng, Richard C. Hendriks:
Distributed Delay and Sum Beamformer for Speech Enhancement via Randomized Gossip. 260-273 - Zhenghua Li, Min Zhang, Wanxiang Che, Ting Liu, Wenliang Chen:
Joint Optimization for Chinese POS Tagging and Dependency Parsing. 274-286
Volume 22, Number 2, February 2014
- Dehong Gao, Wenjie Li, Xiaoyan Cai, Renxian Zhang, Ouyang You:
Sequential Summarization: A Full View of Twitter Trending Topics. 293-302 - Peter W. J. van Hengel, Johannes D. Krijnders:
A Comparison of Spectro-Temporal Representations of Audio Signals. 303-313 - Imed Zitouni, Yassine Benajiba:
Aligned-Parallel-Corpora Based Semi-Supervised Learning for Arabic Mention Detection. 314-324 - Emilio Molina, Ana M. Barbancho, Lorenzo J. Tardón, Isabel Barbancho:
Dissonance Reduction In Polyphonic Audio Using Harmonic Reorganization. 325-334 - Daniel Pak-Kong Lun, Tak-Wai Shen, K. C. Ho:
A Novel Expectation-Maximization Framework for Speech Enhancement in Non-Stationary Noise Environments. 335-346 - Stefano Cosentino, Tiago H. Falk, David McAlpine, Torsten Marquardt:
Cochlear Implant Filterbank Design and Optimization: A Simulation Study. 347-353 - Mehrez Souden, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani:
Location Feature Integration for Clustering-Based Speech Separation in Distributed Microphone Arrays. 354-367 - Heikki Kallasjoki, Jort F. Gemmeke, Kalle J. Palomäki:
Estimating Uncertainty to Improve Exemplar-Based Feature Enhancement for Noise Robust Speech Recognition. 368-380 - Taufiq Hasan, John H. L. Hansen:
Maximum Likelihood Acoustic Factor Analysis Models for Robust Speaker Verification in Noise. 381-391 - Ofer Schwartz, Sharon Gannot:
Speaker Tracking Using Recursive EM Algorithms. 392-402 - Yu Tsao, Shigeki Matsuda, Chiori Hori, Hideki Kashioka, Chin-Hui Lee:
A MAP-based Online Estimation Approach to Ensemble Speaker and Speaking Environment Modeling. 403-416 - Pui-Yu Hui, Helen Meng:
Latent Semantic Analysis for Multimodal User Input With Speech and Gestures. 417-429 - Jesper Jensen, Cees H. Taal:
Speech Intelligibility Prediction Based on Mutual Information. 430-440 - Andrea Primavera, Stefania Cecchi, Junfeng Li, Francesco Piazza:
Objective and Subjective Investigation on a Novel Method for Digital Reverberator Parameters Estimation. 441-452 - Matt Speed, Damian T. Murphy, David M. Howard:
Modeling the Vocal Tract Transfer Function Using a 3D Digital Waveguide Mesh. 453-464 - Hüseyin Hacihabiboglu:
Theoretical Analysis of Open Spherical Microphone Arrays for Acoustic Intensity Measurements. 465-476 - Taemin Cho, Juan Pablo Bello:
On the Relative Importance of Individual Components of Chord Recognition Systems. 477-492 - Takuma Otsuka, Katsuhiko Ishiguro, Hiroshi Sawada, Hiroshi G. Okuno:
Bayesian Nonparametrics for Microphone Array Processing. 493-504 - Jianjun He, Ee-Leng Tan, Woon-Seng Gan:
Linear Estimation Based Primary-Ambient Extraction for Stereo Audio Signals. 505-517 - Sira Gonzalez, Mike Brookes:
PEFAC - A Pitch Estimation Algorithm Robust to High Levels of Noise. 518-530 - Min Zhang, Xiangyu Duan, Wenliang Chen:
Bayesian Constituent Context Model for Grammar Induction. 531-541 - Dah-Chung Chang, Fei-Tao Chu:
Feedforward Active Noise Control With a New Variable Tap-Length and Step-Size Filtered-X LMS Algorithm. 542-555 - Matt McVicar, Raúl Santos-Rodriguez, Yizhao Ni, Tijl De Bie:
Automatic Chord Estimation from Audio: A Review of the State of the Art. 556-575
Volume 22, Number 3, March 2014
- Chung-Hsien Wu, Yi-Chin Huang, Chung-Han Lee, Jun-Cheng Guo:
Synthesis of Spontaneous Speech With Syllable Contraction Using State-Based Context-Dependent Voice Transformation. 585-595 - Manu Airaksinen, Tuomo Raitio, Brad H. Story, Paavo Alku:
Quasi Closed Phase Glottal Inverse Filtering Analysis With Weighted Linear Prediction. 596-607 - Jae-Mo Yang, Hong-Goo Kang:
Online Speech Dereverberation Algorithm Based on Adaptive Multichannel Linear Prediction. 608-619 - Afsaneh Asaei, Mohammad Golbabaee, Hervé Bourlard, Volkan Cevher:
Structured Sparsity Models for Reverberant Speech Separation. 620-633 - Rajan S. Rashobh, Andy W. H. Khong, Di Liu:
Multichannel Equalization in the KLT and Frequency Domains With Application to Speech Dereverberation. 634-646 - Prasanga N. Samarasinghe, Thushara D. Abhayapala, Mark A. Poletti:
Wavefield Analysis Over Large Areas Using Distributed Higher Order Microphones. 647-658 - Wen-Li Wei, Chung-Hsien Wu, Jen-Chun Lin, Han Li:
Exploiting Psychological Factors for Interaction Style Recognition in Spoken Conversation. 659-671 - Stanislaw Andrzej Raczynski, Emmanuel Vincent:
Genre-Based Music Language Modeling with Latent Hierarchical Pitman-Yor Process Allocation. 672-681 - Dalei Wu, Wei-Ping Zhu, M. N. S. Swamy:
The Theory of Compressive Sensing Matching Pursuit Considering Time-domain Noise with Application to Speech Enhancement. 682-696 - Tejaswi Nanjundaswamy, Kenneth Rose:
Cascaded Long Term Prediction for Enhanced Compression of Polyphonic Audio Signals. 697-710 - Kartik Audhkhasi, Andreas M. Zavou, Panayiotis G. Georgiou, Shrikanth S. Narayanan:
Theoretical Analysis of Diversity in an Ensemble of Automatic Speech Recognition Systems. 711-726 - Joonas Nikunen, Tuomas Virtanen:
Direction of Arrival Based Spatial Covariance Model for Blind Sound Source Separation. 727-739
Volume 22, Number 4, April 2014
- Jinyu Li, Li Deng, Yifan Gong, Reinhold Haeb-Umbach:
An Overview of Noise-Robust Automatic Speech Recognition. 745-777 - Ruhi Sarikaya, Geoffrey E. Hinton, Anoop Deoras:
Application of Deep Belief Networks for Natural Language Understanding. 778-784 - Romain Serizel, Marc Moonen, Bas van Dijk, Jan Wouters:
Low-rank Approximation Based Multichannel Wiener Filter Algorithms for Noise Reduction with Application in Cochlear Implants. 785-799 - Marco Crocco, Andrea Trucco:
Design of Superdirective Planar Arrays With Sparse Aperiodic Layouts for Processing Broadband Signals via 3-D Beamforming. 800-815 - José Ricardo Zapata, Matthew E. P. Davies, Emilia Gómez:
Multi-Feature Beat Tracking. 816-825 - Arun Narayanan, DeLiang Wang:
Investigation of Speech Separation as a Front-End for Noise Robust Speech Recognition. 826-835 - Xiaojia Zhao, Yuxuan Wang, DeLiang Wang:
Robust Speaker Identification in Noisy and Reverberant Conditions. 836-845 - Sandro Cumani, Oldrich Plchot, Pietro Laface:
On the use of i-vector posterior distributions in Probabilistic Linear Discriminant Analysis. 846-857 - Chung-Hsien Wu, Han-Ping Shen, Yan-Ting Yang:
Chinese-English Phone Set Construction for Code-Switching ASR Using Acoustic and DNN-Extracted Articulatory Features. 858-862
Volume 22, Number 5, May 2014
- Weibin Zhang, Pascale Fung:
Discriminatively Trained Sparse Inverse Covariance Matrices for Speech Recognition. 871-880 - Hung-yi Lee, Sz-Rung Shiang, Ching-feng Yeh, Yun-Nung Chen, Yu Huang, Sheng-yi Kong, Lin-Shan Lee:
Spoken Knowledge Organization by Semantic Structuring and a Prototype Course Lecture System for Personalized Learning. 881-896 - Leonardo Zão, Rosângela Coelho, Patrick Flandrin:
Speech Enhancement with EMD and Hurst-Based Mode Selection. 897-909 - Daniele Giacobello, Mads Græsbøll Christensen, Tobias Lindstrøm Jensen, Manohar N. Murthi, Søren Holdt Jensen, Marc Moonen:
Stable 1-Norm Error Minimization Based Linear Predictors for Speech Modeling. 910-920 - Yesenia Lacouture-Parodi, Emanuël A. P. Habets, Jingdong Chen, Jacob Benesty:
Multichannel Noise Reduction in the Karhunen-Loève Expansion Domain. 921-934 - Seyed Omid Sadjadi, John H. L. Hansen:
Blind Spectral Weighting for Robust Speaker Identification under Reverberation Mismatch. 935-943 - Gautam Varma Mantena, Sivanand Achanta, Kishore Prahallad:
Query-by-Example Spoken Term Detection using Frequency Domain Linear Prediction and Non-Segmental Dynamic Time Warping. 944-953 - Christopher Osterwise, Steven L. Grant:
On Over-Determined Frequency Domain BSS. 954-964 - Daniel P. Jarrett, Maja Taseska, Emanuël A. P. Habets, Patrick A. Naylor:
Noise Reduction in the Spherical Harmonic Domain Using a Tradeoff Beamformer and Narrowband DOA Estimates. 965-976 - Verena Rieser, Oliver Lemon, Simon Keizer:
Natural Language Generation as Incremental Planning Under Uncertainty: Adaptive Information Presentation for Statistical Dialogue Systems. 979-993 - Jordan Cheer, Stephen J. Elliott:
Comments on "Complete Parallel Narrowband Active Noise Control Systems". 993-994
Volume 22, Number 6, June 2014
- Vipul Arora, Laxmidhar Behera:
Musical Source Clustering and Identification in Polyphonic Audio. 1003-1012 - Rajeev C. Nongpiur:
Design of Minimax Broadband Beamformers that are Robust to Microphone Gain, Phase, and Position Errors. 1013-1022 - Arun Venkitaraman, Chandra Sekhar Seelamantula:
Binaural Signal Processing Motivated Generalized Analytic Signal Construction and AM-FM Demodulation. 1023-1036 - Jürgen T. Geiger, Felix Weninger, Jort F. Gemmeke, Martin Wöllmer, Björn W. Schuller, Gerhard Rigoll:
Memory-Enhanced Neural Networks and NMF for Robust ASR. 1037-1046 - Haiquan Zhao, Yi Yu, Shibin Gao, Xiangping Zeng, Zhengyou He:
Memory Proportionate APA with Individual Activation Factors for Acoustic Echo Cancellation. 1047-1055 - Mehrdad J. Gangeh, Pouria Fewzee, Ali Ghodsi, Mohamed S. Kamel, Fakhri Karray:
Multiview Supervised Dictionary Learning in Speech Emotion Recognition. 1056-1068 - Jae-Hun Choi, Joon-Hyuk Chang:
Dual-Microphone Voice Activity Detection Technique Based on Two-Step Power Level Difference Ratio. 1069-1081 - Xavier Alameda-Pineda, Radu Horaud:
A Geometric Approach to Sound Source Localization from Time-Delay Estimates. 1082-1095 - Klaus Reindl, Stefan Meier, Hendrik Barfuss, Walter Kellermann:
Minimum Mutual Information-Based Linearly Constrained Broadband Signal Extraction. 1096-1108
Volume 22, Number 7, July 2014
- Mohamad Hasan Bahari, Najim Dehak, Hugo Van hamme, Lukás Burget, Ahmed Ali, Jim Glass:
Non-Negative Factor Analysis of Gaussian Mixture Model Weight Adaptation for Language and Dialect Recognition. 1117-1129 - Guangzhao Bao, Yangfei Xu, Zhongfu Ye:
Learning a Discriminative Dictionary for Single-Channel Speech Separation. 1130-1138 - Ian J. Kelly, Francis M. Boland:
Detecting Arrivals in Room Impulse Responses With Dynamic Time Warping. 1139-1147 - Markus Guldenschuh, Raymond A. de Callafon:
Detection of Secondary-Path Irregularities in Active Noise Control Headphones. 1148-1157 - Sin-Horng Chen, Chiao-Hua Hsieh, Chen-Yu Chiang, Hsi-Chun Hsiao, Yih-Ru Wang, Yuan-Fu Liao, Hsiu-Min Yu:
Modeling of Speaking Rate Influences on Mandarin Speech Prosody and Its Application to Speaking Rate-controlled TTS. 1158-1171 - Danilo Comminiello, Michele Scarpiniti, Luis Antonio Azpicueta-Ruiz, Jerónimo Arenas-García, Aurelio Uncini:
Nonlinear Acoustic Echo Cancellation Based on Sparse Functional Link Representations. 1172-1183 - Wen Zhang, Thushara D. Abhayapala:
Three Dimensional Sound Field Reproduction using Multiple Circular Loudspeaker Arrays: Functional Analysis Guided Approach. 1184-1194 - Maja Taseska, Emanuël A. P. Habets:
Informed Spatial Filtering for Sound Extraction Using Distributed Microphone Arrays. 1195-1207 - Mo Shen, Daisuke Kawahara, Sadao Kurohashi:
Dependency Parse Reranking with Rich Subtree Features. 1208-1218
Volume 22, Number 8, August 2014
- Zhibao Li, Ka Fai Cedric Yiu, Sven Nordholm:
On the Indoor Beamformer Design With Reverberation. 1225-1235 - Matthew B. Hawes, Wei Liu:
Sparse Array Design for Wideband Beamforming With Reduced Complexity in Tapped Delay-Lines. 1236-1247 - Yi FanChiang, Cheng-Wen Wei, Yi-Le Meng, Yu-Wen Lin, Shyh-Jye Jou, Tian-Sheuan Chang:
Low Complexity Formant Estimation Adaptive Feedback Cancellation for Hearing Aids Using Pitch Based Processing. 1248-1259 - Simon Conan, Olivier Derrien, Mitsuko Aramaki, Sølvi Ystad, Richard Kronland-Martinet:
A Synthesis Model With Intuitive Control Capabilities for Rolling Sounds. 1260-1273 - Christian Schüldt, Peter Händel:
Decay Rate Estimators and Their Performance for Blind Reverberation Time Estimation. 1274-1284 - Sriram Ganapathy, Sri Harish Reddy Mallidi, Hynek Hermansky:
Robust Feature Extraction Using Modulation Filtering of Autoregressive Models. 1285-1295 - Bo Li, Khe Chai Sim:
A Spectral Masking Approach to Noise-Robust Speech Recognition Using Deep Neural Networks. 1296-1305 - Emre Yilmaz, Jort Florent Gemmeke, Hugo Van hamme:
Noise Robust Exemplar Matching Using Sparse Representations of Speech. 1306-1319 - Dominic Schmid, Gerald Enzner, Sarmad Malik, Dorothea Kolossa, Rainer Martin:
Variational Bayesian Inference for Multichannel Dereverberation and Noise Reduction. 1320-1335
Volume 22, Number 9, September 2014
- Bruno S. Masiero, Michael Vorländer:
A Framework for the Calculation of Dynamic Crosstalk Cancellation Filters. 1345-1354 - Alexander Schasse, Rainer Martin:
Estimation of Subband Speech Correlations for Noise Reduction via MVDR Processing. 1355-1365 - Michal Novotny, Jan Rusz, Roman Cmejla, Evzen Ruzicka:
Automatic Evaluation of Articulatory Disorders in Parkinson's Disease. 1366-1378 - Felicia Lim, Wancheng Zhang, Emanuël A. P. Habets, Patrick A. Naylor:
Robust Multichannel Dereverberation using Relaxed Multichannel Least Squares. 1379-1390 - Sina Hamidi Ghalehjegh, Richard C. Rose:
Linear Regression Based Acoustic Adaptation for the Subspace Gaussian Mixture Model. 1391-1402 - Jonathan Botts, Lauri Savioja:
Spectral and Pseudospectral Properties of Finite Difference Models Used in Audio and Room Acoustics. 1403-1412 - Yong Xiang, Iynkaran Natgunanathan, Song Guo, Wanlei Zhou, Saeid Nahavandi:
Patchwork-Based Audio Watermarking Method Robust to De-synchronization Attacks. 1413-1423 - Ian Vince McLoughlin:
Super-Audible Voice Activity Detection. 1424-1433 - Atiyeh Alinaghi, Philip J. B. Jackson, Qingju Liu, Wenwu Wang:
Joint Mixing Vector and Binaural Model Based Stereo Source Separation. 1434-1448
Volume 22, Number 10, October 2014
- Liheng Zhao, Jacob Benesty, Jingdong Chen:
Design of Robust Differential Microphone Arrays. 1455-1466 - Pooja Jain, Ram Bilas Pachori:
Event-Based Method for Instantaneous Fundamental Frequency Estimation from Voiced Speech Based on Eigenvalue Decomposition of the Hankel Matrix. 1467-1482 - Yonatan Vaizman, Brian McFee, Gert R. G. Lanckriet:
Codebook-Based Audio Feature Representation for Music Information Retrieval. 1483-1493 - O. Nadiri, Boaz Rafaely:
Localization of Multiple Speakers under High Reverberation using a Spherical Microphone Array and the Direct-Path Dominance Test. 1494-1505 - Zhizheng Wu, Tuomas Virtanen, Engsiong Chng, Haizhou Li:
Exemplar-Based Sparse Representation With Residual Compensation for Voice Conversion. 1506-1521 - Dumidu S. Talagala, Wen Zhang, Thushara D. Abhayapala:
Efficient Multi-Channel Adaptive Room Compensation for Spatial Soundfield Reproduction Using a Modal Decomposition. 1522-1532 - Ossama Abdel-Hamid, Abdel-rahman Mohamed, Hui Jiang, Li Deng, Gerald Penn, Dong Yu:
Convolutional Neural Networks for Speech Recognition. 1533-1545 - Shoichi Koyama, Ken'ichi Furuya, Yusuke Hiwasaki, Yoichi Haneda, Yôiti Suzuki:
Wave Field Reconstruction Filtering in Cylindrical Harmonic Domain for With-Height Recording and Reproduction. 1546-1557 - Chia-Ping Chen, Yi-Chin Huang, Chung-Hsien Wu, Kuan-De Lee:
Polyglot Speech Synthesis Based on Cross-Lingual Frame Selection Using Auditory and Articulatory Features. 1558-1570
Volume 22, Number 11, November 2014
- Jian Xu, Zhi-Jie Yan, Qiang Huo:
An Unsupervised Adaptation Approach to Leveraging Feedback Loop Data by Using i-Vector for Data Clustering and Selection. 1581-1589 - Sandro Cumani, Pietro Laface:
Large-Scale Training of Pairwise Support Vector Machines for Speaker Recognition. 1590-1600 - Jun Du, Qiang Huo:
An Improved VTS Feature Compensation using Mixture Models of Distortion and IVN Training for Noisy Speech Recognition. 1601-1611 - Masahito Togami, Yohei Kawaguchi:
Simultaneous Optimization of Acoustic Echo Reduction, Speech Dereverberation, and Noise Reduction against Mutual Interference. 1612-1623 - Jorge Lorente, Miguel Ferrer, Maria de Diego, Alberto González:
GPU Implementation of Multichannel Adaptive Algorithms for Local Active Noise Control. 1624-1635 - Thomas Hélie:
Simulation of Fractional-Order Low-Pass Filters. 1636-1647 - Bruno Defraene, Toon van Waterschoot, Moritz Diehl, Marc Moonen:
Embedded-Optimization-Based Loudspeaker Precompensation Using a Hammerstein Loudspeaker Model. 1648-1659 - Guangsen Wang, Khe Chai Sim:
Regression-Based Context-Dependent Modeling of Deep Neural Networks for Speech Recognition. 1660-1669 - Roland Badeau, Mark D. Plumbley:
Multichannel High-Resolution NMF for Modeling Convolutive Mixtures of Non-Stationary Signals in the Time-Frequency Domain. 1670-1680
Volume 22, Number 12, December 2014
- Li Deng:
Farewell editorial: keeping up the momentum of innovations. 1687 - Sree Harsha Yella, Hervé Bourlard:
Overlapping speech detection using long-term conversational features for speaker diarization in meeting room conversations. 1688-1700 - Ravi K. Chivukula, Yuriy A. Reznik, Yanyan Hu, Venkat Devarajan, Mythreya Jayendra-Lakshman:
Fast algorithms for low-delay TDAC filterbanks in MPEG-4 AAC-ELD. 1701-1712 - Shaofei Xue, Ossama Abdel-Hamid, Hui Jiang, Li-Rong Dai, Qingfeng Liu:
Fast adaptation of deep neural network based on discriminant codes for speech recognition. 1713-1725 - Matthew E. P. Davies, Philippe Hamel, Kazuyoshi Yoshii, Masataka Goto:
AutoMashUpper: automatic creation of multi-song music mashups. 1726-1737 - Chao Weng, David L. Thomson, Patrick Haffner, Biing-Hwang Juang:
Latent semantic rational kernels for topic spotting on conversational speech. 1738-1749 - Neil Wachowski, Mahmood R. Azimi-Sadjadi:
Detection and classification of nonstationary transient signals using sparse approximations and Bayesian networks. 1750-1764 - Graham Percival, George Tzanetakis:
Streamlined tempo estimation based on autocorrelation and cross-correlation with pulses. 1765-1776 - Annea Barkefors, Mikael Sternad, Lars-Johan Brännmark:
Design and analysis of linear quadratic Gaussian feedforward controllers for active noise control. 1777-1791 - Maximo Cobos, Juan José Pérez Solano, Santiago Felici-Castell, Jaume Segura, Juan M. Navarro:
Cumulative-sum-based localization of sound events in low-cost wireless acoustic sensor networks. 1792-1802 - Vladimir Tourbabin, Boaz Rafaely:
Theoretical framework for the optimization of microphone array configuration for humanoid robot audition. 1803-1814 - Yuriy V. Zakharov, Vítor H. Nascimento:
Sliding-window RLS low-cost implementation of proportionate affine projection algorithms. 1815-1824 - Stefano D'Angelo, Vesa Välimäki:
Generalized Moog ladder filter: part I-linear analysis and parameterization. 1825-1832 - Na Yang, He Ba, Weiyang Cai, Ilker Demirkol, Wendi B. Heinzelman:
BaNa: a noise resilient fundamental frequency detection algorithm for speech and music. 1833-1848 - Yuxuan Wang, Arun Narayanan, DeLiang Wang:
On training targets for supervised speech separation. 1849-1858 - Ling-Hui Chen, Zhen-Hua Ling, Li-Juan Liu, Li-Rong Dai:
Voice conversion using deep neural networks with layer-wise generative training. 1859-1872 - Stefano D'Angelo, Vesa Välimäki:
Generalized Moog ladder filter: part II-explicit nonlinear model through a novel delay-free loop implementation method. 1873-1883 - Zafar Rafii, Zhiyao Duan, Bryan Pardo:
Combining rhythm-based and pitch-based methods for background and melody separation. 1884-1893 - Jussi Rämö, Vesa Välimäki, Balázs Bank:
High-precision parallel graphic equalizer. 1894-1904 - Yannis Panagakis, Constantine Kotropoulos, Gonzalo R. Arce:
Music genre classification via joint sparse low-rank representation of audio features. 1905-1917 - Akira Maezawa, Katsutoshi Itoyama, Kazuyoshi Yoshii, Hiroshi G. Okuno:
Nonparametric Bayesian dereverberation of power spectrograms based on infinite-order autoregressive processes. 1918-1930 - Martin Krawczyk, Timo Gerkmann:
STFT phase reconstruction in voiced speech for an improved single-channel speech enhancement. 1931-1940 - Vahid Khanagha, Khalid Daoudi, Hussein M. Yahia:
Detection of glottal closure instants based on the microcanonical multiscale formalism. 1941-1950 - A. Venturini, Leonardo Zão, Rosângela Coelho:
On speech features fusion, α-integration Gaussian modeling and multi-style training for noise robust speaker classification. 1951-1964 - Peter Foster, Matthias Mauch, Simon Dixon:
Sequential complexity as a descriptor for musical similarity. 1965-1977 - Gang Liu, John H. L. Hansen:
An investigation into back-end advancements for speaker recognition in multi-session and noisy enrollment scenarios. 1978-1992 - Jitong Chen, Yuxuan Wang, DeLiang Wang:
A feature study for classification-based speech separation at low signal-to-noise ratios. 1993-2002 - Jelle Van Mourik, Damian T. Murphy:
Explicit higher-order FDTD schemes for 3D room acoustic simulation. 2003-2011 - Pei Chee Yong, Sven Nordholm, Hai Huyen Dam:
Effective binaural multi-channel processing algorithm for improved environmental presence. 2012-2024 - Austin Chen, Mark A. Hasegawa-Johnson:
Mixed stereo audio classification using a stereo-input mixed-to-panned level feature. 2025-2033 - Gongping Huang, Jacob Benesty, Tao Long, Jingdong Chen:
A family of maximum SNR filters for noise reduction. 2034-2047 - Su Yan, Xiaojun Wan:
SRRank: leveraging semantic roles for extractive multi-document summarization. 2048-2058 - Hideyuki Tachibana, Nobutaka Ono, Hirokazu Kameoka, Shigeki Sagayama:
Harmonic/percussive sound separation based on anisotropic smoothness of spectrograms. 2059-2073 - Jose Manuel Gil-Cacho, Toon van Waterschoot, Marc Moonen, Søren Holdt Jensen:
A frequency-domain adaptive filter (FDAF) prediction error method (PEM) framework for double-talk-robust acoustic echo cancellation. 2074-2086 - Qi Wang, Wai Lok Woo, Satnam Singh Dlay:
Informed single-channel speech separation using HMM-GMM user-generated exemplar source. 2087-2100 - Daniel Erro, Tudor-Catalin Zorila, Yannis Stylianou:
Enhancing the intelligibility of statistically generated synthetic speech by means of noise-independent modifications. 2101-2111 - Yi Jiang, DeLiang Wang, Runsheng Liu, Zhenming Feng:
Binaural classification for reverberant speech segregation using deep neural networks. 2112-2121 - Li Su, Hsin-Ming Lin, Yi-Hsuan Yang:
Sparse modeling of magnitude and phase-derived spectra for playing technique classification. 2122-2132 - Vinod Veera Reddy, Andy W. H. Khong, Boon Poh Ng:
Unambiguous speech DOA estimation under spatial aliasing conditions. 2133-2145 - Amir Mohammadi, Seyyed Saeed Sarfjoo, Cenk Demiroglu:
Eigenvoice speaker adaptation with minimal data for statistical speech synthesis systems using a MAP approach and nearest-neighbors. 2146-2157 - Kun Han, DeLiang Wang:
Neural network based pitch tracking in very noisy speech. 2158-2168 - Yongsheng Mu, Peifeng Ji, Wei Ji, Ming Wu, Jun Yang:
Modeling and compensation for the distortion of parametric loudspeakers using a one-dimension Volterra filter. 2169-2181 - Oliver Thiergart, Maja Taseska, Emanuël A. P. Habets:
An informed parametric spatial filter based on instantaneous direction-of-arrival estimates. 2182-2196 - João Felipe Santos, Tiago H. Falk:
Updating the SRMR-CI metric for improved intelligibility prediction for cochlear implant users. 2197-2206 - Seon Man Kim, Hong Kook Kim:
Direction-of-arrival based SNR estimation for dual-microphone speech enhancement. 2207-2217 - Takuma Otsuka, Katsuhiko Ishiguro, Takuya Yoshioka, Hiroshi Sawada, Hiroshi G. Okuno:
Multichannel sound source dereverberation and separation for arbitrary number of sources based on Bayesian nonparametrics. 2218-2232 - Johannes Traa, Paris Smaragdis:
Multichannel source separation and tracking with RANSAC and directional statistics. 2233-2243 - Weifeng Li, Longbiao Wang, Yicong Zhou, John Dines, Mathew Magimai-Doss, Hervé Bourlard, Qingmin Liao:
Feature mapping of multiple beamformed sources for robust overlapping speech recognition using a microphone array. 2244-2255 - Yi FanChiang, Cheng-Wen Wei, Yi-Le Meng, Yu-Wen Lin, Shyh-Jye Jou, Tian-Sheuan Chang:
Correction to "Low complexity formant estimation adaptive feedback cancellation for hearing aids using pitch based processing". 2256
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.