default search action
Srinivasan Umesh
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c87]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
Stable Distillation: Regularizing Continued Pre-Training for Low-Resource Automatic Speech Recognition. ICASSP 2024: 10821-10825 - [c86]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
FusDom: Combining in-Domain and Out-of-Domain Knowledge for Continuous Self-Supervised Learning. ICASSP 2024: 12572-12576 - [c85]Advait Joglekar, Hamees Ul Hasan Sayed, Srinivasan Umesh:
SPRING Lab IITM's Submission to Low Resource Indic Language Translation Shared Task. WMT 2024: 770-774 - [i24]Srinivasan Umesh, Leon Cohen, Douglas J. Nelson:
On the relationship between speech and hearing. CoRR abs/2402.12094 (2024) - 2023
- [c84]Anusha Prakash, Srinivasan Umesh, Hema A. Murthy:
Towards Developing State-of-The-Art TTS Synthesisers for 13 Indian Languages with Signal Processing Aided Alignments. ASRU 2023: 1-8 - [c83]Sreyan Ghosh, Ashish Seth, Srinivasan Umesh, Dinesh Manocha:
MAST: Multiscale Audio Spectrogram Transformers. ICASSP 2023: 1-5 - [c82]Vasista Sai Lodagala, Sreyan Ghosh, Srinivasan Umesh:
Data2vec-Aqc: Search for the Right Teaching Assistant in the Teacher-Student Training Setup. ICASSP 2023: 1-5 - [c81]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
Unfused: Unsupervised Finetuning Using Self Supervised Distillation. ICASSP Workshops 2023: 1-5 - [c80]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
SLICER: Learning Universal Audio Representations Using Low-Resource Self-Supervised Pre-Training. ICASSP 2023: 1-5 - [c79]Vrunda N. Sukhadia, Srinivasan Umesh:
Channel-Aware Pretraining Of Joint Encoder-Decoder Self-Supervised Model For Telephonic-Speech ASR. ICASSP Workshops 2023: 1-5 - [c78]Ramanan Sivaguru, Vasista Sai Lodagala, Srinivasan Umesh:
SALTTS: Leveraging Self-Supervised Speech Representations for improved Text-to-Speech Synthesis. INTERSPEECH 2023: 3033-3037 - [c77]Anusha Prakash, Arun Kumar A, Ashish Seth, Bhagyashree Mukherjee, Ishika Gupta, Jom Kuriakose, Jordan Fernandes, K. V. Vikram, Mano Ranjith Kumar M., Metilda Sagaya Mary, Mohammad Wajahat, Mohana N, Mudit Batra, Navina K, Nihal John George, Nithya Ravi, Pruthwik Mishra, Sudhanshu Srivastava, Vasista Sai Lodagala, Vandan Mujadia, Kada Sai Venkata Vineeth, Vrunda N. Sukhadia, Dipti Misra Sharma, Hema A. Murthy, Pushpak Bhattacharyya, Srinivasan Umesh, Rajeev Sangal:
Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages. INTERSPEECH 2023: 3683-3684 - [c76]Kaousheik Jayakumar, Vrunda N. Sukhadia, Arun Kumar A, Srinivasan Umesh:
The Tag-Team Approach: Leveraging CLS and Language Tagging for Enhancing Multilingual ASR. INTERSPEECH 2023: 4414-4418 - [i23]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation. CoRR abs/2303.05668 (2023) - [i22]Kaousheik Jayakumar, Vrunda N. Sukhadia, A. Arunkumar, Srinivasan Umesh:
The Tag-Team Approach: Leveraging CLS and Language Tagging for Enhancing Multilingual ASR. CoRR abs/2305.19584 (2023) - [i21]Ramanan Sivaguru, Vasista Sai Lodagala, Srinivasan Umesh:
SALTTS: Leveraging Self-Supervised Speech Representations for improved Text-to-Speech Synthesis. CoRR abs/2308.01018 (2023) - [i20]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
Stable Distillation: Regularizing Continued Pre-training for Low-Resource Automatic Speech Recognition. CoRR abs/2312.12783 (2023) - [i19]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
FusDom: Combining In-Domain and Out-of-Domain Knowledge for Continuous Self-Supervised Learning. CoRR abs/2312.13026 (2023) - 2022
- [j15]Sreyan Ghosh, Ashish Seth, Srinivasan Umesh:
Decorrelating Feature Spaces for Learning General-Purpose Audio Representations. IEEE J. Sel. Top. Signal Process. 16(6): 1402-1414 (2022) - [j14]Narla John Metilda Sagaya Mary, Srinivasan Umesh, Sandesh Varadaraju Katta:
S-Vectors and TESA: Speaker Embeddings and a Speaker Authenticator Based on Transformer Encoder. IEEE ACM Trans. Audio Speech Lang. Process. 30: 404-413 (2022) - [c75]Pratik Kumar, Vrunda N. Sukhadia, Srinivasan Umesh:
Investigation of Robustness of Hubert Features from Different Layers to Domain, Accent and Language Variations. ICASSP 2022: 6887-6891 - [c74]A. Arunkumar, Srinivasan Umesh:
Joint Encoder-Decoder Self-Supervised Pre-training for ASR. INTERSPEECH 2022: 3418-3422 - [c73]Anish Bhanushali, Grant Bridgman, Deekshitha G, Prasanta Kumar Ghosh, Pratik Kumar, Saurabh Kumar, Adithya Raj Kolladath, Nithya Ravi, Aaditeshwar Seth, Ashish Seth, Abhayjeet Singh, Vrunda N. Sukhadia, Srinivasan Umesh, Sathvik Udupa, Lodagala V. S. V. Durga Prasad:
Gram Vaani ASR Challenge on spontaneous telephone speech recordings in regional variations of Hindi. INTERSPEECH 2022: 3548-3552 - [c72]Sreyan Ghosh, Sonal Kumar, Yaman Kumar, Rajiv Ratn Shah, Srinivasan Umesh:
Span Classification with Structured Information for Disfluency Detection in Spoken Utterances. INTERSPEECH 2022: 3998-4002 - [c71]A. Arunkumar, Vrunda Nileshkumar Sukhadia, Srinivasan Umesh:
Investigation of Ensemble features of Self-Supervised Pretrained Models for Automatic Speech Recognition. INTERSPEECH 2022: 5145-5149 - [c70]Sreyan Ghosh, Samden Lepcha, Sakshi Singh, Rajiv Ratn Shah, Srinivasan Umesh:
DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances. INTERSPEECH 2022: 5185-5189 - [c69]Vasista Sai Lodagala, Sreyan Ghosh, Srinivasan Umesh:
CCC-WAV2VEC 2.0: Clustering AIDED Cross Contrastive Self-Supervised Learning of Speech Representations. SLT 2022: 1-8 - [c68]Vasista Sai Lodagala, Sreyan Ghosh, Srinivasan Umesh:
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations. SLT 2022: 136-143 - [c67]Vrunda N. Sukhadia, Srinivasan Umesh:
Domain Adaptation of Low-Resource Target-Domain Models Using Well-Trained ASR Conformer Models. SLT 2022: 295-301 - [i18]Vrunda N. Sukhadia, Srinivasan Umesh:
Domain Adaptation of low-resource Target-Domain models using well-trained ASR Conformer Models. CoRR abs/2202.09167 (2022) - [i17]Sreyan Ghosh, Ashish Seth, Srinivasan Umesh:
DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio Representation Learning. CoRR abs/2203.13628 (2022) - [i16]Sreyan Ghosh, Sonal Kumar, Yaman Kumar Singla, Rajiv Ratn Shah, Srinivasan Umesh:
Span Classification with Structured Information for Disfluency Detection in Spoken Utterances. CoRR abs/2203.16028 (2022) - [i15]Harshvardhan Srivastava, Sreyan Ghosh, Srinivasan Umesh:
MMER: Multimodal Multi-task learning for Emotion Recognition in Spoken Utterances. CoRR abs/2203.16794 (2022) - [i14]Sreyan Ghosh, Harshvardhan Srivastava, Srinivasan Umesh:
A Discourse Aware Sequence Learning Approach for Emotion Recognition in Conversations. CoRR abs/2203.16799 (2022) - [i13]Lodagala V. S. V. Durga Prasad, Sreyan Ghosh, Srinivasan Umesh:
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations. CoRR abs/2203.16965 (2022) - [i12]Lodagala Durga Prasad, Ashish Seth, Sreyan Ghosh, Srinivasan Umesh:
Analyzing the factors affecting usefulness of Self-Supervised Pre-trained Representations for Speech Recognition. CoRR abs/2203.16973 (2022) - [i11]A. Arunkumar, Vrunda N. Sukhadia, Srinivasan Umesh:
Investigation of Ensemble features of Self-Supervised Pretrained Models for Automatic Speech Recognition. CoRR abs/2206.05518 (2022) - [i10]Vasista Sai Lodagala, Sreyan Ghosh, Srinivasan Umesh:
CCC-wav2vec 2.0: Clustering aided Cross Contrastive Self-supervised learning of speech representations. CoRR abs/2210.02592 (2022) - [i9]Vasista Sai Lodagala, Sreyan Ghosh, Srinivasan Umesh:
data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setup. CoRR abs/2211.01246 (2022) - [i8]Anusha Prakash, Arun Kumar A, Ashish Seth, Bhagyashree Mukherjee, Ishika Gupta, Jom Kuriakose, Jordan Fernandes, K. V. Vikram, Mano Ranjith Kumar M., Metilda Sagaya Mary, Mohammad Wajahat, Mohana N, Mudit Batra, Navina K, Nihal John George, Nithya Ravi, Pruthwik Mishra, Sudhanshu Srivastava, Vasista Sai Lodagala, Vandan Mujadia, Kada Sai Venkata Vineeth, Vrunda N. Sukhadia, Dipti Misra Sharma, Hema A. Murthy, Pushpak Bhattacharya, Srinivasan Umesh, Rajeev Sangal:
Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages. CoRR abs/2211.01338 (2022) - [i7]Sreyan Ghosh, Ashish Seth, Srinivasan Umesh, Dinesh Manocha:
MAST: Multiscale Audio Spectrogram Transformers. CoRR abs/2211.01515 (2022) - [i6]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
SLICER: Learning universal audio representations using low-resource self-supervised pre-training. CoRR abs/2211.01519 (2022) - [i5]Vrunda N. Sukhadia, A. Arunkumar, Srinivasan Umesh:
Channel-Aware Pretraining of Joint Encoder-Decoder Self-Supervised Model for Telephonic-Speech ASR. CoRR abs/2211.01669 (2022) - 2021
- [c66]Vishwas M. Shetty, Srinivasan Umesh:
Exploring the use of Common Label Set to Improve Speech Recognition of Low Resource Indian Languages. ICASSP 2021: 7228-7232 - [i4]Sreyan Ghosh, Sandesh V. Katta, Ashish Seth, Srinivasan Umesh:
Deep Clustering For General-Purpose Audio Representations. CoRR abs/2110.08895 (2021) - 2020
- [c65]Metilda Sagaya Mary N. J, Vishwas M. Shetty, Srinivasan Umesh:
Investigation of Methods to Improve the Recognition Performance of Tamil-English Code-Switched Data in Transformer Framework. ICASSP 2020: 7889-7893 - [c64]Vishwas M. Shetty, Metilda Sagaya Mary N. J, Srinivasan Umesh:
Improving the Performance of Transformer Based Low Resource Speech Recognition for Indian Languages. ICASSP 2020: 8279-8283 - [i3]Vishwas M. Shetty, Metilda Sagaya Mary N. J, Srinivasan Umesh:
Investigation of Speaker-adaptation methods in Transformer based ASR. CoRR abs/2008.03247 (2020) - [i2]Metilda Sagaya Mary N. J, Sandesh V. Katta, Srinivasan Umesh:
S-vectors: Speaker Embeddings based on Transformer's Encoder for Text-Independent Speaker Verification. CoRR abs/2008.04659 (2020)
2010 – 2019
- 2019
- [c63]Anusha Prakash, Anju Leela Thomas, Srinivasan Umesh, Hema A. Murthy:
Building Multilingual End-to-End Speech Synthesisers for Indian Languages. SSW 2019: 194-199 - 2018
- [j13]Neethu Mariam Joy, Sandeep Reddy Kothinti, Srinivasan Umesh:
FMLLR Speaker Normalization With i-Vector: In Pseudo-FMLLR and Distillation Framework. IEEE ACM Trans. Audio Speech Lang. Process. 26(4): 797-805 (2018) - [c62]Rini A. Sharon, Sandeep Reddy Kothinti, Srinivasan Umesh:
Correlational Networks for Speaker Normalization in Automatic Speech Recognition. INTERSPEECH 2018: 882-886 - [c61]Jochen Weiner, Miguel Angrick, Srinivasan Umesh, Tanja Schultz:
Investigating the Effect of Audio Duration on Dementia Detection Using Acoustic Features. INTERSPEECH 2018: 2324-2328 - [c60]Vishwas M. Shetty, Rini A. Sharon, Basil Abraham, Tejaswi Seeram, Anusha Prakash, Nithya Ravi, Srinivasan Umesh:
Articulatory and Stacked Bottleneck Features for Low Resource Speech Recognition. INTERSPEECH 2018: 3202-3206 - 2017
- [j12]Basil Abraham, Srinivasan Umesh:
An automated technique to generate phone-to-articulatory label mapping. Speech Commun. 86: 107-120 (2017) - [j11]Neethu Mariam Joy, Murali Karthick Baskar, Srinivasan Umesh:
DNNs for unsupervised extraction of pseudo speaker-normalized features without explicit adaptation data. Speech Commun. 92: 64-76 (2017) - [c59]Neethu Mariam Joy, Sandeep Reddy Kothinti, Srinivasan Umesh, Basil Abraham:
Generalized Distillation Framework for Speaker Normalization. INTERSPEECH 2017: 739-743 - [c58]Basil Abraham, Srinivasan Umesh, Neethu Mariam Joy:
Joint Estimation of Articulatory Features and Acoustic Models for Low-Resource Languages. INTERSPEECH 2017: 2153-2157 - [c57]Basil Abraham, Tejaswi Seeram, Srinivasan Umesh:
Transfer Learning and Distillation Techniques to Improve the Acoustic Modeling of Low Resource Languages. INTERSPEECH 2017: 2158-2162 - [c56]Neethu Mariam Joy, Srinivasan Umesh, Basil Abraham:
On Improving Acoustic Models for TORGO Dysarthric Speech Database. INTERSPEECH 2017: 2695-2699 - [c55]Seeram Tejaswi, Srinivasan Umesh:
DNN acoustic models for dysarthric speech. NCC 2017: 1-4 - [c54]Seeram Tejaswi, Srinivasan Umesh:
Addressing data sparsity in DNN acoustic modeling. NCC 2017: 1-5 - 2016
- [j10]Vikas Joshi, N. Vishnu Prasad, Srinivasan Umesh:
Modified Mean and Variance Normalization: Transforming to Utterance-Specific Estimates. Circuits Syst. Signal Process. 35(5): 1593-1609 (2016) - [c53]Basil Abraham, Srinivasan Umesh, Neethu Mariam Joy:
Articulatory Feature Extraction Using CTC to Build Articulatory Classifiers Without Forced Frame Alignments for Speech Recognition. INTERSPEECH 2016: 798-802 - [c52]Basil Abraham, Srinivasan Umesh, Neethu Mariam Joy:
Overcoming Data Sparsity in Acoustic Modeling of Low-Resource Language by Borrowing Data and Model Parameters from High-Resource Languages. INTERSPEECH 2016: 3037-3041 - [c51]Neethu Mariam Joy, Murali Karthick Baskar, Srinivasan Umesh, Basil Abraham:
DNNs for Unsupervised Extraction of Pseudo FMLLR Features Without Explicit Adaptation Data. INTERSPEECH 2016: 3479-3483 - [c50]Neethu Mariam Joy, Srinivasan Umesh, Basil Abraham, K. Navneeth:
Improved phone-cluster adaptive training acoustic model. SPCOM 2016: 1-5 - 2015
- [j9]Vikas Joshi, Raghavendra Bilgi, Srinivasan Umesh, Luz García, M. Carmen Benítez:
Sub-band based histogram equalization in cepstral domain for speech recognition. Speech Commun. 69: 46-65 (2015) - [c49]Murali Karthick B, Prateek Kolhar, Srinivasan Umesh:
Speaker adaptation of convolutional neural network using speaker specific subspace vectors of SGMM. INTERSPEECH 2015: 1096-1100 - [c48]R. Sriranjani, Murali Karthick B, Srinivasan Umesh:
Investigation of different acoustic modeling techniques for low resource Indian language data. NCC 2015: 1-5 - [c47]R. Sriranjani, M. Ramasubba Reddy, Srinivasan Umesh:
Improved acoustic modeling for automatic dysarthric speech recognition. NCC 2015: 1-6 - [c46]R. Sriranjani, Srinivasan Umesh, M. Ramasubba Reddy:
Pronunciation Adaptation For Disordered Speech Recognition Using State-Specific Vectors of Phone-Cluster Adaptive Training. SLPAT@Interspeech 2015: 72-78 - 2014
- [j8]Aanchan Mohan, Richard C. Rose, Sina Hamidi Ghalehjegh, Srinivasan Umesh:
Acoustic modelling for speech recognition in Indian languages in an agricultural commodities task domain. Speech Commun. 56: 167-180 (2014) - [c45]Neethu Mariam Joy, Basil Abraham, K. Navneeth, Srinivasan Umesh:
Cross-lingual acoustic modeling for Indian languages based on Subspace Gaussian Mixture Models. NCC 2014: 1-5 - [c44]R. Sriranjani, Murali Karthick Baskar, Srinivasan Umesh:
Experiments on front-end techniques and segmentation model for robust Indian Language speech recognizer. NCC 2014: 1-6 - [c43]Murali Karthick B, Srinivasan Umesh:
Improving deep neural networks using state projection vectors of subspace Gaussian mixture model as features. SLT 2014: 129-134 - 2013
- [c42]Vimal Manohar, Srinivas C. Bhargav, Srinivasan Umesh:
Acoustic modeling using transform-based phone-cluster adaptive training. ASRU 2013: 49-54 - [c41]N. Vishnu Prasad, Srinivasan Umesh:
Improved cepstral mean and variance normalization using Bayesian framework. ASRU 2013: 156-161 - [c40]D. S. Pavan Kumar, N. Vishnu Prasad, Vikas Joshi, Srinivasan Umesh:
Modified splice and its extension to non-stereo data for noise robust speech recognition. ASRU 2013: 174-179 - [c39]Vikas Joshi, N. Vishnu Prasad, Srinivasan Umesh:
Modified cepstral mean normalization - transforming to utterance specific non-zero mean. INTERSPEECH 2013: 881-885 - [i1]D. S. Pavan Kumar, N. Vishnu Prasad, Vikas Joshi, Srinivasan Umesh:
Modified SPLICE and its Extension to Non-Stereo Data for Noise Robust Speech Recognition. CoRR abs/1307.4048 (2013) - 2012
- [j7]Achintya Kumar Sarkar, Srinivasan Umesh:
Multiple background models for speaker verification using the concept of vocal tract length and MLLR super-vector. Int. J. Speech Technol. 15(3): 351-364 (2012) - [j6]D. Rama Sanand, Srinivasan Umesh:
VTLN Using Analytically Determined Linear-Transformation on Conventional MFCC. IEEE Trans. Speech Audio Process. 20(5): 1573-1584 (2012) - [c38]Raghavendra Bilgi, Vikas Joshi, Srinivasan Umesh, Luz García Martínez, M. Carmen Benítez Ortúzar:
Robust speech recognition through selection of speaker and environment transforms. ICASSP 2012: 4333-4336 - [c37]Achintya Kumar Sarkar, Srinivasan Umesh, Jean-François Bonastre:
Computationally efficient speaker identification using fast-MLLR based anchor modeling. ICASSP 2012: 4357-4360 - [c36]Vikas Joshi, Raghavendra Bilgi, Srinivasan Umesh, Luz García Martínez, M. Carmen Benítez Ortúzar:
Noise and speaker compensation in the Log filter bank domain. ICASSP 2012: 4709-4712 - [c35]Aanchan Mohan, Srinivasan Umesh, Richard C. Rose:
Subspace based for Indian languages. ISSPA 2012: 35-39 - 2011
- [c34]Achintya Kumar Sarkar, Srinivasan Umesh:
Use of VTL-wise models in feature-mapping framework to achieve performance of multiple-background models in speaker verification. ICASSP 2011: 4552-4555 - [c33]Vikas Joshi, Raghavendra Bilgi, Srinivasan Umesh, Luz García, M. Carmen Benítez:
Sub-Band Level Histogram Equalization for Robust Speech Recognition. INTERSPEECH 2011: 1661-1664 - [c32]Achintya Kumar Sarkar, Srinivasan Umesh:
Eigen-Voice Based Anchor Modeling System for Speaker Identification Using MLLR Super-Vector. INTERSPEECH 2011: 2357-2360 - [c31]Vikas Joshi, Raghavendra Bilgi, Srinivasan Umesh, M. Carmen Benítez, Luz García:
Efficient Speaker and Noise Normalization for Robust Speech Recognition. INTERSPEECH 2011: 2601-2604 - 2010
- [c30]Achintya Kumar Sarkar, Srinivasan Umesh:
Fast computation of speaker characterization vector using MLLR and sufficient statistics in anchor model framework. INTERSPEECH 2010: 2738-2741 - [c29]Achintya Kumar Sarkar, Srinivasan Umesh, Shakti Prasad Rath:
Computationally Efficient Speaker Identification for Large Population Tasks using MLLR and Sufficient Statistics. Odyssey 2010: 3 - [c28]Achintya Kumar Sarkar, Srinivasan Umesh:
Investigation of Speaker-Clustered UBMs based on Vocal Tract Lengths and MLLR matrices for Speaker Verification. Odyssey 2010: 13
2000 – 2009
- 2009
- [c27]D. Rama Sanand, Shakti Prasad Rath, Srinivasan Umesh:
Improving the performance of VTLN under mismatched speaker conditions and making it approach that of matched speaker conditions. ICASSP 2009: 4397-4400 - [c26]Shakti Prasad Rath, Srinivasan Umesh:
Acoustic class specific VTLN-warping using regression class trees. INTERSPEECH 2009: 556-559 - [c25]Shakti Prasad Rath, Srinivasan Umesh, Achintya Kumar Sarkar:
Using VTLN matrices for rapid and computationally-efficient speaker adaptation with robustness to first-pass transcription errors. INTERSPEECH 2009: 572-575 - [c24]D. Rama Sanand, Shakti Prasad Rath, Srinivasan Umesh:
A study on the influence of covariance adaptation on jacobian compensation in vocal tract length normalization. INTERSPEECH 2009: 584-587 - [c23]A. N. Harish, D. Rama Sanand, Srinivasan Umesh:
Characterizing speaker variability using spectral envelopes of vowel sounds. INTERSPEECH 2009: 1107-1110 - [c22]Achintya Kumar Sarkar, Srinivasan Umesh, Shakti Prasad Rath:
Text-independent speaker identification using vocal tract length normalization for building universal background model. INTERSPEECH 2009: 2331-2334 - 2008
- [j5]Rohit Sinha, Srinivasan Umesh:
A shift-based approach to speaker normalization using non-linear frequency-scaling model. Speech Commun. 50(3): 191-202 (2008) - [c21]D. Rama Sanand, Srinivasan Umesh:
Study of jacobian compensation using linear transformation of conventional MFCC for VTLN. INTERSPEECH 2008: 1233-1236 - [c20]P. T. Akhil, Shakti Prasad Rath, Srinivasan Umesh, D. Rama Sanand:
A computationally efficient approach to warp factor estimation in VTLN using EM algorithm and sufficient statistics. INTERSPEECH 2008: 1713-1716 - [c19]D. Rama Sanand, V. Balaji, Rani R. Sandhya, Srinivasan Umesh:
Use of spectral centre of gravity for generating speaker invariant features for automatic speech recognition. INTERSPEECH 2008: 2258-2261 - 2007
- [j4]Srinivasan Umesh, Rohit Sinha:
A Study of Filter Bank Smoothing in MFCC Features for Recognition of Children's Speech. IEEE Trans. Speech Audio Process. 15(8): 2418-2430 (2007) - [c18]Srinivasan Umesh, D. Rama Sanand, G. Praveen:
Speaker-Invariant Features for Automatic Speech Recognition. IJCAI 2007: 1738-1743 - [c17]D. Rama Sanand, D. Dinesh Kumar, Srinivasan Umesh:
Linear transformation approach to VTLN using dynamic frequency warping. INTERSPEECH 2007: 1138-1141 - 2006
- [c16]Jonas Lööf, Hermann Ney, Srinivasan Umesh:
Vtln Warping Factor Estimation Using Accumulation of Sufficient Statistics. ICASSP (1) 2006: 1201-1204 - [c15]S. V. Bharath Kumar, Srinivasan Umesh, Rohit Sinha:
Study Of Non-Linear Frequency Warping Functions For Speaker Normalization. ICASSP (1) 2006: 1245-1248 - 2005
- [c14]Srinivasan Umesh, András Zolnay, Hermann Ney:
Implementing frequency-warping and VTLN through linear transformation of conventional MFCC. INTERSPEECH 2005: 269-272 - 2004
- [c13]S. V. Bharath Kumar, Srinivasan Umesh, Rohit Sinha:
Non-uniform speaker normalization using affine-transformation. ICASSP (1) 2004: 121-124 - [c12]Srinivasan Umesh, Rohit Sinha, S. V. Bharath Kumar:
An investigation into front-end signal processing for speaker normalization. ICASSP (1) 2004: 345-348 - [c11]Do Yeong Kim, Srinivasan Umesh, Mark J. F. Gales, Thomas Hain, Philip C. Woodland:
Using VTLN for broadcast news transcription. INTERSPEECH 2004: 1953-1956 - 2003
- [c10]Rohit Sinha, Srinivasan Umesh:
A method for compensation of Jacobian in speaker normalization. ICASSP (1) 2003: 560-563 - 2002
- [j3]Srinivasan Umesh, Leon Cohen, Douglas J. Nelson:
Frequency warping and the Mel scale. IEEE Signal Process. Lett. 9(3): 104-107 (2002) - [c9]Srinivasan Umesh, S. V. Bharath Kumar, M. K. Vinay, Rajesh Sharma, Rohit Sinha:
A simple approach to non-uniform vowel normalization. ICASSP 2002: 517-520 - [c8]Rohit Sinha, Srinivasan Umesh:
Non-uniform scaling based speaker normalization. ICASSP 2002: 589-592 - 2000
- [c7]Srinivasan Umesh, Richard C. Rose, Sarangarajan Parthasarathy:
Exploiting frequency-scaling invariance properties of the scale transform for automatic speech recognition. INTERSPEECH 2000: 301-304
1990 – 1999
- 1999
- [j2]Srinivasan Umesh, Leon Cohen, Nenad Marinovic, Douglas J. Nelson:
Scale transform in speech analysis. IEEE Trans. Speech Audio Process. 7(1): 40-45 (1999) - [c6]Srinivasan Umesh, Leon Cohen, Douglas J. Nelson:
Fitting the Mel scale. ICASSP 1999: 217-220 - 1998
- [c5]Srinivasan Umesh, Leon Cohen, Douglas J. Nelson:
Improved scale-cepstral analysis in speech. ICASSP 1998: 637-640 - 1997
- [c4]Srinivasan Umesh, Leon Cohen, Douglas J. Nelson:
Frequency-warping and speaker-normalization. ICASSP 1997: 983-986 - 1996
- [j1]Srinivasan Umesh, Donald W. Tufts:
Estimation of parameters of exponentially damped sinusoids using fast maximum likelihood estimation with application to NMR spectroscopy data. IEEE Trans. Signal Process. 44(9): 2245-2259 (1996) - [c3]Srinivasan Umesh, Douglas J. Nelson:
Computationally efficient estimation of sinusoidal frequency at low SNR. ICASSP 1996: 2797-2800 - [c2]Srinivasan Umesh, Leon Cohen, Nenad Marinovic, Douglas J. Nelson:
Frequency-warping in speech. ICSLP 1996: 414-417 - 1992
- [c1]Srinivasan Umesh, Donald W. Tufts:
Resolving the components of transient signals by a multistage procedure. ICASSP 1992: 553-556
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-22 20:41 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint