Speech and Text Generative Models for Automatic Dubbing of Videos
Master’s Thesis-II (Nationwide project - BharatGen) at the Computational Speech and Language Technologies Lab, IIT Bombay, Guides: Prof. Preethi Jyothi, Prof. Ganesh Ramakrishnan
Summary: This project aims to generate natural low-resource language speech for agriculture education videos.
- Using non-autoregressive flow matching of continuous normalizing flows for text-guided multilingual speech generation
- Adapting neural codec language models like Vall-E and SpeechX for voice and emotion transfer in low resource dialects