I am a Senior Applied Scientist on the Microsoft Speech team and a PhD candidate at Carnegie Mellon University. Previously, I received my masters degree from Carnegie Mellon University and B.Tech from VJTI.
My broad research interests include Audio/Speech Processing and Multimodal Learning. My research gets deployed in products like Teams, Edge, Outlook. Some recent works include: Video Translation, Pengi, CLAP
Academic service:
Links: Google Scholar • GitHub • Twitter • LinkedIn • CV