Sanket Vaibhav Mehta [SVM]

I am a Research Scientist at Google DeepMind, focusing on continual lifelong learning and unlearning from non-stationary environments!

I received my Ph.D. from Language Technologies Institute (LTI) at School of Computer Science, Carnegie Mellon University, where I was advised by Emma Strubell. Before that I obtained a Master's degree (2019) from the LTI where I was advised by Jaime Carbonell and Barnabás Póczos.

Before joining CMU, I worked as a member of the research staff at Big Data Research Lab, Adobe Research (2015-17) where I worked on designing algorithms for identifying data-driven geo-fences to assist Adobe’s digital marketing offering.

I graduated from Indian Institute of Technology Roorkee with a B.Tech in Computer Science (2011-15) and a President's Gold Medal.

Email  /  CV  /  Google Scholar  /  X  /  GitHub

profile photo
Research

I'm interested in machine learning, natural language processing and optimization with a specific focus on learning from limited labeled data, multiple tasks, non-stationary data distributions (Continual/ Lifelong Learning, Transfer Learning, Meta Learning, Multi-Task Learning, Modular Learning).

My doctoral thesis focuses on designing efficient lifelong learning systems that alleviate catastrophic forgetting of previously learned knowledge and facilitate continual learning of new tasks. Inspired by biological learning processes and progress in deep learning, my work injects appropriate inductive biases into the three main components of data-driven machine learning: model (architecture & initialization), training (objective & optimization), and data (limited labeled & unlabeled).

Latest News
Publications
clean-usnob Efficient Lifelong Learning in Deep Neural Networks: Optimizing Architecture, Training, and Data
Sanket Vaibhav Mehta
PhD thesis, Carnegie Mellon University, 2023
bibtex / tweet
clean-usnob DSI++: Updating Transformer Memory with New Documents
Sanket Vaibhav Mehta, Jai Gupta, Yi Tay, Mostafa Dehghani, Vinh Q. Tran, Jinfeng Rao, Marc Najork, Emma Strubell, Donald Metzler
EMNLP, 2023
bibtex / tweet
clean-usnob An Empirical Investigation of the Role of Pre-training in Lifelong Learning
Sanket Vaibhav Mehta, Darshan Patil, Sarath Chandar, Emma Strubell
Journal of Machine Learning Research, 2023
bibtex / code / tweet
clean-usnob Making Scalable Meta Learning Practical
Sang Keun Choe, Sanket Vaibhav Mehta, Hwijeen Ahn, Willie Neiswanger, Pengtao Xie, Emma Strubell, Eric Xing
NeurIPS, 2023
bibtex / code / tweet
clean-usnob Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models
Clara Na, Sanket Vaibhav Mehta, Emma Strubell
EMNLP Findings, 2022
bibtex / code / tweet
clean-usnob An Introduction to Lifelong Supervised Learning
Shagun Sodhani, Mojtaba Faramarzi, Sanket Vaibhav Mehta, Pranshu Malviya, Mohamed Abdelsalam, Janarthanan Janarthanan, Sarath Chandar
arXiv, 2022
bibtex / tweet
clean-usnob Improving Compositional Generalization with Self-Training for Data-to-Text Generation
Sanket Vaibhav Mehta, Jinfeng Rao, Yi Tay, Mihir Kale, Ankur Parikh, Emma Strubell
ACL, 2022
bibtex / code / poster
clean-usnob ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
Vamsi Aribandi, Yi Tay, Tal Schuster, Jinfeng Rao, Huaixiu Steven Zheng, Sanket Vaibhav Mehta, Honglei Zhuang, Vinh Q. Tran, Dara Bahri, Jianmo Ni, Jai Gupta, Kai Hui, Sebastian Ruder, Donald Metzler
ICLR, 2022
bibtex / press / tweet
clean-usnob An Empirical Investigation of the Role of Pre-training in Lifelong Learning
Sanket Vaibhav Mehta, Darshan Patil, Sarath Chandar, Emma Strubell
ICML Theory and Foundation of Continual Learning Workshop, 2021 (Spotlight)
bibtex / code
clean-usnob Efficient Meta Lifelong-Learning with Limited Memory
Sanket Vaibhav Mehta*, Zirui Wang*, Barnabás Póczos, Jaime Carbonell
EMNLP, 2020
bibtex
clean-usnob Learning Rhyming Constraints using Structured Adversaries
Harsh Jhamtani, Sanket Vaibhav Mehta, Jaime Carbonell, Taylor Berg-Kirkpatrick
EMNLP, 2019
bibtex / code / poster
clean-usnob Gradient-Based Inference for Networks with Output Constraints
Jay-Yoon Lee, Sanket Vaibhav Mehta, Michael Wick, Jean-Baptiste Tristan, Jaime Carbonell
AAAI, 2019
bibtex / code
clean-usnob Towards Semi-Supervised Learning for Deep Semantic Role Labeling
Sanket Vaibhav Mehta*, Jay-Yoon Lee*, Jaime Carbonell
EMNLP, 2018
bibtex / code / poster
clean-usnob An LSTM Based System for Prediction of Human Activities with Durations
Kundan Krishna, Deepali Jain, Sanket Vaibhav Mehta, Sunav Choudhary
IMWUT, 2017
bibtex
clean-usnob Preventing Inadvertent Information Disclosures via Automatic Security Policies
Tanya Goyal, Sanket Vaibhav Mehta, Balaji Vasan Srinivasan
PAKDD, 2017
bibtex
Issued Patents
1. Generating data-driven geo-fences (US 9,838,843)
2. Propagation of changes in master content to variant content (US 10,102,191)
3. Digital document update (US 10,489,498)
4. Tagging documents with security policies (US 10,783,262)
5. Digital document update using static and transient tags (US 10,846,466)
6. Tenant-side detection, classification, and mitigation of noisy-neighbor-induced performance degradation (US 11,086,646)
7. Intelligent customer journey mining and mapping (US 11,756,058)


Based on Jon Barron's website.