I’m doing my PhD in Statistics at the University of Waterloo. I’m fascinated by deep learning and its ability to reach superhuman performance on so many different tasks. I want to better understand how neural networks achieve such impressive results… and why sometimes they don’t. To do that, I’m exploring what kind of information or knowledge is contained in the datasets we train our models on and how much of this knowledge is actually needed for our models.
In the beginning, I used deep learning to restore lost data from cars in order to improve anomaly detection algorithms and make cars safer. The ability to restore lost data suggests that knowledge is duplicated across a dataset.
Then, I worked on improving dataset distillation, the process of learning tiny synthetic datasets that contain all the knowledge of much larger datasets. If knowledge is duplicated across a dataset then it should be possible to represent that knowledge using fewer samples.
Now, I work on “less than one”-shot learning, an extreme form of few-shot learning where the goal is for models to learn N new classes using M < N training samples. If models can generalize from a small number of synthetic samples, can they also generalize from a small number of real samples?
Check out my publications to see my progress so far. If you find something you’re interested in discussing then shoot me an email and I’d be happy to chat. The best place to reach me is at firstname.lastname@example.org.
Apr 2021: Two of our papers were accepted to IJCNN 2021: “Soft-Label Dataset Distillation and Text Dataset Distillation” (preprint) and “One Line To Rule Them All: Generating LO-Shot Soft-Label Prototypes” (preprint)
Mar 2021: Our paper on Optimal 1-NN Prototypes for Pathological Geometries was accepted for publication in PeerJ Computer Science.
Jan 2021: Our reseach was profiled on Scientific American as a pathway towards democratizing AI.
Dec 2020: Our paper on ‘Less Than One’-Shot Learning was accepted to AAAI-21!
Nov 2020: Our extended abstract on privacy-preserving dataset distillation has been accepted for poster presentation in the AAAI-21 Student Abstract and Poster Program.
Aug 2020: I’m joining Stratum AI as VP of Research and will be leading the development of ML/DL methods to make mining more efficient and environmentally sustainable.