Portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 2
Published in The Stata Journal, 2017
Use Google Scholar for full citation
Recommended citation: Matthias Schonlau, Nick Guenther, Ilia Sucholutsky, "Text mining with n-gram variables." The Stata Journal, 2017.
Published in Journal of Computational Vision and Imaging Systems, 2018
Use Google Scholar for full citation
Recommended citation: Ilia Sucholutsky, Matthias Schonlau, "ConvART: Improving Adaptive Resonance Theory for Unsupervised Image Clustering." Journal of Computational Vision and Imaging Systems, 2018.
Published in 2021 International Joint Conference on Neural Networks (IJCNN), 2019
Forthcoming. Pre-print available at arXiv:1910.02551
Recommended citation: Ilia Sucholutsky, Matthias Schonlau, "Soft-Label Dataset Distillation and Text Dataset Distillation." 2021 International Joint Conference on Neural Networks (IJCNN), 2019. https://arxiv.org/abs/1910.02551
Published in The proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), 2019
Pre-print at arXiv:1904.05411
Recommended citation: Ilia Sucholutsky, Apurva Narayan, Matthias Schonlau, Sebastian Fischmeister, "Deep Learning for System Trace Restoration." The proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), 2019. https://arxiv.org/abs/1904.05411
Published in PeerJ Computer Science, 2019
Recommended citation: Ilia Sucholutsky, Apurva Narayan, Matthias Schonlau, Sebastian Fischmeister, "Pay attention and you won’t lose it: a deep learning approach to sequence imputation." PeerJ Computer Science, 2019. https://peerj.com/articles/cs-210/
Published in The proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence, 2021
Forthcoming. Pre-print available at arXiv:2009.08449
Recommended citation: Ilia Sucholutsky, Matthias Schonlau, "`Less Than One'-Shot Learning: Learning N Classes From M$<$ N Samples." The proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence, 2021. https://arxiv.org/abs/2009.08449
Published in 2021 International Joint Conference on Neural Networks (IJCNN), 2021
Forthcoming. Pre-print available at arXiv:2102.07834
Recommended citation: Ilia Sucholutsky, Nam-Hwui Kim, Ryan Browne, Matthias Schonlau, "One Line To Rule Them All: Generating LO-Shot Soft-Label Prototypes." 2021 International Joint Conference on Neural Networks (IJCNN), 2021. https://arxiv.org/abs/2102.07834
Published in PeerJ Computer Science, 2021
Recommended citation: Ilia Sucholutsky, Matthias Schonlau, "Optimal 1-NN prototypes for pathological geometries." PeerJ Computer Science, 2021. https://peerj.com/articles/cs-464/
Published in The proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence, 2021
Forthcoming. Pre-print available at arXiv:2009.09155
Recommended citation: Ilia Sucholutsky, Matthias Schonlau, "SecDD: Efficient and Secure Method for Remotely Training Neural Networks (Student Abstract)." The proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence, 2021. https://arxiv.org/abs/2009.09155
Published:
An introduction to Generative Adversarial Networks intended for a technical audience with little to no background knowledge in neural networks.
Published:
Lossy, noisy, or missing data are a common phenomena in many areas of statistics ranging from sampling to statistical learning. Instead of just ignoring these missing values, it can be useful to somehow attempt to recover or impute them. Meanwhile, deep learning is increasingly shown to be adept at learning latent representations or distributions of data. These patterns or representations can often be too complex to be recognized manually or through classical statistical techniques. We will discuss practical deep learning approaches to the problem of lossy data restoration or imputation with examples of several different types of datasets. We will compare the results to classical techniques to see if deep learning can really be used to perform higher quality imputation.
Published:
We will go over five exciting projects from very different areas, and examine the deep learning algorithms underlying them, as inspiration for how you can enter the field regardless of where your interests or expertise currently lie.
Published:
Should you pursue graduate research in AI? What should you expect if you do? Most importantly, how do you ensure that it is a beneficial and positive experience for you? I hope to help you answer some of these questions by sharing my own experiences and introducing you to the very diverse set of AI projects happening at the University of Waterloo that you could work on as a graduate student.
Published:
While supervised learning techniques have become increasingly adept at separating images into different classes, these techniques require large amounts of labelled data which may not always be available. We propose a novel neuro-dynamic method for unsuper- vised image clustering by combining 2 biologically-motivated mod- els: Adaptive Resonance Theory (ART) and Convolutional Neu- ral Networks (CNN). ART networks are unsupervised clustering al- gorithms that have high stability in preserving learned information while quickly learning new information. Meanwhile, a major prop- erty of CNNs is their translation and distortion invariance, which has led to their success in the domain of vision problems. By embedding convolutional layers into an ART network, the useful properties of both networks can be leveraged to identify different clusters within unlabelled image datasets and classify images into these clusters. In exploratory experiments, we demonstrate that this method greatly increases the performance of unsupervised ART networks on a benchmark image dataset.
Published:
Most real-world datasets, and particularly those collected from physical systems, are full of noise, packet loss, and other imperfections. However, most specification mining, anomaly detection and other such algorithms assume, or even require, perfect data quality to function properly. Such algorithms may work in lab conditions when given clean, controlled data, but will fail in the field when given imperfect data. We propose a method for accurately reconstructing discrete temporal or sequential system traces affected by data loss, using Long Short-Term Memory Networks (LSTMs). The model works by learning to predict the next event in a sequence of events, and uses its own output as an input to continue predicting future events. As a result, this method can be used for data restoration even with streamed data. Such a method can reconstruct even long sequence of missing events, and can also help validate and improve data quality for noisy data. The output of the model will be a close reconstruction of the true data, and can be fed to algorithms that rely on clean data. We demonstrate our method by reconstructing automotive CAN traces consisting of long sequences of discrete events. We show that given even small parts of a CAN trace, our LSTM model can predict future events with an accuracy of almost 90%, and can successfully reconstruct large portions of the original trace, greatly outperforming a Markov Model benchmark. We separately feed the original, lossy, and reconstructed traces into a specification mining framework to perform downstream analysis of the effect of our method on state-of-the-art models that use these traces for understanding the behavior of complex systems.
Undergraduate course, University of Waterloo, Department of Statistics and Actuarial Science, 2020
I offered a section of STAT 231 in Winter 2020.
Undergraduate course, Princeton University, Computer Science, 2022
I offered COS IW 10: Deep learning with small data in Spring 2022.
Graduate course, NYU, Center for Data Science, 2024
I’m offering DS-GA 3001.011: Special Topics in Data Science - Learning from small data in Fall 2024.