Posts by Collection



Text mining with n-gram variables

Published in The Stata Journal, 2017

Use Google Scholar for full citation

Recommended citation: Matthias Schonlau, Nick Guenther, Ilia Sucholutsky, "Text mining with n-gram variables." The Stata Journal, 2017.

Deep Learning for System Trace Restoration

Published in The proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), 2019

Pre-print at arXiv:1904.05411

Recommended citation: Ilia Sucholutsky, Apurva Narayan, Matthias Schonlau, Sebastian Fischmeister, "Deep Learning for System Trace Restoration." The proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), 2019.

SecDD: Efficient and Secure Method for Remotely Training Neural Networks (Student Abstract)

Published in The proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence, 2021

Forthcoming. Pre-print available at arXiv:2009.09155

Recommended citation: Ilia Sucholutsky, Matthias Schonlau, "SecDD: Efficient and Secure Method for Remotely Training Neural Networks (Student Abstract)." The proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence, 2021.


Deep Learning for Lost Data Restoration and Imputation


Lossy, noisy, or missing data are a common phenomena in many areas of statistics ranging from sampling to statistical learning. Instead of just ignoring these missing values, it can be useful to somehow attempt to recover or impute them. Meanwhile, deep learning is increasingly shown to be adept at learning latent representations or distributions of data. These patterns or representations can often be too complex to be recognized manually or through classical statistical techniques. We will discuss practical deep learning approaches to the problem of lossy data restoration or imputation with examples of several different types of datasets. We will compare the results to classical techniques to see if deep learning can really be used to perform higher quality imputation.

Breaking Into Deep Learning: Five Projects to get you Inspired


We will go over five exciting projects from very different areas, and examine the deep learning algorithms underlying them, as inspiration for how you can enter the field regardless of where your interests or expertise currently lie.

Making the Most of Graduate Research in AI


Should you pursue graduate research in AI? What should you expect if you do? Most importantly, how do you ensure that it is a beneficial and positive experience for you? I hope to help you answer some of these questions by sharing my own experiences and introducing you to the very diverse set of AI projects happening at the University of Waterloo that you could work on as a graduate student.

ConvART: Improving Adaptive Resonance Theory for Unsupervised Image Clustering


While supervised learning techniques have become increasingly adept at separating images into different classes, these techniques require large amounts of labelled data which may not always be available. We propose a novel neuro-dynamic method for unsuper- vised image clustering by combining 2 biologically-motivated mod- els: Adaptive Resonance Theory (ART) and Convolutional Neu- ral Networks (CNN). ART networks are unsupervised clustering al- gorithms that have high stability in preserving learned information while quickly learning new information. Meanwhile, a major prop- erty of CNNs is their translation and distortion invariance, which has led to their success in the domain of vision problems. By embedding convolutional layers into an ART network, the useful properties of both networks can be leveraged to identify different clusters within unlabelled image datasets and classify images into these clusters. In exploratory experiments, we demonstrate that this method greatly increases the performance of unsupervised ART networks on a benchmark image dataset.

Deep Learning for System Trace Restoration


Most real-world datasets, and particularly those collected from physical systems, are full of noise, packet loss, and other imperfections. However, most specification mining, anomaly detection and other such algorithms assume, or even require, perfect data quality to function properly. Such algorithms may work in lab conditions when given clean, controlled data, but will fail in the field when given imperfect data. We propose a method for accurately reconstructing discrete temporal or sequential system traces affected by data loss, using Long Short-Term Memory Networks (LSTMs). The model works by learning to predict the next event in a sequence of events, and uses its own output as an input to continue predicting future events. As a result, this method can be used for data restoration even with streamed data. Such a method can reconstruct even long sequence of missing events, and can also help validate and improve data quality for noisy data. The output of the model will be a close reconstruction of the true data, and can be fed to algorithms that rely on clean data. We demonstrate our method by reconstructing automotive CAN traces consisting of long sequences of discrete events. We show that given even small parts of a CAN trace, our LSTM model can predict future events with an accuracy of almost 90%, and can successfully reconstruct large portions of the original trace, greatly outperforming a Markov Model benchmark. We separately feed the original, lossy, and reconstructed traces into a specification mining framework to perform downstream analysis of the effect of our method on state-of-the-art models that use these traces for understanding the behavior of complex systems.


STAT 231

Undergraduate course, University of Waterloo, Department of Statistics and Actuarial Science, 2020

I offered a section of STAT 231 in Winter 2020.