The doctoral dissertation in the field of Computer Science will be examined at the Faculty of Science and Forestry online.
What is the topic of your doctoral research?
Machine learning is a computer algorithm that captures patterns from digital signals to produce meaningful predictions or analyses. To generalize well, traditional machine learning algorithms must rely on preprocessing of handcrafted features through careful engineering and requiring considerable domain expertise. This approach involves a complicated pipeline, which often results in intractable errors, poor scalability to massive datasets, and difficulties in practical deployment. Recent advances in deep artificial neural networks (DNN) have enabled a powerful framework for end-to-end learning. That is, the algorithms learn directly from what we see (e.g. an image of a cat), what we heard (e.g. audio of your speech), or what we cannot easily visualize (e.g. the transcriptomic sequence of a single cell).
Despite remarkable performance in various domains, the application of DNN is hindered by a lack of representative data, which is caused, for example, by the mismatches between the training condition of the algorithm and real-life situations. Critically, this problem is attributed to the limitation of two prominent learning paradigms: supervised learning where the learning data must be mapped to meaningful categories such as dog and cat, and unsupervised learning where the learning data are unlabelled; thus, the algorithm must discover the relevant patterns for a given objective. A sensible approach would be to incorporate these two learning paradigms into a unified framework and utilized both supervised and unsupervised data to leverage their merits and mutually address their drawbacks. This is known as semi-supervised learning (SSL).
As a result, my research proposes the integration of SSL and DNN in solving complex signal processing tasks that include the identification of under-resourced languages or dialects, imputing missing gene counts in single-cell RNA sequence expression profiles, and meaningful generation and manipulation of images.
What are the key findings or observations of your doctoral research?
In one study, we demonstrated that recognition of the North Sami regional dialects is confounded by multiple factors and the lack of training data due to the small number of North Sami speakers, all of which have negative impacts on the performance of the system. As a result, our proposed semi-supervised method could efficiently utilize all the available data to outperform the most advanced dialect recognition system to date by a significant margin.
We also figured out that the algorithm can cope with uncertainty and unreliable data, especially in the case of learning from corrupted datasets. The particular task tested is recovering the missing gene counts in single-cell RNA sequence expression profiles where above 95% of data are zeros due to technical limitations. Moreover, many labeled cells are often biased observations because the scientist has to rely on unreliable sources of information for classification. Since our algorithm could learn from both supervised and unsupervised data, these biases are often partially mitigated and the imputed expressions excessed baseline methods evaluated by both quantitative and quantitative metrics.More interestingly, our final study discovered that semi-supervised learning could be used to guide unsupervised algorithms in discovering meaningful intrinsic properties of objects in images. For example, the algorithm can move an object around in a room, change its color or generate an image with the desired shape, color, and/or rotation from a seemingly random vector. The ability to control a meaningful generation of images is crucial for understanding how the image was created in the beginning.
How can the results of your doctoral research be utilised in practice?
In the context of the scarcity of labeled examples and the abundance of unlabeled examples, an efficient learning algorithm must utilize data from all possible sources for optimizing the learnable function; thus, semi-supervised learning (SSL) is a potential solution for answering numerous scientific questions. Consequently, the potential of SSL is endless since the framework is flexible and there are multiple approaches to extend the existing learning algorithms. Applying semi-supervised algorithms to the industrial scenarios would address practical issues in the collection and annotation of an enormous amount of data; thus, enabling more reliable and adaptable virtual assistants, self-driving vehicles, and medical diagnoses. In academia, the learning of structured information from both supervised and unsupervised data facilitates deeper insights into the causality and connection among variables, and it has become a toolkit for intervention and counterfactual studies.
The fact that this learning paradigm can be applied to completely different types of signals, including speech, images, and genetic sequences, is a strong proof of concept. In summary, deep semi-supervised learning is the key to robust machine learning algorithms that adapt well to the unknown conditions of realistic environments.
What are the key research methods and materials used in your doctoral research?
In this research, we utilize the most recent advances in deep artificial neural networks (DNN) to learn meaningful patterns from complex digital signals. That is the convolutional neural architecture in which the connection between neurons resembles the organization of the animal visual cortex for exploiting locally invariant patterns. The recurrent neural architectures utilize feedback signals between neurons to capture temporal patterns which is the essence of human speech.The two semi-supervised techniques in the focus of this thesis are consistency regularization and generative modeling. In practice, consistency regularization is both flexible and versatile, because it incorporates an extra objective that regulates the consistency of data points from the same semantic group without explicitly changing the model design.
On the other hand, generative modeling learns the underlying data structure and addresses the uncertainty in the input data. Due to the computational complexity of DNNs, our work wouldn’t be feasible without the advances in Graphics Processing Units for computational tasks. The research is also only viable by the effort of researchers from the DigiSami project, University of Helsinki, and the Institute of Biomedicine, University of Eastern Finland for collecting, annotating, and cleaning large amounts of data.
The doctoral dissertation of Trung Ngo, MSc, entitled Deep semi-supervised learning for signal processing will be examined at the Faculty of Science and Forestry on 27 May 2022 at 10 am online. The Opponent will be Associate Professor Arto Klami, Helsinki Institute for Information Technology,and the Custos will be Doctor Ville Hautamäki, University of Eastern Finland. Language of the public defence is English.