Date of Award
Spring 5-9-2025
Level of Access Assigned by Author
Open-Access Thesis
Degree Name
Doctor of Philosophy (PhD)
Department
Computer Science
First Committee Advisor
Chaofan Chen
Second Committee Member
Salimeh Yasaei Sekeh
Third Committee Member
Gregory Nelson
Additional Committee Members
Phillip Dickens
Paul Hand
Ali Payani
Abstract
While significant progress has been made in producing more complex and performative deep neural networks (DNNs) and larger datasets on which to train them, these advances raise the need to research better training methods as well. Traditional approaches may either train individual networks, or jointly train a single network on multiple datasets. These approaches either prevent knowledge from being shared between datasets or require having simultaneous access to all datasets. Sequential training alleviates these issues while allowing for knowledge learned in the earlier tasks to inform the learning of later tasks. Naive approaches to sequential training do not preserve learned knowledge when retraining, emphasizing the need for research into the domain of Continual Learning (CL). Continual learning is the sequential training of a DNN on several tasks while avoiding the forgetting of learned information. This dissertation investigates how the behavior of a network and the connections within it change over the course of CL. We produce Information Flow (IF) measures for network behavior utilizing concepts of Information Theory including the Pearson Correlation and Mutual Information. These measures are used to monitor the activity of the network and each constituent subnetwork. We seek to understand how this information can provide usable insights which may be leveraged to improve the efficiency, accuracy, or adversarial robustness of the network. We demonstrate that given sufficient dependency between layers, a robust subnetwork can help confer robustness to the full network. We consider the impact of substituting early CL tasks with images synthesized by generative models, such that we share the knowledge learned on these tasks with later, natural-image tasks. The metric of IF is then extended to the subnetworks produced by pruning-based CL to compare similar subnetworks and inform decisions of which subset of previous knowledge to share when learning a task. We extend this comparison of similar subnetworks to Online CL (OCL), showing how subnetworks may be leveraged for efficient online learning by enabling more systematic training. We show that the IF within these subnetworks can be used to cluster and merge similar subnetworks to free additional capacity within the network for future learning.
Recommended Citation
Andle, Joshua, "Investigating Deep Neural Network Behavior During Continual Learning" (2025). Electronic Theses and Dissertations. 4190.
https://digitalcommons.library.umaine.edu/etd/4190
Files over 10MB may be slow to open. For best results, right-click and select "save as..."