Date of Award

Spring 5-9-2025

Level of Access Assigned by Author

Open-Access Thesis

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Committee Advisor

Chaofan Chen

Second Committee Member

Salimeh Yasaei Sekeh

Third Committee Member

Gregory Nelson

Additional Committee Members

Phillip Dickens

Paul Hand

Ali Payani

Abstract

While significant progress has been made in producing more complex and performative deep neural networks (DNNs) and larger datasets on which to train them, these advances raise the need to research better training methods as well. Traditional approaches may either train individual networks, or jointly train a single network on multiple datasets. These approaches either prevent knowledge from being shared between datasets or require having simultaneous access to all datasets. Sequential training alleviates these issues while allowing for knowledge learned in the earlier tasks to inform the learning of later tasks. Naive approaches to sequential training do not preserve learned knowledge when retraining, emphasizing the need for research into the domain of Continual Learning (CL). Continual learning is the sequential training of a DNN on several tasks while avoiding the forgetting of learned information. This dissertation investigates how the behavior of a network and the connections within it change over the course of CL. We produce Information Flow (IF) measures for network behavior utilizing concepts of Information Theory including the Pearson Correlation and Mutual Information. These measures are used to monitor the activity of the network and each constituent subnetwork. We seek to understand how this information can provide usable insights which may be leveraged to improve the efficiency, accuracy, or adversarial robustness of the network. We demonstrate that given sufficient dependency between layers, a robust subnetwork can help confer robustness to the full network. We consider the impact of substituting early CL tasks with images synthesized by generative models, such that we share the knowledge learned on these tasks with later, natural-image tasks. The metric of IF is then extended to the subnetworks produced by pruning-based CL to compare similar subnetworks and inform decisions of which subset of previous knowledge to share when learning a task. We extend this comparison of similar subnetworks to Online CL (OCL), showing how subnetworks may be leveraged for efficient online learning by enabling more systematic training. We show that the IF within these subnetworks can be used to cluster and merge similar subnetworks to free additional capacity within the network for future learning.

Files over 10MB may be slow to open. For best results, right-click and select "save as..."

Share