Date of Award

Summer 8-18-2023

Level of Access Assigned by Author

Open-Access Thesis

Degree Name

Master of Science (MS)


Biological Engineering


Andre Khalil

Second Committee Member

Michael Mason

Third Committee Member

David Bradley


In recent years breast cancer has become the leading cause of global cancer incidence. One of the most common forms of screening is through the use of digital x-ray screening mammography. Risk assessment models which help predict a patient’s risk of developing breast cancer rely mainly on patient history and qualitative breast density assessment from screening. The 2D wavelet transform maxima modulus (2D WTMM) method uses a sliding window approach to quantify the spatial organization of underlying mammographic tissue according to Hurst- exponent ranges (H) as fatty (H ≤ 0.45), healthy dense (H ≥ 0.55) and risky dense (0.45 < H < 0.55) resulting in grey-scale maps composed of H pixel values in the shape of mammograms.

The metric space technique (MST) is a method for quantifying 2D maps as a 1D output function where characteristics of an image are measured across threshold values and plotted. The MST was run on 89 tumorous patients (71 cancer, 18 benign) of the Perm data set which is comprised of H-value maps of mediolateral oblique (MLO) and craniocaudal (CC) views. Of thirty possible metrics, six are concluded to have statistically significant differences between cancer and benign categories. These six metrics were used to train univariate and multivariate general linear models (GLM) and k-nearest neighbor (KNN) models. The univariate KNN models outperformed the univariate GLM models which resulted in acceptable discrimination and high specificity, but low sensitivity and balanced accuracy. The multivariate KNN achieved the highest area under the curve of the receiver operator curve (ROC AUC) of 0.71 indicating acceptable discriminatory capacity of the model. A modification to the MST is suggested which would address a dilution effect introduced through the examination of fatty, risky dense, and healthy dense AUC as discrete regions. Further work on feature selection and work with a larger, more balanced data set is necessary to validate these results.