Automatic Classification of Mammography Images Using a Convolutional Neural Network

Lisa Pitzl
Master Digital Healthcare, St. Pölten University of Applied Sciences 2024

Aim and Research Question(s)

Breast cancer represents a significant health burden worldwide. Mammography screening programs play a crucial role in early detection, highlighting the need for accurate and efficient diagnostic tools such as Convolutional Neural Networks (CNN). This thesis aims to develop and evaluate a CNN-based solution for the automatic classification of mammography images. RQ1: How can a CNN be developed and evaluated? RQ2: How can a CNN, with automated classification of mammography images, help improve breast cancer diagnoses?


About 5,000 women in Austria are diagnosed with breast cancer every year, which corresponds to one in eight women in this country [1]. Organized mammography screening programs are intended to counteract the mortality rate from breast cancer. Due to the increasing amount of mammography images and data generated by widespread screening programs, radiologists are often challenged to perform accurate assessments within a reasonable time frame [2].


This thesis addresses the necessity of developing and evaluating a CNN for classifying mammography images. The dataset considering mammography images of 500 patients from the radiology department of the Ordensklinikum Linz GmbH Barmherzige Schwestern was filtered and pseudonymized. The data was initially categorized according to the Breast Imaging-Reporting and Data System (BI-RADS®) classification. The models were optimized for binary and multi-class classification tasks. The performance evaluation is based on accuracy, loss, receiver operating characteristic and the confusion matrix.

Results and Discussion

The results presented show that the CNN model has potential performance in distinguishing between benign and malignant mammographic images with a validation accuracy of 69.49% and a test accuracy of 70.13%, while the baseline accuracy was calculated to be 66.33%. However, challenges were observed when extending the model to three and five classes. The baseline accuracy of the three class multiclass classification model is 43.82% and the test accuracy is 47.65%. For the five class model, the test accuracy reached 27.18% compared to a baseline accuracy of 24.70%. The presence of artefacts in mammography images may tend to affect the model performance.

Figure 1: Data categorization and class divison. Own illustration. Figure 1: Data categorization and class divison. Own illustration.


While the models delivered satisfactory performance on training datasets, the effectiveness on unseen data was limited, indicating potential overfitting and generalization challenges. The thesis emphasizes the significance of continuous research and innovation in the development of artificial intelligence-based diagnostic tools in medicine, and especially in the radiological sector for the recognition and classification of mammography images.


[1] Österreichische Gesundheitskasse. (2023, October 3). Früh erkennen, Österreichisches Brustkrebs-Früherkennungsprogramm. [2] Ragab, D. A., Attallah, O., Sharkas, M., Ren, J., & Marshall, S. (2021). A framework for breast cancer classification using Multi-DCNNs. Computers in Biology and Medicine, 131, 104245.