Journal article

Improving Breast Cancer Prediction Using Adaptive Synthetic Sampling: A Study on the Coimbra Dataset

تحسين التنبؤ بسرطان الثدي باستخدام أخذ عينات اصطناعية تكيفية (Adaptive Synthetic Sampling): دراسة على مجموعة بيانات كويمبرا

Ayman Alsabry, Abeer A Shujaaddeen, Mogeeb AA Mosleh

2025 5th International Conference on Emerging Smart Technologies and Applications (eSmarTA) · 2025 · pp. 1–8

IEEE

Abstract

Class imbalance remains a significant challenge in breast cancer classification, leading to biased predictive models that favor the majority class. Addressing this issue is crucial for improving early detection and diagnostic accuracy. This study investigates the impact of Adaptive Synthetic Sampling (ADASYN) on the performance of various machine learning models for breast cancer prediction using the Breast Cancer Coimbra Dataset (BCCD). A total of 36 machine learning models, including decision trees, support vector machines (SVMs), k-Nearest Neighbors (KNNs), neural networks, and ensemble-based methods, were trained and evaluated both before and after applying ADASYN. Performance was assessed using accuracy as the primary metric. The findings demonstrate that balancing the dataset significantly enhances classification performance, with Subspace KNN achieving the highest accuracy (91.7%) after ADASYN. However, some models, such as Linear SVM and certain neural networks, exhibited performance declines, highlighting the varying impact of synthetic oversampling across different algorithms. This study underscores the importance of data preprocessing techniques in medical diagnostics, demonstrating that adaptive oversampling can improve predictive accuracy but requires careful model selection. Future research should explore hybrid balancing techniques and feature selection methods to further enhance classification robustness.

Keywords

Classification Machine Learning Imbalanced Data Predictive Models Imbalance Dataset Oversampling

Journal & article details

Journal 2025 5th International Conference on Emerging Smart Technologies and Applications (eSmarTA)

Publisher IEEE

Volume / Issue -

Pages 1–8

Year 2025

DOI -

Author Ayman Alsabry, Abeer A Shujaaddeen, Mogeeb AA Mosleh

Author email aymanalsabry@iutt.edu.ye

Faculty

Department

Program

Research area

Academic year

🌐 Read on journal site

Article ID: ART-994675

Improving Breast Cancer Prediction Using Adaptive Synthetic Sampling: A Study on the Coimbra Dataset

Improving Breast Cancer Prediction Using Adaptive Synthetic Sampling: A Study on the Coimbra Dataset

Abstract

Keywords

main navigation

Information for

quick links

Inactive

FAQ

Top Searches:

Inactive

News and Events

Academic & Field Activities

Infrastructure

Alumni

Improving Breast Cancer Prediction Using Adaptive Synthetic Sampling: A Study on the Coimbra Dataset

Abstract

Keywords

Other published articles

Gender-based engagement model for designing serious games

Gender-based game engagement model validation using low Fidelity prototype

Gender enrolment factors in ict studies

Gender-based Engagement Model for Serious Games

Instrument Validation for Evaluating Serious Game Engagement Model

Inactive

FAQ

Top Searches:

Inactive

News and Events

Academic & Field Activities

Infrastructure

Alumni