A privacy-preserving approach to effectively utilise distributed data for medical disease detection
Authors
Kareem, AmerIssue Date
2024Subjects
privacy preservingimage detection
medical disease detection
machine learning
federated learning
Subject Categories::G760 Machine Learning
Metadata
Show full item recordAbstract
Pneumonia is one of the fatal diseases that causes the death of around 4 million people yearly. Previously, several researches have been done to detect pneumonia using state-of-the-art machine learning methods. However, the challenges involved in medical image detection are high in spatial resolution, heterogeneous in visuals, and complex in pattern. To overcome this challenge, a large number of datasets is needed that can be achieved by utilising data through collaborative sharing platforms from hospitals and medical institutes. But general data protection and regulation (GDPR) and data protection act (DPA) 2018 do not allow institutions to share customer data with third-party companies. With the restrictions imposed due to UK (EU) rules and regulations, the major challenge for researchers is the accessibility of data. As a result, a method to access the appropriate amount of data for machine learning models is needed to make an accurate prediction while maintaining privacy. In research, a hybrid approach of machine learning models and a federated learning framework has been proposed to use distributed data in a privacy-preserving manner. In the experiment, the chest radiographs is used to detect pneumonia disease by distributing the data to a different number of clients (simulation) and training the model individually. Data are trained locally on the client in the distributed system federated learning framework, and the trained model is shared with the central federated learning server. The benchmark of best performing models has been performed on malaria and brain tumor dataset. The research has also highlighted the significance test between the models performance in federated learning framework. The research contribution includes the hybrid framework of federated learning and the CNN based pre-trained models that allows access to the distributed data in a privacy preserving manner. The test analysis have been performed using machine learning algorithms that include convolutional neural network (CNN) based pretrained models of Alexnet, DenseNet, Residual Network (ResNet50), Inception, and Visual Geometry Group 19 (VGG19) in the pneumonia dataset. research will allow hospitals and medical institutes to collaborate while using data mutually. This thesis gives the clear pathway of the effective approaches that can be adopted to enhance diagnosis, improving the healthcare. It also gives state-of-the-art methods for different medical image detection, limitation and future potential. The benchmark analysis gives clear reflection of the potential effectiveness of findings and future scope. I have selected algorithms by performing experimental analysis for effective classification as they are state of the art methods. Due to the complexity of medical images (especially X-ray images), I need a vast number of datasets (images) to train the model correctly and precisely. Novel aspects of the research are to develop the hybrid framework for individual algorithms with federated learning while ensuring data privacy by using a secure aggregation encryption method that promises the privacy. The preliminary result showed that ResNet50 and desnenet perform well in contrast to others in the federated learning framework. It answers research questions of mutual data collaboration while keeping privacy intake and knowing what machine learning models can be used in medical image detection. I have also demonstrated the future scope of research that will allow hospitals and medical institutes (including national health services (NHS) bodies) to share live stream data for effective machine learning modelling in a privacy-preserving manner. This thesis reflects the hybrid approach of using CNN-based pre-trained models in a federated learning framework for medical image detection and is a novel contribution to the scientific knowledge, as best of information.Citation
Kareem, A. (2024) 'A Privacy-preserving Approach to Effectively Utilise Distributed Data for Medical Disease Detection’. PhD thesis. University of Bedfordshire.Publisher
University of BedfordshireType
Thesis or dissertationLanguage
enDescription
A thesis submitted to the University of Bedfordshire, in partial fulfilment of the requirements for the degree of Doctor of PhilosophyCollections
The following license files are associated with this item:
- Creative Commons
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 International