Classification of emails for Internet fraud detection and prevention through the application of artificial intelligence techniques.
Abstract
The social engineering threat known as Internet fraud is one among the forms of cybercrime that cost huge financial loss to the global economy as recent data and statistics revealed that even during the global Covid-19 pandemic Internet fraud continued to surge with many cases reported. Beyond financial losses, victims of Internet fraud in many instances suffers from other problems like psychological trauma, job or business loss and so on. Internet fraud is committed by cyber criminals from different countries across the globe, but some forms of Internet fraud have been identified and linked directly to specific countries. Advance fee fraud has been particularly linked to Nigeria due to participation of Nigerians in such fraud and also tracing the origin of many of such fraud to the country. Tackling internet fraud was among the reasons that necessitates Nigerian government to establish the Economic and Financial Crimes Commission (EFCC), an agency responsible to fight Internet fraud, corruption and other financial crimes. This research addresses the detection and classification of Internet fraud through the application of artificial intelligence. A classifier is designed and implemented that classifies incoming email into a category based on the content as either fraud or non-fraudulent applying the Natural language process model of Bag of Words. The research focuses on Advance fee fraud that originates from Nigeria, with part of the research data collected from Nigeria’s law enforcement agency, the Economic and Financial Crime Commission (EFCC). This research is the first research work so far published that collect such dataset from the commission to the best of my knowledge. The classifier is design and implemented in English language, and all the dataset used for training and testing are in English, therefore the classifier can be applicable and use by other English-speaking countries other than Nigeria. For countries that are non-English speaking, the Bag of word can be translated from English to their language and still be used for Internet fraud detection and classification. The classifier was implemented and experimented on using six machine learning algorithms; Decision Tree, Discriminant Analysis, Ensemble, Logic Regression, Nearest Neighbour and IV Support Vector Machine to identify the one(s) that produce the highest classification/detection rate compared to other similar published work. This research makes four main contributions to existing pool of academic research: (1) Focus and Identify specific Internet fraud directly linked to Nigeria. (2) Identify and generate unique features (Bag of Words) of Advance fee fraud that originate from Nigeria. This indeed is a unique contribution to research on Internet fraud. (3) Using artificial Intelligence technique, a classifier is design and implemented that successfully detect and classify fraudulent emails from Non fraudulent ones based on their content. (4) Result of the classifier is presented, the Classifier produced the best result with Decision Tree algorithms achieving 100% detection and classification accuracy using 7500 emails which consisted of 4500 fraudulent emails and 3000 non fraudulent.Citation
Hamisu, M. (2021) 'Classification of emails for Internet fraud detection and prevention through the application of artificial intelligence techniques'. PhD thesis. University of Bedfordshire.Publisher
University of BedfordshireType
Thesis or dissertationLanguage
enDescription
A thesis submitted to the University of Bedfordshire, in partial fulfilment of the requirements for the degree of Doctor of Philosophy.Collections
The following license files are associated with this item:
- Creative Commons
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 International