Abstract:
Anomaly detection is the process of finding outlying record from a given data set. This problem has been of increasing importance due to the increase in the size of the data and the need to efficiently extract those outlying records that can have important indications in real-life problems. Anomaly detection is applied in many different sectors. There are many approaches to solve the anomaly detection problem. However, those that are more widely applicable are unsupervised approaches as they do not require labeled data. The aim of this thesis is to study a well-known anomaly detection technique on a specific application, and to find some ways to tune and optimize it in order to have a better precision and results. The server chosen in this thesis is the Short Message Service Centre server, which is used in the telecommunications field to handle and store messages (SMSs) so that they can be properly delivered to the appropriate destination. This server was studied in details to decide which data should be taken into consideration, Once done, a script was written to gather all these data, which went through a cleaning (preprocessing) and then through a labeling process after being deeply analyzed and divided into many subsets. After extensive research, the decision tree algorithm was chosen to be implemented, and it was applied to the labeled data obtained. The original tree that was constructed gave a precision of 98.82%. After that, different types of tuning were performed to increase the precision, which reached 99.38% with an effective accuracy of 99.98% (relative to our case). Our approach proved that the application on this type of servers is efficient and it leads to very good results, which can also be improved in future studies.
Description:
M.S. -- Faculty of Natural and Applied Sciences, Notre Dame University, Louaize, 2015; "Thesis submitted in partial fulfillment of the requirements for the joint degree of Master in Computer Science in the Department of Computer Science in the Faculty of Natural and Applied Sciences of Notre Dame University."; Includes bibli.ographical references (leaves 77-80).