MOTOR INSURANCE FRAUD DETECTION USING TEXT ANALYSIS AND MACHINE LEARNING

Please use this identifier to cite or link to this item: http://ir-ithesis.swu.ac.th/dspace/handle/123456789/1700

Full metadata record

DC Field	Value	Language
dc.contributor	PHURIT AMNUAYCHAI	en
dc.contributor	ภูริต อำนวยชัย	th
dc.contributor.advisor	Subhorn Khonthapagdee	en
dc.contributor.advisor	ศุภร คนธภักดี	th
dc.contributor.other	Srinakharinwirot University	en
dc.date.accessioned	2023-02-08T05:47:50Z	-
dc.date.available	2023-02-08T05:47:50Z	-
dc.date.created	2022
dc.date.issued	27/5/2022
dc.identifier.uri	http://ir-ithesis.swu.ac.th/dspace/handle/123456789/1700	-
dc.description.abstract	The purpose of this research was to analyze the data from the text attributes and categorical attributes, in order to generate a model using machine learning techniques. The dataset from motor insurance claims were used and were from the Asia Insurance Company 1950 (Public) and originated in the period from January 2020 to December 2020 and fraudulent claims data from January 2020 to April 2021, which a total of 58,579. The machine learning (ML) algorithms such as Naive Bayes classifier, Logistic regression, Random Forest and support vector machine were applied to the dataset. In this study, two methods were compared to handle an imbalanced dataset: random oversampling and SMOTE. These models were evaluated using Accuracy, Precision, Recall and F1-Score. It was found that Random Forest using SMOTE achieved the best results, with the following values of Accuracy=0.99, Precision=0.803, Recall=0.241, and a F1-Score=0.371.	en
dc.description.abstract	วัตถุประสงค์ของงานวิจัยเพื่อศึกษาวิเคราะห์ข้อมูลจากข้อความร่วมกันกับการใช้คุณลักษณะอื่นๆมาประกอบร่วมกัน นำมาประยุกต์ใช้กับเทคนิคการเรียนรู้ของเครื่อง(Machine Learning) เพื่อสร้างแบบจำลองเพื่อทำนายการคาดการความน่าจะเป็นว่าเคลมจะเกิดการทุจริต และเปรียบเทียบประสิทธิภาพของแบบจำลองการแยกประเภท(Classification) ร่วมกับการทดลองกับการจัดการความไม่สมดุลกันของข้อมูล โดยใช้ชุดข้อมูลการเคลมสินไหมรถยนต์ของบริษัทเอเชียประกันภัย1950 จำกัด(มหาชน) ที่เกิดเคลมในช่วง ม.ค. 2563 ถึง ธ.ค. 2563 โดยรวบรวมข้อมูลการทุจริตเคลมในช่วง ม.ค. 2563 ถึง เม.ย. 2564 จำนวนข้อมูลทั้งหมด 58,579 แถว โดยได้ทำการทดลองด้วย 4 วิธีหลักดังนี้ 1. สร้างแบบจำลองทดลองกับข้อมูลที่มีความไม่สมดุล 2. สร้างแบบจำลองทดลองกับข้อมูลที่จัดการกับความไม่สมดุลด้วยวิธี Random Oversampling 3. สร้างแบบจำลองทดลองกับข้อมูลที่จัดการกับความไม่สมดุลด้วยวิธี SMOTE 4. นำแบบจำลองและวิธีการจัดการความไม่สมดุลของข้อมูลที่เลือกมาทำการปรับจูนพารามิเตอร์ ผู้วิจัยได้ทำการทดลองโดยเปรียบเทียบจากค่า Accuracy, Precision, Recall และ F1-Score ในแต่ละวิธีการที่ทำการวิจัย ซึ่งแบบจำลองที่ให้ค่าผลลัพธ์ที่ดีที่สุดคือ Random Forest และวิธีการจัดการกับความไม่สมดุลกันของข้อมูลคือ SMOTE โดยให้ค่า Accuracy=0.99, Precision=0.803, Recall=0.241, F1-Score=0.371 โดยใช้เวลาเทรนแบบจำลองเพียง 12นาที จากการทดลองแบบจำลอง Random Forest ร่วมกับการทำ SMOTE สามารถให้ผลลัพธ์ที่ดีกว่าและใช้เวลาในการเทรนที่ไม่มาก ในแง่ของการใช้คุณลักษณะข้อความกับคุณลักษณะที่ไม่ใช่ข้อความพบว่าแบบจำลองยังให้ความสำคัญกับคุณลักษณะที่ไม่ใช่ข้อความมากกว่า	th
dc.language.iso	th
dc.publisher	Srinakharinwirot University
dc.rights	Srinakharinwirot University
dc.subject	ทุจริตเคลมรถยนต์	th
dc.subject	การวิเคราะห์ข้อความ	th
dc.subject	การเรียนรู้ของเครื่อง	th
dc.subject	ความไม่สมดุลกันของข้อมูล	th
dc.subject	เทคนิคป่าแบบสุ่ม	th
dc.subject	Motor Claim Fraud	en
dc.subject	Text Analytics	en
dc.subject	Machine Learning	en
dc.subject	Imbalance Data	en
dc.subject	Random Forest Technique	en
dc.subject.classification	Computer Science	en
dc.subject.classification	Education	en
dc.title	MOTOR INSURANCE FRAUD DETECTION USING TEXT ANALYSIS AND MACHINE LEARNING	en
dc.title	การตรวจจับการฉ้อโกงประกันภัยรถยนต์โดยใช้การวิเคราะห์ข้อความและการเรียนรู้ของเครื่อง	th
dc.type	Master’s Project	en
dc.type	สารนิพนธ์	th
dc.contributor.coadvisor	Subhorn Khonthapagdee	en
dc.contributor.coadvisor	ศุภร คนธภักดี	th
dc.contributor.emailadvisor	subhorn@swu.ac.th
dc.contributor.emailcoadvisor	subhorn@swu.ac.th
dc.description.degreename	MASTER OF SCIENCE (M.Sc.)	en
dc.description.degreename	วิทยาศาสตรมหาบัณฑิต (วท.ม.)	th
dc.description.degreelevel	-	en
dc.description.degreelevel	-	th
dc.description.degreediscipline	Department Of Computer Science	en
dc.description.degreediscipline	ภาควิชาวิทยาการคอมพิวเตอร์	th
Appears in Collections:	Faculty of Science

Files in This Item:

File	Description	Size	Format
gs631130117.pdf		4.48 MB	Adobe PDF	View/Open

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets