PREDICTION OF POVERTY LEVEL ON CENSUS DATA USING MACHINE LEARNING

Please use this identifier to cite or link to this item: http://ir-ithesis.swu.ac.th/dspace/handle/123456789/1239

Full metadata record

DC Field	Value	Language
dc.contributor	SORNRAM HONGPROM	en
dc.contributor	ศรราม หงษ์พรหม	th
dc.contributor.advisor	Chantri Polprasert	en
dc.contributor.advisor	จันตรี ผลประเสริฐ	th
dc.contributor.other	Srinakharinwirot University. Faculty of Science	en
dc.date.accessioned	2021-09-08T11:43:24Z	-
dc.date.available	2021-09-08T11:43:24Z	-
dc.date.issued	16/8/2021
dc.identifier.uri	http://ir-ithesis.swu.ac.th/dspace/handle/123456789/1239	-
dc.description	MASTER OF SCIENCE (M.Sc.)	en
dc.description	วิทยาศาสตรมหาบัณฑิต (วท.ม.)	th
dc.description.abstract	The purpose of this research is to present the utilization of machine learning for analysis of census data by proposing a feature engineering process to create household characteristics in conjunction with the Synthetic Minority Over-sampling Technique (SMOTE) for predicting population poverty. Poverty is divided into four levels: Extreme Poverty, Moderate Poverty, Vulnerable Households, and Non-Vulnerable Households. The machine learning models used in the poverty prediction from the census data, including Multilayer Perceptron, Linear Discriminant Analysis, K-nearest neighbor, Random Forest and Extra Trees. After adjusting the hyperparameters of each model to improve prediction efficiency, the experimental results showed that Random Forest model had the best performance for poverty prediction from census data, with accuracy equal to 0.63, precision equal to 0.43, recall equal to 0.42 and the average macro F1 score equal to 0.43. The experimental results also revealed that SMOTE played a significant role in the optimization of the model of poverty identification. The models presented above possess three most important properties affecting the performance of the model, including age of the population, years in school and average years of education for adults with the feature importance was 0.066, 0.065 and 0.059, respectively.	en
dc.description.abstract	งานวิจัยนี้นำเสนอการใช้การเรียนรู้ของเครื่องจักรในการวิเคราะห์ข้อมูลสำมะโนประชากร โดยนำเสนอการใช้กระบวนการปรับแต่งคุณลักษณะเฉพาะของข้อมูล (Feature Engineering) เพื่อสร้างคุณลักษณะเฉพาะของครัวเรือน ร่วมกับวิธีการสุ่มเพิ่มตัวอย่างกลุ่มน้อย (Synthetic Minority Over-sampling Technique) เพื่อใช้ในการทำนายความยากจนของประชากร ซึ่งความยากจนถูกแบ่งออกเป็น 4 ระดับคือ ขั้นรุนแรง, ปานกลาง, มีความเสี่ยงจะยากจน, ไม่มีความเสี่ยงจะยากจน โดยโมเดลการเรียนรู้ของเครื่องจักรที่งานวิจัยนี้นำมาใช้ในการทำนายความยากจนจากข้อมูลสำมะโนประชากรประกอบไปด้วย โครงข่ายประสาทเทียมแบบ Multilayer Perceptron ,การวิเคราะห์การจำแนกประเภทเชิงเส้น ,วิธีการเพื่อนบ้านใกล้ที่สุด ,โมเดลป่าสุ่ม ,ต้นไม้ตัดสินใจจำนวนมาก หลังจากที่ได้ทดลองปรับไฮเปอร์พารามิเตอร์ (hyperparameter) ของแต่ละแบบจำลองเพื่อเพิ่มประสิทธิภาพในการทำนายแล้ว จากการทดลองพบว่าโมเดลการเรียนรู้ของเครื่องจักรแบบป่าสุ่มมีประสิทธิภาพดีที่สุดในการทำนายความยากจนจากข้อมูลสำมะโนประชากรโดยให้ค่าความถูกต้อง (Accuracy) เท่ากับ 0.63 , ความแม่นยำ (Precision) เท่ากับ 0.43 , ความครบถ้วน (Recall) เท่ากับ 0.42 และคะแนน F1 (macro F1) เฉลี่ยเท่ากับ 0.43 โดยจากการทดลองพบว่าเทคนิคการสุ่มเพิ่มตัวอย่างกลุ่มน้อยมีส่วนสำคัญในการเพิ่มประสิทธิภาพของโมเดลในการระบุความยากจน โมเดลที่นำเสนอนี้มีคุณสมบัติที่สำคัญที่สุดสามประการที่มีผลต่อประสิทธิภาพของแบบจำลองอันประกอบไปด้วยอายุของประชากร,จำนวนปีในสถานศึกษาและจำนวนปีการศึกษาโดยเฉลี่ยสำหรับผู้ใหญ่ โดยมีค่าความสำคัญ (Feature Importance) เท่ากับ 0.066, 0.065 และ 0.059 ตามลำดับ	th
dc.language.iso	th
dc.publisher	Srinakharinwirot University
dc.rights	Srinakharinwirot University
dc.subject	การเรียนรู้ของเครื่องจักร	th
dc.subject	การทำนายความยากจน	th
dc.subject	การปรับแต่งคุณลักษณะเฉพาะของข้อมูล	th
dc.subject	การสุ่มเพิ่มตัวอย่างกลุ่มน้อย	th
dc.subject	Machine learning	en
dc.subject	Predicting poverty	en
dc.subject	Feature engineering	en
dc.subject	Synthetic Minority Over-sampling Technique	en
dc.subject.classification	Computer Science	en
dc.title	PREDICTION OF POVERTY LEVEL ON CENSUS DATA USING MACHINE LEARNING	en
dc.title	การทำนายระดับความยากจนจากของข้อมูลสำมะโนประชากรด้วยการเรียนรู้ของเครื่อง	th
dc.type	Master’s Project	en
dc.type	สารนิพนธ์	th
Appears in Collections:	Faculty of Science

Files in This Item:

File	Description	Size	Format
gs601130298.pdf		3.24 MB	Adobe PDF	View/Open

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets