PREDICTION OF 72-HOUR RE-VISIT TO THE EMERGENCY DEPARTMENT USING MACHINE LEARNING

Please use this identifier to cite or link to this item: http://ir-ithesis.swu.ac.th/dspace/handle/123456789/2749

Title:	PREDICTION OF 72-HOUR RE-VISIT TO THE EMERGENCY DEPARTMENT USING MACHINE LEARNING การทำนายการเข้ารับบริการซ้ำภายใน 72 ชั่วโมงของผู้ป่วยที่แผนกฉุกเฉิน โดยใช้การเรียนรู้ของเครื่อง
Authors:	PEACHAPONG POOLPOL พิชญ์พงศ์ พูลผล Sirisup Laohakiat ศิริสรรพ เหล่าหะเกียรติ Srinakharinwirot University Sirisup Laohakiat ศิริสรรพ เหล่าหะเกียรติ sirisup@swu.ac.th sirisup@swu.ac.th
Keywords:	แผนกฉุกเฉิน การกลับเข้ามารับบริการซ้ำ 72 ชั่วโมง Emergency Department Re-visit 72 hours
Issue Date:	24
Publisher:	Srinakharinwirot University
Abstract:	Patients who are discharged from the emergency department (ED) with the approval of a physician, but experience a worsening of symptoms and require a revisit to the ED within 72 hours may face complications due to potential misjudgments in the initial assessment, affecting patient outcomes compared to hospitalization or referral. This thesis aims to develop machine learning to predict re-visits to ED within 72 hours among patients who are discharged from the ED, supporting emergency physicians in evaluating patient risks for re-visit after discharge. Utilizing the MIMIC-IV-ED dataset from Beth Israel Deaconess Medical Center in Boston, Massachusetts, USA, spanning from 2011 to 2019. This dataset, available on Physionet, contains 220,378 emergency department visits with 10,090 (4.58%) resulting in a revisit within 72 hours with a total of 22 variables, including gender, race, age, mode of arrival, acuity level, body temperature, heart rate, respiratory rate, oxygen saturation, systolic blood pressure, diastolic blood pressure, pain score, length of stay in the ED, and diagnosis. After data preparation and cleaning, the study investigates techniques for handling data imbalance, including Random Oversampling, SMOTE, Random Undersampling, and Class Weight, in order to train various models: Logistic Regression, KNN Classifier, Random Forest Classifier, and XGBoost Classifier, a total of 29 models. The findings indicate that addressing data imbalance significantly affects model learning, with accuracies reaching up to 0.95 but recall at zero, resulting in an AUC of 0.5, which is equivalent to random prediction. However, the performance of the imbalance management methods did not show any significant differences, with the Logistic Regression models achieving the highest AUC at 0.61. This suggests that complex models may not be necessary and that simple statistical models, such as Logistic Regression, could be sufficient. Nevertheless, the moderate AUC values suggest there is room for improvement, possibly due to the limited variables and the noisy nature of the data, which was not designed for standardized collection. The key predictive variables identified include diagnosis, gender, age, race, pain scores, length of stay in the ED, and heart rate. ผู้ป่วยที่มารับบริการที่แผนกฉุกเฉินที่ได้รับการอนุญาตโดยแพทย์ให้สามารถกลับบ้านได้ แต่กลับมีอาการที่รุนแรงขึ้นและต้องกลับเข้ามารับบริการซ้ำที่แผนกฉุกเฉินภายในระยะเวลา 72 ชั่วโมง อาจเกิดจากการประเมินของแพทย์ที่เกิดความคลาดเคลื่อน ซึ่งส่งผลต่อผลลัพธ์ของการรักษาผู้ป่วยเทียบกับการรับตัวไว้รักษาในโรงพยาบาลหรือได้รับการส่งต่อ การวิจัยนี้ทำขึ้นเพื่อประยุกต์ใช้การเรียนรู้ของเครื่องมาช่วยทำนายการกลับมารับบริการซ้ำที่แผนกฉุกเฉินภายในระยะเวลา 72 ชั่วโมง เพื่อช่วยสนับสนุนแพทย์ประจำแผนกฉุกเฉินในการประเมินผู้ป่วยหลังได้รับอนุญาตให้สามารถกลับบ้านได้ โดยใช้ข้อมูลชุด MIMIC-IV-ED ซึ่งเป็นข้อมูลจากฐานข้อมูลเวชระเบียนของแผนกฉุกเฉินที่ Beth Israel Deaconess Medical Center เมืองบอสตัน ประเทศสหรัฐอเมริการะหว่างปี ค.ศ. 2011 – ค.ศ. 2019 จากเว็บไซต์ Physionet หลังจากได้มีการเตรียมข้อมูลและทำความสะอาดข้อมูลแล้ว มีจำนวนการเข้ารับบริการที่แผนกฉุกเฉินทั้งหมด 220,378 ครั้งโดยมีการกลับมารับบริการซ้ำที่แผนกฉุกเฉินภายใน 72 ชั่วโมง 10,172 ครั้ง (4.61%) ใช้ตัวแปรทั้งหมด 22 ตัวแปร ประกอบด้วย เพศ เชื้อชาติ อายุ วิธีการเดินทางมาที่แผนกฉุกเฉิน ระดับการประเมินความรุนแรงและเร่งด่วน สัญญาณชีพระหว่างรักษาตัวอยู่ในแผนกฉุกเฉิน (ได้แก่ อุณหภูมิร่างกาย อัตราการเต้นของหัวใจ อัตราการหายใจ ค่าระดับความเข้มข้นของออกซิเจนในเลือด ความดันโลหิตซิสโตกลิก ความดันโลหิตไดแอสโตลิก และระดับความปวด) ระยะเวลาที่อยู่ภายในแผนกฉุกเฉิน และผลการวินิจฉัย ซึ่งฝึกฝนด้วยข้อมูลทั้งแบบที่มีความไม่สมดุลของข้อมูล และข้อมูลที่ใช้วิธีการจัดการความไม่สมดุลของข้อมูล 4 วิธี คือ Random oversampling, SMOTE, Random undersampling และ Class weight เพื่อนำมาเปรียบเทียบประสิทธิภาพของแบบจำลอง 4 ประเภทหลัก ได้แก่ Logistic regression, KNN classifier, Random forest classifier และ XGBoost classifer สร้างออกมาเป็นแบบจำลองทั้งหมด 29 แบบ พบว่า ข้อมูลที่ไม่มีการจัดการความไม่สมดุลส่งผลให้แบบจำลองมีค่า Accuracy สูงถึง 0.95 แต่ค่า Recall เท่ากับ 0 ซึ่งมี AUC เท่ากับ 0.5 ไม่แตกต่างจากการทำนายแบบสุ่ม แต่เมื่อมีการจัดการความไม่สมดุลของข้อมูล แต่ละวิธีของการจัดการความไม่สมดุลของข้อมูลได้ผลดีกว่าแต่ไม่แตกต่างกัน โดยแบบจำลองที่ใช้วิธี Logistic regression ได้ค่า AUC สูงที่สุด คือเท่ากับ 0.61 ซึ่งแสดงให้เห็นว่าอาจไม่จำเป็นต้องใช้แบบจำลองที่ซับซ้อนในการทำนาย การใช้แบบจำลองทางสถิติอย่างง่ายอาจเพียงพอ แต่ทั้งนี้ค่า AUC ที่ได้จากงานวิจัยนี้ยังไม่สูงมากพอที่จะนำไปใช้ต่อ ซึ่งอาจเป็นผลจากจำนวนตัวแปรกที่ใช้ในการทำนายมีน้อย รวมถึงข้อมูลที่นำมาใช้มีข้อมูลที่ถูกรบกวนค่อนข้างมากเนื่องจากไม่ได้ถูกออกแบบมาให้เก็บในรูปแบบมาตรฐาน ทั้งนี้จากการสร้างแบบจำลองพบว่า ตัวแปรที่มีความสำคัญได้แก่ การวินิจฉัย เพศ อายุ เชื้อชาติ ระดับความปวด ระยะเวลาการอยู่ในแผนกฉุกเฉิน อัตราการเต้นของหัวใจ
URI:	http://ir-ithesis.swu.ac.th/dspace/handle/123456789/2749
Appears in Collections:	Faculty of Science

Files in This Item:

File	Description	Size	Format
gs631130346.pdf		4.44 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets