THE IMPACT OF FEATURE EXTRACTION AND DATA IMPUTATION ON PM2.5 FORECASTING MODEL FOR BANGKOK AREA

THE IMPACT OF FEATURE EXTRACTION AND DATA IMPUTATION ON PM2.5 FORECASTING MODEL FOR BANGKOK AREA

Files

gs581130331.pdf (3.58 MB)

Date

30/8/2020

Publisher

Srinakharinwirot University

Abstract

The last few years have seen a dramatic increase in PM2.5 air pollution in Thailand’s major cities. Various works have tried to develop efficient Long Short-Term Memory (LSTM) deep neural network models for PM2.5 concentration forecasting. However, little has been studied about the impact of data imputation and feature extraction on the model performance in this context. In this reserch, we imputed missing values using Kalman Smoothing and Linearly Weighted Moving Average. We utilized the LSTM Autoencoder (LSTM AE) for feature extraction. Using the Chokchai Police station in Bangkok as a case study to predict PM2.5 in the next 24 hours, we demonstrated that the performance gain from training LSTM models with imputed data is more than 7 percent overall with respect to the root mean square error (RMSE) and more than 10 percent overall with respect to the mean absolute error (MAE). Improvement with LSTM AE varies according to time steps. Forecasting 22 to 24 hours ahead tends to favor the use of LSTM AE.
ไม่กี่ปีที่ผ่านมาเกิดเหตุการณ์มลภาวะทางอากาศฝุ่นละอองขนาดเล็ก PM2.5 เพิ่มขึ้นอย่างรวดเร็วในเมืองใหญ่ของประเทศไทย มีหลายงานวิจัยพยายามพัฒนาเทคนิคความจำระยะสั้นแบบยาว (Long Short-Term Memory : LSTM) ที่เป็นแบบจำลองแบบโครงข่ายประสาทเชิงลึก (Deep Neural Network) สำหรับพยากรณ์ค่าความหนาแน่นของ PM2.5 อย่างไรก็ตามยังมีการศึกษาผลกระทบของการทำการแทนที่ข้อมูลสูญหายและการสกัดคุณลักษณะที่มีผลกับประสิทธิภาพของ LSTM ในการพยากรณ์ค่า PM2.5 อยู่ค่อนข้างน้อย ในงานวิจัยนี้ผู้วิจัยแทนที่ข้อมูลสูญหายด้วย Kalman Smoothing และ Linearly Weighted Moving Average และใช้ LSTM Autoencoder (LSTM AE) สำหรับการสกัดคุณลักษณะ ชุดข้อมูลที่ใช้พยากรณ์ PM2.5 ได้มาจากการตรวจวัดบริเวณสถานีตำรวจนครบาลโชคชัย โดยงานวิจัยนี้ทำการพยากรณ์ค่า PM2.5 ล่วงหน้า 24 ชั่วโมง ผู้วิจัยพิสูจน์ให้เห็นว่าประสิทธิภาพของแบบจำลอง LSTM เพิ่มขึ้นโดยรวมประมาณ 7 เปอร์เซ็นต์เมื่อใช้ Root Mean Square Error (RMSE) ประเมิน และประสิทธิภาพเพิ่มขึ้นมากกว่า 10 เปอร์เซ็นต์เมื่อใช้ Mean Abosolute Error (MAE) ประเมิน ส่วน LSTM AE เพิ่มประสิทธิภาพในการพยากรณ์ในแต่ละช่วงเวลาแตกต่างกันไปตามจำนวนชั่วโมงโดยเฉพาะการพยากรณ์ล่วงหน้า 22 ถึง 24 ชั่วโมงมีแนวโน้มจะได้ประสิทธิภาพที่ดี

Description

MASTER OF SCIENCE (M.Sc.)
วิทยาศาสตรมหาบัณฑิต (วท.ม.)

Keywords

การแทนที่ข้อมูลสูญหาย, ความจำระยะสั้นแบบยาว, ตัวเข้ารหัสแบบความจำระยะสั้นแบบยาว, การพยากรณ์ PM2.5, การสกัดคุณลักษณะ, Data imputation, LSTM, LSTM autoencoder, PM2.5 forecasting, Feature extraction

URI

https://ir-ithesis.swu.ac.th/handle/123456789/953

Collections

Faculty of Science

Full item page

THE IMPACT OF FEATURE EXTRACTION AND DATA IMPUTATION ON PM2.5 FORECASTING MODEL FOR BANGKOK AREA

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By