Abstract
Road traffic emissions are generally believed to contribute immensely to air pollution, but the effect of road traffic datasets on air quality predictions has not been clearly investigated. This research investigates the effects traffic dataset have on the performance of Machine Learning (ML) predictive models in air quality prediction. To achieve this, we have set up an experiment with the control dataset having only the Air Quality (AQ) dataset and Meteorological (Met) dataset. While the experimental dataset is made up of the AQ dataset, Met dataset and Traffic dataset. Several ML models (such as Extra Trees Regressor, eXtreme Gradient Boosting Regressor, Random Forest Regressor, K-Neighbors Regressor, and five others) were trained, tested, and compared on these individual combinations of datasets to predict the volume of PM2.5, PM10, NO2, and O3 in the atmosphere at various time of the day. The result obtained showed that various ML algorithms react differently to the traffic dataset despite generally contributing to the performance improvement of all the ML algorithms considered in this study by at least 20% and an error reduction of at least 18.97%. This research is limited in terms of the study area and the result cannot be generalized outside of the UK as many conditions may not be similar elsewhere. Additionally, only the ML algorithms commonly used in literature are considered in this research. Therefore, leaving out a few other ML algorithms. This study reinforces the belief that the traffic dataset has a significant effect on improving the performance of air pollution ML prediction models. Hence, there is an indication that ML algorithms behave differently when trained with a form traffic dataset in the development of an air quality prediction model. This implies that developers and researchers in air quality prediction need to identify the ML algorithms that behave in their best interest before implementation. This will enable researchers to focus more on algorithms of benefit when using traffic datasets in air quality prediction.
More Information
Identification Number: | https://doi.org/10.1108/JEDT-10-2021-0554 |
---|---|
Status: | Published |
Refereed: | Yes |
Publisher: | Emerald |
Additional Information: | Copyright © 2022, Emerald Publishing Limited |
Uncontrolled Keywords: | 09 Engineering, 12 Built Environment and Design, |
Depositing User (symplectic) | Deposited by Bento, Thalita |
Date Deposited: | 09 Dec 2022 12:05 |
Last Modified: | 12 Jul 2024 05:43 |
Item Type: | Article |
Download
Note: this is the author's final manuscript and may differ from the published version which should be used for citation purposes.
License: Creative Commons Attribution Non-commercial
| Preview
Export Citation
Explore Further
Read more research from the author(s):