Forecasting Air Quality in Taiwan by Using Machine Learning

文章推薦指數: 80 %
投票人數:10人

The EPA, CWB, and CEMS databases are the main data sources used to forecast air quality. EPA uses an automatic continuous monitoring system to ... Skiptomaincontent Thankyouforvisitingnature.com.YouareusingabrowserversionwithlimitedsupportforCSS.Toobtain thebestexperience,werecommendyouuseamoreuptodatebrowser(orturnoffcompatibilitymodein InternetExplorer).Inthemeantime,toensurecontinuedsupport,wearedisplayingthesitewithoutstyles andJavaScript. Advertisement nature scientificreports articles article ForecastingAirQualityinTaiwanbyUsingMachineLearning DownloadPDF Subjects EnvironmentalimpactEnvironmentalmonitoring AbstractThisstudyproposesagradient-boosting-basedmachinelearningapproachforpredictingthePM2.5concentrationinTaiwan.Theproposedmechanismisevaluatedonalarge-scaledatabasebuiltbytheEnvironmentalProtectionAdministration,andCentralWeatherBureau,Taiwan,whichincludesdatafrom77airmonitoringstationsand580weatherstationsperforminghourlymeasurementsover1year.BylearningfrompastrecordsofPM2.5andneighboringweatherstations’climaticinformation,theforecastingmodelworkswellfor24-hpredictionatmostairstations.Thisstudyalsoinvestigatesthegeographicalandmeteorologicaldivergencefortheforecastingresultsofsevenregionalmonitoringareas.WealsocomparethepredictionperformancebetweenTaiwan,Taipei,andLondon;analyzetheimpactofindustrialpollution;andproposeanenhancedversionofthepredictionmodeltoimprovethepredictionaccuracy.TheresultsindicatethatTaipeiandLondonhavesimilarpredictionresultsbecausethesetwocitieshavesimilartopography(basin)andarefinancialcenterswithoutdomesticpollutionsources.TheresultsalsosuggestthatafterconsideringindustrialimpactsbyincorporatingadditionalfeaturesfromtheTaichungandThong-Siaupowerplants,theproposedmethodachievessignificantimprovementinthecoefficientofdetermination(R2)from0.58to0.71.Moreover,forTaichungCitytheroot-mean-squareerrordecreasesfrom8.56fortheconventionalapproachto7.06fortheproposedmethod. DownloadPDF IntroductionTheimportanceofairqualitypredictionhasincreasedrapidlyforprovidingcitizenswithsuperiorqualityoflifeandforenablingeffectiveenvironmentalsensingresearch.NumerousstudieshavelinkedPM2.5(particulatematter)tohealthproblems.Forexample,previousstudieshaveindicatedthatpoorairqualitypotentiallycausesvarioushealthproblems,includingstroke,ischemicheartdisease,lungcancer,respiratoryinfection,andchronicobstructivepulmonarydisease1,2.Thus,airqualityhasrecentlybecomeoneofthemostimportanttargetsofnationalenvironmentalmonitoringandforecastingtasksformanycountriesinEurope,NorthAmerica,andAsia3.Forexample,authoritiesinLondoncooperatewithanenvironmentalresearchgroupthatprovidesdataforindependentscientificstudies.TheUnitedStatesEnvironmentalProtectionAgencyprovidesrawdata,statistics,andvisualizationforthesamepurpose4,5.InAsia,Taiwan’sEnvironmentalProtectionAdministration(EPA)setup19weatherstationsandbegantomonitortheairqualityinmajorcitiesin1980.In1993,theTaiwanesegovernmentdevelopedtheTaiwanAirQualityMonitoringNetwork,whichincludesalarge-scaledeploymentofairmonitoringstations,torecordandforecasttheairquality6.Thisnetworkhasbeenoperatingforover25yearsanditsdataareopentothepublic.ApartfromtheTaiwanesegovernment,theTaiwanesenationalresearchinstitutionAcademiaSinicapartneredwithprivateindustryandcivilcommunitiestoinitiatetheAirBoxcollaborativeprojecttomonitorthePM2.5concentration3,7,8,9byusingalargenumberofsmalldevices.ThisprojecthasmadecollectingPM2.5datamoreconvenientandmadeindependentresearcheasier.However,PM2.5valuesfromtheAirBoxprojecthavelowcorrelationcoefficientswiththeEPAdata.Onlyamoderatepositiverelationshipisobservedbetweenthem10,11,andthereliabilityofthesedataischallengedowingtothelackofhardwarecalibration.Consequently,thisstudyusesthereliablelarge-scaledatacollectedfromtheEPAtodevelopanaccurateairqualityforecastmodelforcitizensinTaiwan.Physicalmodelsandmachinelearningmodelsaretwotypesoftechniquestoforecasttheairquality.Inthe1990s,variousatmosphericdynamicsmethodswereappliedtobuildaphysicalmodelwithcomplicatedequations12,13,14,15,16,17.Thesemethodsrequiredhigh-speedcomputerswithlargememoriestocalculatealargenumberofiterations.Theyhadlimitedaccuracyandcouldnotidentifytheimportanceofnewandolddata18.Artificialintelligencehasrecentlyattractedconsiderableattention,andvariousmachinelearningapproacheshavebeenextensivelyimplementedtomodeldatainnumerousapplications.Airqualityforecastingisonepromisingapplication.Forexample,anartificialneuralnetworkwasappliedtoforecastairpollutantsinAthens,Greece,in199919.AnotherstudyappliedageneticalgorithmtoairqualitypredictionforextractingsufficientfeaturesandsuccessfullypredictedtheairqualityinTianjin,China,in201120.Previousstudieshaveindicatedthatwiththemassivecollectionofdataanddevelopmentofmachinelearningtechniques,adetailedairqualitypredictionmodelisworthfurtherstudy7,8.Amongvariousmachinelearningalgorithms,tree-basedmethodshaveattractedinterestforairqualityprediction.Astudyconductedin2010indicatedthatadecisiontreeexhibitsthebestperformancewhenpredictingCO2concentrationsinJapan21.In2016,arandomforestmodelwasappliedtopredicttheairqualityinShenyang,China.Arecentstudysuccessfullypredictedtheairqualityindexwithhighprecisionbycombiningurbansensingdataincludingmeteorology,roadinformation,thereal-timetrafficstatus,andthepointsofinterestdistribution22.Specifically,thetree-basedlearningmodelsgenerallyprovidethehighestaccuracy23,24.Besides,inrecentyears,thetree-basedmodelsarealsoappliedforpredictionofnitrogenoxidesandozonewithhighaccuracy25,26.Therefore,thisstudyselectedagradientboostingmodel(GBM)ondecisiontreestogenerateanairqualityforecastingmodelinTaiwan.Thispaperhasthreemaincontributions,First,thisstudyextractedthetemporalsequencesofhighlydimensionalandheterogeneoussensordatafrombothairmonitoringandweatherstationsintoafeaturedomain.TheproposedmechanismbasedonthegradientboostingalgorithmandextractedfeaturesefficientlypredictedtheairqualityPM2.5forthenext24hatindividualairmonitoringstations.Throughexperimentsconductedover1year,thisstudyinvestigatedthegeographicalandmeteorologicaldivergenceofforecastingresultsandperformedmeasurementsatsevenregionalmonitoringareasinTaiwan.Second,thepredictionperformancebetweenTaiwan,Taipei,andLondonwascomparedunderthesamemethodology.TheresultsindicatethatTaiwanhasamorecomplexenvironmentthanLondondoesduetodomesticandoverseas(mainlyfromChina)sourcesofPM2.5andthegeographictopographyofTaiwan.AninterestingobservationisthattheperformanceforTaipeiissimilartothatforLondon,withastationarypredictionperformanceofapproximately6intermsoftheroot-mean-squareerror(RMSE)and0.75intermsofthecoefficientofdetermination(R2).Theseresultsmaybeexplainedbythefactthatthesetwocitieshavesimilartopography(basin)andarefinancialcenterswithoutdomesticpollutionsources.TheresultsfromanalternativeLondondatabasesuccessfullyverifiedtheproposedpredictionmechanism.Third,wefocusedoncentralTaiwan,especiallyTaichungCity,becauseairpollutionhasworsenedthereinrecentyears.Themainreasonforthischangeisthefactthatthesecond-largestcoal-firedpowerstationintheworld(i.e.,theTaichungPowerPlant)islocatedinthisregion.Thisdomesticindustrialsourcemakespredictiondifficult.Thus,weproposeanenhancedfusionversionofthepredictionmodelbyincorporatingadditionalfeaturesfromContinuousEmissionMonitoringSystems(CEMS)toimprovethepredictionaccuracy.Afterconsideringtheindustrialimpact,theproposedmethodachievedsignificantimprovementonaverageforR2(0.58to0.71)andreducedtheRMSE(8.56to7.06)comparedwiththeconventionalapproachfollowedinTaichungCity.Tothebestofourknowledge,thisstudyisthefirsttocombiningtheuseofbothairstationandindustryrecordsforairqualitypredictionincomplexenvironments.GeographicalandMeteorologicalDivergenceofPM2.5ConcentrationBetweenRegionalMonitoringAreasInTaiwan,thePM2.5concentrationhasbeenasevereenvironmentalproblemandhasattractedconsiderableattentionfromresearchersandthegeneralpublic.Previousstudieshavefocusedonthesourcecontributionsofairpollution.Airpollutioncanbecategorizedaslocallyproduced(mainlyfrommotorvehiclesandpowerplants)orlong-rangetransported(LRT)pollution(mainlyfromChina)27,28.SomestudieshavefocusedontheinfluenceofmeteorologicalconditionsonthePM2.5concentrationandhaveindicatedthatthePM2.5concentrationisstronglyinfluencedbythesouthwesterlymonsoonalflow(SWM)andnortheasterlymonsoonalflow(NEM).GeographicaldivergenceisanimportantfactorwhendiscussingtheeffectsofmeteorologicalconditionsandsynopticweatherontheairqualityinTaiwan29.Thus,thefollowingsectiondiscussesthegeographicaldivergenceandtemporalvariationsofthemeasuredPM2.5concentrationinTaiwan.Thelarge-scaledatabaseusedinthisstudywasbuiltbytheofficialEPAandCentralWeatherBureau(CWB).Thedatabasecontainsmorethan260,000samplestakenin2017from77airmonitoringstations(fromtheEPA)and580weatherstations(fromtheCWB)inTaiwan.Fig. 1(a)displaysthegeographicalpositionsofthe77airstations.ThemapindicatesthatTaiwanislandissurroundedbytheoceanandhastheCentralMountainRange(CMR)runningfromthenorthtothesouth.ThecomplexterrainanddifferentmeteorologicalconditionscreatespatialandseasonalvariationsinthePM2.5concentrationinTaiwan.Consideringthedifferentmeteorologicalandgeographicalconditions,TaiwanhasbeendividedintosevenregionsbytheEAPforregionalairqualitymonitoring.AsillustratedinFig. 1(a),thesevenregionsarenorthernTaiwan(NT),Chu-Miao(CM)area,centralTaiwan(CT),Yun-Chia-Nan(YCN)area,Kao-Ping(KP)area,Hua-Dong(HD)area,andYilan(YI)30.Inadditiontothegeography,PM2.5isalsoheavilyinfluencedbytheweatherconditions.Toofferaccurateweatherconditionsandpredictions,theCWBhassetup580weatherstationsinTaiwan,asdepictedinFig. 1(b).Figure1The(a)77airqualitymonitoringstationsand(b)580weatherstationssetupacrossthesevenairqualityregionsinTaiwan.FullsizeimageFigure 2illustratesthemonthlytemporalvariationofthemeanPM2.5concentrationinthesevenregions,whereYIhasmissingdataexceptforDecember.ThisfigureindicatesthatthePM2.5concentrationreachesapeakinmostregionsduringspring(February,March,andApril),dropsdramaticallyduringsummer(May,June,andJuly),andstartstogoupwardsduringautumn(September,October,andNovember)andwinter(November,December,andJanuary).Figure2(a)MonthlyvariationsofthemeanPM2.5concentrationinthesevenairmonitoringregionsand(b)theirtopography.FullsizeimageStartingfromautumn(September,October,andNovember),theprevailingwind(NEM)affectsthenorthernpartoftheisland,therebyeasilydispersinganddilutingairpollutants.TheNEMbecomesstrongduringthewinter;however,italsobringsLRTpollutantsfromChina,whichaffecttheairquality.Thus,airpollutionisseverebecausetheNEMisblockedbyCMR;therefore,thesouthernpartofTaiwanbecomesaleesideregionandformsstagnantwindconditionsthatincreasetheaccumulationofairpollutants30,31.Bycontrast,duringspringastableweathersystemwithoutstrongwindsisobserved,especiallyinthenorthernpartoftheisland29.Besides,theboundarylayerheightisalsooneofthefactorsfordecreasingPM2.5duringsummer.Onthecontrary,duringautumnandwinter,temperatureinversionlimitstheboundarylayerheightandresultsinincreasingPM2.5concentration.Figure 2alsoindicatesthatTaiwanhassuperiorairqualityduringthesummerbecausetheprevailingwind(SWM)easilydispersesanddilutesairpollutants.Moreover,thesubtropicalhighpressureoverthePacificOceaninthesummercauseshighconvectionofairmass,whichleadstotheverticaldispersionofpollutants.Afterexaminingthetemporalvariations,weinvestigatethegeographicaldivergence.Fig. 2indicatesthatYCNandKPhavetheworstairquality.AccordingtotheEPAstandards,whenthePM2.5concentrationislargerthan35.4μg/m3,theairqualitycouldcausehealthproblemstocitizens.FromFig. 2,theYCNandKPregionsexceedthethresholdvaluesfor2and5months,respectively.Bycontrast,NTandHDhavethebestairqualitybecauseNTisafinancialandbusinesscenterandHDisanundevelopedgreenspace.Asexpected,thesetworegionshaveonlyminorindustrialpollution.InTaiwan,HDisthemostlivableplacewithgoodairqualitybecausethemeanPM2.5concentrationisbelow15.4μg/m3forseveralmonths.AninterestingregionisCT,wherethemeanPM2.5concentrationislowerthanthatinonlyKPandYCN.ThecentralpartofTaiwanissubjecttobothmeteorologicalconditionsandtopographiceffectsbecausemostresidentsliveinthebasinareasurroundedbyhillsandmountains.Furthermore,thesecond-largestcoal-firedpowerstationintheworld,thatis,theTaichungPowerPlant,islocatedinthisregion.Thepowerplantaccountsfor15%ofTaichungCity’sPM2.5concentration32. Figure 3showstheestimateofthePM2.5concentrationofTaiwaninDecemberusingIDW(inversedistanceweighted)technique,whichisainterpolationmethodviaspatialaverageforunseenlocations33,34.ThisfigureshowsthatNEMisstronginDecember,resultinginbetterairqualityinthenorthernTaiwan.However,theNEMisblockedbyCMR.ThisresultsinweakwindinthesouthernTaiwanandbadatmosphericdiffusioncondition.Figure3EstimateofthePM2.5concentrationofTaiwaninDecember.FullsizeimageResultsThedataprovidedbytheEPAandCWBfor2017werecollectedandprocessed.Thesedatacontainedmorethan260,000samplescovering264,799hand36attributes.Inthisstudy,81independentparameterswereselectedfromeachairstationanditsfournearestweatherstationsasfeatures,andgradientboostingwithregressionwasusedasthelearningmodeltoforecastthePM2.5concentrationforthenext24h.Thefeatureextractionprocessandlearningalgorithmaredescribedindetailinthefollowingsections.Thisstudyusedfive-foldcrossvalidationtojustifythepredictionresults.Theentiredatawererandomlysplitintofivefolds,ofwhichfourwereusedfortrainingandonewasusedfortesting.PM2.5PredictionResultsTheRMSE,normalizedRMSE(NRMSE),andR2wereusedaspredictionperformancemetricsintheexperiments.TheRMSEindicatesthemeanfluctuationbetweentheobservedandpredictedvalues.AsmallerRMSEandNRMSEindicateasuperiorforecastingperformance.R2measurestheproportionofthevarianceofobservedvaluesthatispredictableinthecaseofmultipleregression.WhenR2 = 1,theobservedvaluesareperfectlypredicted.Fig. 4displaystheannualaverageresultsofforecastingperformanceof77airstationsinthesevenregionsofTaiwan.TheresultsinFig. 4(a)generallyagreewiththegeographicaldivergence.MoststationsintheYCN,KP,andCTregionsaredistributedintheupperleft.Furthermore,wecanobservethatthesimilaritybetweenFig. 4(a,b)liesinsomeboundarystationsincludingstations34,68,and76.Althoughthestation61showsthehighestNRMSEinsteadofstation69,bothstationsshowthesimilarRMSE,asshowninFig. 4(a).Inaddition,sometrendisconsistentinFig. 4(a,b)suchasthedistributionofHDregions(thelowerright),CTregions(theupperleft),andNTregions(middle).Next,theseparationbetweenregionsobviouslyreducedduetothenormalization.Forexample,YCNandNTareclearlyseparatedinFig. 4(a)whereastheyaremixedinFig. 4(b).UsingNRMSE,itisalsodifficulttodistinguishtheKPregionsamongthemiddleranges.Thus,eachperformancemetrichasitsownadvantageanddis-advantages.NRMSEprovidesfairpredictionperformancecomparisonamongeachmonitoringstationswhileRMSEmakesthecomparisonofairqualityregionseasilyduetohigherseparationability.InadditiontothelargeRMSEandNRMSE,thesmallR2valuesimplythattheseregionsmaycontainanincreasednumberofhiddenfactorsthatmayinfluencethepredictionmodel.Thus,aninterestingobservationisthatalthoughYCNandKPhaveworseairqualitythanCT,theR2valueofCTislowerthanthatofYCNandKP.TheR2valueforstation34(FengyuanstationinCT)isonly0.41,andistheworstamongallstations.Thisissueisdiscussedindetailinthefollowingparagraph.Bycontrast,moststationsintheHD,YI,andNTregionsaredistributedinthelowerright.Forstations20,21,76,and77,R2isapproximatelyequalto0.9,whichindicatesthattheforecastingmodelworksnearlyperfectlyforthesestations.Atsomestations,theRMSEvaluesaresmallerthan4andtheNRMSEvaluesaresmallerthan0.2,whichindicatesthatthepredictionvalueapproachesthegroundtruth.Amongthesestations,weobservethatalthoughstation68(Hengchuen)belongstotheKPregion,itsperformanceiscompletelydifferentfromthatofotherstationsinthisregion.Thereasonforthisobservationmaybethatstation68islocatedinanationalparkinsouthTaiwan,wherefewanthropicactivitiesoccur,whichresultsinalowRMSEandNRMSE.Figure4AnnualaverageresultsofPM2.5predictionperformanceforthe77airstationsinTaiwan.Fullsizeimage Figure 5showsthestandarddeviations(STD)ofRMSEandR2across5-foldsvalidations.Besides,wecanobservethatthemaximumSTDofRMSEislessthan1andthatofR2islessthan0.1.Inaddition,thisfigurealsoindicatesthatalmostallmonitoringstationsshowthesimilarandstableSTD,whilethereareonlyminorstationsshowrelativelyhigherSTDintheNTregions.Thus,theforecastmodelsarenotoverfittingfortheairmonitoringstationsinTaiwanduringtheexperiments.Figure5STDof(a)RMSEand(b)R2across5-foldsvalidationsinsevenregions.Fullsizeimage Figure 6displayscalendarheatmapsofpredicted,observed,andresidualvaluesatstation68(Hengchuen)and69(Chaozhou).InFig. 6(a),HengchuenexhibitssmallresidualsinthePM2.5concentrationovertheyear,alowPM2.5concentration(maximumisonlyapproximately30μg∕m3),andminordifferencesbetweentheobservedandpredictedPM2.5values.Bycontrast,asdisplayedinFig. 6(b),ChaozhouexhibitslargeresidualsinthePM2.5concentrationovertheyear,ahighPM2.5concentration(maximumisapproximately100μg∕m3),andsignificantdifferencesbetweentheobservedandpredictedPM2.5values.Besides,Fig. 6(b)illustratesthehigherobservedvaluesinwinter(November,December,andJanuary)andspring(February,March,andApril)atChaozhou.TheresultsindicatethatalthoughbothHengchuenandChaozhouareintheKPregion,theirforecastingperformanceissignificantlydifferent.Hengchuenislocatedinanationalpark,whereChaozhouisveryclosetotheLinyuanindustrialarea.Therefore,inadditiontothebetween-regiondivergence,thein-regiondifferenceisalsoanimportantissuewhenconsideringthepredictionperformanceofindividualairmonitoringstations.Figure6ComparisonbetweentheobservedandpredictedPM2.5valuesfor2017between(a)station68(Hengchuen)and(b)station69(Chaozhou).FullsizeimageComparisonBetweenTaiwanandLondonWeappliedanotherpublicairqualitydatabasetovalidatetheaforementionedstudyresults.Recently,theairqualitydataforLondonandsoutheastEnglandwasmadeavailableforindependentscientificmeasurementsandassessment35,36.Therefore,weselectedLondonasabenchmarkregionforcomparison.ByusingthesameanalysisprocedureasthatforTaiwan,experimentswereconductedtocollecthourlydatafromairandweatherstationsfor6years(2012–2018)inLondontoprovideafaircomparisonofairqualitypredictionbetweendifferentcountries37. Figure 7illustratesthecomparisonresults,wheretheemptybluecirclesandsolidgraycirclesrepresentairstationsinTaiwanandLondon,respectively.ThesubfiguredepictstheindexabbreviationsofairstationsinLondon35.Taiwanhas77airstationsfordetectingPM2.5concentration,whereasLondonhasonly11ones.Thefullnames,acronyms,andenvironmentsofLondonstationsareshowninTable 1.whereasthatinTaiwanis0.4–0.9.TheRMSEresultsexhibitthesametendency.TheRMSEinLondonis5–7,whereasthatinTaiwanis3–13.TheseresultsindicatethattheairqualityforecastingperformanceinLondonissignificantlysuperiortothatinTaiwanevenwhenusingthesamealgorithm.Thismaybeattributedtothreemainfactors.First,fromtheperspectiveofgeography,Londonislocatedinabasin,whereasTaiwanhasconsiderablycomplicatedterrain.Second,Londonisafinancialcenterwithoutdomesticpollutionsources.Bycontrast,mostpollutioninTaiwanoriginatesfromdomesticsources,suchasinKPandYCN.Third,TaiwanhastransboundaryissuesbecauseairpollutionmaybetransportedfromChina.ThesefactorsexplainwhyLondonsignificantlyoutperformsTaiwanandalsoindicatethatdedicatedairqualityforecastinginTaiwanischallenging.Figure7ComparisonoftheforecastingperformancebetweenTaiwan,Taipei,andLondon.FullsizeimageTable1TheacronymsfortheLondonstations.FullsizetableNext,weselectedstationsintheTaipeiregioninTaiwan(solidbluecirclesinFig. 7)forcomparingwithstationsinLondonfortworeasons.First,mostnorthernstationsarelocatedinTaipei,whichliesinabasin.Inaddition,Taipeiisafinancialcenterwithoutdomesticpollutionsources.Thus,TaipeiissimilartoLondonintermsofthetopography,pollutionsources,andnumberofairstations.ThecomparisoninFig. 7indicatesthatTaipeihassimilarpredictionaccuracycomparedwiththatofLondonfromtheviewpointofboththeRMSEandR2values.Fig. 7indicatesthattheRMSEinTaiwanis3–13andthatinTaipeiis5–8.Inotherwords,PM2.5predictioninTaipeiismoreaccuratethanthatatotherstationsinTaiwan.TheresultsobtainedfromanalternativedatabaseforLondonverifiedtheadvantagesoftheproposedpredictionmechanism.Ifacityhasapuretopography(basin)andzerodomesticpollutionsources(e.g.,TaipeiandLondon),thePM2.5predictionperformanceisexpectedtobestationaryandreliablewithaRMSEofapproximately6andanR2valueof0.75.EffectofAddingIndustry-RelatedCEMSFeaturesInrecentyears,airpollutionhasworsenedinCT,especiallyinTaichungCity,forreasonsthatremainunclear.Domesticpollutioncouldbeoneofthereasonsbecausethecoal-firedpowerstationinTaichungincreasedpowergenerationin2017.ThepowergenerationwasincreasedatthedirectiveoftheTaiwangovernmenttosatisfyfutureenergyrequirementandtopromotethenuclear-freehomelandpolicy.TheresultsindicatethattheforecastingperformanceispoorinCT,asdisplayedinFig. 4.Inparticular,station34hasthelowestR2value.Thisresultisconsistentwiththecoal-firedpowerstation’spolicyforincreasingpowergeneration.Weexploredthepossibilityofusingadditionalindustry-relatedfeaturestoachieveanenhancedpredictionperformance.Theeffectofaddingindustry-relatedfeaturesmayalsoprovidesomescientificevidenceforthedistributionofpollutionsources.Thequestioniswhetherimprovingthepredictionaccuracyispossiblewhenconsideringindustry-relatedinformation.Toanswerthisquestion,weusedthealternativeCEMSdatabasetoinvestigatetheeffectofaddingindustry-relatedfeaturesonthePM2.5prediction.TheemissionoffactorygasthroughchimneyshasbeenmanagedandmonitoredbytheCEMSoftheEPAsince1993.CEMSdetectsseveralparameters,includingtheCO,SO2,O2,nitrogenoxides,andhydrogenchlorideconcentrations;temperature;opacity,andemissionflowrate38.Besides,windspeedanddirectionareappliedforconsideringdiffusionofthedetecteditems.CEMShascollaboratedwiththeEPAsince2003,andthedatahasrecentlybeenmadepubliclyavailableforacademicresearch. Figure 8displaysthegeographicalpositionsofTaichung’scoal-firedpowerplantsandairstations30-34inTaichungCity,whichareindicatedbytheyellowareainthemap.Inadditiontothelargecoal-firedpowerstation,Fig. 8alsoshowsthegeographicalpositionoftheThong-Siaugas-firedpowerplants,whichisthesecond-largeststationinCTandisclosetotheairstationsinTaichungCity.Therefore,consideringthesetwothermalpowerplantswhenmakingPM2.5predictionsinTaichungisreasonable.Figure8GeographicalpositionsoftheairstationsandpowerplantsinTaichung.Fullsizeimage Figure 9illustratestheeffectofaddingindustry-relatedfeaturesinTaichung,inwhichalltheCEMSdatawereincludedasfeaturesinthetrainingmodel.ThefigureindicatesthatafterconsideringtheCEMS,theforecastperformancesignificantlyimprovesintermsoftheRMSEandR2values.Ingeneral,thehighestimprovementisobservedwhenaddingTaichungPowerPlant’sattributes.UponalsoaddingThong-Siaupowerplant’sfeatures,stations34and32exhibitasimilarperformance,stations31and33exhibitmarginalimprovementintheperformance,andstation30exhibitamarginaldecreaseintheperformance.Figure9CEMSimpactoftheTaichungandThong-SiaupowerplantsonfiveairmonitoringstationsinTaichungCity,wherethecirclesrepresenttheperformancewithoutCEMS,trianglesrepresenttheimpactofaddingTaichungPowerPlant’sattributes,andsquaresrepresenttheimpactofaddingfeaturesfromboththeTaichungandThong-Siaupowerplants.FullsizeimageAninterestingobservationisthattheimprovementintheforecastperformanceishighlyrelatedtothedistancebetweenthepowerplantsandtheairstations.Forexample,stations30-33areclosertoTaichungPowerPlantthanstation34,asshowninFig 8.ThismayexplainwhytheimprovementuponaddingTaichungPowerPlantisthemostinsignificantatstation34comparedwiththeotherairstations.Besides,station34islocatednearCMRandhashigheraltitudethannearbystations.IthasbeeninfluencedbyboththeTaichungandThong-SiaupowerplantsandsuffersfromhighPM2.5valueduetobadatmosphericdiffusionconditions.Furthermore,station34isclosetoThong-Siaupowerstationplant,andtheimprovementuponaddingThong-Siaupowerplant’sfeaturesisthelargestamongalltheairstations.Similarly,station30isthefarthestfromtheThong-Siaupowerplant.ThismayexplainwhytheforecastperformancedecreasesmarginallyafterconsideringthefeaturesofThong-Siaupowerplant.Insummary,theR2valuesincreaseby0.13,whereastheRMSEdecreasesby1.50onaverage(Fig. 9).TheseresultsindicatethatthePM2.5predictionperformanceinTaichungimprovessignificantlyonconsideringtheneighboringcoal-firedpowerplants.Next,wefollowthesameproceduretodeterminewhetherthesametrendcanbeobtainedinTaipeiCity.AlthoughnoindustrialpowerplantsexistinTaipei,theCEMSdata,includingdatafromgarbageincineratorsatMucha,Neihu,andBeitou,wereappliedasalternativeindustry-relatedfeaturestodeterminetheperformanceimprovementforairstationsinTaipei.Figure 10depictsthecomparisonbetweenTaichungandTaipei.ThisfigureindicatesthatalthoughaddingCEMSfeaturesstillprovidesamarginalperformanceimprovementinTaipei,thedegreeofimprovementisnotcomparabletothatinTaichung.IndustrialpollutionappearstohaveastrongerimpactinTaichungthaninTaipei,especiallyintermsofthePM2.5prediction.ThisobservationfitsrecentdevelopmentsofTaipeiandTaichung.TaipeiCityisafinancialandbusinesscenterinTaiwanwithminorindustrialimpact,whereasTaichungCityexperiencesincreasedairpollutionduetocoal-firedpowerplants.Figure10ComparisonoftheCEMSimpactbetweenTaichungandTaipei.ThepredictionresultsofTaipeiwithandwithoutCEMSarerepresentedbybluesolidcirclesanddiamonds,respectively.ThepredictionresultsofTaichung(showninFig. 9)withandwithoutCEMSarerepresentedbygreensolidcirclesandsquares,respectively,foracomparison.FullsizeimageDiscussionTheairqualityforecasthasarousedattentionfromgovernmentsandscientistsforimprovingenvironmentalqualityofcitizensandupgradingenvironmentalsensingstudies.BecausenumerousscientificstudieshavelinkedPM2.5(particulatematter)tohealthproblems,thisstudyproposesamachinelearningapproachbasedonGBMtopredictthePM2.5concentrationinTaiwan.Byextractingtemporalsequencesfromairmonitoringandweatherstationsintofeatures,theproposedmechanismefficientlypredictstheairqualityintermsofthePM2.5concentrationforthenext24hatindividualairmonitoringstations.Byperformingexperimentsover1year,thisstudyinvestigatedthegeographicalandmeteorologicaldivergenceofforecastingresultsandmeasurementsatsevenregionalmonitoringareasinTaiwan.TheresultsfromthealternativeLondondatabaseverifiedtheproposedpredictionmechanism.WeobservethatTaiwanhasamorecomplexenvironmentthanLondonduetodomesticpollutionsources,overseaspollutionsources(mainlyfromChina),andgeographictopography;however,TaipeiexhibitsasimilarpredictionperformancetoLondon.Thismaybebecausethesetwocitieshavesimilartopography(basin)andarefinancialcenterswithoutdomesticpollutionsources.Finally,becausedomesticindustrialpollutionmakespredictiondifficult,thisstudyproposesanenhancedfusionversionofthepredictionmodelbyincorporatingadditionalfeaturesfromtheCEMStoimprovethepredictionaccuracy.Afterconsideringtheindustrialimpact,theproposedmethodachievessignificantimprovementintheR2values(0.58–0.71onaverage)andconsiderablydecreasestheRMSE(8.56–7.06onaverage)comparedwiththeconventionalapproachforTaichungCity.Inthefuture,wewilltrytouserecentadvanceddynamicneuralnetworkstoimprovetheperformance.Wewillalsoinvestigatedeeperandmorecomplexmodelstructurewhenmoretrainingdatabecomeavailableforairqualityprediction.MethodsDatacollectionTheEPA,CWB,andCEMSdatabasesarethemaindatasourcesusedtoforecastairquality.EPAusesanautomaticcontinuousmonitoringsystemtoimmediatelyidentifyandrespondtouncommomemergingsignals.Thissystemcollectsairqualitydataeveryhour39,40.EPAprovidedmorethan260,000samplesfrom77airstationsin2017.Themeasuredattributesincludetheindex,city,county,stationname,date,detecteditems,andtimeinhours.ThedetecteditemsincludePM2.5,NO2,PM10,NO,NOX,SO2,CO,O3,THC,NMHC,andCH4.Specifically,PM2.5representsparticulatematterwithadiameterlessthan2.5μm,andNO2isoneofagroupofhighlyreactivegases.Thesesubstancesareprimarilyreleasedintotheairfromtheburningoffuelincars,trucksbuses,andpowerplants.Taiwan’sEPAhasstrictregulationsandguidelinesinregardstoqualityassuranceoperationsforairqualitymonitoringtoachievetheEPA’sdataqualityobjectives(DQO).EveryyeartheEPAhastoperformqualityassuranceoperationsfortheairqualitymonitoringsysteminordertoevaluatetheaccuracyofairqualitymeasurements.ThosedatawererecordedinEPA’squalityassurance(QA)operationsannualreportofairqualitymonitoringsystem.ForourPM25data(2017),theaccuracywas96.3%.Theannualreportfrom2001to2017canbedownloadedattheEPAwebsite41.InTaiwan,thestationsarecategorizedbytheEPAas6types,whichmaygiveageneralideaaboutthelocationofthestations: 60generalstations.Mostofthestationsarelocatedinurbanandsuburbanarea. 5industrialstations.Allofthestationsareclosetoindustrialfactories. 5trafficstations.Thesestationsarelocatedasidethemainroadsandhavelowersamplingaltitude. 4backgroundstations(2stationssimultaneousasthegeneralstation).Thesestationsareawayfrompollutionsourcesandprovidethebaselinemonitoringdata. 2nationalparkstations(1stationsimultaneousasthegeneralstation). 2otherairqualitymonitoringstations.Thesetwostationsarelocatedinruralarea. CWBhassetup580weatherstationstomonitorandreportweatherconditionsinTaiwan42.Theattributesofweatherinformationincludelongitude,latitude,stationname,city,county,windspeed,winddirection,temperature,andpressure.ThenumberanddateofdatasetsfromtheCWBandEPAmustbesynchronizedformodeltraining.TheCEMSwasestablishedbytheEPAtomonitorthegasesemittedfromchimneysofindustriessince1993.ThedetecteditemsincludetheCO,SO2,O2,nitrogenoxides,andhydrogenchlorideconcentrations;temperature;opacity;andtotalemissionflowrate.Specifically,emissionflowrateindicatestotalemissionmaterialsfromachimneyanditsunitiscubicmeterperhour38.Thedatahasbeenrecentlymadepubliclyavailableforacademicresearch,andweusedthisdatafortrainingtostudytheinfluenceoftheCEMSonTaichungandTaipei.FeatureextractionWederivedandextractedfeaturesfromtheEPAandCWBhourlydataformodeltraining.Thereexist81featuresfromeachairstationanditsfournearestweatherstations.Specifically,21featureswereidentifiedfromindividualairstationsand15wereidentifiedfromaweatherstation.Tables 2and3listthedetailsofthe81featuresforgeneratingtheGBM-basedPM2.5predictionmodel.Table2Inputfeaturesobtainedfromanairmonitoringstation.FullsizetableTable3Inputfeaturesfromaweatherstation.Thefeaturesfromfourweatherstationsandoneairstationarerequiredformodeltraining.FullsizetableWhenweforecastairpollutiononagivenday,theprevioustwodaysarealsoconsideredduetothememoryeffect.Furthermore,thedifferencebetweenthesetwodays’PM2.5concentrationisdefinedas“concentrationdifference”Inaddition,becausetrafficflowstronglyinfluencestheairquality,thefeaturesconsiderwhetheragivendayisregularday,weekend,orholiday.Anairstation’sweatherconditionsarerepresentedbythemeanpressureandmeantemperatureofnearbyweatherstations.Featuresinthetrainingmodelalsoincludethehourofday,dayofweek,andyeartolearnthetrendandperiodofthetemporalindex.Insummary,21featuresareextractedfromanairstation,aspresentedinTable 2.Wederivedweatherstations’pressure,temperature,andwindspeedfromthefeaturestorepresenttheweatherconditionsatnearbyairstations.Becausewindcanblowfromanydirection,thewindwassplitintothenorthernandeasterndirectionsforsimplicity.Insummary,15featureswereinferredfromeachweatherstation,aspresentedinTable 3.Becausethe580weatherstationsareuniformdistributedinTaiwan,neighboringweatherstationsshouldbeenoughforrepresentingweatherconditionofeachairqualitystation.However,numberofweatherstationsmaybevaryingaccordingtotheenvironmentsandvalidationtests.Toprovideafaircomparisonamongstations,wefixedthenumberas4intheexperimentalsetup.Therefore,weusedoneairstationandfourneighboringweatherstationstogenerate81-dimensionalfeaturevectorsformodellearningandPM2.5forecasting.GBMGBMcombinesfittingfunctions,lossfunctions,adecisiontree,andgradientdescentanalysis24.Thedecisiontreeproducesinitialvaluesforthefittingfunctionwithmultipleregression,whichdealswiththemanyinputvariablesconsideredinthisstudy.Then,errorsbetweentheobserveddatasetsandoutputvaluesarecalculatedusingalossfunction.Thefrequentlyusedlossfunctionsincludesquare-error,absolute-error,andnegativebinomiallog-likelihoodfunctions43,44.Thereafter,gradientdescentanalysisisappliedtofindthefittingfunctionwhoseexpectedvalueoflossfunctionisminimized.Theaforementionedprocedureisrepeatedtoacquiretheoptimizedfittingfunction.Theone-yeardatawererandomlysplitintofivefolds,ofwhichfourwereusedfortrainingmodelsandonewasusedfortestingmodels.Besides,theperiodoftrainingdataischosentocoverawholeyeartoreduceseasonalchangeswhentrainingpredictionmodels.Fortrainingmodels,31featuresofthepredictingday,25featuresofonedaybeforepredictingday,and25featuresoftwodaysbeforepredictingdaycomprise81-dimensionalinputvectorxt,asshowninTables 2and3,wheretisthetimeindexinhour.Specifically,thepredictingday’smeteorologicalfeaturesarecollectedformweatherforecastoftheCWB.Becausethetargetistopredictthenext-24hoursPM2.5,theoutputisone-dimensionalvariabledenotedasyt+24.AfterNpairsoftheinputvectorxtandtheoutputvariableyt+24aregiven,afittingfunctionF(xt)isselectedfromunknownfunctions\(F({{\boldsymbol{x}}}_{t},\beta{\prime})\)producedbythedecisiontree.Inaddition,\(\beta{\prime}\)isagradientdecentstepsizeand\(({{\boldsymbol{x}}}_{t}^{i},{y}_{t+24}^{i})\)isthei-thtrainingsamplepair.Whenthevalueofthelossfunction\(L({y}_{t+24},F({{\boldsymbol{x}}}_{t},\beta{\prime}))\)isminimizedas$$\beta=\arg\{\min}_{\beta{\prime}}{\sum}_{i=1}^{N}L\left({y}_{t+24}^{i},F({{\boldsymbol{x}}}_{t}^{i},\beta{\prime})\right),$$ (1) thetargetfunctionF(xt)ischosentobeF(xt, β).Notethatinthisstudy,Nis211840sincewecollectedone-yeardataand80%ofthemwereselectedfortrainingmodel.Besides,thegradientdescentanalysisisappliedforoptimizedfittingfunctionF(xt).Thedetailedprocessisdescribedasfollows.Atthefirststep,initialguessfunction\({F}_{0}({{\boldsymbol{x}}}_{t},\beta{\prime})\)isgeneratedandinitialgradientdescentstepsizeβ0is$${\beta}_{0}=\arg\{\min}_{\beta{\prime}}\mathop{\sum}\limits_{i=1}^{N}L({y}_{t+24}^{i},{F}_{0}({{\boldsymbol{x}}}_{t}^{i},\beta{\prime}))$$ (2) $${F}_{0}({{\boldsymbol{x}}}_{t})={F}_{0}({{\boldsymbol{x}}}_{t},{\beta}_{0}).$$ (3) Then,wetakegradientoflossfunctionasafirst-stepbaselearnerfunctionf1(xt)as$${f}_{1}({{\boldsymbol{x}}}_{t})=-{\nabla}_{{F}_{0}}L({y}_{t+24},{F}_{0}({{\boldsymbol{x}}}_{t})),$$ (4) $${\beta}_{1}=\arg\,{\min}_{\beta{\prime}}\mathop{\sum}\limits_{i=1}^{N}L({{y}^{i}}_{t+24},[{F}_{0}({{\boldsymbol{x}}}_{t}^{i})+\beta{\prime}{f}_{1}({{\boldsymbol{x}}}_{t}^{i})]).$$ (5) AfterMiterations,thetargetfunctionF(xt)isexpressedas$$F({{\boldsymbol{x}}}_{t})={F}_{0}({{\boldsymbol{x}}}_{t})+\mathop{\sum}\limits_{m=1}^{M}{\beta}_{m}{f}_{m}({{\boldsymbol{x}}}_{t}),$$ (6) wherefm(xt)andβmareexpressedasfollows$${f}_{m}({{\boldsymbol{x}}}_{t})=-{\nabla}_{{F}_{m-1}}L({{\boldsymbol{y}}}_{t+24},{F}_{m-1}({{\boldsymbol{x}}}_{t}))$$ (7) $${\beta}_{m}=\arg\{\min}_{\beta{\prime}}\mathop{\sum}\limits_{i=1}^{N}L({{y}^{i}}_{t+24},[{F}_{m-1}({{\boldsymbol{x}}}_{t}^{i})+\beta{\prime}{f}_{m}({{\boldsymbol{x}}}_{t}^{i})]).$$ (8) Finally,F(xt)istheGB-basedpredictionmodel.Then,duringtheonlinestage,thetestingsamplesareputintothemodeltocalculatetheforecastingresults.TheentireGBMprocedureisdescribedinAlgorithm1.Algorithm1GBM.Fullsizeimage Dataavailability Theairqualitydataandtheweatherconditiondataareavailableinthewebsitesofhttps://taqm.epa.gov.tw/taqm/tw/YearlyDataDownload.aspx/EPAandhttp://farmer.iyard.org/cwb/cwb.htmCWB. References1.Apte,J.S.,Marshall,J.D.,Cohen,A.J.&Brauer,M.Addressingglobalmortalityfromambientpm2.5.Environ.Sci.Technol.49,8057–8066(2015).CAS  ADS  Article  GoogleScholar  2.Conibear,L.,Butt,E.W.,Knote,C.,Arnold,S.R.&Spracklen,D.V.Residentialenergyuseemissionsdominatehealthimpactsfromexposuretoambientparticulatematterinindia.Nat.Commun.9,617(2018).ADS  Article  GoogleScholar  3.Chen,L.etal.Adf:Ananomalydetectionframeworkforlarge-scalepm2.5sensingsystems.IEEEInternet363ThingsJ.5,559–570(2018).Article  GoogleScholar  4.USEnvironmentalProtectionAgency.AirQualitySystemDataMart[internetdatabase]availableviahttps://www.epa.gov/airdata.AccessedAugust01,2018.5.Hu,X.etal.Estimatingpm2.5concentrationsintheconterminousunitedstatesusingtherandomforestapproach.Environ.Sci.&Technol.51,6936–6944(2017).CAS  ADS  Article  GoogleScholar  6.Chu,Y.-C.Amulti-layeredsystemarchitectureforenvironmentalmonitoringdatamanagement-taiwan’sexperience.InEnviroInfo2007Conference(2007).7.Huang,G.,Chen,L.-J.,Hwang,W.-H.,Tzeng,S.&Huang,H.-C.Real-timepm2.5mappingandanomalydetectionfromairboxesintaiwan.Environmetrics 0,e2537E2537env.2537(2018).8.Mahajan,S.,Chen,L.-J.&Tsai,T.-C.Short-termpm2.5forecastingusingexponentialsmoothingmethod:Acomparativeanalysis.Sensors 18(2018).9.Mahajan,S.,Liu,H.,Tsai,T.&Chen,L.Improvingtheaccuracyandefficiencyofpm2.5forecastserviceusingcluster-basedhybridneuralnetworkmodel.IEEEAccess6,19193–19204(2018).Article  GoogleScholar  10.Chang,Y.S.,Lin,K.-M.,Tsai,Y.-T.,Zeng,Y.-R.&Hung,C.-X.Bigdataplatformforairqualityanalysisandprediction.In201827thWirelessandOpticalCommunicationConference(WOCC),1–3(IEEE,2018).11.EnvironmentalProtectionAdministrationExecutiveinTaiwan.Questionsandanswersofairbox.Retrievedfrom,https://taqm.epa.gov.tw/taqm/tw/b0905.aspx.AccessedAugust01,(2018).12.Zannetti,P.Numericalsimulationmodelingofairpollution:Anoverview.TransactionsonEcol.Environ. 1(1993).13.M.Lee,A.,Carver,G.,Chipperfield,M.&McQuaid,J.Three-dimensionalchemicalforecasting:Amethodology.J.Geophys.Res.102,3905–3919(1997).ADS  Article  GoogleScholar  14.Dabberdt,W.F.etal.Meteorologicalresearchneedsforimprovedairqualityforecasting:Reportofthe11thprospectusdevelopmentteamoftheu.s.weatherresearchprogram*.Bull.Am.Meteorol.Soc.85,563–586(2004).ADS  Article  GoogleScholar  15.Carmichael,G.R.etal.Predictingairquality:Improvementsthroughadvancedmethodstointegratemodelsandmeasurements.J.Comput.Phys.Predictingweather,climateandextremeevents227,3540–3571(2008).16.Makar,P.A.etal.Dynamicadjustmentofclimatologicalozoneboundaryconditionsforair-qualityforecasts.AtmosphericChem.Phys.10,8997–9015(2010).CAS  ADS  Article  GoogleScholar  17.Sportisse,B.Areviewofcurrentissuesinairpollutionmodelingandsimulation.Comput.Geosci.11,159–181(2007).MathSciNet  Article  GoogleScholar  18.Niharika,Venkatadri&Rao,P.S.B.Asurveyonairqualityforecastingtechniques.IN.InternationalJournalofComputerScienceandInformationTechnologies5(1),103–107(2014). GoogleScholar  19.Kalapanidas,E.&Avouris,N.Applyingmachinelearningtechniquesinairqualityprediction.Res.(1999).20.Zhao,H.,Zhang,J.,Wang,K.,Bai,Z.&Liu,A.Aga-annmodelforairqualitypredicting.In2010InternationalComputerSymposium(ICS2010),693–699(2010).21.Deleawe,S.,Kusznir,J.,Lamb,B.&Cook,D.Predictingairqualityinsmartenvironments.J.ambient398intelligencesmartenvironments2,145–152(2010).Article  GoogleScholar  22.Yu,R.,Yang,Y.,Yang,L.,Han,G.&Move,O.A.Raq-arandomforestapproachforpredictingairqualityinurbansensingsystems.Sensors 16(2016).23.Kaur,G.,Gao,J.,Chiao,S.,Lu,S.&Xie,G.Airqualityprediction:Bigdataandmachinelearningapproaches.Int.J.Environ.Sci.Dev.9,8–16(2018).Article  GoogleScholar  24.Friedman,J.H.Greedyfunctionapproximation:Agradientboostingmachine.TheAnnalsStat.29,1189–1232(2001).MathSciNet  Article  GoogleScholar  25.Nowack,P.etal.Usingmachinelearningtobuildtemperature-basedozoneparameterizationsforclimatesensitivitysimulations.Environ.Res.Lett.13,104016(2018).ADS  Article  GoogleScholar  26.Keller,C.A.&Evans,M.J.Applicationofrandomforestregressiontothecalculationofgas-phasechemistrywithinthegeos-chemchemistrymodelv10.Geosci.Model.Dev.12,1209–1225(2019).CAS  ADS  Article  GoogleScholar  27.Chuang,M.-T.etal.Simulationoflong-rangetransportaerosolsfromtheasiancontinenttotaiwanbyasouthwardasianhigh-pressuresystem.Sci.totalenvironment406,168–179(2008).CAS  ADS  Article  GoogleScholar  28.KeoniEverington.ChinaissinglelargestsourceofPM2.5pollutioninTaiwan,TaiwanNewshttps://www.taiwannews.com.tw/en/news/3342478(2018).29.Yang,K.-L.Spatialandseasonalvariationofpm10massconcentrationsintaiwan.AtmosphericEnviron.36,3403–3411(2002).CAS  ADS  Article  GoogleScholar  30.Cheng,F.-Y.&Hsu,C.-H.Long-termvariationsinpm2.5concentrationsunderchangingmeteorologicalconditionsintaiwan.Sci.reports9,6635(2019).ADS  Article  GoogleScholar  31.Hsu,C.-H.&Cheng,F.-Y.Classificationofweatherpatternstostudytheinfluenceofmeteorologicalcharacteristicsonpm2.5concentrationsinyunlincounty,taiwan.AtmosphericEnviron.144,397–408(2016).CAS  ADS  Article  GoogleScholar  32.Chen,T.-L.Airpollutioncausedbycoal-firedpowerplantinmiddletaiwan.Int.J.EnergyPowerEng.6,121(2017).Article  GoogleScholar  33.Lu,G.Y.&Wong,D.W.Anadaptiveinverse-distanceweightingspatialinterpolationtechnique.Comput.423Geosci.34,1044–1055(2008).Article  GoogleScholar  34.Tai-Yi,Y.Characterizationofambientairqualityduringaricestrawburningepisode.J.Environ.Monit.14,817–829(2012).Article  GoogleScholar  35.King’sCollegeLondon.DataFeedsforLondonairquality[webAPIs]availablevia,https://www.londonair.org.uk/Londonair/API/.AccessedAugust01,2018.36.Buccolieri,R.,Jeanjean,A.P.,Gatto,E.&Leigh,R.J.Theimpactoftreesonstreetventilation,noxandpm2.5concentrationsacrossheightsinmarylebonerdstreetcanyon,centrallondon.Sustain.CitiesSoc.41,227–241(2018).Article  GoogleScholar  37.RaspisaniyePogodiLtd.WeatherarchiveinLondon[datafiles]availablevia,http://rp5.co.uk/Weather_archive_in_London.AccessedAugust01,2018.38.EnvironmentalProtectionAdministrationinTaiwan.Datasetsofcontinuousemissionmonitoringsystem[internetdatabase]availablevia,https://erdb.epa.gov.tw/DataRepository/Air/Flue_CEMS_DATA.aspx.AccessedAugust01,2018.39.EnvironmentalProtectionAdministrationinTaiwan.Datasetsofairqualitymonitoringstations[datafiles]availablevia,https://taqm.epa.gov.tw/taqm/tw/YearlyDataDownload.aspx/.AccessedAugust01,2018.40.Jung,C.-R.,Hwang,B.-F.&Chen,W.-T.Incorporatinglong-termsatellite-basedaerosolopticaldepth,localizedlandusedata,andmeteorologicalvariablestoestimateground-levelpm2.5concentrationsintaiwanfrom2005to2015.Environ.Pollut.237,1000–1010(2018).CAS  Article  GoogleScholar  41.EnvironmentalProtectionAdministrationinTaiwan.Qualityassuranceofairqualitymonitoring.Retrievedfrom,https://taqm.epa.gov.tw/taqm/en/b0801.aspx.AccessedAugust01,2018.42.CentralWeatherBureauinTaiwan.Datasetsofweatherstation[datafiles]availablevia,http://farmer.iyard.org/cwb/cwb.htm.AccessedAugust01,2018.43.Matheson,I.Acriticalcomparisonofleastabsolutedeviationfitting(robust)andleastsquaresfitting:Theimportanceoferrordistributions.Comput.Chem.14,49–57(1990).Article  GoogleScholar  44.White,G.C.&Bennetts,R.E.Analysisoffrequencycountdatausingthenegativebinomialdistribution.Ecol.77,2549–2557(1996).Article  GoogleScholar  DownloadreferencesAcknowledgementsTheauthorswouldliketoacknowledgethefinancialsupportprovidedbyMinistryofScienceandTechnology,Taiwan(MOST109-2634-F-155-001).AuthorinformationAffiliationsDepartmentofElectricalEngineering,YuanZeUniversity,TaoyuanCity,TaiwanLarryLin, Ting-HsuanYao, Min-HanFei & Shih-HauFangMOSTJointResearchCenterforAITechnologyandAllVistaHealthcare,Taipei,TaiwanLarryLin, Ting-HsuanYao, Min-HanFei & Shih-HauFangFarEasternGroup,Taipei,TaiwanMikeLeeDepartmentofGeography,ChineseCultureUniversity,Taipei,TaiwanChih-YuanChenResearchCenterforInformationTechnologyInnovation,AcademiaSinica,Taipei,TaiwanYuTsaoAuthorsMikeLeeViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarLarryLinViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarChih-YuanChenViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarYuTsaoViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarTing-HsuanYaoViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarMin-HanFeiViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarShih-HauFangViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarContributionsMikeLeeconceivedandconductedtheexperiments,andanalyzedtheresults.LarryLinorganizedthelayoutofarticle,figures,andtables.Chin-YuanChenwasresponsibleforgeographicalandmeteorologicalelucidationandfigures.Ting-HsuanYaoandMin-HanFeimadeclearandunderstandablefigures.YuTsaoandShih-HauFangwroteandcorrectedthearticleandfigures.Alltheauthorsreviewedthemanuscript.CorrespondingauthorCorrespondenceto Shih-HauFang.Ethicsdeclarations Competinginterests Theauthorsdeclarenocompetinginterests. AdditionalinformationPublisher’snoteSpringerNatureremainsneutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations.Rightsandpermissions OpenAccessThisarticleislicensedunderaCreativeCommonsAttribution4.0InternationalLicense,whichpermitsuse,sharing,adaptation,distributionandreproductioninanymediumorformat,aslongasyougiveappropriatecredittotheoriginalauthor(s)andthesource,providealinktotheCreativeCommonslicense,andindicateifchangesweremade.Theimagesorotherthirdpartymaterialinthisarticleareincludedinthearticle’sCreativeCommonslicense,unlessindicatedotherwiseinacreditlinetothematerial.Ifmaterialisnotincludedinthearticle’sCreativeCommonslicenseandyourintendeduseisnotpermittedbystatutoryregulationorexceedsthepermitteduse,youwillneedtoobtainpermissiondirectlyfromthecopyrightholder.Toviewacopyofthislicense,visithttp://creativecommons.org/licenses/by/4.0/. ReprintsandPermissionsAboutthisarticleCitethisarticleLee,M.,Lin,L.,Chen,CY.etal.ForecastingAirQualityinTaiwanbyUsingMachineLearning. SciRep10,4153(2020).https://doi.org/10.1038/s41598-020-61151-7DownloadcitationReceived:09August2019Accepted:20February2020Published:05March2020DOI:https://doi.org/10.1038/s41598-020-61151-7SharethisarticleAnyoneyousharethefollowinglinkwithwillbeabletoreadthiscontent:GetshareablelinkSorry,ashareablelinkisnotcurrentlyavailableforthisarticle.Copytoclipboard ProvidedbytheSpringerNatureSharedItcontent-sharinginitiative Furtherreading ComparingquantileregressionmethodsforprobabilisticforecastingofNO2pollutionlevels SebastienPérezVasseur JoséL.Aznarte ScientificReports(2021) CommentsBysubmittingacommentyouagreetoabidebyourTermsandCommunityGuidelines.Ifyoufindsomethingabusiveorthatdoesnotcomplywithourtermsorguidelinespleaseflagitasinappropriate. DownloadPDF AssociatedContent Collection Top100inEarthScience Advertisement Explorecontent Researcharticles News&Comment Collections Subjects FollowusonFacebook FollowusonTwitter Signupforalerts RSSfeed Aboutthejournal AboutScientificReports Journalpolicies Guidetoreferees Contact CallsforPapers Editor'sChoice GuestEditedCollections ScientificReportsTop1002019 ScientificReportsTop1002018 ScientificReportsTop102018 ScientificReportsTop1002017 EditorialBoardHighlights AuthorHighlights Announcements 10thAnniversaryEditorialBoardInterviews Publishwithus Forauthors Submitmanuscript Search Searcharticlesbysubject,keywordorauthor Showresultsfrom Alljournals Thisjournal Search Advancedsearch Quicklinks Explorearticlesbysubject Findajob Guidetoauthors Editorialpolicies Closebanner Close SignupfortheNatureBriefingnewsletter—whatmattersinscience,freetoyourinboxdaily. Emailaddress Signup IagreemyinformationwillbeprocessedinaccordancewiththeNatureandSpringerNatureLimitedPrivacyPolicy. Closebanner Close Getthemostimportantsciencestoriesoftheday,freeinyourinbox. SignupforNatureBriefing



請為這篇文章評分?