Intelligent prediction of air quality index based on the transformer-BiLSTM model – Nature
Executive Summary
Air quality is a critical determinant for achieving multiple Sustainable Development Goals (SDGs), particularly SDG 3 (Good Health and Well-being) and SDG 11 (Sustainable Cities and Communities). Accurate forecasting of the Air Quality Index (AQI) is essential for effective environmental management, public health protection, and industrial stability. This report details the development and validation of a hybrid deep learning model, integrating a Transformer encoder with a Bidirectional Long Short-Term Memory (BiLSTM) network, to advance data-driven air pollution control strategies. This technological innovation directly supports SDG 9 (Industry, Innovation, and Infrastructure) by providing an advanced tool for environmental monitoring.
The model was trained and validated using daily air quality data from Shijiazhuang, Beijing, and Tianjin (November 2013 – February 2025). The proposed Transformer-BiLSTM model demonstrated stable and reliable predictive performance, achieving:
- Root Mean Squared Error (RMSE): 3.0012 ug/m³
- Mean Absolute Error (MAE): 1.7928 ug/m³
- Mean Absolute Percentage Error (MAPE): 3.3646%
The model’s enhanced accuracy and generalization capability offer a reliable tool for AQI forecasting, providing quantitative support for policies aimed at creating healthier and more sustainable urban environments.
1. Introduction: Air Quality Management as a Cornerstone of Sustainable Development
1.1. The Global Impact of Air Pollution on Health and Sustainability
Air pollution is a paramount global challenge, directly undermining progress toward the 2030 Agenda for Sustainable Development. The World Health Organization (WHO) links approximately 7 million premature deaths annually to air pollution, positioning air quality management as a critical priority for achieving SDG 3 (Good Health and Well-being). Rapid urbanization and industrialization have increased emissions of pollutants like PM2.5 and O3, posing severe threats to public health and restricting the sustainable development of cities, a core focus of SDG 11.
1.2. Air Quality Challenges in the Beijing-Tianjin-Hebei Region
The Beijing-Tianjin-Hebei region serves as a critical case study. Between 2013 and 2020, the annual average PM2.5 concentration frequently exceeded national standards by more than double. Such elevated pollution levels are a significant environmental risk factor, contributing to cardiovascular and respiratory diseases and impeding the creation of safe, resilient, and sustainable urban communities as envisioned by SDG 11.
1.3. Evolution of Forecasting Methodologies for Sustainable Governance
The evolution of AQI prediction methodologies reflects a shift towards more sophisticated, data-driven approaches necessary for modern environmental governance. While traditional statistical and machine learning models provided foundational insights, they often struggled to capture the complex, nonlinear dynamics of atmospheric processes. The advent of deep learning and hybrid frameworks represents a significant innovation (SDG 9), offering more powerful tools to address these challenges and support proactive policy-making.
2. A Hybrid Deep Learning Framework for Enhanced AQI Prediction
2.1. Model Architecture: Integrating Transformer and BiLSTM
To address the limitations of existing models, this study proposes a hybrid Transformer-BiLSTM model. This architecture is designed to capture both long-range and short-term temporal dependencies in air quality data, providing a more comprehensive and accurate forecasting tool. The model’s structure consists of four primary components:
- Positional Encoding Module
- Transformer Encoder Layer (for global temporal dependencies)
- BiLSTM Decoder Layer (for local bidirectional patterns)
- Fully Connected Output Layer
This innovative integration provides a robust framework for generating reliable data to inform public health advisories (SDG 3) and urban planning (SDG 11).
2.2. Data and Experimental Design
The model was developed using publicly available data from the China National Environmental Monitoring Center (CNEMC) and a historical weather database, covering Beijing, Tianjin, and Shijiazhuang from November 2013 to February 2025. The dataset included daily average concentrations of six major pollutants (PM2.5, PM10, SO2, NO2, CO, O3) and the corresponding AQI. A rigorous preprocessing pipeline involving noise reduction, normalization, and data augmentation was implemented to ensure model robustness.
2.3. Evaluation Metrics
Model performance was quantitatively assessed using the following standard metrics to ensure a comprehensive and objective evaluation:
- Root Mean Squared Error (RMSE)
- Mean Absolute Error (MAE)
- Mean Absolute Percentage Error (MAPE)
- Coefficient of Determination (R²)
3. Performance Analysis and Implications for Policy
3.1. Superior Predictive Performance
The Transformer-BiLSTM model demonstrated superior performance compared to baseline models across all three cities. In Beijing, the model achieved an RMSE of 3.0012 ug/m³, MAE of 1.7928 ug/m³, and an R² of 0.9694. Similar high-accuracy results were observed in Tianjin and Shijiazhuang. Statistical significance tests (p SDG 11.
3.2. Model Interpretability for Targeted Interventions (SHAP Analysis)
To ensure the model’s utility for policy-making, SHapley Additive exPlanations (SHAP) analysis was conducted to identify the most influential pollutants. The analysis revealed that PM10, PM2.5, and O3 were consistently the most dominant factors driving AQI predictions across all three cities. This interpretability allows policymakers to develop targeted, evidence-based pollution control strategies, directly contributing to public health protection (SDG 3) and the creation of healthier urban environments (SDG 11).
3.3. Generalization Capability and Robustness
The model’s generalization capability was tested across six additional Chinese cities with diverse geographical and climatic characteristics (Chengdu, Xi’an, Shenyang, Wulumuqi, Shanghai, and Guangzhou). The model maintained strong predictive performance (R² > 0.92 in five of the six cities), demonstrating its robustness and potential for wide-scale deployment. This scalability is crucial for developing regional and national air quality management systems aligned with the ambitions of SDG 11 and SDG 3.
4. Discussion: Advancing Data-Driven Sustainable Development
The superior performance of the Transformer-BiLSTM model stems from its synergistic architecture. The Transformer encoder effectively captures long-term, global trends (e.g., seasonal patterns), while the BiLSTM decoder models short-term, local dynamics (e.g., diurnal variations). This comprehensive approach provides a more accurate representation of air quality dynamics than single-architecture models.
By delivering highly accurate and interpretable AQI forecasts, this research provides a powerful tool for data-driven environmental governance. It enables authorities to issue timely public health warnings, implement short-term pollution mitigation measures, and inform long-term urban and industrial planning. This directly supports the creation of sustainable, resilient, and healthy cities as mandated by SDG 3 and SDG 11.
5. Conclusion and Future Directions
This study successfully developed and validated a Transformer-BiLSTM model that significantly improves the accuracy of AQI prediction. The model’s high performance, interpretability, and generalization capability make it a valuable asset for environmental agencies and policymakers working to achieve key Sustainable Development Goals.
Future research will focus on enhancing the model’s contribution to sustainable development by:
- Integrating Multi-Source Data: Incorporating meteorological, satellite, and socio-economic data to create a more holistic forecasting system that can better inform integrated urban planning (SDG 11).
- Improving Computational Efficiency: Developing lightweight versions of the model to enable real-time deployment in resource-constrained environments, broadening its global applicability.
- Enhancing Explainability: Extending the use of explainable AI to provide deeper, causal insights for crafting more effective and equitable environmental policies.
These advancements will further establish this framework as a deployable and powerful tool for intelligent air quality management, contributing to a healthier and more sustainable future for all.
Analysis of Sustainable Development Goals in the Article
1. Which SDGs are addressed or connected to the issues highlighted in the article?
- SDG 3: Good Health and Well-being: The article directly links air pollution to severe public health issues. It states that “Air quality significantly impacts public health” and cites the World Health Organization (WHO) estimate that “approximately 7 million premature deaths annually are attributable to the combined effects of ambient and household air pollution.” It also mentions that elevated pollution levels contribute to “increased incidence and severity of cardiovascular diseases, respiratory disorders, and various other health complications.”
- SDG 11: Sustainable Cities and Communities: The research focuses on air quality in major urban areas (Beijing, Tianjin, and Shijiazhuang), which is a core component of urban sustainability. The article notes that air pollution “restricts the sustainable development of the city” and that rapid urbanization is a key contributor to the problem. The development of an accurate Air Quality Index (AQI) forecasting tool is presented as a method for “effective environmental monitoring and management” in cities.
- SDG 9: Industry, Innovation, and Infrastructure: The article identifies “rapid advancement of urbanization and industrialization” as a primary cause of increased air pollutant emissions. The core contribution of the study is the development of an innovative technological solution—a “hybrid deep learning model that integrates a Transformer encoder with a Bidirectional Long Short-Term Memory (BiLSTM) network”—to address this environmental challenge. This aligns with the goal of fostering innovation to make infrastructure and industries more sustainable.
2. What specific targets under those SDGs can be identified based on the article’s content?
-
SDG 3: Good Health and Well-being
- Target 3.9: By 2030, substantially reduce the number of deaths and illnesses from hazardous chemicals and air, water and soil pollution and contamination. The article’s entire premise is built on mitigating the health impacts of air pollution. It explicitly mentions that PM2.5 “damages the respiratory system and cardiovascular function” and references the “7 million premature deaths annually” linked to air pollution, directly addressing the core concern of this target.
-
SDG 11: Sustainable Cities and Communities
- Target 11.6: By 2030, reduce the adverse per capita environmental impact of cities, including by paying special attention to air quality. The study is centered on forecasting the Air Quality Index (AQI) in the Beijing–Tianjin–Hebei urban region. It highlights that the “annual average PM2.5 concentration in this region frequently exceeded twice the National Ambient Air Quality Standard,” demonstrating a clear focus on reducing the adverse environmental impact of cities by improving air quality management.
-
SDG 9: Industry, Innovation, and Infrastructure
- Target 9.4: By 2030, upgrade infrastructure and retrofit industries to make them sustainable…and greater adoption of clean and environmentally sound technologies and industrial processes. While the article identifies industrialization as a source of pollution, its main contribution is the development of an advanced technological tool (the Transformer-BiLSTM model). This model provides “quantitative support for data-driven air pollution control strategies,” which is a critical innovation for managing and ultimately reducing industrial pollution, thereby supporting the transition to more sustainable industrial practices.
3. Are there any indicators mentioned or implied in the article that can be used to measure progress towards the identified targets?
-
Target 3.9
- Indicator 3.9.1 (Mortality rate attributed to household and ambient air pollution): The article directly references this indicator by citing the WHO’s estimate of “7 million premature deaths annually” due to air pollution. This statistic serves as a baseline measure of the problem that the research aims to help mitigate.
-
Target 11.6
- Indicator 11.6.2 (Annual mean levels of fine particulate matter (e.g. PM2.5 and PM10) in cities): The article is fundamentally based on measuring and forecasting pollutants that define urban air quality. It explicitly uses “daily average concentrations of six major atmospheric pollutants,” including PM2.5 and PM10, as the core data for its model. The study notes that in the Beijing-Tianjin-Hebei region, the “annual average PM2.5 concentration…frequently exceeded” standards, making this a central indicator for the research. The Air Quality Index (AQI) itself is a composite measure derived from these pollutant levels.
-
Target 9.4
- Concentrations of industrial pollutants (SO2, NO2): The article includes Sulphur Dioxide (SO2) and Nitrogen Dioxide (NO2) in its dataset of key pollutants. These are often used as proxies for industrial emissions and the environmental performance of industries. Monitoring and forecasting these pollutants, as done in the study, provides data to measure the effectiveness of pollution control strategies aimed at making industries cleaner.
- Development of advanced environmental monitoring technology: The creation and successful validation of the Transformer-BiLSTM model itself serves as an indicator of progress in innovation. The model’s high accuracy (RMSE of 3.0012 ug/m3) and efficiency (“lightweight model (14.537MB, 1.361ms inference)”) demonstrate the development of an advanced, “environmentally sound” technology for pollution management.
4. SDGs, Targets, and Indicators Summary
| SDGs | Targets | Indicators |
|---|---|---|
| SDG 3: Good Health and Well-being | 3.9: Substantially reduce deaths and illnesses from air pollution. |
|
| SDG 11: Sustainable Cities and Communities | 11.6: Reduce the adverse per capita environmental impact of cities, paying special attention to air quality. |
|
| SDG 9: Industry, Innovation, and Infrastructure | 9.4: Upgrade infrastructure and retrofit industries to make them sustainable and adopt clean and environmentally sound technologies. |
|
Source: nature.com
What is Your Reaction?
Like
0
Dislike
0
Love
0
Funny
0
Angry
0
Sad
0
Wow
0
