An inventory of industrial solid waste in 337 cities of China: Applying machine learning for data completion – Nature

An inventory of industrial solid waste in 337 cities of China: Applying machine learning for data completion – Nature

 

Report on Industrial Solid Waste in China (1990-2022): A Machine Learning Approach for Sustainable Development Goal Monitoring

Executive Summary

Rapid industrialization in China has resulted in the generation of approximately 60 gigatonnes (Gt) of industrial solid waste (ISW) over the past two decades, posing significant challenges to environmental sustainability and the achievement of the Sustainable Development Goals (SDGs). A critical data gap regarding the temporal and spatial distribution of ISW has hindered effective policymaking. This report details the development of a complete tempo-spatial dataset for ISW generation across all 337 cities in China from 1990 to 2022. By collecting available data from over 500 sources and employing a sophisticated machine learning framework, this research fills a crucial information void. The methodology utilized six distinct machine learning models, with Bayesian optimization applied to ensure the highest performance and resilience for each city’s specific context. The resulting dataset provides not only aggregate ISW amounts but also details on six major subcategories for 2022. This data is instrumental for researchers and policymakers to address challenges related to responsible production (SDG 12), sustainable cities (SDG 11), and the protection of terrestrial ecosystems (SDG 15).

1. Background: Industrialization’s Impact on Sustainable Development

China’s economic expansion has generated an immense volume of solid waste, with ISW accounting for approximately 4 Gt annually. This massive waste stream directly impacts several SDGs. The cumulative stock of 60-70 Gt of ISW has led to widespread environmental degradation, including soil erosion, groundwater contamination, and habitat destruction, which directly contravenes the objectives of SDG 15 (Life on Land) and SDG 14 (Life Below Water). Despite national goals, such as the 73% comprehensive utilization rate targeted in the 13th Five Year Plan, the actual rate was only 57.7% by 2022. This shortfall highlights a critical challenge in achieving SDG 12 (Responsible Consumption and Production), which calls for the environmentally sound management of waste and a substantial reduction in waste generation through prevention, reduction, recycling, and reuse. The absence of a complete, city-level dataset has been a primary obstacle to creating targeted policies that support SDG 11 (Sustainable Cities and Communities) by improving urban waste management infrastructure and practices.

2. Methodology: An Innovative Framework for SDG-Aligned Data

To address the data deficiencies and support evidence-based policymaking, a data-driven, non-interpolation machine learning framework was developed. This approach aligns with SDG 9 (Industry, Innovation, and Infrastructure) by applying innovative technology to build resilient environmental monitoring infrastructure.

2.1. Data Collection and Preparation

The initial phase involved the collection of existing ISW data from over 500 sources, including national, provincial, and municipal statistical yearbooks and bulletins. This process yielded an original dataset covering 337 administrative divisions from 1990-2022, of which 33% of the values were missing, primarily in less developed regions or earlier years.

2.2. Machine Learning Model Implementation

To complete the dataset, six machine learning models were employed, chosen for their complementary strengths in handling complex, non-linear environmental data:

  • KNeighborsRegressor (KNN)
  • LGBMRegressor (Light Gradient Boosting Machine)
  • RandomForestRegressor (RF)
  • MLPRegressor (Multilayer Perceptron)
  • Extreme Gradient Boosting (XGB) Regressor
  • Decision Trees (DT)

2.3. Bayesian Optimization and Model Selection

Bayesian optimization was integrated to systematically tune model hyperparameters, enhancing predictive accuracy and resilience against noise from incomplete data. This automated process identified the optimal model configuration for each city. The performance of each model was evaluated using the correlation coefficient (R²) and mean square error (MSE) to select the best-performing model for completing the missing data points in each specific municipal context.

3. Data Records and Key Findings

The complete dataset, “Industrial solid waste dataset in China 1990–2022,” provides a comprehensive inventory essential for tracking progress towards the SDGs. The data reveals that ISW generation is strongly correlated with industrial development and urbanization, with resource-rich and economically advanced areas showing the largest increases.

3.1. Aggregate ISW Generation (1990-2022)

The completed dataset offers a continuous time-series of ISW generation for all 337 Chinese cities. This record is foundational for analyzing regional disparities, evaluating the effectiveness of waste reduction programs, and modeling future waste streams under different economic and policy scenarios, thereby supporting strategic planning for SDG 11 and SDG 12.

3.2. Major ISW Subcategories (2022)

For the year 2022, the dataset provides city-level generation data for the six largest ISW subcategories, which are critical for targeted recycling and circular economy initiatives:

  1. Metallurgical slags
  2. Fly ash
  3. Furnace slags
  4. Coal gangue
  5. Tailings
  6. Desulfurization gypsum

This detailed breakdown enables a more granular approach to waste management, promoting the circular economy principles embedded in SDG 12.

4. Implications for Sustainable Development

This dataset is a powerful tool for advancing China’s sustainable development agenda. Its applications are directly relevant to multiple SDGs.

  • SDG 12 (Responsible Consumption and Production): The dataset enables precise tracking of industrial waste, which is the first step toward its environmentally sound management. It allows policymakers to identify hotspots of waste generation, assess the potential for secondary material utilization, and design effective policies to promote a circular economy.
  • SDG 11 (Sustainable Cities and Communities): By providing city-level data, the inventory helps urban planners and environmental agencies design targeted interventions, optimize waste management infrastructure, and reduce the adverse per capita environmental impact of cities.
  • SDG 9 (Industry, Innovation, and Infrastructure): The methodology itself represents an innovation in environmental data science. The dataset can help industries identify opportunities for resource efficiency and cleaner production, fostering sustainable industrialization.
  • SDG 15 (Life on Land): By facilitating better management and reduction of ISW, the data contributes to mitigating land degradation, soil pollution, and habitat loss associated with improper waste disposal.

5. Limitations and Future Directions

While this study provides a foundational dataset, certain limitations exist, including potential inconsistencies in source data and the difficulty of predicting random business changes. Future work should focus on integrating real-time data from IoT and remote sensing to enhance accuracy and dynamism. Advanced deep learning frameworks could be employed to better capture spatial and temporal complexities. This dataset serves as a cornerstone for future research aimed at developing more effective waste management strategies and promoting sustainable, low-carbon development pathways, fostering the partnerships required to achieve the goals (SDG 17).

Analysis of Sustainable Development Goals in the Article

1. Which SDGs are addressed or connected to the issues highlighted in the article?

The article on China’s industrial solid waste (ISW) connects to several Sustainable Development Goals (SDGs) by addressing the environmental and developmental challenges posed by rapid industrialization.

  1. SDG 9: Industry, Innovation, and Infrastructure

    The article’s core subject is the massive quantity of waste generated by the “rapid industrialization of China.” This directly relates to SDG 9, which aims to build resilient infrastructure, promote inclusive and sustainable industrialization, and foster innovation. The challenge of managing industrial waste is a key aspect of making industrial processes more sustainable.

  2. SDG 11: Sustainable Cities and Communities

    The research focuses on creating a “complete tempo-spatial dataset of ISW generation in China at city-level,” covering all 337 cities. It links ISW generation to “rapid urbanization” and the expansion of industrial zones in urban centers. This aligns with SDG 11, which seeks to make cities and human settlements inclusive, safe, resilient, and sustainable, particularly concerning waste management within urban environments.

  3. SDG 12: Responsible Consumption and Production

    This is the most directly relevant SDG. The article details the generation of “a colossal amount of solid waste” and discusses its management, including utilization, landfilling, and recycling. It explicitly mentions China’s goal for a “73% comprehensive utilization rate” for ISW. This directly addresses the core principles of SDG 12, which promotes sustainable consumption and production patterns, including the environmentally sound management of waste.

  4. SDG 15: Life on Land

    The article states that the accumulation of ISW has “led to a variety of environmental degradations from soil erosion, groundwater contamination, to habitat destruction and biodiversity loss across the country.” This directly connects the issue of industrial waste to the degradation of terrestrial ecosystems, which SDG 15 aims to protect, restore, and promote the sustainable use of.

2. What specific targets under those SDGs can be identified based on the article’s content?

Based on the issues discussed, several specific SDG targets can be identified:

  • Target 9.4: Upgrade infrastructure and retrofit industries to make them sustainable

    The article’s focus on managing the by-products of industrial activity implies a need to retrofit industries for sustainability. The creation of a dataset to “help researchers and policymakers recognize and address challenges brought by industrial waste” is a foundational step toward achieving this target by enabling better planning for cleaner production technologies and waste management infrastructure.

  • Target 11.6: Reduce the adverse per capita environmental impact of cities, including by paying special attention to… waste management

    The research directly supports this target by developing a city-level ISW dataset. The article notes that “ISW generation was largely driven by industrial development coupled with rapid urbanization.” By providing detailed data for 337 cities, the study enables the monitoring and management of waste, which is a critical component of reducing the environmental footprint of cities.

  • Target 12.4: Achieve the environmentally sound management of chemicals and all wastes throughout their life cycle

    The article is centered on the management of industrial solid waste. It discusses the massive accumulation of ISW stocks (60-70 Gt) that need to be “environmentally safely treated” and the environmental degradation resulting from improper management. The entire study is an effort to improve the “environmentally sound management” of this waste stream.

  • Target 12.5: Substantially reduce waste generation through prevention, reduction, recycling and reuse

    This target is explicitly addressed when the article mentions that “only 57.7% of ISW was reused by 2022,” falling short of the 73% goal. The discussion of utilizing ISW as “secondary materials” and the development of a dataset to “evaluate waste reduction programs” and “promote circular economy initiatives” directly align with the principles of reduction, recycling, and reuse.

  • Target 15.1: Ensure the conservation, restoration and sustainable use of terrestrial and inland freshwater ecosystems

    The article links improper ISW management to severe environmental consequences like “soil erosion, groundwater contamination, to habitat destruction and biodiversity loss.” By providing data to better manage this waste, the study contributes indirectly to mitigating these pressures and protecting terrestrial ecosystems, which is the goal of Target 15.1.

3. Are there any indicators mentioned or implied in the article that can be used to measure progress towards the identified targets?

Yes, the article mentions and implies several quantitative and qualitative indicators that can be used to measure progress.

  • Indicator for Target 12.5 (and 11.6): National recycling rate, tons of material recycled (12.5.1) / Proportion of solid waste regularly collected and with adequate final discharge out of total waste generated, by cities (related to 11.6.1)

    • Total amount of ISW generated: The article provides specific figures, such as “some 4 Gt/a” and a cumulative stock of “60–70 Gt.” The dataset itself is designed to provide this indicator at a city level from 1990-2022.
    • Comprehensive utilization rate of ISW: The article explicitly states this indicator, noting that “only 57.7% of ISW was reused by 2022,” against a target of 73%. This is a direct measure of recycling and reuse efforts.
    • Generation of specific ISW subcategories: The dataset includes data on six major subcategories (metallurgical slags, fly ash, etc.), allowing for more detailed tracking of waste streams, which is crucial for targeted recycling and management strategies.
  • Indicator for Target 9.4: Material intensity/footprint of industrial production

    • ISW generation per unit of industrial output: While not calculated, the article provides the necessary data points by stating that “the industrial value-added in China had increased nearly 200 times until 2022,” while “ISW generation, nonetheless, grew by a factor of 8.4.” This allows for an analysis of the material efficiency of China’s industrialization, a key aspect of sustainable industry.
  • Indicator for Target 12.4: Environmentally sound management of waste

    • Proportion of ISW data missing from official statistics: The article mentions that in the original dataset, “33% are missing.” Reducing this data gap through methods like machine learning is an indicator of improved monitoring and management capacity, which is a prerequisite for environmentally sound management.

4. Table of SDGs, Targets, and Indicators

SDGs Targets Indicators Identified in the Article
SDG 9: Industry, Innovation, and Infrastructure 9.4: By 2030, upgrade infrastructure and retrofit industries to make them sustainable, with all countries taking action in accordance with their respective capabilities.
  • Ratio of ISW generation growth (factor of 8.4) to industrial value-added growth (factor of 200), implying a measure of material efficiency.
  • The dataset itself, which enables tracking of industrial waste as a measure of industrial environmental performance.
SDG 11: Sustainable Cities and Communities 11.6: By 2030, reduce the adverse per capita environmental impact of cities, including by paying special attention to air quality and municipal and other waste management.
  • City-level ISW generation data for 337 cities (1990-2022).
  • Data on the increase rate of ISW in cities, linked to urbanization.
SDG 12: Responsible Consumption and Production 12.4: By 2020, achieve the environmentally sound management of chemicals and all wastes throughout their life cycle… and significantly reduce their release to air, water and soil.

12.5: By 2030, substantially reduce waste generation through prevention, reduction, recycling and reuse.

  • Total annual generation of ISW (approx. 4 Gt/a).
  • Cumulative ISW stocks requiring treatment (60-70 Gt).
  • Comprehensive utilization (recycling/reuse) rate of ISW (57.7% in 2022).
  • National goal for utilization rate (73%).
  • Generation data for six major subcategories of ISW (metallurgical slags, fly ash, etc.).
  • Proportion of missing data in official statistics (33%), indicating monitoring capacity.
SDG 15: Life on Land 15.1: By 2020, ensure the conservation, restoration and sustainable use of terrestrial and inland freshwater ecosystems and their services.
  • Qualitative mention of environmental degradation caused by ISW: “soil erosion, groundwater contamination, to habitat destruction and biodiversity loss.” The ISW generation data serves as an indirect indicator of pressure on these ecosystems.

Source: nature.com