To push green AI forward, the industry must start at source

Urtė Karklienė, Sustainability Manager at Oxylabs, explores how AI’s environmental impact can be minimised without sacrificing model performance.

It’s hard to point to any human advance that didn’t have a cost to nature. From the smoke-filled streets of the first industrial cities to the Great Pacific Garbage Patch, history has many examples of ecological catastrophes. While a few years ago everyone’s eyes were on energy-hungry blockchain mining rigs, today no one can ignore the growing impact AI has on the planet.

Deloitte predicts that the amount of electricity consumed by data centres worldwide will reach over 1,000 terawatt-hours (TWh) by 2030 and 2,000 TWh by 2050, making up nearly 3% of the world’s total electricity consumption.

The culprit? As you can probably guess, Large Language Models (LLMs) require tons of computing power both for training and operations. And while the tools of reducing this impact are there, their implementation is less than straightforward.

The root cause(s) of AI-led environmental impact

While the general public is probably troubled by headlines like “one ChatGPT request consumes 10 times more energy than one Google search” or “generating one AI image costs as much in energy as half a smartphone charge”, the consumer-facing side of AI models is just the tip of the iceberg. And if the reduction in AI-related energy consumption is seen as a pressing issue, we should start at the source. Namely, at the training stage and data consumption.

The massive energy cost is mostly due to the fact that the majority of AI models (including those by OpenAI and DeepMind) on the market today are stochastic in nature. In other words, the results they generate are based on complicated algorithmic calculations that need a lot of power to run. Couple that with the high appetite for data these models have during training, and you can see how that compounds and creates a toll on the environment. To put things in perspective, training GPT-3 produced 552 metric tons of CO₂, equivalent to driving a car for over 2 million kilometers.

The winding road to sustainable AI

Environmental shifts that go beyond buying carbon credits or planting a few trees are hard for any industry. In the case of AI developers, the external pressure to prioritise green solutions comes from investors and governments.

For example, a 2023 survey by McKinsey shows that 35% of investors would consider decreasing their exposure to tech companies due to Environmental, Social, and Governance (ESG) concerns, with the E part playing an important role. Does this mean that performance must be sacrificed in return? Not necessarily, as some viable solutions offer a solid compromise between the two.

First, GPU efficiency must be considered a hygiene factor in the AI industry. This does not mean that companies need to get rid of their existing rigs. A research paper presented at the USENIX Symposium on Networked Systems Design and Implementation explored how a more deliberate use of the GPU power limit caps can reduce energy consumption by up to 75%.

Alongside GPU optimisation, we have another low-hanging fruit that too few AI companies consider. Namely, a more efficient approach to data collection and management, as parsing terabytes of redundant, low-quality data takes its toll on energy consumption, too.

Unfortunately, ‘redundant, low-quality data’ is a synonym for a lot of poorly sourced data from the web. When done right, web scraping can provide clean and efficient data sets, but not all providers of such services prioritise quality over sheer quantity. As the digital economy expands its data footprint by 40% every year, this problem is going to become more acute.

Given that data being generated by today’s increasingly digitised economy is predicted to reach a truly unfathomable 163 trillion gigabytes in 2025, optimising data sets coming from the web will be more important than ever. On the positive side, new algorithmic approaches, such as probabilistic programming, allows for the use of smaller datasets instead of relying on massive-scale data processing.

Just as data quality varies, so does the environmental impact of different energy sources employed in AI business. While we are yet to see the consequences of the US withdrawing from the Paris Agreement in 2025, there are plenty of countries that prioritise green energy on the other side of the pond. Relocating data centres to places with abundant (and potentially cheaper) green energy can make a massive difference. Take Google, whose Finland-based data center runs on 97% carbon-free energy, which would have been an impossible feat to achieve in countries where fossil fuels are still king.

Finally, federated learning emerges as a viable alternative to centralised AI training. Unlike more traditional machine learning, federated learning distributes the computational burden across multiple sources — from servers to smartphones, desktop computers, and so on. A study by the University of Cambridge showed that a federated learning setup, if configured properly, can emit significantly less CO₂ than centralised training methods.

Good intentions & declarations are not enough

Having outlined several technical solutions for a greener AI development, it is important to note that the path to their implementation is anything but straightforward. In an industry where the fear of a “bursting AI bubble” is on everyone’s mind, going green might not be seen as a priority.

First, there’s the harsh reality of diminishing returns. As AI systems grow more complex, their ROI is, unfortunately, not keeping pace with expectations, and most AI companies are still in the red. For instance, in 2024, OpenAI had accumulated approximately $8 billion in expenses, which put it on the brink of bankruptcy in July. Recent projections suggest that these losses could go up to $14 billion by 2026. For reference, the company’s revenue for 2024 stood at just $3.7 billion. Looking at other major players, the situation is not much brighter.

Another strong barrier is the fear of falling behind, which is felt acutely in the West due to China’s increasingly aggressive push into AI development. Thus, prioritising environmental concerns might be seen as a business disadvantage.

Going from performance-first “Red AI” to Green AI might also be technically challenging. Balancing performance and energy efficiency takes more than just moving servers to Finland. Ensuring high performance is still key, unless you want a model that hallucinates or works slower than its competitors. Besides, the technical talent needed to implement probabilistic algorithms and optimise data management is hard to find and costly.

Finally, the industry has yet to adopt a standardised way of defining and measuring AI sustainability. While initiatives like MLPerf Green have made some progress in setting common benchmarks, without consensus among the major players, there is still a lot of room for greenwashing.

To push green AI forward, the industry must start at source

ABOUT US

FOLLOW US