The dirty secret of Generative AI – massive carbon footprint

Every new chatbot and image generator requires a lot of electricity, which means technology may be largely responsible for global warming

The race to build high-performance, AI-powered search engines is likely to require a dramatic rise in computing power, and with it a massive increase in the amount of energy that tech companies require and the amount of carbon they emit. The creation of every new chatbot and image generator requires a lot of electricity, which means the technology may be responsible for a massive and growing amount of planet-warming carbon emissions.

Researchers estimates that the training of GPT-3, which ChatGPT is partly based on, consumed 1,287 MWh, and led to emissions of more than 550 tons of carbon dioxide equivalent – the same amount as a single person taking 550 roundtrips between New York and San Francisco. Data centres already account for around one percent of the world’s greenhouse gas emissions, according to the International Energy Agency. That is expected to rise as demand for Cloud computing increases.

Microsoft, Alphabet, Googleand ChatGPT maker OpenAI use Cloud computing that relies on thousands of chips inside servers in massive data centres across the globe to train AI algorithms called models, analysing data to help them “learn” to perform tasks. The success of ChatGPT has other companies racing to release their own rival AI systems and chatbots or building products that use large AI models to deliver features to anyone from Instacart shoppers to Snap users.

The compute demand of neural networks is insatiable. The larger the network, the better the results, and the more problems you can solve. Energy usage is proportional to the size of the network. Therefore, energy efficient inference is absolutely essential to enable the adoption of more and more sophisticated neural networks and enhanced use-cases, such as real-time voice and vision applications.

Power hungry data centres

Hyperscaler companies are trying to get better and more accurate voice recognition, speech recognition, recommendation engines. The higher accuracy they can get, the more clients they can service, and they can generate more profitability. You look at data centre training and inference of these very large NLP models, that is where a lot of power is consumed.

There are a growing number of smart edge devices that are contributing to the problem, as well. There are billions of devices that make up the IoT and, at some point in the not-too-distant future, they are going to use more power than we generate in the world.

In the past, the tech world relied on semiconductor scaling to make things more energy-efficient. But process technology is approaching the limits of physics. Transistor width is somewhere between 10 and 20 lattice constants of silicon dioxide. We have more wires with stray capacitance, and a lot of energy is lost charging and discharging these wires.

Generative AI energy requirements increased 18000X

Generative AI or for that matter any AI involves a lot of training which consumes a lot of power. A lot of energy is consumed when you’re iterating over the same dataset multiple times. The amount of energy consumed by doing that is increasing rapidly. The amount of energy taken to train a model two years back, they were in the range of 27 kilowatt hours for some of the transformer models. If you take the transformers today, it is more than half a million-kilowatt hours. The number of parameters went from maybe 50 million to 200 million. The number of parameters went up four times, but the amount of energy went up over 18,000X. At the end of the day, what it boils down to is the carbon footprint and how many pounds of CO₂ this creates.

Moving an application from the Cloud to the edge may be done for many different reasons. We are seeing a desire for more AI at the edge, we are seeing a desire for more mission critical applications at the edge rather than AI as a stamp on the outside of the box. The AI is actually doing something useful in the device, ratherthan just being there.

Models are getting larger in an attempt to gain more accuracy, but that trend must stop because the amount of power that it is consuming is going up disproportionately. While the Cloud can afford that today because of its business model, the edge cannot. And as more companies invest in edge applications, we can expect to see a greater regard for energy optimisation. Some companies are looking at reductions of 100x in the next 5 years, but that is nowhere near enough to stop this trend.

Know more about the syllabus and placement record of our Top Ranked Data Science Course in Kolkata, Data Science course in Bangalore, Data Science course in Hyderabad, and Data Science course in Chennai.

Data Science Course in Kolkata