Interview with Dr Sasha Luccioni and Irene Solaiman
Open source helps with the recycling of models. Instead of training transformer models once, you can reuse them. All the pretrained models on Hugging Face can be fine-tuned for specific use cases. That’s definitely more environmentally friendly than creating a model from scratch. Several years ago, the main approach was to accumulate as much data as possible to train a model, which would then not be shared. Now, data-intensive models are shared after training. People can reuse and retune them according to their particular use cases.
With the size of transformer and AI models growing bigger and bigger, the entry barrier for joining the AI community is becoming correspondingly high, especially for countries that don’t have access to the extremely powerful computers being used to create these models. Hugging Face has several offers available for such cases – for example, the ability to query a large language model using an API, so you don’t need to run it on your own computer. This makes such models more accessible.
The regulations pertaining to AI that have been issued in recent years haven’t focused particularly on sustainability. Measuring carbon emissions has likewise not been prioritized, but there aren’t a lot of tools available to adequately measure them. We find ourselves in a dilemma: There is an urgent need for policymakers to up the pressure, but to do so, they need emissions data. Political regulations, however, do not currently include a requirement to deploy tools for measuring emissions – which means that policymakers don’t have the data they need.
The European Union’s Artificial Intelligence Act is one of the most robust and prominent approaches to regulating AI in the public’s interest. A lot of policies and regulations are necessarily coming from countries with higher gross domestic products, such as the Algorithmic Accountability Act in the United States and the AI and Data Act in Canada. The Algorithmic Accountability Act does not explicitly include sustainability, but I appreciate the emphasis it places on impact assessments. Decision-makers need more guidance on the impact of AI systems, including CO2 emissions. Such information will give them a greater understanding for the importance of developing appropriate tools.
These models are trained on data scraped from all over the internet. By avoiding a specific and limited data source, they’re supposed to be relatively impartial. But when you use them in a downstream AI application, outputs are generated that you may not have expected. To figure out where potential biases could emerge, you have to make AI applications take decisions or make predictions in different situations. We’ve been working on ways of prompting the models by giving them bits of text and making them complete them – based on a pronoun, for example as in “She should work as” and “He should work as.” If a model continues, “She should work as a nurse” and “He should work as a computer scientist,” you can immediately see how biased, how toxic, it is. Such negative stereotypes are one example of system bias, which we can document for every AI model by creating a report card.
Most of the emissions numbers we have are from training. We don’t have many numbers from deployment. A lot of people are interested in how much CO2 will be emitted through deployment, but that’s extremely complicated, because it depends on a number of factors, including the hardware you’re using and where the computing is being done. Without knowing those factors, it’s impossible to provide information on the emissions. In order to do so, you would need to evaluate different architectures, different models, different GPUs, etc. Still, a lot of people would find such information extremely useful.
If people start using tools to measure the emissions of their ML models and disclose that information, we can start making decisions about AI models based on facts and figures. Tools like Code Carbon calculate a model’s carbon footprint in real-time. It’s a program that runs in parallel to any code and will estimate the carbon emissions at the end. We also run a website allowing you to enter information like training hours and the type of hardware used. It then provides an estimate of the system’s carbon footprint. It is less precise than Code Carbon, but it still gives you an idea.
I think that bottom-up approaches work, especially in terms of research. At conferences, we are constantly asked for more information. But there’s the issue of reproducibility: A lot of research can’t be reproduced because it is highly contingent upon specific factors. This is something the AI community has been trying to tackle by implementing certain guidelines. If you submit a paper, you have to disclose parameters X, Y and Z. You also have to make your code and data freely available. In terms of sustainability, there have to be similar measures in place pertaining to efficiency or accuracy. Only then can we compare different models. We have to provide a technical procedure that a broader community can adopt.
A lot of policy conversations I’ve been involved in have focused on lowering the regulatory burden on small and medium-sized enterprises, since these companies have fewer resources than Big Tech. Since smaller companies are less likely to have the infrastructure for analyzing their carbon emissions, we can’t expect them to be responsible for monitoring them.
Something we’ve been working on is documentation. We need more guidance from policy institutions on what, specifically, would be helpful to report over and above the information included on model cards. A lot of governments have been asking the industry for more information about models without specifying what aspects of AI sustainability the industry and developers should report. And we also need to know how to report that information in ways that are understandable to high-level policymakers, who may not have a technical background. Developers definitely need more information if we want them to think about how their systems can become more sustainable.
When we look at the infrastructure, there are both positive and negative developments. Hardware development is making rapid progress when it comes to computing efficiency. If you compare a GPU from this year to one built two or three years ago, there’s a significant difference. It’s literally 10 times faster. But with this positive development comes a negative one, because that efficiency leap means that people are doing more computing. It’s a Catch-22. If we kept the size of our models and the amount of computation needed at a constant level, we would definitely be going in the right direction. But since both are growing so fast, it’s hard to say where we might end up. I do see cloud providers taking advantage of carbon offsetting, and some are switching to renewable energy sources. On the other hand, though, the concept of “the bigger the better” in AI modeling is getting out of hand.