Computational Power and AI

A longread from the AI Now Institute looks at the sustainability of the sudden ‘AI chatbotification’ of the entire world of technology:

Large-scale compute is also environmentally unsustainable: chips are highly toxic to produce and require an enormous amount of energy to manufacture: for example, TSMC on its own accounts for 4.8 percent of Taiwan’s national energy consumption, more than the entire capital city of Taipei. Running data centers is likewise environmentally very costly: estimates equate every prompt run on ChatGPT to the equivalent of pouring out an entire bottle of water.

Could future research directions lead to smaller models? To answer this question, it is helpful to look at why larger models took hold in the first place—and who benefits from perpetuating them. Sara Hooker’s concept of the hardware lottery describes the phenomenon where a research idea wins because it is the most suited to the available hardware and software. In this sense, the hardware and software determine the research direction, not the other way around. Deep neural networks at first represented an idea that was too ahead of its time in hardware terms, and was thus long ignored. It was only when the research on neural networks was combined with massive datasets scraped from the web, the computational resources accrued by Big Tech firms, and the incentive structures introduced by commercial surveillance that we saw the explosion of interest in building artificial intelligence systems. Hooker predicts that due to increasing specialization in computing, the cost of straying from the mainstream, hardware-compatible set of ideas will only increase over time. 

It’s well written, well researched – and definitely worth the read.

Leave a comment