Cloud AI's energy problem has a solution and it's closer than you think
As AI adoption accelerates, organizations face growing challenges with energy consumption and sustainability. While cloud-based infrastructure remains the dominant model, technical leaders are increasingly evaluating alternative architectures that might better address these concerns. And AI's energy appetite is growing. Today's model for deploying AI relies mainly on cloud computing, which puts unbelievable stress on electrical grids - just when we should be thinking about "electrifying everything" to get off fossil fuels.
Data centers currently consume an estimated 1 to 2 percent of global electricity, and according to recent Goldman Sachs Research, that demand will increase by a staggering 165 percent by 2030. Much of this increase will be driven not just by AI but by the kinds of machine-learning tasks that do the most work - neural networks. Meanwhile, Organisations racing to implement AI are mostly clueless about the environmental costs of their new capabilities.
The hidden energy cost of cloud-based AI
Cloud-based AI offers many advantages in terms of scalability and resource pooling, but presents certain efficiency challenges related to power consumption and data transfer, particularly as deployment scales increase.
It has several significant inefficiencies:
- Data movement costs: Moving data to the cloud and waiting for it to return requires much power. Just getting data to the cloud requires about three times as much energy as if you had done the computation close to where the data was generated - for example, on the edge.
- Always-on infrastructure: Even when cloud-based AI isn't doing much, it's already using a lot of power, just being up and running. It's almost like a light that's always on. With AI delivered in the current cloud-first model, we also provision it to handle peak loads, meaning it has to be ready and capable of handling much more at once than it typically does.
- GPUs versus edge inference devices: On the cloud, we always use 300-700W GPUs. The cool AI work we do on the edge can be done on devices typically using 5-50W, meaning those devices are roughly as efficient as they need to be to compensate for the inefficiencies inherent in the cloud.
- Cooling and power consumption: Actual data center energy consumption isn't just the result of power-hungry CPUs and GPUs. These concentrated computing resources require a lot of cool air pumped through quite a few fans - and fancy refrigerants - that further up the overall power consumption numbers.
The edge computing alternative
One alternate approach, edge computing, that refers to processing data on the device or near it where it is created is such a mechanism that existing research indicates can lower energy consumption in certain deployment scenarios. Alternatives, such as maximizing the efficiency of the data center, designing algorithms for lower power and energy, and hybrid architectures also have been considered.
By analyzing data at the collection point, energy-intensive data transfers are largely eliminated, while modern frameworks enable dynamic resource management that powers down processing units when they're not actively generating outputs. This distributed approach - of which edge AI is but one example - spreads computation across many smaller devices rather than concentrating it in a few massive data centers. And it doesn't even require all devices to be part of the same network all the time; edge AI devices can serve as network nodes, communicating with other underground networked devices. Up through 2020, being 'in the cloud' has primarily meant having access to the vast resources of a few enormous data centers. But as I said, that's not the only way to do distributed computing or even the best way if we think along the lines of energy efficiency.
Even though edge computing can promise ecological advantages, its deployment also has its cons. There can be large hardware costs with edge deployments, limitations on processing complex models, and security concerns with it being integrated into the network. Businesses will need to weigh their use cases and decide where edge computing has true benefits over centralized processing.
The distributed processing continuum
Industry experts now advocate for a strategic, multi-tiered approach to AI processing - viewing computing as a continuum rather than a binary cloud-or-edge choice. This approach positions computational resources all along the spectrum from the data source to the centralized infrastructure, optimizing for energy efficiency, latency, and the all-important performance requirements in a way that feels almost sentient. Even more remarkable is that it can get more sentient over time because the AI models it helps deploy can reason about the optimization problem and suggest even better problem-solving strategies in the future.
The layered continuum of computational resources usually comprises four key levels:
Device Edge: Direct on-device processing for immediate analysis
- Local Edge: Processing at localized aggregation points for multi-sensor fusion
- Regional Edge: Mid-level analysis and aggregation at regional centers
- Cloud: Reserved for complex training, large-scale analysis, and model development
Implementation considerations
When organizations aim to achieve energy efficiency via edge AI, they must address several strategic elements.
- Choose the Right Hardware. How you select your hardware impacts efficiency and energy profiles vary widely from platform to platform.
- Model Optimization is key. This involves lowering the demand for computation by optimizing the models for maximum utilization of the hardware. tDon't Overlook Workload Distribution. A thoughtful workload distribution is just as important as choosing the right models and hardware. Not every AI task is suitable for the edge, most tasks are not.
- Implement Comprehensive Energy Monitoring Systems. Organisations need robust systems that provide a clear view of what is happening across the architecture to identify and hone in on potential optimization opportunities.
- Promote a Hardware-Aware Development Culture. Finally, organizations must shift their development approaches; they cannot retrofit cloud-trained models to work efficiently on the edge.
The business case for edge AI efficiency
The business case for edge AI is much broader than environmental considerations, offering many strategic plusses. Lower energy consumption means lower operating expenses - and that's particularly good to have when energy costs keep rising. From a system design perspective, it's also good to have a system that's less dependent on network connectivity; that makes the whole thing more resilient and able to function during average daily traffic jams or power outages. The business case also points to opportunities for performance improvements: eliminating reliance on the network means there's no network to slow down the system, and it's in this performance arena where edge AI can most profitably compete with cloud AI.
Looking forward
As artificial intelligence becomes more familiar, its energy consumption will be a main concern for technical leaders. Edge AI is another way to deploy large-scale systems and offers a sustainable alternative for scaling AI. This transition is not just about technological preference; it is a choice that organizations make when they want to gain a competitive advantage through large-scale AI deployments.
As the AI space continues to evolve, organizations will gain value by considering where various processing models make sense for their unique sustainability, performance, and business needs using cloud to edge. An astute use-case specific viewpoint to support technical leaders make well-considered AI infrastructure investment choices.
While many use the edge for simple workloads and the cloud for everything else, we're starting to hear about more sophisticated reasons for running AI workloads on edge devices: more energy efficiency and faster, more responsive systems.