Deploying the future
The dual facets of AI – training and inference – are driving different demands on data center architecture.
Training AI, particularly when it comes to developing Large Language Models (LLMs), requires substantial GPU clusters and immense computing power.
This means that data centers involved in AI training need to be strategically located in areas with plenty of energy and cooling resources, which ideally need to be sustainable.
On the other hand, AI Inference, which involves processing user input using patterns learned during training, requires minimal latency to ensure a smooth user experience. As a result, compute resources for inference are typically positioned in urban areas, closer to end users, to serve proximity-driven applications.
Ensuring AI can be effective
Training and inference workloads impact networks differently. Training is centralized in areas with low-cost energy; inferencing pushes greater traffic demands to edge of network. For AI to be most effective, these considerations must therefore be taken into account.
Increasingly, enterprises require more, smaller, data centers at the edge – as close to the end user as possible to reduce latency. Here are just a few examples of why:

Manufacturing
Microsecond decision making in a manufacturing environment can help avoid machine downtime and revenue loss.

Smart cities
Modern cities leverage AI and 4K cameras to monitor and detect traffic patterns and suspicious activity and crime.

Automated cars
This workload will not run in a data center, but on a chip within the vehicle itself. However, the vehicle will store sensor data and send it back to the data center for training – enabling the development of better models, which will then be sent back to the car locally. These updates might happen as much as once a day.
Shaping tomorrow’s networks
AI’s use in training is less driven by latency and more by the need for a large compute capacity in order to process massive data sets, along with higher bandwidth to handle increased traffic. These factors are therefore creating a strain on lower power/bit networks.
In addition to this, greater costs and impact to the environment are forcing data center operators to favor locations far away from the edge, where they can utilize renewable energy and buy cheaper land. However, this then presents the challenge of having to transport data traffic across larger distances in a reliable and secure way – putting even more pressure on bandwidth requirements.
New types of traffic with distinct requirements
Training traffic

Ingress to data center
- Huge volume
- Spiky
- Usually not time sensitive

Egress to data center
- Small to medium size
- Distributed to thousands of sites

Inter data center
- Deterministic
- Maximum throughput
- Minimal latency
Inference traffic

Chat/text
Low bandwidth
Not latency sensitive

Image
Low/medium bandwidth
Not latency sensitive

Voice
Low bandwidth
Medium latency sensitive

Video
High bandwidth
High latency sensitive

Multimodal
Highest bandwidth requirements
Extremely latency sensitive