Samsung today became the latest large vendor to update on its AI-RAN work. In what was essentially an update on the demos it showed at MWC, some of which was in turn work which it had already publicised in December 2024, the vendor said it was exploring the use of GPU AI acceleration alongside the vRAN.
Interestingly for those following the debate about integrating vRAN software with the GPU platform, Samsung’s work has gone pretty far down the RAN stack, using the Nvidia Grace Hopper 200 platform to accelerate channel estimation (PUSCH – Physical Uplink Shared Channel is the physical channel used to carry data from the device to the network.)
The companies said they will continue to explore the best-of-breed combinations of AI-RAN options leveraging Samsung vRAN with NVIDIA’s Grace CPU and/or GPU based AI platform using Compute Unified Device Architecture (CUDA) technologies.
At MWC, Alok Shah, Vice President, Networks Strategy, said that the GPU has been slotted in to support the Samsung vDU, for example to carry out those channel estimation algorithms on the GPU. But the company is just beginning its evaluation of the best ways to combine the GPU capability with the vRAN platform.
How the major RAN vendors react to the rise of AI-RAN as a concept is creating an interesting dynamic in the industry. In many ways it is not dissimilar to Open RAN.
The AI-RAN vision creates a potential point of disruption. It posits that vRAN software could run on the same compute architecture as AI workloads. Depending on the deployment architecture, this compute could be orchestrated so that utilisation is as high as possible. Resources could be used only when needed by telco applications, with “spare” capacity made available to the cloud or for enterprise AI workloads.
That, says Nvidia, would be a much more efficient way to build networks. To prove its point it created a RAN stack and got it working on its hardware. It doesn’t want to sell vRAN software, but it does want to act as a platfrom for those that do.
What if operators are convinced? T-Mobile is exploring it, as is Softbank in Japan. Indosat, which is investing in GPU for AI factories in Indonesia, has also signed up.
Nvidia has deep pockets and a lot of developers. Arm is keen to expand its presence within the telco architecture. Telcos themselves, even though they appreciate the performance of Intel’s latest Xeon SoCs, would like to keep their options open at the hardware layer.
Stack it all up and there’s a path where AI-RAN could potentially be a point where RAN vendors face disruption. How to react? Well you move to cover the threat, and by doing so you also put yourself in a position to be the one that benefits if the technology does indeed go ahead.
What’s the bad outcome for the RAN vendors? It is that Nvidia’s deep pockets and huge developer base open up opportunities for new players to run RAN software on Nvidia platforms. That’s not entirely fanciful. Fujitsu is already developing along with Softbank and Nvidia, and its technology is already in Softbank’s AITRAS L1, as well fully forming its L2 and L3. Companies such as Kyocera and SynaXG are creating AI-RAN platforms based on Nivida GPUs.
Softbank’s Mauro Filho, Director of AI-RAN America, told TMN that one of its goals is to attract new players and innovation, at a speed more in line with the needs of operators than the roadmaps of the telco industry. “We’re always building infrastructure so that innovators can come and improve things,” he said.
“It’s important that our users and partners have best possible network that is available – and if innovation can happen in a disaggregated way and if a big part of the stack is software, then our ability to innovate faster and to track demand more closely is improved as well.”
The good outcome is that the GPU compute platform becomes an economic winner, and something the RAN vendors can piggyback to success.
As we reported last week, Nokia already has a demo of its Cloud RAN plugin accelerator working with Softbank. Explorations with Ericsson are running slightly behind those with Nokia, Filho said.
Nvidia’s Chris Penrose, head of business development for telco at Nvidia, added that the company welcomes innovation wherever it comes from, pointing to Fujitsu’s work “We build tools and capabilities, but we really want those tools to then be used by the ecosystem and deployed.”
He also highlighted the important role that the major RAN NEPs would play.
“We want Ericsson and Nokia to take pieces of Nvidia and embed it with their solutions. They’re the ones taking this to the market. They’re the trusted partners.”
By committing to AI-RAN development, however speculative they may privately think the business case it, the major players can control development, and either delay it or when the time is right, fully support it.
So what are their options?
In basic terms – you could put the L1 in a server running on its own SoC – this is what Nokia’s SmartNIC does in the Softbank trial we reported on last week. It’s also the approach behind Indosat’s announcement that it is commencing AI-RAN pilots in Indonesia. This approach does not use the GPUs for the RAN L1 acceleration. Instead it leaves the GPUs free for other workloads, which could either be AI workloads that help optimise the RAN, or separate entirely – say an inferencing app for a telco enterprise customer. An orchestrator sits over the whole platform allocating both GPU and CPU resources.
Or you could look for something more integrated, in which you re-work the software so the GPU element does do the necessary L1 acceleration, and the Arm-based CPUs take on other work higher up the stack. This appears to be the focus of Samsung’s recent demos and the release it put out today. Samsung’s pictures show the vDU with the GPU server, which seems a pretty tight integration.
It would also accord with Ericsson’s stated approach. Ericsson said last week it has demonstrated a prototype of its Cloud RAN software running on the Grace Hopper, and is exploring how to orchestrate RAN and AI applications on the same infrastructure.
Both of these companies have, to date, primarily built out using Intel’s SoC-based acceleration for L1. Porting to the GPU and Arm-based CPU platform would be a challenge, although nobody presents it as insurmountable. Nokia too is exploring going further than just its NIC vision to something more integrated, and Ericsson has looked before at GPU-based acceleration.
Can AI-RAN pay?
But there’s a wider challenge facing the AI-RAN vision, which is the economics of it.
GPU-based compute is not cheap. The economics of the model relies on very efficient orchestration of the resources. But also on selling and monetising the compute capacity left unused by telco workloads.
Nvidia is, of course, bullish on the economics. Penrose said, “The fact they’re reusing the same infrastructure is almost like getting free RAN.” But again, that assumes the the re-use is actually happening.
KDDI’s CTO Kazuyuki Yoshimura told TMN that the company is already using AI in its RAN, for example on Massive MIMO use cases with Samsung. Moving that to GPU would be possible, but it’s clearly not something on the near horizon.
“GPUs are very expensive I don’t think it is something we are planning immediately – but as prices and demand shift, that is a possibility for the future,” he said.
For AI on RAN, using AI within the RAN edge, he was also circumspect.“It is a new service that we have to build in an attractive way. We need to take a bit more time to make that visible. Of course, it’s a very important area so that’s why we need to continue time to make sure that we have the right service.”
We can do that in the CPU at a fraction, a fraction of the cost and the power consumption
Samsung’s Shah said that as a vendor, Samsung is primarily about evaluating the technical capability and the performance benefits, and will leave the business model development more to its operator partners.
“We have to see how the business model develops. The capex for a GPU is quite substantial, but if the operator is in a position to monetise that then it could be a very powerful model. We’ll have to see how that develops but that won’t really be our domain.”
The primary vRAN chip provider, Intel, is also far from impressed.
Christina Rodriguez, Vice President in Intel’s Network & Edge Group and General Manager of the Communication Solutions Group, said that GPUs are a hammer looking for a nail. Intel, with its AI and RAN acceleration-included Xeon 6 chipset, is not about to concede the need for GPU acceleration for RAN, or for AI that enhances the RAN.
“You can use AI to optimise the infrastructure or the radio algorithm. But both of these things you can do with a Xeon SoC,” she points out.
“Now the industry is exploring if they could we do something else – running applications that have nothing to do with the RAN. But what you can do with Xeon SoC AI is infrastructure and power management, predictive maintenance, management of resources, everything that has to do with taking automation to the next level. And on the radio side, taking radio algorithms and making them smart; for example we have an AI-enhanced beam forming algorithm using just one core. You don’t need to have a super power-hungry GPU for the functionality we want to do in the RAN: the AI models running the RAN will be small to medium models – you don’t require some heavy GPU.
“We can do that in the CPU at a fraction, a fraction of the cost and the power consumption. And this is not a hypothesis. We are deploying massive MIMO in the previous generation of our CPU.
On the other hand operators and vendor customers are taking a serious look at AI-RAN. They see something there.
For Rodriguez that just means the industry needs to make its assessments wisely.
“We know AI is coming our way and is going to be everywhere. So the whole industry is figuring out what it means for us. But at the end of the day you want to be very careful about how much to spend and how much power to consume – and if you don’t need it you’re not going to have it.”