Nvidia adds AI vision to evolving vRAN ambition

Nvidia hits industry with new AI-RAN platform that extends its stated ambitions to enable a joint vRAN-AI platform.

On Wednesday, Nvidia launched AI Aerial, the latest version of its suite of software and hardware to support vRAN development and deployment.

The adoption of the AI into Aerial platform is an indication that Nvidia is leaning further into the AI-RAN concept, as previously outlined and most recently formalised in February this year with the launch of the AI-RAN Alliance.

The new platform includes three elements:

First is the Aerial CUDA-Accelerated RAN – software libraries for building GPU-accelerated, cloud-native networks. The CUDA baseband platform supports inline GPU acceleration of Layer 1 PHY and Layer 2 MAC. That fulfils Nvidia’s previous strategic direction to move up the RAN stack within a converged accelerator.

Nvidia’s DOCA GPUNetIO enables the removal of the CPU from the critical path. Used in Aerial, it provides IO and packet processing by exchanging packets directly between GPU memory and the DPU. Nvidia said this enables fast IO processing and direct-memory access (DMA) technology to “unleash the full potential of inline acceleration”.

Its cloud-native architecture allows RAN functions to be realized as microservices in containers orchestrated and managed by Kubernetes

The second and third elements that Nvidia announced were its Aerial AI Radio Frameworks, software to develop and train models to improve RAN performance and the Aerial Omniverse Digital Twin – a simulated environment that incorporates the Aerial Cuda Accelerate RAN that mirrors real world deployments.

The compute for the platform is ARC-1, built on the Nvidia GB NGX server, with CUDA accelerate runtime libraries upon which it can place the RAN software. The server is formed of a Grace CPU and Blackwell CPU  with NVLink and a fronthaul NIC that is O-RAN compliant and can connect to an O-RU over traditional eCPRI.

The ARC-1 can also be connected via Nvidia’s cloud orchestration layer to enable its GPUs to be leveraged and used by every node in that network.

Ronnie Vashista, SVP Telecom at NVIDIA, said, “Think of this as the DU, CU, Distributed UPF and orchestration, all on a computer that can connect not just in the traditional method of a RAN, but also enabling GPUs to be connected.”

Vashista said that the platform can also be deployed at different points of the operator network topology.

“Because of the capabilities of the fronthaul, and the unified memory architecture of the Grace-Blackwell NVLink, you can now run high performance RAN algorithms across the NV link wth very high avalability.

“It’s the compute platform that allows high performance RAN to be implemented in a form factor at the cell site as well as at a distributed rack system or centralised site.”

Vashista said that despite the expense of high end GPU-based platforms, and the fact that many operators have already invested billions in 5G, the Nvidia platform would make CFOS “very happy” by changing the dynamics of the telecom in terms of TCO and return on capital.

That’s because the same infrastructure can be used to support AI traffic, enabling monetisation of the network platform, something operators have not been able to achieve before.

“There’s been RAN and there’s been compute infrastructure. We see the essential need for these to be converged and combined in one, called AI RAN,” he added.

The underlying driver for this investment, Vashista said, would be the growth of AI and AI apps that require strict throughput, latency and jitter KPIs. Intelligent connectivity that can apply real time decision making will need to be distributed across the network topology, from a base station to a local site to a centralised cloud.

It’s also worth noting that Nvidia takes care to position its software defined RAN as 6G capable – putting it into an investment framework to support advanced 5G, vRAN refresh projects and also then 6G.

The technology will be tested out by T-Mobile, Ericsson and Nokia with the foundation of a new AI-RAN Innovation Centre. Along with Softbank, and existing Nvidia partner for vRAN, T-Mobile was one of the founder operators named at the launch of the AI RAN Alliance.

OEMs like Ericsson and Nokia could build their own stack on top of Aerial, Vashista said, or they can use Aerial PHY as their own software PHY. This, of course, is a critical question. What is the route to market for Nvidia’s vRAN supporting hardware, and its own vRAN software? Will the major vendors want to re-build around the cuBB, or rather view it as just another potential means of accelerating their own vRAN workloads?

To date, most of Nvidia’s development of its vRAN platform has been with O-RAN new entrants such as Mavenir, which has worked with Softbank, and Fujitsu, which actually has a commercial deployment with NTT DoCoMo. Nokia said in February this year that it would not, for now, be adopting Nvidia for L1 acceleration. But it did say it saw potential for using Nvidia’s Grace CPU “superchip” for Layer 2 and above in Cloud RAN deployments.

This week’s product announcement is new, but it is very much in line with Nvidia’s stated ambition, tracked over the past few years on TMN.

NVIDIA ON TMN:

In 2019 Ericsson and Nvidia said they would explore the potential use of Nvidia GPUs for vRAN processing (“Nvidia aims guns at Intel with GPU for vRAN play”)

As we know – those were initial conversations as Ericsson assessed options for a potential virtualisation of RAN workloads. And they didn’t seem to go very far as Ericsson continued to design its own (Intel-based) silicon solutions for its RAN platforms, and then integrated its Cloud RAN with Intel based solutions from Dell (“Ericsson firms up Cloud RAN partnership with Dell) and with HPE.

But Ericsson has popped up again this week as a named partner with Nvidia for the innovation lab at T-Mobile, following its signup for the AI-RAN Alliance earlier in the year. What is in doubt is the extent to which Ericsson will commit to adopting Nvidia’s built in vRAN libraries, or whether it will see the Nvidia hardware as just another potential accelerator option, albeit a fully inline version. Certainly Ericsson’s public strategy is to be agnostic to the underlying hardware platform.

It’s worth bearing in mind (again) what Mårten Lerner, Head of Cloud RAN at Ericsson, told TMN at MWC 2023.

“I think the difference is that we want to become fully portable between different hardware platforms. If we look at the alternatives out on the market, and I think [Nokia’s] SmartNIC is one way of expressing it, there is also an acceleration component, which means that you connect again the software to the hardware in a way that we have initially taken a step to go away.

“So our solution is completely disaggregated, while some of the other companies are integrating their L1 acceleration with their L1 software. And I think this has not been fully understood in the industry about the portability of the solution, because if you then go with smartNIC or their acceleration, you’re still becoming proprietary to that L1, versus us being agnostic across the different hardware platforms that we support.”

The new platform is different again. Could Ericsson achieve its ambition of being hardware independent whilst hosting its own software on a platform which Nvidia has developed with its own L1 and L2 libraries?

Coming a little forward. In 2021 Nvidia announced extended support for Arm-based CPUs – with the NVIDIA Aerial A100 AI-on-5G platform.

Nvidia said this would “create a simplified path to building and deploying self-hosted vRAN that converges AI and 5G capabilities across private enterprises, network equipment companies, software makers and telecommunications services providers.”

The platform incorporated 16 Arm Cortex-A78 processors with the NVIDIA BlueField-3 A100, an AI software library.

Fast forward to March 2023 and Nvidia and Softbank established a partnership to explore a platform that would host vRAN as well as AI apps.

TMN reported on work Softbank was doing with Nvidia on the use of Nvidia GPUs to support a Mavenir vRAN, MEC workloads and core network software from within the same server and Nividia GPU. Softbank said that Distributed Units (DU) from Mavenir and MEC AI applications were working on the same architecture server. Nvidia said this enabled much higher utilisation by dynamically allocating GPU resources on demand, so that a telco could run AI and other apps on the same GPUs and servers as it runs its vRAN. (sound familiar?)

SoftBank said its “AI-on-5G Lab” research facility consisted of NVIDIA’s hardware, vRAN and AI processing middleware, vRAN software from Mavenir, core network software, AI image processing application software on MEC, and physical Radio Units (RU) provided by Foxconn Technology Group. The Mavenir Distributed Units (DU) and MEC AI applications work on the same architecture server. As we said at the time, “This is a key vision for Nvidia, which says it can enabled much higher utilisation of resources by dynamically allocating GPU resources on demand, so that a telco could run AI and other apps on the same GPUs and servers as it runs its vRAN.”

A new platform announced in May 2023 built on that multi-tenant vision, with SoftBank saying it planned to roll out at new, distributed AI data centres across Japan, uses Nvidia’s GH200 Grace Hopper Superchip. Most of the focus was on the benefits to SoftBank of running vRAN workloads on the platform, with passing mention that these workloads might, themselves, benefit from AI applications.

That solution was based on Nvidia’s Grace Hopper chip with its BlueField-3 data processing units to accelerate software-defined 5G vRAN workloads on a 1U MGX-based server design. As we’ve seen, the MGX is a modular reference architecture that also underpins the new ARC-1 launch.

In February 2023 Fujitsu launched a vDU and vCU based on Nvidia technology, developed as part of NTT DoCoMo’s “5G Open RAN Ecosystem” (OREC) – now renamed OREX.

The solution applies NVIDIA’s GPU processing engine “NVIDIA A100X” to the physical layer processing at the base station, enabling parallel processing of vRAN and edge applications on GPU hardware resources in an all-in-one configuration that allows each function to be built on the same server. Fujitsu and NVIDIA’s integration formed Type 1 or OREC’s then blueprints, based on a Wind River virtualisation platform in Intel-based hardware from Fujitsu as well as Fujitsu’s vRAN software.

The companies said in September 2023 that they had commercialised the solution, stating, “The 5G Open RAN solution is the first 5G vRAN for telecom commercial deployment using the NVIDIA Aerial platform. The platform brings together the NVIDIA Aerial vRAN stack for 5G, AI frameworks, accelerated compute infrastructure, and long-term software support and maintenance. It delivers innovative and transformational new capabilities for telco operators.”

Beyond Layer 1

Also in March 2023 we outlined how Nvidias ambitions extended beyond virtualising L1 RAN workloads. At that point, Nvidia’s product platform was its A100X, a GPU+DPU converged card that supported inline L1 processing, providing the hardware layer for Nvidia’s Aerial SDK – a containerised application framework for vRAN and a vDU. Nvidia also had a dedicated fronthaul switch and said it was working with partners to design GPU-based orchestration and RAN automation apps (RIC).

Nvidia was promising that the solution used a standard 2U COTS node to support a 100MHz 4T4R product. A 32T32R version was said to be due later on in  2023, with a 64T64R version also following based on a quarterly release cycle.

Rajesh Gadiyar, Vice President, Telco and Edge Architecture, NVIDIA, said then that what was needed was a “paradigm” change to deliver innovations for the vRAN infrastructure of tomorrow. First, that meant bringing AI to the wireless network, enabling solutions like RIC and SMO. Secondly, Nvidia would push beyond Layer 1 in the RAN stack, targeting multi cell scheduler, channel estimation and beamforming within a converged accelerator.” This it seems to have done with its L2 cuMAC support in AI-Aerial.

“Now, that does not mean that we are running the whole layer two on the GPU, but it’s primarily focusing on the scheduling problem and accelerating the scheduling. So that the the layer two stack and the scheduler can utilise the radio resources more efficiently,” Gadiyar said at the time.

Thirdly it was pusing for RAN in a multi-tenant cloud. This, it said, would help with the capital outlays required, and allow operators to scale up over time. And at times of under-utilisation it can use the GPU infrastructure for running other apps.

Nvidia vRAN in 2024

Bring it forward to 2024 and in March’s GTC, Nvidia kicked off with the capabilities of its new Blackwell GPU. One of three key use cases for Blackwell that Nvidia highlighted within telco was vRAN (the other two were the use of Gen AI for internal ops, and the creation of AI factories).

Again, Nvidia had its story worked out – use the infrastructure from potential money-raising AI factories to support vRAN workloads. Same story as this week’s messaging.

As Vashishta said at the time, “Just think about the billions of dollars in a RAN. Now your investment in your AI factory enables that workload to run in that factory. This we call AI and RAN. We’re working with partners in the AI RAN Alliance to flush out how this infrastructure is deployed and how the two workloads of AI and RAN can be orchestrated to work together.”

And yes, 2024 also saw the formation of the AI-RAN Alliance, in which Nvidia is taking a lead to integrate AI with cellular technology. Announced partners included Nokia and Ericsson, as well as T-Mobile, and perhaps it is in this light that we should view the latest announcement with T-Mobile and those two vendors.

While Nokia was interested in using Nvidia’s for Layer 2 and above in Cloud RAN deployments, the vendor also pointed out, as did Marvell, that Marvell already has a ML/AI Accelerator capability built in to its Octeon 10 chip (used by Nokia for L1 acceleration). Marvell said that it was open sourcing this ML/AI Accelerator Software.

Through contributions accepted to the Apache TVM (Tensor Virtual Machine) project, developers can use Open-source tools to build machine learning (ML) models that can be executed the ML/AI acceleration engine.

A Marvell spokesperson said to TMN that vendors to this point had not done a lot with the AI/ML capability that is built into the Octeon 10 CN106 chipset, which was its first SKU in the Octeon 10 family and contains up to 24 Arm Neoverse N2 server processor cores, inline AI/ML acceleration, an integrated 1-terabit switch and vector packet processing (VPP) hardware accelerators.

“It’s been there since 2021,” the spokesperson said, “so that if a carrier or OEM had an AI application to run it wouldn’t need a GPU.”

Marvell’s announcement was also welcomed by Nokia, saying it was “the first vendor to incorporate AI/ML into our ReefShark SoCs and AirScale portfolio” and stating that this would “put it at the forefront of bringing machine learning technology to our customers and exploring its enormous possibilities ahead of the 6G era.”

You can see our breakdown of AI-RAN Alliance demos, including participation from Softbank, Fujitsu Supermicro, Radisys.