Nvidia this week upped its ambitions for presence in mobile networks. It said that its GPU-based accelerators can prevent L1 acceleration creating a new hardware dependency in vRAN. Not only that, it thinks that it can resolve the issue of under utilisation of resources in telco networks by developing RAN in the cloud. And it wants to accelerate more than just L1 RAN workloads, seeing a role for itself across the network stack but also, crucially, to support non-network function workloads, such as edge AI.
In its GTC conference, held across this week, many sessions followed up on CEO Jensen Huang’s repeated assertion that AI is at an inflection point, and is having its “iPhone moment”.
One area where Nvidia thinks its fast processors can help support new AI-based models is in the mobile network.
The most obvious use case here is to use Nivdia’s GPU as an inline processor for L1 processing of RAN workloads. This isn’t new. For example, Ericsson said back in 2019 that it was assessing GPU-based acceleration. But things have progressed. Nvidia’s A100X is a GPU+DPU converged card that supports inline L1 processing, providing the hardware layer for Nvidia’s Aerial SDK, a containerised application framework for vRAN and a vDU. Nvidia also has a dedicated fronthaul switch and is working with partners to design GPU-based orchestration and RAN automation apps (RIC).
This year NTT DoCoMo will start rolling out vRAN software from Fujistu that will use Nvidia acceleration for L1.
Sadayuki Abeta, Head of Open RAN, NTT Docomo, said that the company has had a mult-vendor, O-RAN compliant network since 2019. It has 20,000 gNodeBs (5G base stations) and 16 million 5G customers, and can choose vendors based on best performance with interoperability across its network.
It is now moving that to create a vRAN solution, using a solution from Fujitsu based on Nvidia’s GPU acceleration.
“In this we use Nvidia acceleration to improve performance – and we believe we have already achieved the performance of incumbent vendors. It is ready for commercial deployment this year.” Abeta said.
And this week another Japanese operator, SoftBank, said it had proved the use of Nvidia GPUs to support a Mavenir vRAN, MEC workloads and core network software from within the same server and Nividia GPU. “Its AI-on-5G Lab” research facility consists of NVIDIA’s hardware, vRAN and AI processing middleware, vRAN software from Mavenir, core network software, AI image processing application software on MEC, and physical Radio Units (RU) provided by Foxconn Technology Group. The Mavenir Distributed Units (DU) and MEC AI applications work on the same architecture server. This is a key vision for Nvidia, which says it can enabled much higher utilisation of resources by dynamically allocating GPU resources on demand, so that a telco could run AI and other apps on the same GPUs and servers as it runs its vRAN.
Today the industry is hitting the road block of the end of Moores Law and x86 CPUs. Fixed function accelerators are required to supplement CPU performance. That takes us back in time to the era of fixed function appliances.
The solution relies on one architecture to address lowband, midband, and mmWave, and uses a standard 2U COTS node to support 100MHz 4T4R product. A 32T32R version is dues later this year, and a 64T64R version also following. Vikram Ditya, Director of Software, 5G RAN Compute, NVIDIA, said that Nvidia’s Aerial platform can now deliver 4T4R 10 cell capactity from one GPU accelerator.
“We started working on this effort a few years ago, but last year we knew the GPU capacity it can reach with respect to cell capacity and for us the main effort was to make sure it can work in the converged platfrom. That means the 5G PHY signal processing workload is working with the fonthaul IO and also with the L2 stack and end-to-end it is reaching further multiple cells. So we were able go from one cell to 10 cells on the same converged platform just via the software upgrades and achieve 4T4R, 100MHz, 4 DL 2UL TDD The software can run up to 64TR; internally we have a quarterly SDK release cycle, new software with more improved performance and features.”
Rajesh Gadiyar, Vice President, Telco and Edge Architecture, NVIDIA, said, “Today the industry is hitting the road block of the end of Moores Law and x86 CPUs. Fixed function accelerators are required to supplement CPU performance. That takes us back in time to the era of fixed function appliances. It doesn’t meet performance and TCO requirements and therefore it needs a new approach.”
What Nvidia sees as the way forward is “a paradigm change to deliver innovations for the vRAN infrastructure of tomorrow.”
Gadiyar sees this happening in three dimensions. First, to bring AI to the wireless network “like never before”. “We are enabling AI solutions based on O-RAN standards such as RIC and SMO for use cases such as auto scaling, energy management, spectral efficiency and more. We have enabled an AI-driven platform for creating full digital twins of networks. AI will play a critical role if we are to deliver an order of magnitude improvement in bandwidth and latency.
Secondly, “Our flagship Aerial platform with A100x converged accelerators is a full inline offload for vRAN L1, and has already delivered industry leading cell density and thrughput per Watt. We are now looking to accelerate the vRAN stack beyond Layer 1, and are targeting multi cell scheduler, channel estimation and beamforming.”
“The third dimension we are pushing is RAN in the cloud, which improves utilisation and multi tenancy by running RAN on demand as a service fully in the cloud. By doing this we are addressing the biggest pain point of telcos who see low utilisation of RAN infrastructure. RAN in the cloud provides significant flexibility to telco and enterprise. They don’t need large capital outlays, and can scale over time as needed. Also when the RAN is underutilised they can use GPU infrastructure for running other apps such as edge AI and video and monetise the infastructure.”
Addressing higher performance
Vikram Ditya, leads engineering for the Aerial platform at Nvidia.
He said that by using GPU for parallel processing the company was able to improve MIMO channel estimation for cell edge users, resulting in a 14-17 -db gain.
Ditya said that one problem with MIMO is that spectral efficiency gains become constrained for cell edge users, where the gNode B receives low quality SRS (Sounding Reference Signal) and its channel estimations become outdated. So it needed to find SRS receive signal functions that can work on a very small occupied bandwidth. This leads to an optimisation problem, and Nvidia leverages RHKS (maths) to convert this, but that in turn creates a very intensive compute problem, which is where GPU comes in.
Ditya said the second problem Nvidia has been looking at again for increasing the spectral efficiency is accelerating the layer two scheduler on GPU.
“Now, that does not mean that we are running the whole layer two on the GPU, but it’s primarily focusing on the scheduling problem and accelerating the scheduling. So that the the layer two stack and the scheduler can utilise the radio resources more efficiently.”