Mobile operators planning network cloud deployments based on the offerings of players like VMWare and Red Hat are doomed to difficulties and failure, according to cloud software provider Platform9.
Platform9 was founded by a group of mainly ex-VMWare engineers, and has raised over $100 million for its vision of an as-a-service centralised cloud management platform. In the past years it has developed its technology to provide a common management plane for telco network functions running in distributed cloud environments.
Last week Platform9 announced that Mavenir had chosen it as the key partner underpinning Mavenir’s Webscale Platform. Mavenir describes the MWP as “a next generation cloud-native solution that includes Kubernetes based CaaS (Containers as a Service), PaaS (Platform as a Service) and MTCIL (Telecom PaaS layer)”. Mavenir intends to leverage Platform9’s Kubernetes solution to deliver a web-scale platform that runs containerised cloud-native network functions. It has integrated Telco PaaS (Platform as a Service) that it contributed to Opensource XGVela on top of the Platform9 Managed Kubernetes (PMK) solution.
So why has Mavenir chosen to work with Platform9 for its platform, and not an established player like RedHat, which markets Kubernetes management via its OpenShift platform, or a virtual infrastructure provider such as VMWare?
According to CEO Sirish Raghuram, it comes down to the limitations of those two platforms. He says that VMWare is just not designed to support a production telecoms network made up of container-based and virtualised functions from the packet core to the RAN.
“VMWare is pretty complex, pretty heavy. It is not designed for this type of world. Certainly, it is trying to talk a good game and talk a good market story but the reality is it was never envisaged and designed for an edge world. It was always designed for a Windows admin running the traditional back office IT workload in a datacentre.
I think those customers that are betting on OpenShift are about to find this out, in a very painful way
“VMWare was just never designed for a distributed network topology, that kind of architecture. It’s not production workload-orientated, it doesn’t do well with managing distributed sites. The management overhead of the full stack means that by the time you’ve deployed a VMWare containerised solution you’ve deployed like 30 servers. And when you are trying to manage a rack or half rack of gear in 10, 50, a hundred installations of those, you can’t afford it. It doesn’t make any sense.”
Raghuram admits that OpenShift shares Kubernetes DNA with Platform9, but again he questions its ability to provide operators with a central management plane.
“The simple answer is that without that central as-a-Service management architecture – that OpenShift doesn’t have – you end up having to deal with a lot of professional services: every site is a silo, constantly debugging why this site works and that one doesn’t, why this site didn’t come up the way you expected it to. We’ve heard so many horror stories from customers who have had that experience.
“It’s pretty incredible. I’m kind of saying the emperor has no clothes. That’s the reality. Running OpenShift in one or two enterprise data centres, you can get away with having to deal with a lot of professional services and constant scripting. But in the network world, it completely falls apart. And I think those customers that are betting on OpenShift are about to find this out, in a very painful way.”
“When you use something like AWS or Google Cloud or Azure, you don’t have to deal with deploying and managing n different sites and regions all on ad hoc scripting, potentially diverging over time. Amazon and Azure and Google operate that as a central, globally-available SaaS fabric and they fully manage it for you, and it’s all very consistent.”
What are the technologies that can give you that?
“You need some kind of shared management layer that can assure that,” Raghuram says. “And it’s the operating architecture. The public cloud companies understand this but for all the work that has happened in telcos and in the open source ecosystem there hasn’t been anyone outside of Platform9 who has really brought this SaaS architecture – outside of the public cloud companies.”
A blog post from the company spells it out like this. “While Kubernetes is great for orchestrating microservices in a cluster, managing thousands of such clusters requires another layer of management and DevOps style API-driven automation.
“This management plane is where DevOps engineers manage the entire operation. There, they store container images and inventory caches of remote locations. Synchronisation ensures eventual consistency to regional and edge locations automatically, regardless of the number of locations.”
Raghuram takes up the story again. “What that means is that you get the simplicity and consistency of cloud, but unlike the cloud you can run this in radio networks and cell towers and central offices, wherever you want it to be. That’s how we are different from the cloud. We don’t care where the infrastructure is, our clusters run wherever the servers are. But they’re all being centrally managed and operated with a high degree of automation and consistency.”
Wouldn’t it be nice if you could get the public cloud experience in the site where the radio network is running. In the site where the packet core is running?”
Why not the public cloud?
The obvious counter-argument, then, is to ask why the telcos don’t move to the public cloud?
Raghuram adds that what customers are looking for, in many ways, is a technology that looks a lot like the public cloud architecture “but they want that to be applicable to the telecom workload.”
“Wouldn’t it be nice if you could get the public cloud experience in the site where the radio network is running. In the site where the packet core is running?”
Platform9 argues in its literature that the public cloud giants are finding out that distributed telco network functions workloads are not well-matched to their capabilities.
A pitch document from the company says, “The hyperscale advantage of the giants does not apply to telco. For the first time, the public cloud providers are finding out that this is a new battleground that does not play very well into their hyper-scale advantage. Public cloud providers are attempting to treat network service providers as just another enterprise vertical, and asking them to adopt the public cloud for the same reasons that everyone else has. It won’t work.
“5G requires a distributed architecture, with a greater number of components running at the network edge. This is in contrast to the traditional cloud computing architecture used from 2010-2020, when the cloud was largely used as the powerful central hub in a “hub and spoke” architecture. At the distributed edge, the hyperscale economic advantage of the public cloud providers simply disappears.”
We should note that the focus here is on the network functions themselves, not on the supporting “back office” IT functions such as billing, charging and some OSS functions, which Raghuram says may be well-suited to public cloud deployments.
So what are Platform9’s credentials, and why is Raghuram confident he can provide operators and vendors with the functionality they need?
First, Raghuram says, the company “knows the pain” of achieving its goal. It has spent seven years engineering its centralised, as-a-service based management architecture. It has around 90 staff on the technology, and has “spent nearly 100 million in venture funding to go and build this”. Just this year it expanded it Series D round to 37.5 million. “With the engineers you need to hire and the automation technology you need to build, it is very hard for individual customers to invest at that kind of scale,” the CEO says.
A centralised, Kubernetes as-a-service management plane means, “you can manage a hundred sites as easily as you can manage one site.”
“Your management complexity shouldn’t scale with the scale of your network. The point is that you deploy the management plane once, and it should scale seamlessly all the way from one site to hundreds of sites. Once you have deployed you can add sites painlessly.
“We have an advanced stack that includes everything from bare metal orchestration, to the virtualised VM-based workloads, to the container-based workloads, all three of which you can bring up remotely with the central management plane. You can guarantee that your sites look exactly the same because the central SaaS management fabric is ensuring that everything is conformant to the specification of what you want your platform infrastructure to look like. You can then do that at hundreds of sites, and be assured they are all conformant and consistent with your policy.
“And we have a sophisticated configuration capability to optimise this platform for network function workloads. So that’s everything from advanced networking configurations to enable high performance, low latency network function workloads, as well as advanced resource scheduling capabilities like CPU pinning capabilities, NUMA awareness, topology management: making sure this is a PaaS layer that really works for the network function workload.”
Raghuram says the platform has been designed to take advantage of open source software innovation in the domain to reduce the management overhead and services requirement.
“The platform is very much derived from the Linux Foundation, from the CNCF and all the open source innovations that are happening in the space. So we are very much drawing on everything from MetalKubed, and Ironic, to KubVirt and Kubernetes itself, Docker, SR-IOV, DPDK, Prometheus, all of these are coming from the open source realm. What we are saying is that operationally, for this to be something that you could run in confidence, in production, at any kind of scale, you need to adopt the operating model of the cloud. And that means you need to adopt a high degree of automation, you cannot run this with professional services.”
I’m telling you on the record you cannot run distributed Kubernetes, distributed cloud, without a shared management plane. You can’t do it.
There is one example of a telco that is managing network function workloads on cloud servers with a distributed architecture and that, as you might expect it to be, is Rakuten. The Japanese challenger operator has planned around four thousand edge sites hosting vRAN and other software, and uses Robin.io to provide automated management of its distributed cloud. Robin is also a partner of Parallel Wireless, a Mavenir vRAN and Open RAN competitor, for whom it is addressing the same sort of issues that Mavenir is addressing with Platfrom9.
Rakuten CTO Tareq Amin has previously said that there was so little public information about the sort of capability that he needed, that he found Robin.io via a late night Google search. The way in which the telco has structured its automated operations is a big part of its differentiation, and is a key aspect of its commercialised Rakuten Communications Platform.
Raghuram says that the Rakuten deployment came a bit early for Platform9. However, he adds, “We like to believe if we’d participated in the selection then the outcome would have been different.” And he thinks that things may still get trickier for Rakuten and Robin.io.
“Ultimately they will have to figure out if operationally they can run something like this without having a shared management plane. And I’m telling you on the record you cannot run distributed Kubernetes, distributed cloud, without a shared management plane. You can’t do it. So they are probably having to roll that on on top of Robin. So they are using Robin for some technology pieces, but Rakuten is probably finding out that they need to invest a lot of their own time and effort in building out a management plane.
“Can they match the focus and quality that an entirely dedicated company like Platform 9 can provide? I don’t know, I think the jury’s out. They may very well find that they need to revisit that down the road.”