The benefits of running in the cloud are obvious in so many industries: you can focus your business on what brings value to it (rather than managing hardware racks and updating Linux patches); it’s also a proven way of improving flexibility and reducing the time to market for your new services.
So, why does it take so long for our indutry to embrace the cloud?
Well, live video encoding has its own characteristics that may be challenging. Firstly, it is 24/7, and not a sporadic service that is called on from time to time. Secondly, video encoding is CPU intensive and used to require specific hardware families. Thirdly, video is a high bitrate data stream and you need the correct bandwidth to ingest your mezzanine feed into the cloud. Last, but not least, the cloud has always been seen as expensive (hindering its total adoption), for four reasons:
- When comparing cloud and on-premise, it’s common that apples aren’t compared to apples … Computing the true Total Cost of Ownership (TCO) for an on-premise channel is a difficult exercise (more on that later) that spans over multiple teams, with a mix of mutualized and non mutualized equipment.
- Because of the misconception described above, the comparison is usually made with the same software running either on-premise, on a server, or in the cloud. But running such a comparison has limited interest, as software needs to be cloud-native to leverage the cloud benefits.
- Companies are not leveraging the benefits that Just-In-Time (JiT) technology brings in reducing the TCO dramatically. JiT functions are instantiated in a snap, anywhere globally and on any cloud provider. When you need more CPU because you have more viewers watching more profiles and more channels, you can simply increase the number of functions; because JiT functions are stateless, high availability is built-in (the workload will be balanced to another unit in case of failure). It further reduces the overall hardware cost as you need less CPUs to achieve a seamless redundancy.
- The cloud costs are usually analyzed using “on-demand” engagement models, whilst some can also be up to three times less expensive. “Spot” is the most interesting as it combines the best price and a Zero commitment model: You basically use spare resources that no-one uses. Although these instances can be interrupted at any time by the cloud provider, JiT functions will enable running a 24/7 continuous live stream, using hardware that will be continuously preempted.
Computing On-Premise Costs vs Cloud Using Just-In-Time
Let’s say that you want to compute the cost price of a Pizza at your favourite restaurant or even the cost of a slice. Why Pizza? Well Pizza like every other restaurant or food outlet around the world works within the same Just-in-Time principle…. You want the pizza to be served to you with your preference of topping, mushrooms, cheese, pepperoni etc (lets for the moment call these the pizza’s profiles) fresh and piping hot. Of course if restaurants don’t employ the JiT workflow then your meal will be sitting there waiting for you at your table prior to your arrival.
To calculate the cost you would need to take into consideration several factors: the price of the ingredients, the wages of the chef’s and servers, business rent, the electricity, the furniture, equipment and the wood for your oven … When summing it up, it’s very likely that the pizza’s profiles will only amount to a small percentage of the pizza price, right?
Well, the same applies for computing the TCO of a live channel: if you only take the hardware price into account, you will end up with a very inaccurate view of what your costs are. At the end of the day, computing a TCO per channel or slice of pizza using on-premise equipment is so complex that it ends up being diluted in other costs, with no one having a clear view on the final price … In the figure above, we provide an estimate (built with real customer data) of the prices given for all the different pieces of a video streaming datacenter, for 30 channels with fairly standard assumptions (7 profiles, handover to a public CDN, seamless redundancy, 15% SLA, …). It’s worth noting that the transcoder hardware represents less than a third of the overall TCO.
Computing a cloud TCO with Just-In-Time
The cloud brings other challenges when it comes to computing a TCO. Despite a common misbelief (and unlike on-premise), there are no hidden costs in the cloud … costs can be a bit of a challenge to consolidate. You can basically split your cost into 3 categories: ingress cost, compute cost and egress cost. Most importantly, you need to make sure that the software you will run in the cloud is “cloud-native”:
Building a Cloud native software
Just-In-Time functions are cloud-native as, by essence, they are spun up only when required and consume CPU resources also only when required. On top of that, software will only be efficient in the cloud if:
- It is fully hardware independent, because the CPU and RAM are not consistent among cloud providers, and may not even be consistent among different regions of the same cloud provider (as depicted in this diagram). By doing so, you also make sure that you leverage the Moore law (that will give you 2 times more power for the same price every 18 months)
- It is also scalable: because of 1, your software needs to be able to scale both horizontally (more CPU’s on the same machine) and vertically (more machines in your cluster). Indeed, the same workload may require significantly different amounts of CPU depending on the cloud providers, and the software needs to be automatically and seamlessly accommodated for that
- It is natively distributed so that any available unit can do the required job. In other words, it has to run using serverless functions (such as lambda or google run) and have a stateless behavior. This enables an unprecedented robustness, but also the ability to spot instances.
The added advantage when calculating the TCO for true cloud based services is that far fewer elements need to be considered compared to on premise services. Even though in this example we have taken the same configuration such as channel line up, profiles, redundancy etc as with the on premise calculation, the cloud TCO is OPEX driven and has a far lower drain on CAPEX.
Comparing On-premise cost versus Just-in-Time Cloud
The world of streaming is not the traditional world ruled by DVB/ATSC/ARIB standards that we were familiar with a few years ago. In streaming, standards are made by de-facto implementation rather than by standardization bodies. Buying a piece of hardware with the assumption that it will still be usable in the next couple of years is already a challenge; having a five year amortization period is also science-fiction. Because of its flexibility, the cloud, when used in conjunction with Just-In-Time functions, offers both cost savings without a commitment model, allowing you to adjust your streaming workflow in a snap, and flexible cutting edge services to your end users.
Seeing the cloud as a simple hardware rental (we’ve all heard the misleading statement “The cloud is just someone else’s computer”!) will lead to running the same software on premise and in the cloud, which is the best way to fail in any cloud transition: a software that was never meant to run in the cloud has very limited chances (if any) to be efficient when running in the cloud.