WebRTC doesn’t scale, right?
Built as a peer-to-peer communications protocol, it’s true that traditional WebRTC did not lend itself easily to scalability practices. But that was then, this is now.
In this blog, we will see that, by utilizing WebRTC performance testing and using topologies designed to support low latency video streaming, we can scale to achieve truly big, geographically distributed, and fully interactive WebRTC applications. We will also learn that, while it is never too soon to look at scaling, it can sometimes be too late.
So now that we know it is possible, let’s take a look at how it is done
The right stuff
There is no one perfect fit for scalability. Before you can begin to scale, you must ask yourself how your application uses WebRTC right now:
- What is the maximum number of users I need to support in one session?
- What is the average number of users I can expect to support in one session?
- How many users might send media within a session?
- How many users might receive media within a session?
- Do my users typically connect from private or business networks?
- How many users might connect from high-security networks (e.g. hospitals)?
- What data do I need to record?
- What guarantees do I need to provide around the recordings I capture?
Push it to the limit
Maintaining platform performance and server stability requires an end-to-end scalable WebRTC testing workflow. The next step is to test the scalability of that workflow by looking at the following areas:
For a large number of sessions
- The signaling server
- How does it cope with increased capacity?
- Does it slow down?
- Does it leak memory?
It is a good idea to add some real users into your WebRTC Performance test process using a tool like testRTC alongside one that loads your signaling connections separately. This can help to weed out any service killers waiting to attack when media is added to the mix.
- The media server
- How does the gateway between users and networks hold up under pressure?
By generating test client connections of all sizes using a tool like testRTC, you can simulate any possible connectivity bandwidth scenarios that might present themselves during a live event.
- The TURN server
- How will your servers manage service growth?
Anywhere between 5-20% of calls end up being relayed via a TURN server. By changing the configuration of your client, filtering ICE candidates, and doing SDP munging, you can test your scenario by forcing browsers to connect via TURN. Better still, enforce some network rules on the machine running the browser and test your service in different network conditions.
For large numbers in a single session
- Is latency affected when you expand from 1:1 to 4 to 10 users?
- Do we incur packet loss when we expand?
- How is bitrate affected when more browsers are introduced?
- If there is bitrate loss, is it linear or exponential?
testRTC allows for simultaneous scaling of both the size and the number of sessions by using the notion of #session. By indicating #session, the test automatically splits the number of concurrent users you want into sessions at the scale you choose by #session.
Now that you have queried your use-cases and your capacity to test your infrastructure at scale, you are ready to make some decisions about the available WebRTC scaling architectures:
And the new kid on the block
Experience Delivery Network (XDN)E
Let’s jump right in.
The most cost effective and simplest to set up, Mesh architecture uses a peer-to-peer process that involves each participant sending their stream directly to every other participant without having to go through a media server. This means, however, that the burden of encoding and decoding streams is offloaded to each user. Since internet connections are asymmetric, with download bandwidth being greater than upload bandwidth, upload can become a major issue. This factor alone makes the Mesh topology the least scalable of the four available options.
- Up and running quickly
- Set up using a basic WebRTC implementation, no change to infrastructure
- Excellent for low participant numbers
- Better privacy parameters
- Cost-effective because it doesn’t require a media server
- Not reliable beyond 10 participants (max)
- Necessary to encrypt each media stream for every connection
In Mixing architecture, each participant connects to a centralized server called a Multipoint Conferencing/Control Unit. The server mixes each media stream and sends out one centralized stream to all participants.
In this environment, streams upload to the server only once regardless of the number of participants. Although an older architectural model, MCUs continue to be a valuable resource, particularly where there is a need for a beefy media server to interface with older or less compatible technologies and send them a single composite video or audio stream. They work well in low bandwidth environments and are easily scalable for large numbers of participants.
- Compared to MESH, the MCU moves most of the processing from the client to the server
- Reliable streaming
- Works well in with low bandwidth
- Scales easily
- Good for custom media features like backgrounds and blur
- Expensive due to computationally intense server processing demands
- Recomposition of media data from the server may cause latency issues
- Limited choice in media layout
A Selective Forwarding Unit is a central server that does exactly what it says on the tin; it forwards. The server behaves more like a router than a processing server. Every participant sends their stream to the SFU, which returns all streams back to that participant. For example, five participants send their stream to the SFU and the SFU returns 4 streams to each participant.
Unlike the MCU approach, transcoding happens at the edges and not at the server. This is also the case for features like background blur and image layout. Streams are separate, so each can be rendered individually – allowing full control of the layout on the client side. With most networks, particularly residential networks, having weaker up-link bandwidth but higher down-link speed, SFUs have become a very popular scaling solution, and the most cost-effective.
- Scales well
- Reduced server load = lower latency possibility
- Flexibility on the user side – reorganize the specifics and layout of conversations easily, even when live, something that is not possible with MCU topologies
- Reduced computational cost on the server side
Connecting to a centralized server again. Scales easily. Server behaves more like a router than a processing server. No mixing, participants only get the streams of the other participants.
Top tip: Optimize your software to keep end-to-end delay below the order of the network delay.
Hybrid/Experience Delivery Network (XDN)
The new kid on the block, XDNs deliver a ‘best of all worlds’ approach. Again, the name is a dead giveaway: it is a hybrid architecture mixing Mesh, MCU, and Forwarding topologies. An XDN dynamically selects the underlying architecture to deliver live stream outputs depending on the number of participants, their geographical location, and their networking bandwidth. This is called autoscaling.
So, for example an XDN may employ a Mesh or P2P topology for conversations with less than 4 participants, switching to an SFU topology when the group grows larger than 4.
During the recent health crisis, it has become much easier to leverage private and public cloud resources for real-time communications across any configuration of users at any distance. Keep your eye on XDN as it gains traction in the race to deliver cross-cloud autoscaling for multi-directional communications.
What’s the use?
At the end of the day, WebRTC scaling is a tradeoff between optimization for server efficiency and optimization for client efficiency. So how do you make the call? The answer lies in your use cases. We asked ourselves eight fundamental use-case questions right at the start, the importance of which cannot be overstated. You really do need to understand what it is your users value most and how you intend to use your application in real time. The value inherent in each of the topologies we discussed is relative to your particular needs. Look carefully at all the available topologies before you hop into bed with one in particular.
Who’s your Daddy?
A lot depends on your business model. Who is paying your server costs? Are you servicing a lot of free clients? In this case you are going to want to keep your server costs to a minimum. Or are you aiming for maximum efficiency with an aggressive price point? In this instance, the sky’s the limit for server outlay.
No matter which architecture you choose, however, keep in mind that you will always need:
- A signaling server (for registration and presence)
- A TURN server (for network traversal)
- A WebRTC testing tool
Don’t be a victim of your own success
It is not a conceit to plan for success. It is neither arrogant nor is it presumptuous to have a future scaling schema. If you start looking at scaling only when things are beginning to fail or break, then it may already be too late.
No matter how much you decide to do upfront or how much of the work you put a pin in for a later date, you must start laying the groundwork for your future scaling needs right from the off. Start with the pricing mode for the infrastructure you are using now and how it might work for you if you end up with a million clients.
Planning for success means bracing for disaster. For example, If you have your own server, who will come to your aid if something goes wrong? In a real-time environment there is no margin for error, you have a tiny window before your users start to bail on a call. Have a solution to deal with drops, errors, and scaling issues in real-time. Reading the documentation for whatever platform you’re using and making sure it has really good bitrate management will help prevent these issues. Best of all, incorporate a WebRTC Performance testing tool right from the very beginning. Using a WebRTC performance testing suite like testRTC , you can identify, isolate, and fix problems in real time to prevent disruptions to your service.
Are you using a centralized, off-the-shelf meeting tool? Is this sustainable at scale? Now might be a good time to look at integrating something into Your own infrastructure to help you to become a first class citizen in the WebRTC world.
Top tip: Maintain a tight focus on the feature-sets to focus on right from the very first version of your application. You don’t want to end up with unnecessary features and totally avoidable concerns that prevent you from focussing your efforts on scaling down the line.
Turn the scales
So there you have it. WebRTC scalability is not a myth, it’s a reality and one that is very much within your reach. Choosing between the four best architectures, while integrating a scalable WebRTC Performance testing tool will make your scaling dreams a reality and is just a matter of knowing your audience, your infrastructure, and your financial limitations. Armed with this knowledge, you can confidently plan for all of your future successes – the possibilities are endless!
Spearline is a technology company that proactively tests toll, toll-free and premium-rate numbers for audio quality and connectivity globally. Our latest testRTC product offers WebRTC testing, monitoring and support. We support business sectors, such as contact centers, conferencing services, and other applications, in successfully connecting with their customers.