Audio Fallback: Real-time Traffic Shaping with OpenTok

feature-optimizationAt TokBox, we aim to push boundaries and deliver the best possible WebRTC-enabled experience for application developers building face-to-face video applications. One of our guiding architectural philosophies has been to provide the right primitives for developers to build rich and powerful applications. In addition, we want to make sure we abstract the underlying nuts and bolts and enable the cloud service to dynamically react to changing environmental conditions (bandwidth, packet-loss, etc.) in order to deliver the best possible experience.

The multiparty stream routing component of the OpenTok platform is also capable of shaping traffic in real time. Let’s take a look at how this this capability delivers a significantly improved quality of experience for users.

Audio Fallback

One of the core tenets that drives our team is to implement practical features that solve real-world scenarios. As part of our commitment to continuously improve quality, we have implemented a relatively simple but powerful feature called the “audio-only fallback,” which goes a long way toward optimizing user experience.

To better understand how this works, let’s consider how the OpenTok platform would have dealt with a poor connection before OpenTok added the audio-only fallback:

  • User A is publishing a stream

  • User B joins the session and is subscribing to User A

  • User C joins the session after a bit and also subscribes to User A

Assume in the above scenario User B has a good download bandwidth and User C has limited download bandwidth. Before implementing audio-only fallback, the OpenTok media server would continuously monitor RTCP traffic in real-time to build an aggregated model of all subscribers. Having detected that User C’s quality (ability to subscribe) had been impacted by his/her download bandwidth, the OpenTok media server would have automatically signaled User A to downgrade his/her publishing bit-rate/quality to accommodate User C. This older approach improved the quality of experience of all participants as compared to what otherwise would have occurred, but it had the effect of reducing everyone’s call quality to the lowest-common denominator.

With the audio-only fallback feature, OpenTok has the intelligence and ability to avoid the lowest-common denominator syndrome. OpenTok now supports the ability to dynamically fallback to an audio-only stream. Looking at the scenario above with this new capability, User C receives a pure audio-only version of User A’s stream, whereas User B continues to receive the full audio-video version of User A’s stream. This makes sure User C is able to continue to participate in the session to the fullest extent possible, while not penalizing User B who continues to have the best possible experience. Our team in Barcelona has implemented some pretty nifty algorithms to allow Mantis to manipulate RTCP traffic from various participants individually and dynamically to strip out audio for certain participants while delivering the full experience to others.

Here at TokBox, we are diligent about being data-driven when making decisions and building features. So, here is a quick analysis of how this feature has impacted our ecosystem.

Over a two-week period, we analyzed data from multiparty calls in which at least one subscriber experienced poor network conditions (low download bandwidth, packet loss, etc). We call these  “congestion events.” We then aligned the moment of each call’s congestion event to T=0 for better readability.

  • X-axis: Time in seconds with all “congestion events” aligned at T=0

  • Y-axis: Average publishing bit rate of all such streams

In general OpenTok pushes streams at close to 1 Mbps. But if you are wondering why, in general, the bit rate is low (~200 kbps): it is because of selection bias. We have picked only streams which had subscribers with poor networks, who would have dialed down the publisher quality under the covers by default.

At T = -50, we see that publisher quality is beginning to deteriorate. This is because subscribers who are experiencing bad network conditions are reducing (using RTCP feedback) the bit rate of the publishers. This normally would impact all subscribers to this stream and would lead to the “lowest-common denominator” problem we described earlier. With our audio-only fallback feature, however, when the bit rate falls and hits a threshold (at T = 0), we see our cloud service intelligently react by sending audio-only streams to these subscribers. Those subscribers will now receive the best experience possible (audio only) given their network conditions, while the publisher bit rate quickly returns to original levels. So, every other subscriber of this stream is receiving the best possible experience.

In addition, we also have events in our OpenTok client SDKs which enable you as a developer to detect the switch to “audio only” and build elegant UX treatment as part of your application. You can find them here:

We are committed to continuously enhancing the quality of experience and this only the beginning. If you have questions, feel free to reach out to us.