OpenTok on WebRTC: Grab your lab coat!

At TokBox, we spend a lot of time keeping up with what’s going on in the world of face-to-face video, because we’re always looking for the best ways to help move the OpenTok platform forward.

Today, we’re very happy to launch OpenTok support for WebRTC through an early-access build generally available to our developer community. While WebRTC is still a ways away from being ready for end users, last week Google took a big step forward towards their vision of what WebRTC could be with their stable release of Chrome 21. That makes this an opportune time to show you what we’ve been working on behind the scenes.

We’re making this build available through OpenTok Labs, a new part of our site for sharing what’s up-and-coming with developers. Through OpenTok Labs, we hope to give you a clearer idea of what we’re working on and what the future of OpenTok will be. And with your feedback, to make that future even brighter.

For those of you unfamiliar with the WebRTC project, it has the potential to dramatically change the quality of face-to-face video across the web. Combining a set of HTML5 APIs with new codecs, AV processing algorithms and peer networking support, WebRTC has the potential to replace Flash as a higher-quality foundation for browser-based face-to-face video. And for those reasons, we’re very excited about WebRTC.

But WebRTC, like other standards efforts, is not likely to follow the smoothest path to market. Just yesterday, Microsoft indicated that it didn’t see eye-to-eye with Google on some of the decisions that Google is pushing in the standard. And that shouldn’t be too surprising – after all, what other browser-based video standard has benefited from general agreements on codecs, protocols and other key issues?

That’s why we take a pretty pragmatic approach. We’re excited about WebRTC, no doubt. But we also know that it’s going to take a while before your end users all have a consistent implementation in their browsers. Some users will have WebRTC. Some will have different codecs. Some won’t have anything at all.

At TokBox, our job is to sort through fragmented environments like these and make life simpler for you. After all, you just want face-to-face video in your app, without having to worry about standards wars, telling your users to download or install any scary plugins, or giving them crazy configuration instructions.

What you need us to do is easy to describe: detect and assess browser and device environments, figure out how to connect two, three, or three hundred endpoints together, and deliver the best quality we can to your end users. Whether that’s over WebRTC or Flash. Peer-to-peer or cloud-routed. Via an open standard or with our own proprietary stack. Then add additional services and features – archiving, device interoperability, and the like – so you can let your app fly.

By using WebRTC well, OpenTok can continue to keep life simple for developers while making end users even happier. And that will make us very happy here at TokBox.

Our new Labs build starts to paint this vision. And in that same build, we’re also leveraging native JavaScript WebSockets communications for all internal OpenTok messaging. Which makes everything else about OpenTok – beginning with how long it takes to connect to a session – a whole heckofalot faster…

If you haven’t already experienced it for yourself, you should. Make sure your Chrome release is up to date, and wander over to the Labs to visit the future today.

  • Tsahi Levent-Levi

    This is some interesting progress. You are probably the first ones who are adding WebRTC on top/instead of Flash.
    What happens if I have multiple clients in a call, where some use WebRTC and some Flash? Do you transcode the video on the server side?

    • http://songz.me/ Song Zheng

      Hi Tsahi! Thanks for your compliment! If there are Flash users in the call, then everyone will be degraded to Flash.

      • Ian Small

        Right now, the alpha on OpenTok Labs is artificially limited to 2-person calls. We put that limit in just prior to release as part of stabilization of the alpha. Our path is to implement WebRTC for multiparty and large-scale calls as well, but our path to doing that will be via more sensible and scalable mechanisms that simply opening up a lot of P2P connections, which clearly doesn’t scale past a few users and has the potential to soak bandwidth usage in both desktop and mobile settings. That kind of intelligent multiparty scaling capability is not available in the alpha (baby steps).

        For the situation you’re talking about, Song is correct – everyone (ie. the two people on the call) will be automatically degraded to Flash. But you could also choose to insist on WebRTC connectivity and exclude the Flash user from the call. It will soon be easy to implement this kind of “minimum configuration” capability via the OpenTok API so that we take that problem off your hands as well. Transcoding – as you suggest – on the server side is possible but introduces latency issues in a real-time conversation and is in general prohibitively expensive for all but a few scenarios (generally those with large-scale audiences).

      • Ian Small

        Right now, the alpha on OpenTok Labs is artificially limited to 2-person calls. We put that limit in just prior to release as part of stabilization of the alpha. Our path is to implement WebRTC for multiparty and large-scale calls as well, but our path to doing that will be via more sensible and scalable mechanisms that simply opening up a lot of P2P connections, which clearly doesn’t scale past a few users and has the potential to soak bandwidth usage in both desktop and mobile settings. That kind of intelligent multiparty scaling capability is not available in the alpha (baby steps).
        For the situation you’re talking about, Song is correct – everyone (ie. the two people on the call) will be automatically degraded to Flash. But you could also choose to insist on WebRTC connectivity and exclude the Flash user from the call. It will soon be easy to implement this kind of “minimum configuration” capability via the OpenTok API so that we take that problem off your hands as well. Transcoding – as you suggest – on the server side is possible but introduces latency issues in a real-time conversation and is in general prohibitively expensive for all but a few scenarios (generally those with large-scale audiences).