Despite the fact that filters are used a lot in non-WebRTC video applications like Photo Booth and SnapChat, we haven’t seen many WebRTC applications using these types of filters. This is probably because it hasn’t really been possible… until now.
It has always been possible to apply filters to video streams locally using the OpenTok platform by rendering the video into a Canvas element. The problem with this approach has always been that the person on the other end does not see the filter unless you apply the same filter on both the publisher and subscriber video. This would mean significant CPU load if you are subscribing to multiple participants. It also means that you don’t get to see the filters in the Archives.
The good news is that as of Firefox 43 and with the recently released Chrome 51 we have a new captureStream method on the Canvas object which allows you to capture a stream from the Canvas and use that in your PeerConnection. This means that when you apply the filter on the publishing side, the subscriber will receive the video already filtered, and any archives will also see the filtered video.
So how do we use that API in OpenTok?
I wrote a library, opentok-camera-filters, that adds a few different filters to OpenTok Publishers. The filters aren’t as elaborate or entertaining as the SnapChat filters but they serve as a good proof of concept.
You can go and check out the demo here (you will need to be using the latest Firefox or Chrome browser). There are also instructions on the README for how to use these filters in your own OpenTok project.
Using captureStream in OpenTok Apps
The Canvas has a captureStream method on it which gives you a WebRTC stream that you can use in a PeerConnection. It is supported as of Chrome 51 and Firefox 43, however Firefox does not support adding audio to a created stream until version 49 due to a bug. This means that anything that can be rendered in a Canvas can also be streamed in an OpenTok session.
At the moment, OpenTok does not allow you to provide your own WebRTC stream object or give you access to the PeerConnection to overwrite the object in there either. So I had to do a little hacking. I wrote a mockGetUserMedia function which overwrites the getUserMedia methods used by OpenTok and lets you replace the stream that is returned. This method of overwriting the getUserMedia function unfortunately can be error prone and not officially supported by OpenTok however it does the trick for this proof of concept. This mockGetUserMedia function can be used by calling it and passing in a function that accepts the WebRTC getUserMedia stream. It then returns your own stream that you want to replace that stream with. For example:
Here is the code in opentok-camera-filters that uses it and actually applies real filters.
How the filters work
- The code for the filters is in the filters.js file. There is a function, filterTask, which is passed a filter function and repeatedly calls requestAnimationFrame.
It then renders the filtered ImageData into a Canvas.
The filter functions all receive the unfiltered ImageData from the VideoElement and return a new filtered ImageData object to render on the Canvas.
For example here is the filter function for the invert filter:
This function simply takes in ImageData and iterates through each of the pixels replacing the red, blue and green portions of the data with the invert of the colour by subtracting it from 255. It then leaves the alpha portion alone. The result is an inverted image (which looks a little creepy).
This hilarious effect is achieved by using face detection to detect the positions of faces in the image and then drawing an image on top of the canvas at that position. The face detection library used is tracking.js.
When I first used face detection I was finding that the video was rendering really slowly because the face detection algorithm was quite intensive. To solve this I moved the face detection part to run as a background task using the Worker API. This meant that the rendering of the video on the Canvas was not affected by the algorithm detecting faces.
This was achieved by using a single Worker which grabs the most recent frame and tries to detect any faces in it. Then when it finishes it grabs the next most recent frame (which probably will not be the successive frame because it takes longer to detect faces than to pull another frame). The result unfortunately means that the face lags a little bit behind the video but at least it’s not making the video itself lag.
There are so many possibilities unlocked by the captureStream API and this area is fertile ground for lots of great WebRTC applications and libraries. SnapChat style elaborate animated video filters, Photo Booth style green screen effects, picture in picture displays and subtitles rendered in the video are all made possible by this feature.
At TokBox we look forward to powering more of these great features in your applications. Hopefully the opentok-camera-filters library and this blog post will inspire you to go out and create, and have fun with, the future of communications on the web.