Fun With Core Graphics in iOS

Several partners have been asking us about the options around getting access to media streams as they come and go from an iOS device. While more robust media access features are further off, I wanted to take some time to explore the options an iOS developer can play with today.

The UIKit view hierarchy integrates with a fairly simple animation and compositing API. Every instance of UIView is backed by an animation layer (CALayer), which can be accessed (and manipulated) without much complexity. A neat thing about CALayer is that you render its contents at any time using the renderInContext: method. Most often, your render target is the window, which is managed by the UIKit view hierarchy, so none of this knowledge is particularly compelling. Unless of course, you wanted to render the contents of the animation layer to a bitmap in memory to perform, say, facial recognition with the iOS 5 CIDetector.

Clown mode!

Subscriber and publisher instances may be captured for analysis

Having borrowed some code from a great example on facial recognition with iOS, I made a background method that will continuously render video content into a hunk of memory for facial recognition analysis. While the example is a bit silly, it is clear to see what sorts of neat opportunities open up with this functionality. The source code of the application that was used to make these images can be found on GitHub, all of the interesting parts are inside the view controller, in a function called doFacialDetectionOnView:.

In order to do work on the CALayer that drives a video container, we must first prepare a context that can act as a render target.

CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
CGContextRef context = CGBitmapContextCreate(NULL,
                             imageSize.width,
                             imageSize.height,
                             8,
                             4*imageSize.width,
                             colorSpace,
                             kCGImageAlphaPremultipliedLast);
CGColorSpaceRelease(colorSpace);

Next, we will render the layer contents into our bitmap context and create a CGImage from the context. It is worth noting that the contents property of CALayer is typically backed by a CGImage itself. While you could perform processing directly on the layer contents, this will affect timely delivery of new content, specifically in the case of our video stream. Also, any background filters and pending animations on a layer will be discarded when rendering to the bitmap context.

[view.layer renderInContext:theContext];
CGImageRef myCGImage = CGBitmapContextCreateImage(theContext);

At this point, the developer is free to do work on the rendered content. In our case, this is the entry point into CIDetector for facial detection. I also added a transform to rotate the image about the origin; the UIKit view hierarchy uses a different coordinate system than Core Image. This would also be an acceptable entry point for applying Core Image filters on your content, of the few available for iOS.

CIImage* myCIImage = [CIImage imageWithCGImage:myCGImage];
myCIImage = [myCIImage imageByApplyingTransform:CGAffineTransformMakeRotation(M_PI)];
// create an array containing all the detected faces from the detector
NSArray* features = [detector featuresInImage:myCIImage];

That’s all you need to know to get started with processing viewable content in your iOS apps. I hope this inspires some development ideas that are more interesting than what you see here. If you come up with something cool, put it on GitHub and leave a comment with the link here!

Oh! Also: In writing this demo, I noticed that CIDetector can actually handle lots of image features in a timely manner. If you haven’t had a chance to play around with this neat framework in iOS 5, I recommend it.

Multiple features detected in a single image

TokBox, Inc. is an equal opportunity employer. Clowns welcome.