A Core Animation Manifesto

May 7, 2013

Core Animation is an advanced compositing and animation framework for iOS and OS X. Not only does it open up the ability to perform incredible animations with just a few lines of code, but it also includes powerful objects called layers, which are extremely lightweight objects that contain some type of visual content which can be easily manipulated by transforms. This article will explore various aspects of Core Animation, beginning with layers.

Core Animation was initially named Layer Kit. What are layers? Layers are a lightweight data structures that are similar to views. They can be arranged in a hierarchy to create a user interface. The parent layer (superlayer) can contain a set of layers (sublayers), forming a layer tree. The sublayers have a coordinate system that is relative to their superlayer.

The real power of layers is their ability to be transformed by transform matrices. Although layers are primarily designed for 2D interfaces, they do exist in a 3D coordinate system and can therefore be transformed in three dimensions. Each instance of a layer has its own coordinate system. Any sublayers added to a layer will be positioned relative to that layer. Layer coordinate systems actually differ between platforms. On iOS, the origin is the top-left corner, whereas on OS X the origin is the bottom-left corner. The remainder of this article will use the OS X coordinate system.

Layers contain multiple properties which affect their geometry. The first is the position. The position is just a struct of type CGPoint, which contains an x and y coordinate, which are relative to the origin of the superlayer. The bounds of a layer is a combination of a struct containing the size of a layer (CGSize), and the origin of the graphics context, which is usually {0, 0}. The anchorPoint is a CGPoint that has valid values in the range of 0.0 – 1.0. The anchor point directly affects how the position of the layer relates to the bounds. For example, a layer with an anchor point of {0.5, 0.5} will have its position set relative to the exact center of the layer. The anchor point also affects transforms, which will be discussed later. Layers also have a frame property, which is derived from the anchor point, position, and bounds.

Layers can be transformed using transform matrices. These transforms can be represented as CATransform3D structs. CATransform3D structs merely contain a 4x4 matrix of floats that represent a transform matrix. By default, a layer has its transform property set to CATransform3DIdentity, which unsurprisingly is the identity matrix. Applying this identity matrix to the layer will revert any transforms applied, and will reset the layer to its default position, scale, rotation, and perspective. There are many ways to transform a layer. Although a transform matrix can be created manually, it is easier to use the built-in functions to easily create a transform with a specific purpose. CATransform3DMakeTranslation creates a transform matrix which is translated by x, y, and z. CATransform3DMakeScale returns an identity matrix scaled by x, y, and z. CATransform3DMakeRotation is a transform matrix which rotates by x, y, and z in radians. A complex transform matrix can be formed by combining these transforms into one matrix. This can be accomplished by using CATransform3DConcat, which takes in two transform matrices and concatenates the second onto the first (a * b). Alternatively, variants of the transformation functions can be used to directly apply one transform onto an existing transform. These are CATransform3DTranslate, CATransform3DScale, and CATransform3DRotate, which take in the same arguments as their counterparts, with the addition of a pre-existing transform. These transforms can be used to create some stunning effects, and are most useful when combined with animation, which will be discussed in a later section.

A layer can be provided with content in multiple ways. The first and easiest method is to simply set the contents property of the layer to an image. The image can either be of type NSImage on the Mac, or CGImageRef on iOS and Mac. The current backing store of the layer is replaced with the content of the image. This behavior can be useful when focusing on optimization, and will be explored in further detail later. Another way to provide content to the layer is through drawing. Layers provide access to the drawing context through the delegate method -drawLayer:inContext:, and the overridden -drawInContext: method for subclasses. In either case, the context is provided as a CGContextRef, which can be used to draw graphics as desired. The drawing is then performed on the CPU, then the context is converted into a texture optimized for the GPU. Behind the scenes, Core Animation ties in directly with Open GL ES. Layers are really just wrappers around GPU textures, because the content is rasterized into a texture before being displayed on the screen.

Core Animation has multiple types of layers built-in. The main class, CALayer, is the parent class for all other types of layers. It provides all the basic features that layers need. Core Animation also includes specialized subclasses of CALayer that add on additional features, or modify behavior in some cases. CATiledLayer is designed to display content with massive dimensions that would be impractical and impossible to load normally. As its name implies, CATiledLayer splits up its content into smaller tiles. As transforms are applied to zoom the layer, tiles are dropped and redrawn at the requested detail asynchronously. CAScrollLayer is optimized for scrolling and displaying a portion of a layer. CATextLayer makes it easy to create a layer with a content of text. There are also additional classes that are only available for specific platforms. On OS X there are CAOpenGLLayer, QCCompositionLayer, QTMovieLayer, and QTCaptureLayer, whereas on iOS there is CAEAGLLayer.

Although layers by themselves are incredibly useful for creating performant interfaces, they are complemented with a powerful ability to tie in with the animation aspect of Core Animation. Apple’s implementation of animations is fantastic. On a superficial level, animations can be performed by creating an instance of an animation and adding it to a layer. Once an animation is added, Core Animation handles the rest. A dedicated secondary thread is used by Core Animation to ensure that regardless of whether the GPU can keep up with a full 60 frames per second, the animation will finish precisely on time. If needed, Core Animation will drop frames automatically to guarantee this behavior.

A brief aside must be taken to discuss the basics of a technology in the Objective-C language known as Key-Value Coding, or KVC. KVC is a method of accessing a property on an object through the use of strings, instead of directly accessing the object’s instance variables or accessors. At first this might seem useless, but in reality it can be incredibly useful, especially when combined with Core Animation. Accessing a KVC-compliant property using KVC is as simple as calling -valueForKey:, passing in the name of the property as a NSString. Use of this technology can dramatically simplify code, such as in the case where identifiers are mapped to property names. Another way of accessing values is through the use of key paths. A key path is a string that in essence “drills down” through KVC-compliant properties until the desired value has been found. Specific Core Animation classes, namely CAAnimation and CALayer extend the NSKeyValueCoding protocol, and add support for key paths, even for structs such as CGPoint and CATransform3D. This ability allows for Core Animation to provide an incredibly simple API for animating specific values.

To get started with a basic animation, Core Animation includes a standard object, CABasicAnimation, that provides all the necessary tools to create an animation which interpolates from one value to another. It is a subclass of CAPropertyAnimation, which in turn is a subclass of CAAnimation. The first step is to create an instance of CABasicAnimation, which can be done by using the designated property initializer, +animationWithKeyPath:. Alternatively, the standard CAAnimation initializer can be used by calling +animation and setting the keyPath property afterwards. The key path is the property or value which needs to be animated. Not all of CALayer’s properties are animatable, but those that aren’t are explicitly listed in Apple’s documentation. Next, the animation needs to know what you want to interpolate from, and what you want to interpolate to. This can be done by setting the fromValue and toValue, which are both of type id. As a consequence of the type being id, scalar types must be wrapped into objects. Floats and other scalar numbers can be wrapped into NSNumbers. Other structs such as CGRect can be wrapped into NSValues. For example, suppose you would like to animate the position. The position in this case would be a CGPoint, which is a struct. We need to wrap this in a NSValue, which can be performed by calling NSValue’s +valueWithCGPoint: method. The final property worth mentioning in CABasicAnimation is byValue, which is optional. If set, Core Animation will attempt to use the value as an increment between interpolation steps. The other inherited properties (such as duration) are optional as well, and will be set to their default values if omitted.

Somewhat more complex animations can be formed by using CAKeyframeAnimation. Keyframe animations in Core Animation seem simple at a first glance, but they can be used to create some surprisingly complex animations. Instead of taking control over the entire interpolation as CABasicAnimation does, CAKeyframeAnimation allows manipulation of intermediate points along the interpolation, which allows for far more control over the animation. There is only one property which is required to be set before the animation is able to be used, namely the values array, which is of type NSArray. The array of values needs to contain at minimum the ending value, but it may contain as many values as required to create the specific animation. There is also an optional property, keyTimes, which is an array of numbers ranging from 0 to 1 which correspond directly to their related values in the values array. If the count of both arrays is not identical, or if invalid numbers are specified in keyTimes, it will be ignored. Since the times range from 0 to 1, the numbers could be thought of as percentages of the total duration of the animation. If the animation needs to move along a specific path, CAKeyframeAnimation has an optional property that accepts a CGPathRef. This path will take precedence over the values array (if provided), and will be used along with the key times (if set) to create a motion along the path. There are several other properties on CAKeyframeAnimation that can be used to further modify the behavior of the animation, but they are outside the scope of this article. However, there is one property, timingFunctions, which serves a unique purpose of allowing for different timing functions for each of the values in the animation.

The ability to set timing functions is also present in CABasicAnimation. By default, animations will use a mostly linear curve, meaning the object will move from one value to another without much variance in the speed. For some purposes this can be useful, but when creating user interface animations this can make the animations appear “dead”. This can be easily remedied by using the timingFunction property declared in CAAnimation, from which CABasicAnimation inherits. In the case of CAKeyframeAnimation, the timing function can be set through the the related array of timingFunctions. Both cases require an object of type CAMediaTimingFunction. A timing function object is usually created by using +functionWithName:, passing in one of the timing function constants. These constants are kCAMediaTimingFunctionLinear, kCAMediaTimingFunctionEaseIn, kCAMediaTimingFunctionEaseOut, kCAMediaTimingFunctionEaseInEaseOut, and kCAMediaTimingFunctionDefault. For most purposes kCAMediaTimingFunctionEaseInEaseOut provides the most natural feeling for animations. Custom timing functions can also be created by using the -initWithControlPoints:::: method on CAMediaTimingFunction. Although this offers a good degree of flexibility, the shape of the resulting bezier path is still cubic, and multiple points are unable to be specified. However, CAKeyframeAnimation was designed with this in mind, and can be an excellent substitute if the timing functions do not fit the needed style of animation.

Although layers can be extremely performant, there are some bottlenecks that can cause unwanted lag. The first area of optimization is blending. When a layer does not have any transparent areas it is opaque. Opaque layers are not blended with any layers beneath them. When a layer is non-opaque and is placed above another layer, the GPU is forced to blend the two layers together, which is quite an expensive operation. Using the Core Animation instrument (currently only available for iOS), detecting blended layers is as easy as checking the box that is labeled “Color Blended Layers”. The areas that are in red are blended, and should be reduced as much as possible to avoid multiple layout passes.

The second area of optimization is pixel alignment. Core Animation, UIKit, and AppKit all use floating-point numbers to store pixel coordinates. When the content is actually drawn on the screen, the pixels don’t align with the points. This causes anti-aliasing, which can put some degree of strain on the GPU, not to mention the unsightly fuzzy appearance. This issue can be resolved easily by ensuring that coordinate calculations are always truncated or rounded to the nearest integer. Since non-integral coordinates usually occur when layout calculations take place, the recommended approach is to use the floor() or ceil() functions to truncate or round the numbers.

Finally, a large source of performance issues can arise from masking to bounds. If a layer has a corner radius set, for example, masking to bounds must be turned on for the layer to render with the rounded corners. Unfortunately, turning on masking actually causes the layer to perform multiple offscreen render passes, which can absolutely destroy performance. There are multiple ways to work around this issue, but the simplest and arguably the best way to avoid this problem is to avoid masking at all. The corner radius effect can be emulated by clipping the context to a path before drawing. Another way to work around this issue is to rasterize the layer.

Layer rasterizing can sometimes bring fantastic performance boosts to complex layers. Layer rasterization can be turned on by setting the shouldRasterize property on the CALayer in question to YES. Rasterization works by rendering the entire layer as a bitmap, including effects such as masked bounds and shadows. This can potentially cause a tremendous increase in speed when animating. The GPU excels at moving around textures, and rasterized layers are no exception. However, rasterizing layers does not always work in your favor. If the content of the layer changes at all, the entire bitmap cache of the layer is invalidated, rendered, and cached again. Naturally this is quite inefficient, and as a result layers that will have any changes applied to them during animation are advised not to rasterize. Additionally, just because a layer is rasterized does not mean that non-opaque layers with blending issues will have improved performance. The blending will still need to take place. Finally, rasterization requires more memory usage due to the additional bitmap cache of the layer. These allocations are significantly more expensive due to the fact that they are effectively mutable surface textures. They will be paged to another backing store rather than destroyed for resource-constrained scenarios. On OS X this is not much of a worry due to the large amount of memory available to applications, but on iOS devices memory usage can be a potential concern.

Decreasing memory usage is a vital aspect of development. Although layers are lightweight, they still must keep their textures in memory. If the same texture is used repeatedly, the same bitmap can be shared across multiple layers. Suppose a table view had cells that contained the same background for each of them. One way to optimize this would be to draw the background once, and render that background into an image that all the cells have access to. Then, each cell can either set their layer’s contents to this image, or create sublayers using the same image for their content. The end result is the reuse of the single bitmap, which helps to bring some welcome memory improvements.

Animations are fantastic. However, they serve a purpose. An application that includes animations just for the sake of having animations should not have them at all. Animations serve multiple purposes, but their most important purpose is to provide context. Suppose you’re using an email client. You are scrolling through your emails, and you decide you no longer need one of them. You tap or click delete. Without animation serving as contextual guidance, the cell that contained the information about the email you currently selected suddenly disappears. This happens so quickly that your eye notices a flash, and suddenly the list is updated and the old email is nowhere to be found. At this point, you might second-guess yourself, and start to wonder whether you really deleted the right message. This type of behavior does not happen in real life, and therefore it’s hard to relate to it. Now suppose this whole process is animated. You delete the message, and it gently drifts away and fades out, while the messages beneath it slide up to fill the gap left by the deleted message. This action is much more relatable to in real life, and an animation in this case gives the user reassurance that the correct message was indeed deleted.

As it unsurprisingly turns out, interfaces that implement these type of natural animations are far more intuitive than those that do not. On the other hand, interfaces that implement animations that do not exist in real life are quite counterintuitive. Another example of an intuitive animation is on iOS. When navigating to a new level in a view hierarchy, the old view is pushed out to the left while the new view is simultaneously pushed in from the right. A similar situation could be created in the real world if two papers were taped together and put underneath a frame that only showed one single paper. If the paper under the frame were pushed to the left, the paper attached to the right would slide into view. These type of animations and movements are instantly relatable; no explanation is needed. This is how animation should be.

Core Animation is truly a masterpiece of engineering. A short amount of code can produce stunning animations that enhance your interface. Nearly all the complicated work previously needed to create complex transformations are completely unnecessary with Core Animation. With some amount of care, proper use of Core Animation can transform your application into something extraordinary.