A system stores images as a series of layers by determining (i) the boundaries of regions of coherent motion over the entire image, or frame, sequence; and (ii) associated motion parameters, or coefficients of motion equations, that describe the transformations of the regions from frame to frame. The system first estimates motion locally, by determining the movements within small neighborhoods of pixels from one image frame i to the next image frame i+1, to develop an optical flow, or dense motion, model of the image. Next, the system estimates the motion using affine or other low order, smooth transformations within a set of regions which the system has previously identified as having coherent motion, i.e., identified by analyzing the motions in the frames i-1 and i. It groups, or clusters, similar motion models and iteratively produces an updated set of models for the image. The system then uses the local motion estimates to associate individual pixels in the image with the motion model that most closely resembles the pixel's movement, to update the regions of coherent motion. Using these updated regions, the system iteratively updates its motion models and, as appropriate, further updates the coherent motion regions, and so forth. The system then does the same analysis for the remaining frames. The system next segments the image into regions of coherent motion and defines associated layers in terms of (i) pixel intensity values, (ii) associated motion model parameters, and (iii) order in "depth" within the image.