A video gesture-based three-dimensional computer interface system that uses images of hand gestures to control a computer and that tracks motion of the user's hand or a portion thereof in a three-dimensional coordinate system with ten degrees of freedom. The system includes a computer with image processing capabilities and at least two cameras connected to the computer. During operation of the system, hand images from the cameras are continually converted to a digital format and input to the computer for processing. The results of the processing and attempted recognition of each image are then sent to an application or the like executed by the computer for performing various functions or operations. When the computer recognizes a hand gesture as a "point" gesture with one or two extended fingers, the computer uses information derived from the images to track three-dimensional coordinates of each extended finger of the user's hand with five degrees of freedom. The computer utilizes two-dimensional images obtained by each camera to derive three-dimensional position (in an x, y, z coordinate system) and orientation (azimuth and elevation angles) coordinates of each extended finger.