An input interface system provides gesture-based user control of an application running on a computer by classification of user gestures in image signals. A given one of the image signals is processed to determine if it contains one of a number of designated user gestures, e.g., a point gesture, a reach gesture and a click gesture, each of the gestures being translatable to a particular control signal for controlling the application. If the image signal is determined to contain a point gesture, further processing is performed to determine position and orientation information for a pointing finger of a hand of the user and its corresponding shadow. The position and orientation information for the pointing finger and its shadow are then utilized to generate a three-dimensional pose estimate for the pointing figure in the point gesture. For example, the three-dimensional pose estimate may be in the form of a set of five parameters (X, Y, Z, &agr;, &egr;), where (X, Y, Z) denotes the position of a tip of the pointing finger in three-dimensional space, and (&agr;, &egr;) denotes the respective azimuth and elevation angles of an axis of the pointing finger. The point gesture can thus be used to provide user control in virtual flight simulators, graphical editors, video games and other applications.

