The VideoHarp is an optical-scanning device for sensing and tracking the movement of multiple fingers which is then used to control the generation of light or sound or to control the motion of other physical objects. Preferably, the VideoHarp detects the images of a performer's fingertips using a single sensor. From these images, the movement of each fingertip is tracked and this information is translated into a standard output, which is preferably used to control a device which generates sound or light. The translation of the finger motion into control signals is programmable, enabling the VideoHarp to be played using a variety of different types of motions and gestures. For example, the VideoHarp may be played with harp-like or keyboard like gestures, by bowing or drumming motions, or even by gestures and motions with no analogue in existing instrument techniques.