An imaging instrument includes a compact hand-held housing having an electronic imaging element supported within a housing, and a plurality of interchangeable instrument heads separably attachable to the housing. Each of the instrument heads includes an optical system disposed in alignment with the electronic imaging element along an instrument viewing axis. Preferably, the instrument further includes an integral display for displaying at least one captured or real-time video image as viewed through the instrument head of choice. The instrument includes a controller with sufficient programmable logic to capture and store a plurality of imaging images which can be transferred along with audio and/or annotation data relating to a captured image. Corresponding video and audio data can be then transferred using a receiving cradle to a computer which contains software which organizes the stored data for further processing. In a preferred example, the audio files can be transcribed through a network utilizing voice recognition software.