A visual display device is provided for delivering a generated image, preferably combinable with environment light, to the eye of a user. The device is lightweight and compact but yields a high quality image. In one embodiment, a shroud protects from stray light and holds optical elements in desired alignment. In one embodiment an image generator is masked by at least two masks to provide for a high quality image without waste. In one embodiment, a removably mounted shield or activatable device and convert the apparatus from a see-through device to an immersion device and back again. In one embodiment, the device can be comfortably mounted to the user's head while still allowing for use of conventional eyeglasses. A tracker for outputting an indication of the orientation, attitude and/or position of a head-mounted display (HMD) may be provided. The tracker can be configured so that it is incorporated in the HMD housing and/or can be easily decoupled from the HMD, so that the HMD can be used without the tracker (e.g. for watching movies). Preferably, decoupling involves unplugging a single electrical connector (such as a cable) and unfastening a mechanical connection (such as a strap). Preferably the tracker provides pass-through of signal to the HMD and, when the tracker is coupled to the HMD, only a single cable or other data link connects the HMD-tracker combination to the host computer. In one embodiment, the tracker uses magnetic sensors. In another embodiment, one or more inertial sensors, such as a rate gyro and/or accelerometers are used.