Vector processing in a computer is achieved by means of a plurality of vector registers, a plurality of independent fully segmented functional units, and means for controlling the operation of the vector registers. Operations are performed on data from vector register to functional unit and back to vector register with minimal delay, rather than memory to functional unit and return to memory with its attendant much greater start-up delays. Data may be bulk transferred between memory and some vector registers while other vector registers are involved in vector processing with one or more functional units. In vector processing elements of one or more vector registers are successively transmitted as operands to a functional unit at a rate of one per clock period, and results are transmitted from a functional unit to a receiving vector register at the same rate. In a chaining mode of operation, the elements in a result vector register become available for immediate and simultaneous transmission as operands to another functional unit. In this mode, more than one result can be obtained per clock period.