A multiprocessor having improved bus efficiency is shown to include a number of processing units and a memory coupled to a system bus. Also coupled to the system bus are at least one I/O bridge systems. A method for improving partial cache line writes from I/O devices to the central processing units incorporates cache coherency protocol and an enhanced invalidation scheme to ensure atomicity which minimizing the bus utilization. In addition, a method for allowing peer-to-peer communication between I/O devices coupled to the system bus via different I/O bridges includes a command and address space configuration that allows for communication without the involvement of any central processing device. Interrupt performance is improved through the storage of an interrupt data structure in main memory. The I/O bridges maintain the data structure, and when the CPU is available the interrupts can be accessed by a fast memory read; thereby reducing the requirement of I/O reads for interrupt handling.