05513354 is referenced by 63 patents and cites 11 patents.

A method and apparatus are disclosed for managing tasks in a network of processors. After a period of time has elapsed, during which the processors of the network have been executing tasks allocated to them, the processors exchange views as to which pending tasks have or have not been completed. The processors reach a consensus as to the overall state of completion of the pending tasks. In a preferred embodiment, the processors exchange views and update their views based on the views received from the other processors. A predetermined condition determines that a consensus has been reached. The predetermined condition is preferably two sets of exchanges in which a processor has received messages from the same set of other processors. Alternatively, the condition is an exchange which does not result in any updates to a processor's view. A processor which has not sent a view as part of an exchange is deemed to have crashed, and the tasks previously allocated to crashed processors are assumed not to have been completed. All pending tasks, including those previously allocated but not completed, are then allocated. Preferably, allocation is based on an estimation that approximately the same time will be required for each processor to complete its allocated tasks. Based on this estimation, a time is scheduled for the next exchange of views, and the processors then resume executing their allocated tasks.

Title
Fault tolerant load management system and method
Application Number
7/993183
Publication Number
5513354
Application Date
December 18, 1992
Publication Date
April 30, 1996
Inventor
Hovey R Strong Jr
San Jose
CA, US
Joseph Y Halpern
Cupertino
CA, US
Cynthia Dwork
Palo Alto
CA, US
Agent
James C Pintner
Assignee
International Business Machines Corporation
NY, US
IPC
G06F 15/16
View Original Source