An array processor includes a managing element having a load streaming unit coupled to multiple processing elements. The load streaming unit provides input data portions to each of a first subset of the processing elements and also receives output data from each of a second subset of the processing elements based on a comparatively sorted combination of the input data portions provided to the first subset of processing elements. Furthermore, each of processing elements is configurable by the managing element to compare input data portions received from either the load streaming unit or two or more of the other processing elements, wherein the input data portions are stored for processing in respective queues. Each processing unit is further configurable to select an input data portion to be output data based on the comparison, and in response to selecting the input data portion, remove a queue entry corresponding to the selected input data portion. Each processing element may be further configured to provide the selected output data portion to either the managing element or as an input to one of the processing elements.