A data processing system and method for solving pattern classification problems and function-fitting problems includes a neural network in which N-dimensional input vectors are augmented with at least one element to form an N+j-dimensional projected input vector, whose magnitude is then preferably normalized to lie on the surface of a hypersphere. Weight vectors of at least a lowest intermediate layer of network nodes are preferably also constrained to lie on the N+j-dimensional surface.
To train the network, the system compares network output values with known goal vectors, and an error function (which depends on all weights and threshold values of the intermediate and output nodes) is then minimized. In order to decrease the network's learning time even further, the weight vectors for the intermediate nodes are initially preferably set equal to known prototypes for the various classes of input vectors. Furthermore, the invention also allows separation of the network into sub-networks, which are then trained individually and later recombined. The network is able to use both hyperspheres and hyperplanes to form decision boundaries, and, indeed, can converge to the one even if it initially assumes the other.