A computer implemented method, apparatus, and computer usable program code for determining aggregate values of health data items from heterogeneously coded databases containing heterogeneously coded medical data. The data, in heterogeneous databases, is queried using a series of semantic layers including i) cascaded asymmetric association tables and ii) semantic search. The heterogeneously coded medical data items are translated into conformal dimensions and denominator files of combinations of disease data are derived. The denominator files of combinations of disease are aggregated based on a mapping of the coded medical and demographic conditions. The data is stored in a target data repository.