A method and an apparatus for generating/maintaining automatically or interactively a lexicon for storing information of cooccurrence relations utilized for determining whether or not a sequence of words in a given sentence described in a natural language is semantically correct with the aid of a memory, a data processor and a textual sentence file. A hypothesized cooccurrence relation table for storing hypothesized cooccurrence relations each having a high probabliity of being a valid cooccurrence relation is prepared by consulting the file. A hypothesis for the cooccurrence relation is previously established on the basis of a cooccurrence relation pattern indicating a probably acceptable conjunction by consulting the hypothesized cooccurrence relation table. Subsequently, a corresponding actual cooccurrence relation is derived from the textual file by parsing the relevant textual sentence and is tested to determine whether the cooccurrence relation is valid or not with reference to predetermined threshold conditions. On the basis of the results of the test, the information of the cooccurrence relation is correspondingly modified. The present method and apparatus can be utilized in a natural language parsing system and a machine translation system.