Monitoring a spoken-word audio stream for a relevant concept is disclosed. A speech recognition engine may recognize a plurality of words from the audio stream. Function words that do not indicate content may be removed from the plurality of words. A concept may be determined from at least one word recognized from the audio stream. The concept may be determined via a morphological normalization of the plurality of words. The concept may be associated with a time related to when the at least one word was spoken. A relevance metric may be computed for the concept. Computing the relevance metric may include assessing the temporal frequency of the concept within the audio stream. The relevance metric for the concept may be based on respective confidence scores of the at least one word. The concept, time, and relevance metric may be displayed in a graphical display.