In one embodiment, documents accessible via a designated public account are classified as public. In another embodiment, documents accessible according to a designated public access control list are classified as public. In some embodiments, all documents not classified as public are classified as private. Content in the public documents is linguistically analyzed, resulting in a set of keys for use in subsequent full and partial content matching. The keys and associated file names are stored in a public-content identification repository. Similarly, content in the private documents is linguistically analyzed, and the results are stored in a private-content identification repository. Subsequently, full and partial content matching is performed on monitored content according to information in the public and private repositories. In a related aspect, monitored content found to correspond to private content is selectively flagged during electronic transmission or optionally prevented from distribution according to a set of defined monitoring policies.