Intelligent Content Based Title and Author Name Extraction from Formatted DocumentsEricBerkowitzauthorMohamed RedaElkhadiriauthorTimSahouriauthorMichelAbrahamauthorThis paper describes the development of algorithms for
extracting the title and the names of the authors from
documents available on the World Wide Web. In this
paper we describe several algorithms for doing so in a
manner designed not to rely on specific stylistic dictates of
any document formatting standard. Rather, they are
designed to rely on a combination of overt and subtle cues
that form a generalized, common standard for placing this
information in a document and its easy extraction by
readers.LanguageArchives2004OmnipressConference Paper