?url_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rft.title=Learning+algorithms+for+keyphrase+extraction&rft.creator=Turney%2C+Peter&rft.subject=Language&rft.subject=Machine+Learning&rft.subject=Statistical+Models&rft.description=Many+academic+journals+ask+their+authors+to+provide+a+list+of+about+five+to+fifteen+keywords%2C+to+appear+on+the+first+page+of+each+article.+Since+these+key+words+are+often+phrases+of+two+or+more+words%2C+we+prefer+to+call+them+keyphrases.+There+is+a+wide+variety+of+tasks+for+which+keyphrases+are+useful%2C+as+we+discuss+in+this+paper.+We+approach+the+problem+of+automatically+extracting+keyphrases+from+text+as+a+supervised+learning+task.+We+treat+a+document+as+a+set+of+phrases%2C+which+the+learning+algorithm+must+learn+to+classify+as+positive+or+negative+examples+of+keyphrases.+Our+first+set+of+experiments+applies+the+C4.5+decision+tree+induction+algorithm+to+this+learning+task.+We+evaluate+the+performance+of+nine+different+configurations+of+C4.5.+The+second+set+of+experiments+applies+the+GenEx+algorithm+to+the+task.+We+developed+the+GenEx+algorithm+specifically+for+automatically+extracting+keyphrases+from+text.+The+experimental+results+support+the+claim+that+a+custom-designed+algorithm+(GenEx)%2C+incorporating+specialized+procedural+domain+knowledge%2C+can+generate+better+keyphrases+than+a+general-purpose+algorithm+(C4.5).+Subjective+human+evaluation+of+the+keyphrases+generated+by+GenEx+suggests+that+about+80%25+of+the+keyphrases+are+acceptable+to+human+readers.+This+level+of+performance+should+be+satisfactory+for+a+wide+variety+of+applications.+&rft.publisher=Kluwer&rft.date=2000&rft.type=Journal+(Paginated)&rft.type=PeerReviewed&rft.format=application%2Fpostscript&rft.identifier=http%3A%2F%2Fcogprints.org%2F1797%2F1%2FIR2000.ps&rft.format=application%2Fpdf&rft.identifier=http%3A%2F%2Fcogprints.org%2F1797%2F5%2FIR2000.pdf&rft.identifier=++Turney%2C+Peter++(2000)+Learning+algorithms+for+keyphrase+extraction.++%5BJournal+(Paginated)%5D+++++&rft.relation=http%3A%2F%2Fcogprints.org%2F1797%2F