The effectiveness of these techniques is compared. The third method extracts stylometric measures such as the frequency of occurrence of function words and from these constructs text classification models using multiple discriminant analysis. The trigram Markov method compares the probabilities of the occurrence of words conditional on the preceding two words to determine the similarity between texts. The word recurrence interval method compares standard deviations of the number of words between successive occurrences of a keyword both graphically and with chi-squared tests. In this study, three attribution techniques are extended, tested on a corpus of English texts, and applied to a book in the New Testament of disputed authorship. Authorship attribution has a range of applications in a growing number of fields such as forensic evidence, plagiarism detection, email filtering, and web information management.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |