Beta 1

Title Textual similarity : Comparing texts in order to discover how closely they discuss the same topics
Author Jensen, Andreas Schmidt
Boss, Niklas Skamriis
Supervisor Sharp, Robin (System Security, Department of Informatics and Mathematical Modeling, Technical University of Denmark, DTU, DK-2800 Kgs. Lyngby, Denmark)
Institution Technical University of Denmark, DTU, DK-2800 Kgs. Lyngby, Denmark
Thesis level Bachelor thesis
Year 2008
Abstract This thesis describes the design and implementation of a tool for measuring textual similarity. The thesis looks into different aspects of text processing and graph searching in an attempt to define similarity. Furthermore, a solution for measuring textual similarity is proposed and implemented. Challenges such as disambiguation of word senses, part-of-speech tagging and several graph searching algorithms are described and used in the measurements. The developed tool is tested using human evaluation of textual similarity and it is concluded that the tool to some degree is able to measure textual similarity with the same results as a human being.
Series IMM-B.Sc.-2008-15
Original PDF bac08_15.pdf (1.55 MB)
Admin Creation date: 2008-06-30    Update date: 2009-02-17    Source: dtu    ID: 220969    Original MXD