Beta 1

Title Statistical learning in Search Engine Optimization (SEO)
Author Nøhr, Mathias puggaard
Tureczek, Alexander
Supervisor Brockhoff, Per B. (Mathematical Statistics, Department of Informatics and Mathematical Modeling, Technical University of Denmark, DTU, DK-2800 Kgs. Lyngby, Denmark)
Institution Technical University of Denmark, DTU, DK-2800 Kgs. Lyngby, Denmark
Thesis level Master's thesis
Year 2009
Abstract This thesis intends to investigate if the Google Machine Learning heritage can be influenced through the internal parameters of a web page. If the choice and amount of HTML tags used on a single page can influence the ranking of a web page. We identified important concepts which are believed to influence the ranking of a web page. These concepts are; prominence, density, and count. In order to collect the data needed for this project we developed a web crawler and a parsing system. The Crawled data were analysed using regression techniques, VIF, SVM and Trees. We achieved classifications better than random guessing with some of our model. But our intend to minimize the False-positive while maximizing the true-positive was not successfully achieved. The implementation of a web page ranking prediction software was however successfully implemented in Python, enabling a user to select a web page for analysis and get a feedback on the performance of the web page.
Series IMM-M.Sc.-2009-61
Original PDF ep09_61.pdf (3.52 MB)
Admin Creation date: 2009-10-21    Update date: 2010-08-25    Source: dtu    ID: 251384    Original MXD