Dimensionality Reduction and Representation for Nearest Neighbour Learning

 
EPrints.org
Agentlink Clearinghouse is powered by GNU EPrints developed by the School of Electronics and Computer Science of the University of Southampton.
Type: Thesis (PhD)
Member Organisation: 145 University of Aberdeen

Payne, T. (1999) Dimensionality Reduction and Representation for Nearest Neighbour Learning. PhD thesis, University of Aberdeen.

Full text not available from this archive.

Abstract

An increasing number of intelligent information agents employ Nearest Neighbour learning algorithms to provide personalised assistance to the user. This assistance may be in the form of recognising or locating documents that the user might nd relevant or interesting. To achieve this, documents must be mapped into a representation that can be presented to the learning algorithm. Simple heuristic techniques are generally used to identify relevant terms from the documents. These terms are then used to construct large, sparse training vectors. The work presented here investigates an alternative representation based on sets of terms, called set-valued attributes, and proposes a new family of Nearest Neighbour learning algorithms that utilise this set-based representation. The importance of discarding irrelevant terms from the documents is then addressed, and this is generalised to examine the behaviour of the Nearest Neighbour learning algorithm with high dimensional data sets containing such values. A variety of selection techniques used by other machine learning and information retrieval systems are presented, and empirically evaluated within the context of a Nearest Neighbour framework. The thesis concludes with a discussion of ways in which attribute selection and dimensionality reduction techniques may be used to improve the selection of relevant attributes, and thus increase the reliability and predictive accuracy of the Nearest Neighbour learning algorithm.

Deposited by Dr Terry Payne on 19 January 2005

Archive Staff Only: edit this record

   

AgentLink is the European Commission's IST-funded Coordination Action for Agent-Based Computing
and is coordinated by the
University of Liverpool and University of Southampton
If you encounter any problems with these pages please contact web@agentlink.org.