A probabilistic approach for cluster based polyrepresentative information retrieval

5.00
Hdl Handle:
http://hdl.handle.net/10547/584261
Title:
A probabilistic approach for cluster based polyrepresentative information retrieval
Authors:
Abbasi, Muhammad Kamran
Abstract:
Document clustering in information retrieval (IR) is considered an alternative to rank-based retrieval approaches, because of its potential to support user interactions beyond just typing in queries. Similarly, the Principle of Polyrepresentation (multi-evidence: combining multiple cognitively and/or functionally diff erent information need or information object representations for improving an IR system's performance) is an established approach in cognitive IR with plausible applicability in the domain of information seeking and retrieval. The combination of these two approaches can assimilate their respective individual strengths in order to further improve the performance of IR systems. The main goal of this study is to combine cognitive and cluster-based IR approaches for improving the eff ectiveness of (interactive) information retrieval systems. In order to achieve this goal, polyrepresentative information retrieval strategies for cluster browsing and retrieval have been designed, focusing on the evaluation aspect of such strategies. This thesis addresses the challenge of designing and evaluating an Optimum Clustering Framework (OCF) based model, implementing probabilistic document clustering for interactive IR. Thus, polyrepresentative cluster browsing strategies have been devised. With these strategies a simulated user based method has been adopted for evaluating the polyrepresentative cluster browsing and searching strategies. The proposed approaches are evaluated for information need based polyrepresentative clustering as well as document based polyrepresentation and the combination thereof. For document-based polyrepresentation, the notion of citation context is exploited, which has special applications in scientometrics and bibliometrics for science literature modelling. The information need polyrepresentation, on the other hand, utilizes the various aspects of user information need, which is crucial for enhancing the retrieval performance. Besides describing a probabilistic framework for polyrepresentative document clustering, one of the main fi ndings of this work is that the proposed combination of the Principle of Polyrepresentation with document clustering has the potential of enhancing the user interactions with an IR system, provided that the various representations of information need and information objects are utilized. The thesis also explores interactive IR approaches in the context of polyrepresentative interactive information retrieval when it is combined with document clustering methods. Experiments suggest there is a potential in the proposed cluster-based polyrepresentation approach, since statistically signifi cant improvements were found when comparing the approach to a BM25-based baseline in an ideal scenario. Further marginal improvements were observed when cluster-based re-ranking and cluster-ranking based comparisons were made. The performance of the approach depends on the underlying information object and information need representations used, which confi rms fi ndings of previous studies where the Principle of Polyrepresentation was applied in diff erent ways.
Citation:
Abbasi, M.K. (2015) 'A Probabilistic Approach for Cluster Based Polyrepresentative Information Retrieval'. PhD thesis. University of Bedfordshire.
Publisher:
University of Bedfordshire
Issue Date:
Dec-2015
URI:
http://hdl.handle.net/10547/584261
Type:
Thesis or dissertation
Language:
en
Description:
A thesis submitted to the University of Bedfordshire in partial ful lment of the requirements for the degree of Doctor of Philosophy
Appears in Collections:
PhD e-theses

Full metadata record

DC FieldValue Language
dc.contributor.authorAbbasi, Muhammad Kamranen
dc.date.accessioned2015-12-21T09:32:26Zen
dc.date.available2015-12-21T09:32:26Zen
dc.date.issued2015-12en
dc.identifier.citationAbbasi, M.K. (2015) 'A Probabilistic Approach for Cluster Based Polyrepresentative Information Retrieval'. PhD thesis. University of Bedfordshire.en
dc.identifier.urihttp://hdl.handle.net/10547/584261en
dc.descriptionA thesis submitted to the University of Bedfordshire in partial ful lment of the requirements for the degree of Doctor of Philosophyen
dc.description.abstractDocument clustering in information retrieval (IR) is considered an alternative to rank-based retrieval approaches, because of its potential to support user interactions beyond just typing in queries. Similarly, the Principle of Polyrepresentation (multi-evidence: combining multiple cognitively and/or functionally diff erent information need or information object representations for improving an IR system's performance) is an established approach in cognitive IR with plausible applicability in the domain of information seeking and retrieval. The combination of these two approaches can assimilate their respective individual strengths in order to further improve the performance of IR systems. The main goal of this study is to combine cognitive and cluster-based IR approaches for improving the eff ectiveness of (interactive) information retrieval systems. In order to achieve this goal, polyrepresentative information retrieval strategies for cluster browsing and retrieval have been designed, focusing on the evaluation aspect of such strategies. This thesis addresses the challenge of designing and evaluating an Optimum Clustering Framework (OCF) based model, implementing probabilistic document clustering for interactive IR. Thus, polyrepresentative cluster browsing strategies have been devised. With these strategies a simulated user based method has been adopted for evaluating the polyrepresentative cluster browsing and searching strategies. The proposed approaches are evaluated for information need based polyrepresentative clustering as well as document based polyrepresentation and the combination thereof. For document-based polyrepresentation, the notion of citation context is exploited, which has special applications in scientometrics and bibliometrics for science literature modelling. The information need polyrepresentation, on the other hand, utilizes the various aspects of user information need, which is crucial for enhancing the retrieval performance. Besides describing a probabilistic framework for polyrepresentative document clustering, one of the main fi ndings of this work is that the proposed combination of the Principle of Polyrepresentation with document clustering has the potential of enhancing the user interactions with an IR system, provided that the various representations of information need and information objects are utilized. The thesis also explores interactive IR approaches in the context of polyrepresentative interactive information retrieval when it is combined with document clustering methods. Experiments suggest there is a potential in the proposed cluster-based polyrepresentation approach, since statistically signifi cant improvements were found when comparing the approach to a BM25-based baseline in an ideal scenario. Further marginal improvements were observed when cluster-based re-ranking and cluster-ranking based comparisons were made. The performance of the approach depends on the underlying information object and information need representations used, which confi rms fi ndings of previous studies where the Principle of Polyrepresentation was applied in diff erent ways.en
dc.language.isoenen
dc.publisherUniversity of Bedfordshireen
dc.subjectprobalistic approachen
dc.subjectcluster baseden
dc.subjectinformation retrievalen
dc.subjectpolyrepresentativeen
dc.subjectG500 Information Systemsen
dc.titleA probabilistic approach for cluster based polyrepresentative information retrievalen
dc.typeThesis or dissertationen
dc.type.qualificationnamePhDen_GB
dc.type.qualificationlevelPhDen
dc.publisher.institutionUniversity of Bedfordshireen
This item is licensed under a Creative Commons License
Creative Commons
All Items in UOBREP are protected by copyright, with all rights reserved, unless otherwise indicated.