RMIT University
Browse

Effective retrieval to support learning

Download (2.5 MB)
thesis
posted on 2024-11-23, 16:50 authored by Michael Harris
To use digital resources to support learning, we need to be able to retrieve them. This thesis introduces a new area of research within information retrieval, the retrieval of educational resources from the Web.<br><br>Successful retrieval of educational resources requires an understanding of how the resources being searched are managed, how searchers interact with those resources and the systems that manage them, and the needs of the people searching. As such, we began by investigating how resources are managed and reused in a higher education setting. This investigation involved running four focus groups with 23 participants, 26 interviews and a survey.<br><br>The second part of this work is motivated by one of our initial findings; when people look for educational resources, they prefer to search the World Wide Web using a public search engine. This finding suggests users searching for educational resources may be more satisfied with search engine results if only those resources likely to support learning are presented. To provide satisfactory result sets, resources that are unlikely to support learning should not be present. A filter to detect material that is likely to support learning would therefore be useful.<br><br>Information retrieval systems are often evaluated using the Cranfield method, which compares system performance with a ground truth provided by human judgments. We propose a method of evaluating systems that filter educational resources based on this method. By demonstrating that judges can agree on which resources are educational, we establish that a single human judge for each resource provides a sufficient ground truth.<br><br>Machine learning techniques are commonly used to classify resources. We investigate how machine learning can be used to classify resources retrieved from the Web as likely or unlikely to support learning. We found that reasonable classification performance can be achieved using text extracted from resources in conjunction with Naïve Bayes, AdaBoost, and Random Forest classifiers. We also found that attributes developed from the structural elements—hyperlinks and headings found in a resource—did not substantially improve classification to support learning. We found that reasonable classification performance can be achieved using text extracted from resources in conjunction with Naïve Bayes, AdaBoost, and Random Forest classifiers. We also found that attributes developed from the structural elements—hyperlinks and headings found in a resource—did not substantially improve classification over simply using the text.

History

Degree Type

Doctorate by Research

Imprint Date

2010-01-01

School name

School of Science, RMIT University

Former Identifier

9921861605601341

Open access

  • Yes

Usage metrics

    Theses

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC