RMIT University
Browse

Cache-conscious sorting of large sets of strings with dynamic tries

journal contribution
posted on 2024-10-31, 23:58 authored by Ranjan Sinha, Justin Zobel
Ongoing changes in computer architecture are affecting the efficiency of string-sorting algorithms. The size of main memory in typical computers continues to grow but memory accesses require increasing numbers of instruction cycles, which is a problem for the most efficient of the existing string-sorting algorithms as they do not utilize cache well for large data sets. We propose a new sorting algorithm for strings, burstsort, based on dynamic construction of a compact trie in which strings are kept in buckets. It is simple, fast, and efficient. We experimentally explore key implementation options and compare burstsort to existing string-sorting algorithms on large and small sets of strings with a range of characteristics. These experiments show that, for large sets of strings, burstsort is almost twice as fast as any previous algorithm, primarily due to a lower rate of cache miss.

History

Journal

Journal of Experimental Algorithmics

Volume

9

Start page

1

End page

31

Total pages

31

Publisher

Association for Computing Machinery

Place published

USA

Language

English

Copyright

© 2004 ACM

Former Identifier

2004001723

Esploro creation date

2020-06-22

Fedora creation date

2009-02-27

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC