RMIT University
Browse

An Efficient and Dynamic Semantic-Aware Multikeyword Ranked Search Scheme Over Encrypted Cloud Data

journal contribution
posted on 2024-11-02, 10:38 authored by Xuelong Dai, Hua Dai, Geng Yang, Xun YiXun Yi, Haiping Huang
Traditional searchable encryption schemes adopting the bag-of-words model occupy massive space to store the document set's index, where the dimension of the document vector is equal to the scale of the dictionary. The bag-of-words model also ignores the semantic information between keywords and documents, which could return non-relevant search results to users. The neutral-network based natural language processing method - Doc2Vec model use word's and paragraph's context information to extract documents' features. The features contain latent semantics information and can measure the similarity between documents. In this paper, we adopt the Doc2Vec model to achieve a semantic-aware multikeyword ranked search scheme. Doc2Vec model uses the distributed representation of words and documents with a modest dimensionality of vectors while trained on a dataset with a few hundred of millions of words. Documents' distributed representations are extracted as documents feature vector by Doc2Vec model and utilized as the search index. The features of the queried keywords are also extracted as the query feature vector, and the secure inner product operation is adopted to achieve privacy-preserving semantic search with the query feature vector and index. Our scheme can support dynamic update on the document set with Doc2Vec model. The experiment on a real-world dataset shows that the fixed-length feature vector can improve the time and space efficiency on the semantic-aware search.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1109/ACCESS.2019.2944476
  2. 2.
    ISSN - Is published in 21693536

Journal

IEEE Access

Start page

142855

End page

142865

Total pages

11

Publisher

IEEE

Place published

United States

Language

English

Copyright

© 2019 IEEE

Former Identifier

2006095912

Esploro creation date

2020-06-22

Fedora creation date

2019-12-18

Usage metrics

    Scholarly Works

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC