Query relaxation across heterogeneous data sources
conference contribution
posted on 2024-10-31, 18:49authored byVerena Kantere, George Orfanoudakis, Anastasios Kementsietsidis, Timoleon Sellis
The fundamental assumption for query rewriting in heterogeneous environments is that the mappings used for the rewriting are complete, i.e., every relation and attribute mentioned in the query is associated, through mappings, to relations and attributes in the schema of the source that the query is rewritten. In reality, it is rarely the case that such complete sets of mappings exist between sources, and the presence of partial mappings is the norm rather than the exception. So, practically, existing query answering algorithms fail to generate any rewriting in the majority of cases. The question is then whether we can somehow relax queries that cannot be rewritten as such (due to insufficient mappings), and whether we can identify the interesting query relaxations, given the mappings at hand. In this paper, we propose a technique to compute query relaxations of an input query that can be rewritten and evaluated in an environment of collaborating autonomous and heterogeneous data sources. We extend traditional techniques for query rewriting, and we propose both an exhaustive and an optimized heuristic algorithm to compute and evaluate these relaxations. Our technique works with input of any query similarity measure. The experimental study proves the effectiveness and efficiency of our technique.
History
Start page
473
End page
482
Total pages
10
Outlet
Proceedings of the 24th ACM International Conference on Conference on Information and Knowledge Management (CIKM 2015)