posted on 2025-06-30, 04:01authored byNhat Quang Cao
<p dir="ltr">Verifiable Data Structures (VDSs), such as Merkle trees, Verkle trees, and Commitment schemes, are widely used in industry applications, for example, in Bitcoin, Ethereum, Google's Certificate Transparency, and Amazon DynamoDB to ensure efficient data integrity verifiability. These structures allow clients to verify that specific items in a database stored on untrusted servers remain unaltered by committing the data, making it publicly available and enabling efficient confirmation of each data item's integrity. VDSs transform a system requiring unconditional trust in server honesty into a trustless system where clients do not have to trust the servers. </p><p dir="ltr">However, privacy is a significant concern for VDSs. For example, with Certificate Transparency (CT) logs, clients inadvertently reveal their browsing behaviours to log servers after verifying certificate membership. In Blockchain, once a client verifies their transaction's membership, a full node, which stores all blocks in the system, can identify the client's transaction. </p><p dir="ltr">This thesis proposes several novel solutions for protecting clients' privacy, data integrity, and efficient data retrieval. TreePIR and qTreePIR enable light clients to securely retrieve membership proofs along any root-to-leaf path in q-ary trees, such as Merkle and Verkle trees. These mechanisms surpass current leading Probabilistic Batch Codes (PBC) in all metrics, offering significantly lower total storage, reduced communication costs, and faster server computation and client query generation times. The TreePIR approach, specifically designed for private retrieval of nodes along an arbitrary root-to-leaf path in a Merkle tree, achieves zero storage overhead for tree-shaped databases. The qTreePIR approach is a more general design for the private retrieval of nodes along an arbitrary root-to-leaf path in a q-ary tree. </p><p dir="ltr">Furthermore, the thesis introduces a Committed Private Information Retrieval (CPIR) scheme, a generic construction that combines a Linear Map Commitment (LMC) with an arbitrary linear Private Information Retrieval (PIR) scheme to create a k-verifiable PIR scheme. This scheme ensures the client will not be deceived into accepting incorrect data, even if all servers collude.</p>