Hashed Trie
Contents
Hashed Trie
I think this is novel, but if I am wrong, please let me know.
Definition
A Hashed Trie is a structure that is a Trie where each node has a hash value based on its contents. The tree state is saved as a hash, which simply points to the root node for that state. The children of each node are referenced by their hash. No node is every overwritten, if it is change the hash changes and it is saved under the new hash. This makes it a space efficient Copy-on-Write (CoW) structure.
Each operation is given the root hash as a parameter. Any modify operation returns a new root hash for the new tree.
The root hash represents the contents of the tree. Since there are fixed rules for the structure of the tree, the same keys and values will always result in the same root hash.
Advantages
- Space efficient. Even though we end up storing many variations of the same tree, we only duplicate the nodes that are different.
- Previous versions of the tree are readable as long as you have the previous root hash.
- Any mutation can be done from any previous root node. No need to ever back out any changes, just use a previous root hash.
- No need to lock for writes. Reads and writes can happen simultaneously.
Disadvantages
- Any write will involve rewriting all the nodes from the root on down to the leaf node in question log(n) operations. Can be helped by batching updates.
- Reads involve reading all the nodes along the path, log(n) operations (likely helped greatly with cache).
Why it is awesome for blockchains
- The root hash can be shared in consensus to insure that nodes have the same data in the tree. For example, in Snowblossom the UXTO root hash is from the UTXO hashed trie.
- Then the shared root hash and the intermediate nodes can be used to prove the existence and completeness of any data in the tree to light clients or header only nodes.