This page describes the technical design of snowblossom and how it is implemented.
The core of this is the snowblossom.proto file which lays it all out.
Transactions consist of a transaction inner part and a list of signatures.
The inner part contains all the inputs, outputs, address claims, fee. Basically, everything about the transaction. Protobuf gives us a fast and terse binary serialization of messages. However it does not guarantee that a message will serialize the same way, so we keep the transaction inner as bytes rather than sending it over the network as a proto. That way, those bytes can be hashed to get us our consistent transaction id. The signatures are outside of that transaction inner so don't effect its hashing, which is good since the signatures are signing the hash of the transaction inner.
The transaction inputs and outputs are what you would expect. The address claims are a bit different.
All addresses are 20-bytes which is the hash of an AddressSpec, called an AddressSpecHash. An address spec is basically a mutlisig definition. It defines a number of required signers and a number of public keys (along with their signing algorithms). A simple one key address then is a 1of1 multisig with a single public key.
Anyways, as above, we can't just serialize the AddressSpec and expect that to always give the same bytes. Also, we want to be able to recreate the address spec from just the public keys so the address spec has a defined hashing method which is described in the comments of snowblossom.proto.
Getting back to transaction, a transaction output is spendable by a specific address spec hash. So to spend that output a transaction has to 'claim' it by including the AddressSpec that hashes to that AddressSpecHash. If there are multiple inputs that spend from the same AddressSpecHash, a single claim covers them all. Then the transaction signatures will have to have sufficient signatures to satisfy the terms of the claim address specs.
The public keys used in the AddressSpec are x.509 encoded public keys with the exception of SIG_TYPE_ECDSA_COMPRESSED, which is the byte 03 followed by 32-bytes of the EC point X (just like bitcoin). This keeps the standard form small, just like bitcoin but also allows for other signature algorithms.
Internally addresses are 160-bit (20-byte) AddressSpecHashes. For ease of use and error detection, they are translated into duck32 form for users. Duck32 is very similar to bech32. It uses the same base 32 character encoding as bech32 with 5 bytes of checksum. It also uses the human readable part of the address as a hash component for the checksum. That way a testnet address would fail checksum if attempting to use it in mainnet.
Can be used without the label on the front:
However, it is encouraged to have the label as it makes things a bit clearer.
For details see Duck32.java
Snowblossom calculates a new difficulty not only with each block, but as time goes on from the last block the difficulty adjusts. This means it should quickly and smoothly adjust for changes in mining power.
Internally, difficulty is expressed as a 8-byte target, which the first 8 bytes of an acceptable block hash must be less than. For user view, difficulty is expressed as a floating point number. 2^diff is the expected number of hashes required to mine a block. So difficulty 25.0 is twice as hard as 24.0.
Snowblossom uses a Proof-of-Work (PoW) function that uses a large deterministically generated file as part of the calculation. This means that mining will be IO limited.
This is done by having the miners reference large deterministic Snow Fields (more on them below) that have a known merkle root hash.
Roughly the algorithm is as follows:
- Hash normal block header things (prev block hash, transaction merkle root, utxo merkle root, timestamp), call that 'context'
- Repeat 6 times:
- Use the current 'context hash' to pick index into snow field, read that 16 byte chunk
- Combine the data from the snow field and current context to make new context
- Final context hash is block hash (checked against target).
The miner also includes a merkle proof that the read 16-byte chunk is actually part of the snow field. Verification is fast and does not require the snow fields since the merkle root of each field is known.
A key to the proof of work is to have large deterministic data files to reference. They need to be large so that it is hard to have them on-chip. They need to be deterministic so that nodes and clients that don't have them can validate the PoW by checking the merkle hash proofs.
Generating these files is "fun". They need to be deterministic but also, we need to avoid a situation where someone can keep checkpoints of state to allow them to regenerate arbitrary sections of the files in memory without the whole file.
To do this, the basic algorithm is to write the files as a quick RNG pass with a known seed and then do several passes of using the RNG to select an offset, reading the contents, adding that entropy to the RNG and re-writing that offset. We also keep some large in memory pools of entropy to make it impossible save space by checkpointing any part of the process.
This can be a bit slow and takes a lot of IOPS. So miners have the option of either generating them or downloading the torrents: Snow Fields
As hardware and techniques improve we want to increase the size of the snow fields as well as increasing the difficulty.
This file defines the progression and sizes of the snow fields: NetworkParamsProd.java
So it starts with a 1gb field at difficulty 25 and with each difficulty increase of 2 (indicating a 4x hash power increase) the snow field doubles. As we approach the end of the table, we will generate and validate additional fields long before the network gets there.
The snow required snow field changes when the trailing average difficulty exceeds the defined limit. This change is called a snow storm.
Miners may also mine at a larger field if they choose to. For example, if a miner does not have space for both the 32gb and 64gb field, they can switch to the 64gb ahead of time and mine with that before it is required. The protocol just defines a minimum acceptable field, not a max.
The HashedTrie allows for an unlimited number of UTXO trees to coexist in a single database. Each node's key in the database is the hash of its contents (including hashes of children nodes). This way, if two trees refer to the same node, that means those subtrees are identical. This gives us both a fail-safe (termination in the middle of an update is fine), idempotent (changes will necessarily be stored under different keys as they hash to different values).
This allows the UTXO database to be fast and allow for chain re-orgs in a simple and clean way.
By buffering updates to do in a batch a new block may be processed to determine if its UTXO hash does indeed match the hash produces when including those transactions before committing it to the database. This allow allows calculating the UTXO root hash to include with block templates for miners.