Centralization of Bitcoin

One of the defining features of bitcoin (as well as many other cryptocurrencies) is its decentralized network of miners. However, with the rise of mining pools, bitcoin mining has become increasingly centralized.

During the course of another project, I wrote Rust code for parsing the entire ~500GB bitcoin blockchain and for extracting data on mining centralization.

The result was this plot showing the number of unique miners in every 1000 consecutive blocks on the blockchain:

Number of unique miners for every 1000 consecutive blocks on the bitcoin blockchain. More unique miners represents more decentralization, while fewer unique miners represent centralization.

(The miners of each block was characterized by the address that received the most bitcoin from the coinbase transaction.)

Around 2012 to 2013, the number of unique miners plummets from 1000 per 1000 blocks to around 100 per 1000 blocks. I don't know enough about the history of bitcoin to definitely identify the cause of the shift toward centralization (contact me if you know!) My best guess is that this was the time when mining hardware improved, or that the price of bitcoin rose, so that mining became profitable, allowing bigger players to justify investment, and allowing miners to reinvest their profits into more mining infrastructure, creating a reinforcing feedback loop.

Looking at this data on a log-scaled plot shows that while the rapid centralization of the early 2010s is over, gradual centralization has continued into the present:

Number of unique miners for every 1000 consecutive blocks on the bitcoin blockchain, on a log scale. More unique miners represents more decentralization, while fewer unique miners represent centralization.

As of the time of writing (September 2022), the last 1000 blocks on the blockchain were mined by only 16 miners, a typical value these days.

Code

As mentioned above, I wrote a Rust package for parsing bitcoin's serialized blk files for a more ambitious project that didn't really work out. This is harder than it sounds because despite numerous good resources (including the official bitcoin wiki), all of them are outdated and wrong. There have been several subtle changes to the blk format, including some that I only figured out by reading the C++ implementation of the Bitcoin Core. One of my favorite absurdities is that you often need to refer to the txindex LevelDB key-value store, but the keys have been obfuscated to avoid problems with anti-virus software, something that I only understood by reading code and GitHub issues! The repository for my Rust code can be found here.

The data extraction for the plots above was done with the bitcoin-explorer crate because it supports reading from the txindex LevelDB. The repository can be found here.