Bitcoin Network Shaken by Blockchain Fork

image

Yesterday, the Bitcoin network experienced one of the most serious hiccups that we have seen in the past four years. Starting from block 225430, the blockchain literally split into two, with one half of the network adding blocks to one version of the chain, and the other half adding to the other. For the next six hours, there were effectively two Bitcoin networks operating at the same time, each with its own version of the transaction history. The split lasted for 24 blocks or 6 hours, finally resolving itself when one version of the chain conclusively pulled ahead of the other at block 225454, leaving the other chain largely abandoned, with only a small number of miners that are incapable of recognizing what has now become the main chain still mining it, while the bulk of the network quickly returned to normal.

The fork was first noticed at about 23:30 GMT on Monday, March 11, when “thermoman” on the bitcoin-dev IRC channel mentioned that “some client told my client it (the other host) had 225431 blocks, but blockexplorer says that currently the block count is at 225430″. Some other blockchain resources also showed 225430 blocks. Over the course of the next thirty minutes, other users started reporting more strange reports from Bitcoin client logs. Bitcoin developer Peter Wuille (“sipa” on IRC) claimed that he was on block 225435, and then soon 225439, while other sources were still reporting 225431. At 00:00 GMT March 12, sipa posted “I wonder if there’s something that triggered it on the network, a large reorganization or so”. It turned out that a blockchain reorganization, an event that happens when a client discovers a new blockchain longer (and therefore more likely to be valid) than the one it was working with before, and switches to it, was indeed what happened, and over the next few minutes everyone realized what was going on: a blockchain fork.

What had happened was the following. The latest version 0.8 release of bitcoind, by far the most popular implementation of Bitcoin used by miners, switched the database that it used to store blocks and transactions from BerkeleyDB to the more efficient LevelDB as part of an effort to reduce blockchain synchronization time. However, what the developers did not realize at the time was that by doing so they also accidentally introduced a change to the rules of the Bitcoin protocol. In order to make an update to the database, the database process must make a “lock” on the part of the database which stores that particular item of information, a mechanism implemented to prevent two changes from occurring simultaneously and accidentally corrupting the database. In a b-tree, the data structure used by BerkeleyDB to store objects, two locks are required per update. However, BerkeleyDB requires its users to set a limit to the number of locks that can be made at the same time; “If the values are too small,” the FAQ page warns, “requests for locks in an application will fail. If the values are too large, the locking subsystem will consume more resources than is necessary.” In the case of Bitcoin, the limit was 10,000. What happened in block 225430 was that a single block simultaneously affected the status of over 5,000 transactions, requiring more than 10,000 locks on the b-tree to be made at the same time. As a result, the BerkeleyDB failed, and so the older bitcoind 0.7 (and earlier versions) could not read the block. In the case of bitcoind 0.8, LevelDB has no such restrictions, so it could accept such blocks just fine. Because the Bitcoin protocol builds up the transaction history, used primarily to calculate and agree on everyone’s account balances, by creating new blocks representing roughly ten-minute time intervals’ worth of transactions on top of existing valid blocks in a chain (hence, “blockchain”), miners using bitcoind 0.8 started building up a version of the blockchain that included the offending block, while miners using bitcoind 0.7 rejected it and started working on a another blockchain of their own. Ordinary users using BitcoinQt 0.7 or platforms that rely on bitcoind 0.7 as a server saw the 0.7 fork, and everyone else saw the 0.8 fork.

With the fork in progress, the Bitcoin developers had a choice: do they support the 0.8 fork or the 0.7 fork? Ultimately, there could only be one; a monetary system cannot function if there are two different databases of how much money each person has. The 0.8 fork had much more computing power behind it, and was already eight blocks ahead by the time the community could muster any effort toward fixing the problem, and upgrading to 0.8 is something that will have to be done eventually. On the other hand, if the 0.8 fork took over, thousands of users on 0.7 would be forced to upgrade in order to use Bitcoin at all, something which would not happen if the 0.7 fork took over since both versions of bitcoind can read it. The developers quickly settled on 0.7, and the community set to work on the next task: notifying major miners and mining pool operators of what they need to do.

Over the next few hours, nearly every major Bitcoin developer and mining pool operator joined the bitcoin-dev IRC channel. Major mining pools that were using bitcoind 0.8 shut down, downgraded to 0.7, and switched back on. Merchants were also notified; most large businesses, including BitcoinStore, BitPay, SatoshiDice and MtGox, shut down deposits to protect themselves from double spend attacks. BitPay quickly turned themselves back on once their servers were on the 0.7 fork; “safe mode alerted us there’s a problem,” BitPay’s Tony Gallippi writes. “That’s when Steve jumped on IRC to see where the consensus was going, and we were back in business very quickly.” Progress on switching hash power to 0.8 appeared to be slow at first, and at block 225451, the 0.8 chain was 13 blocks longer than 0.7. But that was the furthest that the 0.8 chain would get ahead. By then, the two chains were growing roughly in lockstep, and at about 03:30 the tipping point came. The 0.7 chain quickly caught up to being only 10 blocks behind, then 8 blocks, and at 06:19 both chains converged to the same length at block 225454, leading to nearly all remaining miners abandoning the other.

This incident will go down in history as one of the closest moments that we have come to the underlying Bitcoin protocol actually failing. But it is not the most serious breach ever made. In August 2010, a transaction in block 74638 contained two outputs summing to over 184 billion – just over 2^64 satoshis. The result was an integer overflow bug, the digital equivalent of a mechanical odometer wrapping around to zero after the car drives 999,999 kilometers.

The overflow caused the software to think that the transaction contained only a small amount of BTC while in reality the outputs together had thousands of times more than the 21 million that should ever exist. A new version of the Bitcoin software had to be published, the blockchain was forked, and a new, valid, chain overtook the old one at block 74691 – 53 blocks after the original fork. This time, it only took 24 blocks, and it was not even a life-critical threat to the system – if the developers had done nothing, then Bitcoin would have carried on nonetheless, only causing inconvenience to those bitcoind and BitcoinQt users who were on 0.7 and would have had to upgrade.

The economic damage was significant, but fairly small; the only monetary losses that have been reported are the $26,000 USD worth of mining block rewards from the 24 mined blocks of 25 BTC that are now forever lost in the now abandoned chain, as well as a $10,000 double spend against OKPay. Aside from the lost mining revenue and this double spend, transactions were not affected and no bitcoins were “lost”; any transaction that was included in the now abandoned chain was included in the new chain as well, so any bitcoins that were spent during the fork are now at their proper destinations.

In a way, this was the best possible time for such a thing to happen. The Bitcoin price was on a steady uptrend, and so the 24% drop in price that occurred at the time of the incident was quickly reversed, and as of the time of this writing Bitcoin stands at $44-$46, down from $48 the day earlier but up from $36 one week before. Public media attention on Bitcoin is very much positive, and rather than attacking Bitcoin as they would have in 2011 many journalists actually praised the Bitcoin development team on their rapid response. Ars Technica’s Timothy B Lee wrote a neutral piece on the event, writing that “the incident will be an important test of the cryptocurrency’s decentralized governance structure”, and an article at ecurrency.ec on the subject was entitled “Bitcoin software bug has been rapidly resolved”.

However, the incident opens up serious questions about the nature of the Bitcoin protocol, and puts into the spotlight some uncomfortable facts about Bitcoin’s notion of “decentralization”. Most security protocols, including encryption algorithms, hash algorithms and full-scale protocols, have dozens of implementations in many different programming languages, and the protocol specification is determined by a clear standard against which any individual implementation can be checked for compliance. In the case of Bitcoin, however, things are different. Although there is technically a standard on the Bitcoin wiki pages, it has at times been poorly updates, and the reality is that the bitcoind implementation is the standard, and nearly all miners on the Bitcoin network are using some version of it. There are a few alternative implementations, the most complete one being Amir Taaki’s libbitcoin, with Mike Hearn’s BitcoinJ (written in Java) close behind, but so far they have gained very little traction in use with mining, and, what’s more, there is a small portion of the Bitcoin development community which is actively against the idea of using multiple codebases.

Fortunately, most Bitcoin developers do not support this viewpoint, although many have come out in favor of keeping a healthy level of prudence. Mike Hearn wrote the following on the Bitcointalk forums in June 2011:

Gavin wrote to me only days after the BitCoinJ release to tell me how happy he was to see an alternative implementation. Satoshi expressed very similar sentiments. Nobody is against alternative implementations. What some people, especially Satoshi, have said is that there’s an unusual amount of risk involved with reimplementing the full system and using that reimplementation to mine. Bitcoin is very complex and if you aren’t skilled and very thorough you are likely to diverge from its behavior in small, hard to detect ways. This can fork the chain and split the economy. It’s one of the few things that could instantly kill Bitcoin beyond legal harassment of its users.

Lead Bitcoin developer Gavin Andresen replied to another poster in the same thread: “Really? I’ve been encouraging alternative implementations, who is the power-hungry core developer?”, and in November 2012 he wrote in a Bitcoin Foundation update that “part of the solution is to encourage alternative implementations that make different trust/convenience tradeoffs than the reference implementation. There has been a lot of behind-the-scenes work on cross-implementation testing (the “testnet3″ blockchain contains hundreds of transaction validation test cases, for example), and new features are being added to the protocol to support alternative implementations”

But alternative implementations are not just useful for supporting different trust/convenience tradeoffs. They are also crucial in making Bitcoin’s claim of decentralization a reality. If there had instead been five distinct Bitcoin implementation in use at the time of the fork, what would have happened is that one of the five would have recognized the wrong blockchain and forked off, leading to a loss of revenue for a small number of miners and requiring the users of clients using that implementation to upgrade. The aberrant implementation’s fork of the blockchain would end up much weaker than the others right from the start, so the risk of double spend attacks would be minimal. One can argue that there will be a greater number of forking incidents with more implementations, but each one will be smaller in effect, and testing all implementations together on the testnet before release would reduce the number of bugs that slip into production software to about the same frequency as we see today.

The other aspect of Bitcoin’s decentralization that this incident calls into question is that of mining pools. The reason why the controlled switch to the 0.7 fork was even possible was that over 70% of the Bitcoin network’s hash power was controlled by a small number of mining pools and ASIC miners, and so the miners could all be individually contacted and convinced to immediately downgrade. Another article on the fork reads [Russian]: “the real problem is not even in the code supporting the Bitcoin network; bugs are everywhere. Rather, it’s the matter of who controls it. This event clearly showed that even such a well thought-out system is controlled by the will of a very small number of people – particularly, the operators of mining pools. Over 70% of new blocks right now are being found on pools, and not on individual solo miners. The underlying idea of the system was that the benevolent majority can stop a small number of attackers, but in the present time it is simply not working. The winner in a possible takeover will be the one with greater computing power, and no one else.” Bitcoin is clearly not at all the direct democracy that many of its early adherents imagined, and, some worry, if a centralized core of the Bitcoin community is powerful enough to successfully undertake these emergency measures to set right the Bitcoin blockchain, what else is it powerful enough to do? Force double spends to reverse million-dollar thefts? Block or even redirect transactions known to originate from Silk Road? Perhaps even modify Bitcoin’s sacred 21 million currency supply limit?

However, a strong argument can be made that such fears are very unlikely to materialize. The reason why has nothing to do with the specific identities of the Bitcoin mining pool operators or the cohesiveness of the Bitcoin mining community; rather, it’s the fact that Bitcoin mining is still in fact quite decentralized; it simply is decentralized in a different way. Taking a political analogy, the closest equivalent would be a liquid democracy: a hybrid of direct and representative democracy where people can either vote for individual bills by themselves or assign politicians – with the proviso that if they do not like what a given politician is doing they can switch to assigning their voting power to someone else at any time. Back in the world of Bitcoin, although much of the Bitcoin network’s hash power is concentrated with mining pool operators in practice, every individual miner can switch from one pool to another almost instantly, so if a coalition of mining pool operators decides to start violating the Bitcoin protocol miners can simply switch to any pool that remains honest, instantly depriving the miscreants of their power. Although no mining pool has attempted to actively subvert the Bitcoin protocol so far, this kind of “voting” has been done before; in 2011, there were several incidents where the mining pool Deepbit pushed above 50% of the total network hash power, and in each case there was a mass exodus of miners toward other pools to balance things out. Although the nominal power may rest with the mining pool operators, the feedback of the community is always only one step away.

Altogether, the incident was handled very well, and all parts of the Bitcoin community should congratulate themselves for their speedy resolution of the problem and their unconditional cooperation. The Bitcoin community is not always in perfect harmony; Bitcoin gambling site SatoshiDice and a number of Bitcoin developers, notably Luke Dashjr, are usually at odds over concerns that SatoshiDice’s large transaction count is bloating the Bitcoin blockchain, but yesterday differences were laid aside as the community worked together to solve the problem. We also learned a lot, and merchants are likely to be much more prepared for such incidents in the future, perhaps implementing techniques like automatic fork detection to handle forks and avoid double spends without immediate manual intervention. Before today, many people knew that some test for the Bitcoin network would come, whether at the 1 MB block size limit or else where, but just how the community would handle such a thing was an unknown. Now, the test has come and gone, and how the Bitcoin community handled the test is known to everyone: we passed with flying colors.