Video - Consensus Algorithms, Blockchain Technology and Bitcoin UCL

An academic lecture by Andreas M. Antonopoulos explaining the consensus algorithm, "Proof of Work", used by bitcoin and many other blockchains. This talk was presented in collaboration with the Department of Computer Science, at University College London. Andreas is a UCL alum.


Andreas: So this is great. It's back to my roots. I was a student here in 1990. But I was actually born in London and I'm half British, which you won't hear in my accent anymore, Brooklyn changed that. The first day I arrived at UCL, I was telling my mom, "I'm going to be doing my orientation session at the University College Hospital, which is across the street, they have a big auditorium and that's where they're sending all the students." And she said, "Well, just be careful because last time you were there someone assaulted you with forceps." Turns out I was born at UCL and then ended up going to college at UCL and I had no idea. So this is this is really a fantastic honor for me and I'm truly very happy to be here and be doing this at UCL.

All right. So today's presentation is about the consensus mechanism of Bitcoin. And we're going to go into whatever technical depth is appropriate for the audience. So I need to get an idea. How many people here are programmers or have a computer science background? Very good. How many people are completely new to Bitcoin? Okay. And, oh, there's an overlap there, very interesting. Excellent. All right. So we're going to keep it quite technical. Bitcoin is a system that is made up of a network protocol, some core cryptographic functions and a set of gained theoretical equilibrium systems that dynamically adjust, which basically means there are some economics happening on a global scale. And these economics influence the operation of the network protocol. What we're going to talk about today is the invention that lies at the center of Bitcoin. Like many technologies, Bitcoin isn't entirely novel. In fact, the core idea in Bitcoin is to take five or six technologies that existed in the '70s, '80s and '90s and mash them together to create something completely new. All of the constituent components in Bitcoin existed about 15 years before Bitcoin actually showed up, but no one had combined them in this particular way. So what's really novel about Bitcoin is its architecture and the design characteristics that make it work. Think of it as a recipe. All of the ingredients were there. But no one had ever thought of combining them in this particular way. One of the most important components of Bitcoin is the use of cryptographic hash functions, specifically SHA-256. So how many people here are familiar with hash functions? At least understand a basic? Okay, great. I apologize in advance for my handwriting. I type fast. I write horribly. SHA-256 is an algorithm that takes an input, and this can be an input of any size, any form of data as input. And then it's mixes it up. And it produces a fixed size output, which is 256-bit long. So you can just take data in here and it will always produce 256-bits hash as it's called. A useful way to think of this is a fingerprint for the data. Now when you put data into SHA-256, you have no way of predicting what's going to come out. And what does come out looks random. If you do any kind of statistical test on the output of SHA-256, it will appear random. However, it's deterministic. Meaning, that if you put the same thing in the top, you get the same thing out at the bottom always, which means that if you have this fingerprint and someone gives you a piece of data, you can put the data in, verify the fingerprint, and you know that they apply SHA-256 to it. So it can be huge to prove a certain data set. This particular technology in 1997, Adam Back created a system for anti-spam called Hashcash. And he said, "Well, a lot of people are sending spam. So how about we use this function to make them do some work so that before they can post a message on this message board, they have to do a few hundred thousand SHA operations in a row, and then at the end of that you produce a fingerprint." Now it's fairly easy then for someone to check that for a specific value. And if they check it they know that the other person has done the work. As a result, what that means is that if you want to post a message on the message board you have to do maybe half a second of computing before you post the message. Now if you're a legitimate user of this message board, half a second of computing is nothing. If you're a spammer and you want to send a million messages, half a second of computing means you have to do half a million seconds to send a million messages. And suddenly it changes the economics. This mechanism is called proof-of-work. And what Satoshi Nakamoto did was to use this proof-of-work mechanism in order to establish consensus on a global decentralized network and so this is at the core of Bitcoin. This invention SHA-256 has existed for decades, Hashcash had existed since 1997. What Satoshi Nakamoto did was combine this with a peer-to-peer network that operates very similar to how BitTorrent operates or many other peer-to-peer networks to create a digital cache system. So let's say you have a 256-bit fingerprint, you put data in, something comes out. It's seemingly random, but deterministic. What are the chances the first bit is going to be zero? Anyone?

Unidentified Male: 50%.

Andreas: 50%, right. So half the hashes that are going to come out of this will have the first bit as zero. What are the chances the first 2-bits is zero? 25%. What are the chances the first 10-bits is zero, right? So it gets exponentially more difficult as you increase the number of zeroes in the front. So if I say I want you to generate a random number using SHA-256 by putting some inputs into it. And the criteria I have, rather bizarre criteria if you think about it, is I want you to produce a random number, but I want that random number to have a certain pattern to it. And the pattern is it has to have 0-bits at the beginning of the number. In numeric numerical terms, you can describe this as the number has to be smaller than a certain value, right. So effectively, if you say the first bit is zero, has to be zero, what you're saying is that the number has to be less than 255 minus 1, right. If you say the first 2-bits have to be zero, you're saying the number has to be less than 2 to the 254, etcetera, right. So what I'm saying is generate a seemingly random number that is smaller than a specific value. That value is called the target. Specifically, in Bitcoin we call it the difficulty target because the lower that value is the harder it is to find one of these numbers. Now how do you find one of these numbers? Let's take a typical example, SHA-256 is often used to fingerprint documents so you can create a fingerprint that allows you to verify that a document hasn't been modified. Typical example of that, let's say you signed a contract with someone, you take the PDF, you throw it through SHA-256, you get a fingerprint. Now if any PDF that matches, that fingerprint, that produces that same fingerprint is the exact same PDF that you originally had. And you could verify that. If you want to download software from a website and that software is extremely sensitive, security sensitive code, you'll often see at the bottom of the website, it will say, "Verify that the software package you download has this fingerprint." And you can know that not a single bit in that software has changed. This is due to a characteristic of SHA-256,which is that the output changes dramatically even if you change just one bit. So this is a cascade effect, you change one bit in the input. What you get out is not just one bit different. It is in a completely different part of the 256-bit space, so you never know where you're going to land. So let's take a simple example, I take this phrase, hello! Anybody who has a laptop can throw SHA-256 sum or SHA? Any Mac, any UNIX based system, any windows-based system has that function in it, type SHA and it's going to give you a hash. You tell it to use the 256-bit algorithm, you type in capital H-E-L-L-O! It's going to give you a specific fingerprint. That fingerprint is specific to this phrase, you change the exclamation mark, you add a space, you change the first letter to a lowercase H completely different fingerprint, right. But as long as you enter the exact same phrase, the result will be a specific fingerprint. Anybody did that on their laptop? Okay. What's the result?

Unidentified Male: All this?

Andreas: Just give me the first few digits.

Unidentified Male: a8d19153 (Inaudible0:11:28).

Andreas: Anyone in the world can take the phrase, hello! Put it into SHA-256 and they will get a8d19153. So, how do I change that? Well, if I wanted to produce different fingerprints from this, I could introduce some change to the input. How about adding a number? So I can take the same string and instead I add a number to it, hello, zero. And please help me out here if you don't mind.

Unidentified Male: Completely different 0E0978C.

Andreas: 0E0978C. It's a 64 characters long. So completely different output and I can keep going, hello, one. No need to do it, but completely different outputs, right. Each time I do this I'm going to get a different result. Now you remember before I said, I want my target to be, let's say the target is that the first 4-bits are zero. Well, this one fulfills it. The first hexadecimal digit is zero, which means that this particular input produces a value less than the target. That was relatively easy to do, right, because once again the probability of finding a hash where the first hexadecimal digit is zero is not that low. I can just run a few iterations and I'm going to get at least one value that's less than zero. Now if I gave you, if you gave me the string, hello, and you said, "Find me a specific random number" or any number that you add to hello and when you add it, it produces a hash that starts with zeroes. How do I find that number? I have to brute-force it. I have to try every possible number. So I would start hello zero, hello one, two, three, four, five, six, seven, eight, and just keep going in a loop as fast as I can in order to produce hashes. And at some point, a hash would pop out that would meet my difficulty target. And as soon as I have that hash, I can show it to you and I can do two things: I can show you a hash; and I can show you the number. So I can say, for example, hello 39,137 produces a hash that has two zeroes in the beginning, I'm just guessing, right. If I do that, can you verify? Well, of course, you could take hello 39,137, plug it into a hash, come out with a fingerprint. And if the first few digits is zero, you're like, "Oh, that is correct." What have I just proved? What I've proved is that I did several tens of thousands of SHA operations to find that. There is no way I can produce that result without doing the work. There is no way I can produce that result without searching through the space of fingerprints in a brute-force manner. And based on the difficulty, 1 bits 0, 50%, 2 bits 0, 25%, 3 bits 0, 12%, 4 bits 0, 6 %, etcetera, etcetera, etcetera. You can estimate statistically, simply from the pattern approximately how many operations I need to do on average before I find it. All right. Who's got access to a blockchain right now and can look up the latest block? Anyone? Look up the block hash of the latest block.

Unidentified Male: What if you do and I then start?

Andreas: All right. Can you read me out the beginning?

Unidentified Male: 10C300000000000000000.15DE165.

Andreas: This is the fingerprint of the current block in the blockchain. In order to produce this fingerprint, a miner used the header of the block, which is a known set of data. It represents all of the transactions that are in the block, the date and time, and a few other data sets. It's a standard data structure, right. Think of the header as the hello part, that's the useful payload if you want, that we're trying to validate through proof-of-work. And then in addition to the header, they introduced a number which we call it nonce, a random number. So this is called the nonce. Now, if you take one look at this, what that means is that, when you take that blocks header and the specific nonce that was found by this miner, it produces this fingerprint, these are hexadecimal digits, 17 times 4 bits. So 68 bits at the beginning of this hash, the first 68 bits is zero. What is the probability that the first 68 bits are zero? You do the math. But I can estimate approximately how many hashes that took. At the moment the Bitcoin network is producing 500 peta hashes per second, that is five hundred quadrillion hashes per second. And it still takes on average 10 minutes to find a block, so 500 peta-hashes times 10 minutes on average. Is how many hash operations the miner needed to do this? If I see this hash, I know that the only way a miner was able to find that very special nonce that produce this result. There is no way to shortcut this function. The only way they did it is by performing quadrillions upon quadrillions of hash operations. And in order to perform these quadrillions upon quadrillion hash operations, they had to use a very important precious resource energy. So Laws of Thermodynamics tell me that in order to run the SHA algorithm you have to flip bits. To flip bits, you have to expend energy. Expending energy dissipates heat. And therefore, I know two things about this miner. One, they're running some serious hardware because they are able within a period of less than 10 minutes to do quadrillions of hashes. This hardware didn't exist three years ago. This hardware, I can tell you from the specs we see, it's probably a 20 nanometer fabrication chip that is designed to do just SHA-256 and nothing else. And the way it works is by having hundreds of thousands of parallel SHA-256 engines packed as densely as possible. These chips consume enormous power. You can heat your house with them. You can toast bread with them. And if you don't dissipate them they will melt. And these chips consume electricity. In fact, you can estimate approximately how much electricity is consumed in terms of watts per giga-hash. So how many watts of electricity you consume to do a billion calculations of hashes? And that metric is the efficiency of this mining equipment. So just by looking at this number, I know that the miner has expended enormous amounts of effort which translates into enormous amounts of electricity and heat, which means that they had to pay someone to give them electricity, which means they incurred a financial cost. And this is really important. You will hear people say that Bitcoin wastes electricity. Bitcoin does not waste electricity, Bitcoin uses electricity to underpin the security function because it creates an economic system, whereby, in order to participate you have to incur cost. And by incurring cost, the only reason you would incur cost is for the possibility of reward. And the possibility of reward is determined by whether your block meets the consensus rules, which means you spend money. And if you play fair by the rules, you get money back. If you spend money and you try to cheat, you don't get money back, which means you lose money. So therefore, it doesn't pay to cheat and that simple game theoretical equilibrium is the core of the Bitcoin consensus algorithm. So every 10 minutes a miner has to take a whole bunch of transactions, maybe a thousand transactions that are currently outstanding, unconfirmed, ready to be included in the blockchain. They have to then perform consensus rule validation, meaning they have to look at each one of these transactions and say, "Does it meet the rules?" The rules are the same for everyone. There's a list of maybe 30 or 40 rules that every transaction has to comply with and there's a list of 30 or 40 rules that every block has to comply with. So they must construct a block with transactions that match all of these rules because once they construct that block, they're then going to try and find a nonce that makes its header fingerprint look like this. They're going to spend a lot of electricity trying to find that nonce. And if they then find that nonce and send it to the rest of the network, and the rest of the network says, "Hmm. Sorry, doesn't meet the consensus rules." They just wasted all electricity. So they have to make sure that that block meets the rules, which means that every transaction in it meets the rules, which means that every transaction is properly signed, has never been spent before, is properly structured, the amount inside doesn't exceed the number of inputs, the outputs are properly formatted, it doesn't spend mining reward in less than 100 blocks, blah, blah, blah, blah. There's a whole list of rules. And that way you align the interest of the miner who's wasting electricity or using electricity with validating the consensus rules. And in every block, the miner gets to write themselves a check. So when a miner is constructing a block, the first transaction they put in that block is a transaction that pays them reward. It pays them 25 Bitcoins. And that is a unique transaction in Bitcoin because most transactions in Bitcoin, all transactions in Bitcoin, have inputs and outputs. But the coinbase transaction, as it's known, the first one has no inputs. So what that means is it's writing a check from nowhere to myself for 25 Bitcoin. So every miner will put a check in the beginning of the block that says, "Pay me out of thin air 25 Bitcoin." And that 25 Bitcoin is new, brand-new, never existed before and is created. So here's a simple question, why 25? Why doesn't the miner write themselves a check for 26 Bitcoin or 30 Bitcoin?

Unidentified Male: Because miner moves forward.

Andreas: Sorry.

Unidentified Male: None of the other miners would agree to trust that.

Andreas: No, none of the other miners would agree to trust that. So imagine a scenario, where the miner gets greedy. And instead of writing a check to themselves for 25 Bitcoin, they write a check for 26 Bitcoin. They construct the candidate block. They fill out with transactions. They put the first transaction to pay themselves 26 Bitcoin. And now they have the header, and now they have to search for this number, the nonce. And they do all of this work and they produce a hash. They find the proof-of-work. They then disseminate that block to all of the other miners. What do the other miners do? They validate the block consensus rules and they look through. And one of the consensus rules is you can only pay yourself reward at the correct rate based on what block we're at. So if it's 2009 to 2012 year, the first 800,000 blocks, 50. If it's now until approximately August 2016, 25. At a very specific moment on a specific block that equation is going to produce a result of twelve and a half, shortly after approximately August 2016 when enough blocks have been mined. Sorry. It's after every 200,000 blocks. So after 400,000 blocks have been mined on the very next block, everyone will expect to see a reward of twelve and a half. And if you write yourself a check for 25 at that point, your block is going to be rejected. Okay. Yes.

Unidentified Male: Who sets the validation rules?

Andreas: Who sets the validation rules? This is a really good question. The validation rules are defined by the software, the reference implementation. So if you want to know what the current validation rules are, you read the software code for the Bitcoin core reference implementation. So it's a C++ program that contains within it, functions that do things like its block valid is transaction valid, you know. And these functions evaluate things based on a simple set of rules. Now you might ask, "Where are these rules documented?" And the answer is "Nowhere." The rules are whatever the core implementation says the rules are. So the rules are only documented in C++. In fact, this is one of the tricky parts of consensus. The rules are whatever the core implementation does, including the bugs, including every bug ever found since 2009. If you want to write a competing implementation you have to simulate every bug that was found in the code since 2009 and process the blocks in exactly the same way because you have to reprocess them from the beginning until today, which means that you have to fail where Bitcoin core failed and succeed where Bitcoin core succeeded, bugs and all. Yes.

Unidentified Male: Whose definition of C++?

Andreas: Whose definition of C++? Well, it's --

Unidentified Male: It keeps changing.

Andreas: Yes. So with that software code comes a long list of very specific dependencies. Dependencies on underlying database code C++ versions, boost and various other libraries that are used up to recently open SSL libraries. Now they've changed that a bit. So, yes, there is a whole edifice of dependencies that constitutes the reference code. Yes.

Unidentified Male: (Inaudible0:28:37) a consensus (Inaudible0:28:39) change it or a centralized (Inaudible0:28:40) change it?

Andreas: Right. So let's talk about that. All right. First of all, let's explain how the decision is made. Now I've said until now that when minors receive a new block, they validated by the rules and then they decide if they're going to accept it. How do they decide? How do they vote? How is consensus actually affected on the network? And consensus is affected on the network based on a decentralized mechanism of voting by means of the longest difficulty chain. Let me explain what I mean by that. So the blockchain is a sequence of blocks. Each block within it contains the hash of the previous block. And the hash of the previous block is based on the header which contains the hash of the previous block, which is based on the header which contains the hash of the previous block, which means that each block within it contains something that if it changed, would change its own hash, which would change the next blocks hash, which would ripple all the way through the block chain. And this all goes down back to the first block. So in January 2009, Satoshi Nakamoto mined the first block. They did not use the Bitcoin consensus algorithm. To mine this first block, they kind of jury rigged it so that it would come out properly because there was no consensus mechanism at the time. The second block was mined based on consensus. The first block is actually embedded in every version of Bitcoin. So every piece of software that implements the Bitcoin consensus algorithm has within it statically defined the Genesis block, including the Genesis block's hash, its fingerprints. If you are looking at the Bitcoin blockchain and you have the Genesis block and you have its fingerprint and someone gives you a block and says, "This is block number two." You can verify it because block number two contains a reference to the hash of the previous block, which is the Genesis block, and you have that as a constant in your code. And then you can verify that the second block was mined properly because its hash should contain a number of zeroes in front, which means that the proof-of-work was done. And now you validated the second block. If someone gives you the third block, you would then do the same process, you'd validate all the transactions within it, you'd validate that it links to the second block, and you would validate the third block. And if you do this process after three or four days, you're going to get to block 366,000 and then the network is going to tell you, "You already have the latest block." And you've essentially rebuilt the entire chain from Genesis to today. And you can independently verify that every block you have is correctly linked to every previous block all the way back to the Genesis block, which means you have the authentic blockchain. This property is really important. This property means that in a distributed system, nodes can join and leave at will. And one of the important properties of Bitcoin is what happens when you turn your back? What happens when you leave the network, stop paying attention and then you come back? You come back and the network tells you, "You're a thousand blocks behind." But you can then retrieve each one of these blocks. Rebuild them on top of what you knew to be the truth before and arrive at the same truth as everybody else in a way that is completely irrefutable that requires no appeal to authority because that can be verified independently by your node. So the blocks essentially get build in this chain that references everything down to the past. So far so good? All right. So let's say I'm a block 366,001. I validate it that. Up to this block, I can link everything back to the Genesis block. As far as I know this is the latest block on the network. And I'm connected to the network and somebody sends me a message that says "I have blocked 366,002." So I can add that block to the chain and validate it. What I can also do is I can calculate how much proof-of-work was in this block based on how many zeroes are in the front of the header. And then I can add up all of the proof-of-work so far in this chain and I can estimate the total amount of difficulty represented by all of the proof-of-work that has gone into all 366,002 blocks. That gives me the weight of that chain, which means that I can estimate the total cumulative difficulty that is represented by that. Here's why I need that, imagine this is a network that is distributed across the world as it is, of course, and imagine that miners are working on this problem in different places in the world. Now the miners all start with the same data. They include more or less the same transactions and then they race to find proof-of-work for a block. Let's say that a miner in Canada finds a block. And 200 milliseconds later, a miner in Australia finds proof of a block. So the Canadian miner started construction of candidate block 366, 003 and at the same time the Australian miner started construction of Canada's block 366,003. When do the miners start constructing these blocks? As soon as they receive the previous block. So think of this as a race. You're trying to find blocks, but if someone tells you there's a new block. That means you just lost the race, right. And upon receiving a block from someone else you validate it as quickly as possible and then you know the race has started again. You immediately stop doing what you were doing before, construct a new block on top of it, referencing the hash of the previous block, and start trying to find proof-of-work as fast as possible. This is a race, milliseconds matter, right. So the moment this block was broadcast across the network, the Canadian miner and the Australian miner both start their engines and they start hashing as fast as they can. They've constructed a candidate block, they stuffed all the transactions in, they calculated the header, and they start throwing random numbers next to it in order to fill that hash function and produce proof-of-work. Eight and a half minutes later, the Canadian miner finds a nonce that gives them proof-of-work. Now their header is going to be slightly different than the Australian miner. The time stamp is going to be different. They might be off by a couple of seconds. The clocks are not perfectly synchronized. They may have received transactions in a different order, maybe they saw the Australian transactions first, and then the Canadian transactions arrived a few milliseconds later. So the order in which they added them to the blockchain was slightly different. All of these things completely changed the fingerprint of the header and they completely changed the outcome of the proof-of-work. So effectively, there were working on blocks that are slightly different, maybe just a few bits different, maybe a few bytes, maybe a few kilobytes different, right. And they're both racing. At eight and a half minutes, this miner finds a block. At eight and a half minutes and 100 milliseconds, this miner finds a block. But they're all on opposite sides of the world. What do they do as soon as they find a block? They tell everyone about it because now the race is to make their block, the one that the world knows about. They've won the race, but they need to make sure the world knows that they won the race because that's what matters. So they start propagating. They start sending that block on all of the nodes that are connected to them as fast as they can. So imagine now you've got these nodes and you have these two blocks pop, pop, and they start propagating. Every node receives the block says, "Okay. A new one has been found." Sends it to all of their peers after validating it and they each receive it validated to send it to all of their peers, and propagation starts, a ripple, like dropping two stones and two opposite sides of the lake. And they start creating these ripples across the network. Effectively, half the network is being painted green while half the network is being painted red. All of the nodes that are closest to Canada see the green block. They assume this is now the latest block. All of the nodes that are closest to Australia see the red block. They assume this is the latest block. We have a race condition. In distributed system terms, we have a race condition. Now this race condition now has to be resolved. In Bitcoin terms, this is called a Fork. Both of these blocks are valid. They have all of the consensus rules, they have sufficient proof-of-work below the target, they're fully validated by everyone, but only one can survive. At the moment, there are now two versions of the Bitcoin blockchain across the world, two competing versions of history. Only one can survive. So at this moment you have this Fork, what happens next? There is no voting mechanism in Bitcoin. You don't get nodes going, "Hey, I think it's green." Or, "I think it's red." Voting happens through the application of mining power. Essentially, what happens now is, what does a miner do when they receive a block?

Unidentified Male: Start on the next one.

Andreas: They start working on the next one as fast as they can. But when you receive a block and you start building a child of that block, what you're effectively doing is you're voting for that block and your vote is your hashing power. You put your computing behind that block and you say, "This is the longest chain as far as I know, I'm going to build on this." So at this point, you have all of the nodes that receive this one building on top and all of the nodes that receive this one building on top, and the race is on. But now there's two racetracks, a Fork. So what happens next? Someone finds a block. Seven and a half minutes later, someone who had this block as their parent by complete coincidence, doesn't matter, finds a new block. They immediately broadcast it to the entire world. Everybody who had the Australian block before sees the new block and says, "Okay, new block. I have its parent," right? Connect it, validates it, and immediately starts building on top of it. Why it was these people? Everybody who is on this side and suddenly sees this block, looks at that block, starts validating, looks at the parent hash and says, "Oops, this is a child of red. I thought Green was winning, I was wrong." The fact that a child of red won means that Green was not the longest chain, meaning that every miner who thought Green was the longest chain now has a new picture of reality. They now know that the rest of the network thinks red is the longest chain. They are on the wrong side of the Fork. So what do they do? They orphan this block. They basically say, "That was a momentary mistake." Our history deviated into a parallel universe for 10 minutes where we were wrong, but now we know better. What happens to all of the transactions that were in that block? Well, if you think about it, most of the transactions that were in this block are also in this block. There's probably very little difference between them. So the first thing that happens is all of these miners go, "Could somebody tell me what the red block looks like because I've never heard of it, right?" So they go out on the network and they ask for the specific parent fingerprint. They say, "Please give me (INDISTINCT) 5D165, I've never seen it, this one. And someone on their network says here. So it propagates the red block fully. Now most likely they've already received it. So this is a bit of a cheat, right, which means that even if you got this one first you also got this one eventually, maybe a few seconds later. And to avoid duplicating communication, what you do is you put this one on the shelf. You say, "I think we're green, but this red one also is a candidate. So I'm going to put it on the shelf there just in case I find them on the wrong side of history." And then when you find out you're on the wrong side of history, you go back to your shelf and you say, "Uh, turns out it was red." You take green, you throw away all of the transactions that are in, and you put them back in the queue. Then you take red, you start checking off the list, all of the transactions that are already in red, and see if there's anything left, which is the difference between the two. And you start putting those. You then check off all of the transactions that are in orange and then you start building on top of orange. So this side of the network now converges onto orange and they also join the race on top of orange. Okay. Let me just repeat that. Race condition, eventual convergence triggered by the discovery of the next block, forcing a longer chain validation, this is now the longest chain, therefore this wins, therefore this is not on the longest chain, therefore this loses, the entire network then converges on the longest chain and then they vote again by building on top of this, a new child, and continue going. Yes, questions.

Unidentified Male: Would it be possible to reach consensus earlier by (Inaudible0:46:00) then you wouldn't have to wait for the next block (Inaudible0:46:13)?

Andreas: Well, the difference in proof-of-work will be very, very miner.

Unidentified Male: Yes.

Andreas: Right.

Unidentified Male: But it's still (Inaudible0:46:23).

Andreas: There are other protocols for consensus.

Unidentified Male: Okay.

Andreas: The problem with that is that it can cause various weird effects in the network. So this model of eventual convergence after 10 minutes works, right. There is another protocol called Gost, G-O-S-T, that does some interesting things that allow, for example, children proof-of-work to still be counted because you still did the work and to get maybe a partial reward for that. That's not Bitcoin, right. Bitcoin works with this very simplistic algorithm. There are many other competing consensus mechanisms.

Unidentified Male: So if I'm a miner who has proved the block (Inaudible0:47:13) --

Andreas: Yes.

Unidentified Male: -- use the 25 Bitcoins?

Andreas: Well, so here's the question, each one of these has a check that pays 25 Bitcoins to someone. Checks are only valid if they're on the longest blockchain. So this check is on the longest blockchain. The entire network knows that this check is no longer on the longest blockchain. Essentially, this transaction never happened, you disappear from history. The truth is the longest chain. The other chain never happened. Now here's where our critical consideration comes in. When you earn a reward check for mining a block, you can't spend that for 100 blocks. Why? Because for one block, history is fickle, right. After 100 blocks, if your transaction is still in the chain, it is history, right. Think about the Bitcoin blockchain as geological strata. You're drilling a core sample in the ice in Antarctica, the top ten centimeters are slush, they come, they go, they melt, the wind blows, stuff around, you can't tell anything. You go three meters down, you're looking at 100 years of history and that layer hasn't moved in 100 years. You go 300 meters down and you're looking at the Cretaceous era and that millimeter thin layer hasn't moved in millions of years at all. And the way that happens is because layers get deposited on top and compacted. And the deeper they are, the harder it is to change them. Bitcoins consensus algorithm creates a geological history if you like. At the top, the winds blowing and things are very fickle. You go 144 blocks down which is one-day old, the probability of a block changing after 144 confirmations is vanishingly small. But the algorithm allows it. You can change block two. All you have to do is produce a competing longest chain that has more proof-of-work cumulatively than 366,000 blocks until today in ten minutes because you have to do that before the other chain gets one block longer, right. And this is how the cumulative work. Yes.

Unidentified Male: (Inaudible0:50:10) alternate change, though, they have discard them (Inaudible0:50:14)?

Andreas: They discard them pretty quickly because you can always retrieve the alternate block. Because if the node is publishing a longest chain, it also has the history behind it, so you can always retrieve that. I'm not sure, I mean, that's an implementation detail. But it's a good question. All right. Yes.

Unidentified Male: But it means that for achieving more Bitcoins, it's also necessary that you are near the right part of the network because if your mining a cheaper block and you spread it, if you are near, like, low computing --

Andreas: Yes.

Unidentified Male: -- (Inaudible0:50:57) you will not achieve that.

Andreas: So this is a very, very important consideration.

Unidentified Male: Like that, who is richer (Inaudible0:51:06)?

Andreas: Who is bandwidth rich? There are two important considerations in mining: how cheap you can buy electricity, and how low latency networking you can achieve. If you have cheap electricity you can mine as close as possible. Keep in mind that the efficiency of mining equipment is bound on the upper side by Moore's law. We're already seeing 14 nanometer chip fabrication. Bitcoin mining is approaching the edge of Moore's law faster than desktop computing. Why? Because there's a three-billion-dollar economic incentive behind it, which doesn't exist in desktop computing anymore. Bitcoin mining is now driving the development of silicon fabrication, which is shocking. But what that means is that you cannot squeeze more efficiencies out of silicon. Now it becomes a matter of how densely you can pack that silicon in a chip, how densely you can pack the chips on a board, how densely you can pack the boards in a rack, how quickly you can suck heat out of that, and how quickly you can push gigawatts or kilowatts or megawatts of energy into that rack. It's a data center game. And then your economic efficiency depends on the cost of your inputs electricity, the cost of your operations and ability to maintain the hardware, and your ability to propagate these blocks faster on the network. Latency is enormously important, which means that at the moment most mining has migrated to China. And the reason for that is because subsidized coal-fired electrical power in China is I think the ironic term would be dirt-cheap. And so as a result, it leads that economic concentration there. However, China has bandwidth issues and latency is a big problem, which means that as the block size increases it puts the Chinese miners at a disadvantage. If you have a one megabyte block and you're trying to propagate it to eight nodes, it takes a certain amount of time. If you take that one megabyte block and you increase the size limit to eight megabytes, it takes you eight times as long. And while you're propagating a block, someone else is beating you to it. All right. How often does a Fork happen? Approximately, every day. Once a day on average. Two-block Fork, maybe once a week, maybe once a month. If you start seeing a three-block Fork, something really weird is happening on the Bitcoin network. Why? Because imagine the two competing sides created two blocks, then the two competing sides of the network started building on top. And again by coincidence, two blocks were discovered sufficiently far on the network to propagates two equal parts of the network and sufficiently close together in time to not be able to overwhelm each other. And then everybody starts building on top. And again by coincidence, two blocks are found almost simultaneously at opposing sides of the network. The probability of that happening once, okay; twice, rare; three times, exceedingly rare, etcetera, etcetera. It's an exponential function. And that is the basis of the consensus algorithm. Do we have a crowd?

Unidentified Male: There is nobody, we can keep go on.

Andreas: We'll keep --

Unidentified Male: Somebody might be (Inaudible0:54:33)

Andreas: If someone shows up, we'll have to leave in a hurry. All right. So how often do Forks that are longer than this happen? In April of 2013, Bitcoin experienced a two-block Fork. Eight minutes later, Bitcoin experienced the three block Fork. Ten minutes after that Bitcoin experienced the four-block Fork. At this point, people started getting worried. Statistically, we are now in a three-four Sigma event, then Bitcoin experienced the five-block Fork, followed by a six-block Fork, followed by a seven-block Fork and that's when the emergency alert message went out. So April 2013, Bitcoin had an, oh-shit moment, because this isn't supposed to happen. So what had happened? This is a really interesting study in the mechanics of the network. That day or actually about a week before, a new version of Bitcoin had been released. This new version of Bitcoin used a new database for storage of the blocks. Instead of using Berkeley DB, it's used Google's Level DB to improve efficiency indexing and various other characteristics. Now databases are not part of the consensus mechanism, only they are because there's a difference between the implied consensus rules or defined consensus rules in the reference implementation. And the runtime consensus that is exhibited by the network dynamically as a behavioral artifact and bugs affect the exhibited runtime consensus. And all that matters is runtime consensus. So what happened? Berkeley DB had a bug. That bug was that it was limited to 1024 file descriptors. That bug could not be triggered when everyone was running Berkeley DB because no one could create a block with more than 1000 transactions. Because if they did, it would crash their machine and they would never propagate it. But when version 0.8 came out and machines were now mining on Level DB, one of those machines was able to create a block with 1200 transactions. That block could not be consumed by any node running Berkeley DB. They would start processing the transactions to validate them, they would open file descriptors, they would process the first 1024 transactions. And then they would attempt to validate transaction 1025, choke on it, crash, and restart. They would restart, join the network, ask it what the latest block is, receive the exact same block, start validating 1025 transactions later, choke, crash, reboot, ask for a block, validated, choke, crash, reboot. Problem is half the network adapted to Level DB, half the network was on Berkeley DB. The network suffered a complete bifurcation almost perfectly 50-50% balanced, and one side could not move forward. They couldn't move to the next block because every time they got on the network they would try to validate the same block. So one side mines another block, this one chokes, and another block, this one chokes, and another block. The April 2013 event resulted in a 26-block Fork, which should not happen during the life of the universe. This was a perfect example of runtime consensus being different than explicit consensus in the rules. And it was caused by a bug. It led to an emergency overnight summit, let's call it, optimistically a summit. That's when everyone is screaming on IRC and running around as if their hair is on fire. This happened around midnight in the U.S. By that morning, the mining operators had all gotten on board. They did some emergency upgrades, downgrades, patches, etcetera. And by running code that anticipated this mistake, they were able after 26 blocks to shift all of the hashing power back to the first Fork, create a 27-block longer chain, then invalidated the 26 blockchain that had gotten in front, wiped out all of those transactions, reloaded them on the other side, and the network continued. What's really fascinating about this example is that if you made a transaction in the network, it got delayed by 26 blocks. But it got processed eventually. Not a single transaction was dropped. There were a couple of double spends. People spend transactions on both sides of the Fork and they did this as a proof of concept, and then they refunded the merchants for their lost money. And this was one of the events that gives us an opportunity to study the consensus mechanism. On July 4, 2015, five days ago, we had a six-block Fork event. The six-block Fork event was caused by the introduction of a new standard for processing consensus rules called BIP-66, Bitcoin Improvement Proposals 66. BIP-66 has been in planning for more than a year. BIP-66 requires that after a certain point all transactions have elliptic curve digital signature algorithm signatures that are encoded in a very strict way. Previously, anything that open SSL accepted as a valid signature was a valid signature. The problem with that is open SSL has bugs and this leads to another phenomenon called transaction malleability. So in order to fix that, a proposal was introduced about a year ago to require that all signatures are strictly encoded in a specific way. It narrowed the range of possibilities and signatures, tightening the consensus rules, which means that some transactions that were previously valid would now be invalid. This is a soft Fork event, meaning it's backwards compatible. This proposal was put to a vote, meaning that it uses the blockchain mechanism to do an automatic upgrade. Every minor puts a version number in the block, so they say this is version two. If a minor was supportive of BIP-66 strict encoding of signatures, they signal this to the network by making all of their blocks Version Three blocks. And then they count. They look at the blocks that are coming in and they count of the last 1000 blocks that we've seen on the network, approximately ten days of blocks. What percentage are Version Three? Once the percentage goes about above 75%, BIP-66 is in effect, meaning that everybody should create transactions with strict signature encoding because we're about to make a major transition in the network. This is the grace period. Everybody should switch over to BIP-66 at that point. It's been voted in by the network. Once, 95% of the previous 1000 blocks or 950 blocks are all Version Three. This is a transition. And now, not only is BIP-66 encoding required but non BIP-66 encoded, non-strictly encoded, old consensus rules transactions are now considered invalid, meaning that every node should not only produce correctly-signed transactions, they should reject incorrectly-signed transactions after that threshold. That threshold was reached on the morning of July 4th. In the afternoon of July 4th, we discovered something interesting: miners who received blocks were cheating. What they were doing is they were building the next block without validating all of the transactions. Why? Because it takes time to validate the transactions and this is a race. So because they weren't validating transactions and proceeding to mine the next block, when someone who was on the minority--the 5% who had not yet upgraded to BIP-66--created a transaction that was in valid and mined it into a block. That block was accepted. 75% of the miner started mining the next block without validating it. And then the rest of the network validated it, found the invalid transaction, and rejected that block. And the network progressed one block. And part of the network kept rejecting that block. And so you had a Fork emerge. It lasted six blocks. The miners who were taking the short cut lost $50,000 in Bitcoin reward over these blocks in cumulative power that they consumed. So they basically were building blocks that were doing the proof-of-work, and those blocks were rejected by consensus because they did not conform to the new BIB-66 standard. So because they were taking a shortcut, because they were not validating to consensus rules, they found themselves on the wrong side of consensus. And in Bitcoin, you can have opinions, you can believe in big blocks, you can vote for small blocks, you can vote for BIP-66, you can vote against BIP-66. The one thing you can't go is you can't go against consensus. You go against consensus, you're burning electricity for nothing and you will pay for that. And that is the basic game theory of consensus. On July 5th, we had a three-block Fork; on July 6, we had a four-block Fork; on July 7th, we had another three-block Fork. BIP-66 is still causing instability in the Bitcoin network. But over the next couple of weeks, we expect things will get upgraded and we will see stability and the network will converge on strict encoding of signatures. In the meantime, what's interesting about this scenario is that while this is happening in the network, the network is self-healing. So when these Forks happen, they get resolved by eventual convergence. And as soon as they get resolved, all of the transactions get properly sequence and the network continues. This is an extremely resilient mechanism for a massively decentralized system. All right. Any questions?

Unidentified Male: So 51% of hash rate (Inaudible1:06:49)

Andreas: No.

Unidentified Male: No. Okay.

Andreas: Theoretically, you could. The protocol allows it.

Unidentified Male: Yeah.

Andreas: You would have to not only sustain 51% hash rate, you would then have to do 366,000 blocks worth of proof-of-work in 10 minutes.

Unidentified Male: Okay.

Andreas: So more likely you can change one or two blocks in the most recent history.

Unidentified Male: Yes.

Andreas: Maybe three.

Unidentified Male: But the (Inaudible1:07:18)

Andreas: No. Because every node should be able to fully validate from the Genesis block up no matter what you present to it based on the same rules that were in existence at the time.

Unidentified Male: Right.

Andreas: If you present a completely valid alternate history with sufficient proof-of-work to a Bitcoin node, it should be able to validate it all the way to the present from the Genesis block. In fact, every node when it starts only knows the Genesis block. The first thing it has to do is synchronize with a network. And it does that by independently and painstakingly verifying everything from Genesis block to today. It reconstructs the entire chain independently. Every node does this. It takes four to five days.

Unidentified Male: Thank you.

Andreas: The blockchain is about 40 gig at the moment, I think, depending on whether you're indexing all of the transactions or not. And it's growing fast. So the actual synchronization takes quite a bit of bandwidths and quite a bit of time. Okay, questions? Yes.

Unidentified Male: The many variations of Bitcoins are like (Inaudible1:08:29) all these. Do they use the same mechanics as Bitcoin?

Andreas: Not all. So the vast majority of altcoins operate using what we would call Nakamoto consensus, Nakamoto consensus being longest chain proof-of-work as determined by usually a SHA-256 algorithm. Some use a SHA-3 algorithm or a script algorithm or different forms of hashing algorithms, but they still implement Nakamoto consensus in terms of longest chain proof-of-work. But there are altcoins that use other forms of consensus, modified consensus with taking into consideration orphan child's, for example, which is Gost, what I described before. We have some experimental implementations of that. There are consensus mechanisms that instead of proof-of-work are based on proof of stake or delegated proof of stake. And we're seeing all kinds of new consensus algorithms being dreamt up. How many of those can scale to a global level of security that is resistant to global attacks? So far, one, Bitcoin. But that doesn't mean another one can't scale. What's difficult however is that if you try to scale a consensus algorithm today, you have to reach scale before you are attacked at scale. You have to build an industrial infrastructure of hashing or mining or a user adoption base or an economic base that is big enough to resist attack before you are attacked. Bitcoin did this by everybody ignoring it for a couple of years because they didn't think it was important. And by the time everybody noticed and thought, "Okay, maybe this is important and worth attacking," it was already strong enough that it couldn't be attacked. And then the strength of the network has outpaced the adoption and demand and attacks to make it extremely resilient to attack. The problem is you can't do that again because if you actually have a really innovative consensus algorithm and people think that's going to be valuable and they think it's going to be valuable enough to join the consensus algorithm and mind for it, they're also going to think it's valuable enough to attack. And there's no flying under the radar anymore. This is not just an algorithm. This is now a completely new scientific discipline. Consensus algorithms will be entire computer science curricula in the future. This is a completely new branch of distributed computing. It is extremely important. And it is now six years old. This is the birth of a new scientific discipline. And we've gone from one scientific paper, the Satoshi Nakamoto paper. Last year, about 140 papers, peer reviewed academic journal papers were written on consensus algorithms. This is now a thriving scientific discipline with dozens and dozens of researchers around the world working on this. Yes.

Unidentified Male: Do you worry consensus having (Inaudible1:11:35) transactions (Inaudible1:11:37) software is updated. Do you worried (Inaudible1:11:48) is that centralize and decentralize? Do you worry about that?

Andreas: So consensus works in waves. And this is an important concept to understand. That's what I would call, process consensus, which is a process of debate and proposal that happens among the development community. It starts with Bitcoin improvement proposals, discussions on GitHub, Bitcointalk, the Bitcoin developers' mailing lists in various other places, where people suggest changes or slight modifications to the rule or major modifications to the rules. That discussion, the reason for that discussion is to enable smooth software transitions in runtime consensus. So what you do is you gather debate and try to reach process consensus, which means that you have enough people in the development community agreeing with you. You then do a lot of testing. In order to validate the software, you provide a reference implementation that implements the change, you demonstrate that reference implementation on the test net which is a parallel Bitcoin blockchain, you run a battery of tests, the other developers poke holes in it, find bugs, suggest improvements. And at some point, you reach consensus. Then that is implemented in the Bitcoin core reference implementation. Now you've reached reference consensus, which means that it is introduced as a release in the software. In order to do that, you have to get a broader set of consensus. Now that software is released. In order for that software to actually go into the network, people have to upgrade. So now you have to have consensus among the constituencies. And people think that miners are the only constituency. But they're actually five consensus constituencies in Bitcoin. The software developers who are making the reference implementation, the miners who are implementing the runtime consensus for mining blocks, but also the exchanges, each exchange that exchanges Bitcoin into other forms of currencies is running nodes that validates transactions and they choose which version of the software they're running. The wallets, each wallets company, or wallet software, out there creates transactions that must be validated by consensus rules. And if they're doing centralized wallet processing, they're also running nodes that must validate transactions based on consensus rules. And finally, merchants and merchants processing, the economic engine of Bitcoin. Merchants either directly or through processors are running nodes to validate transactions. And they're doing the strictest validation possible because they're the ones who give out a plasma TV for this mythical magic internet money. And so if they don't validate the transaction correctly, they are out one TV and don't have Bitcoin to show for it. Now what happens if the miners go off on their own and the merchants' exchanges and wallets choose a different version? Well, the miners create Bitcoin and they mine it. And a hundred transactions later, they try to spend that Bitcoin. Only they can't spend the Bitcoin because they can't buy anything it because the merchants are on a different chain and therefore their transaction never happened according to the merchants. They can't convert it into currency to pay for their electricity because the exchanges are on a different change and therefore the transaction never happened according to them. And by the way, all this time they've been mining empty blocks because the wallets are on a different chain and they're producing transactions based on different consensus rules. It's not so easy to shift consensus in Bitcoin. In fact, what we're seeing over time is that it's getting harder and harder to modify consensus rules. This is a process that we see in distributed systems and protocols. I call it ossification. I don't know if that's an industry standard term. But the idea is that after a while as the protocol gets embedded in more software systems, more developers learn how to use it. It gets embedded in hardware. It gets embedded in systems that are not updated often enough or maintained often enough. Then it becomes harder and harder to change. A great example of that is IPv4. IPv4 got so embedded that we've now spent almost 20 years trying to upgrade it to IPv6 and it is resisting its own successor. It's become incredibly difficult to upgrade IPv4. And the reason for that is because you have it embedded in fridges and light switches, and wireless access points, and things that don't have interphrases and can't be upgraded or don't have enough memory and can't be upgraded. IPv4became ossified. The best protocol doesn't win. The protocol that's good enough and achieves network scale first wins. So the consensus rules of Bitcoin today are likely to be able to absorb change for a couple more years at the core protocol level. After that, most of the innovation has to move to protocol layers above just like most of the innovation on the Internet moves from IP to TCP to HTTP and to other protocols above HTTP because each of the layers below gradually became ossified and could not be changed dramatically. You can't go and change TCP/IP today. It's simply impossible. All right. Let me take one more question and then we're going to wrap because we're running late. Yes.

Unidentified Male: How many transactions can you actually get done in 10 minutes?

Andreas: So how many transactions can you get done in 10 minutes? That is a capacity limitation, which is artificially constrained by the maximum size of a block. A block today, it can be a maximum size of one megabyte. With a maximum size of one megabyte, it can fit a few thousand transactions depending on the size of each transaction which is variable. The average processing capacity is between three transactions per second to seven transactions per second with the current constraints. These are artificial constraints. So there's a big debate going on in Bitcoin at the moment as to how and when to raise the capacity limit. For the time being, blocks are coming in full about 60 to 75% of the time, meaning that there's still excess capacity to fill with low fee transactions. And in most transactions, the vast majority gets confirmed within the next available block on a best-effort basis. Occasionally, when the network is under stress it may take two or three blocks for a transaction to be processed. Transactions don't have an expiration date. As long as the network knows about them, they will be eventually confirmed. They are valid forever. And so therefore, you can keep retransmitting a transaction until there's sufficient capacity in the system to absorb it. It's fairly resilient in that way. Now the proposals of the moments are to increase the block size capacity by January 2016 to eight megabytes, an eightfold increase, followed by an increase twice over every four years. So 8, 16, 32 every four years. And to keep increasing the size in 2032, that gets us to a twenty gigabyte block, which if we keep approximately the same size of transaction, means 20,000-time increase in the capacity of transactions. So you're looking approximately 100,000 transactions per second. 100,000 transactions per second is the global capacity of the Visa Network. So depending on whether you need to do all of the transactions on the blockchain because there's a big argument that you can actually do a lot of the transactions off the blockchain. You don't need to put every transaction on the blockchain. There are many circumstances where you can do incremental transactions between two parties. This technology is called payment channels. And then do eventual settlement of the net difference between all of the sum of transactions as a single transaction on the blockchain. What that does is it takes the trust capability of the network and it provides it as an attributes to a layer above that can leverage it, but without flooding the blockchain with transactions. So there's the base amount of transactions you can actually record on the blockchain, but each one of those could represent hundreds of transactions that happen off blockchain in between two parties. So the actual capacity may be a lot higher. It's a bit like in monetary terms we talk about M0, which is the base amount of physical currency that is in existence, right. And the amount of cash that exists in the economy is less than 3% of the total amount of money in the economy. But because that cash gets used again and again and again, it actually enables for a lot more economic activity, velocity for each unit of currency. In blockchain, you can think of the base transaction rate as the capacity of the base mechanism. But with overlay networks, you can magnify that and increase the velocity of each transaction. Does that answer your question?

Unidentified Male: So essentially, you're not expecting to (inaudible1:21:33) in each block that will be dealt with (Inaudible1:21:31).

Andreas: Well, at the moment you could if you wanted to, just about.

Unidentified Male: But if you had more -- I'm thinking in terms of value.

Andreas: You have to predict. Is Bitcoin the currency with which you buy aircraft carriers? Or, is Bitcoin the currency with which every one, two, three billion people buy a cup of coffee every single day, right? And then the secondary question is, if Bitcoin is the currency with which billions of people buy a cup of coffee every day, do all of those transactions happen on the core blockchain, which means that we need to massively increase capacity? Or, do many of them happen on overlay networks with eventual settlement which still preserves the decentralized nature. It still preserves the transactional trust in the thing, but it doesn't flood the blockchain. And there's competing schools of thought on this. So we necessarily must scale up capacity. And the question is not whether we will use technique A, B or C, but more how quickly can we use A and B and C in parallel to increase capacity. We're seeing overlay networks and block size increases, and optimization of block propagation, and pruning of transactions of the blockchain to reduce the footprint on disk, and optimizations on validation and processing. All of these things are happening at the same time. So it's a bit of a philosophical issue at the moment. We don't know where Bitcoin is going as a transaction processing environment yet.

Unidentified Male: So, you're suggesting (Inaudible1:23:15) the whole thing it would be a simple (Inaudible1:23:17)?

Andreas: Not necessarily. I think that's more likely, but it could very well scale to support hundreds of thousands of transactions per second. Because one of the other things that is the context in which all of this is happening, of course, is Moore's Law. And so if bandwidths and storage and compute capacity continue to increase at Moore's Law, then in the next 20 years as Bitcoin reaches mainstream adoption, you could actually support billions of users with quadrillions of transactions. You just have to start moving a lot more data on a lot beefier computers. And so we don't know exactly where it's going. We'll see. This is a software engineering problem. This is why it's exciting. Hopefully, that research will be done here at UCL. Thank you so much for (Inaudible1:24:16).


Written by Andreas M. Antonopoulos on January 31, 2016.