When Bitcoin (BTC) first appeared in 2009, few people had a clear idea of what it was, let alone the waves it would generate both financially and technologically. The underlying blockchain technology was more or less a new concept, and like most new concepts was poorly understood in general. In 2018, blockchain remains a hot topic: while it is tied in many people’s minds to cryptocurrencies, it is actually a standalone concept on which cryptocurrencies can be based. This article will clarify how blockchains work and, just as importantly, where blockchain ends and technologies based on it begin.
This post is an updated and expanded version of our 2017 cryptocurrency primer.
What Is Blockchain and What Does it Look Like?
The purpose of blockchain is to create a ledger; that is, a record of historical transactions (be those financial transactions, messages, etc.).
Fundamentally, the blockchain is aptly named: it is a chain of blocks of data which at their most basic level (at least in most current implementations) can be conceptualised as something similar to the diagram below, which is based on the blockchain as famously implemented by Bitcoin.
Any given block of data in this implementation contains four pieces of information:
Timestamp – The time at which the block was created.
Transaction Root – The details of the transactions contained in this block – i.e. this section of the ledger. The amount of data held in this section can vary significantly: in Bitcoin, it will be approximately ten minutes’ worth of transactions. Other implementations use shorter windows.
Previous Hash – The hash of the last block in the chain – this is how the chain is linked together. When any given block has been processed, its hash becomes the Previous Hash of the next block in the chain, thus allowing historical records to be linked together and traversed.
Nonce – A cryptographic term referring to an arbitrary value used only once in a transaction. The purpose of this will be discussed in more detail later on.
The hash of the block – which becomes the Previous Hash value in the following block – is the hashed value of all the data held in these four chunks taken together.
Is blockchain tamper-proof?
Perhaps a truism of security is that nothing is inherently tamper-proof: it has to be designed to make tampering difficult and then protected by as many anti-tampering controls as possible. It shouldn’t, therefore, come as a surprise that blockchains are not themselves tamper-proof without some additional controls.
The first of these controls is distribution and decentralisation: by ensuring that all interested parties have access to the ledger and any new transactions which are supposed to be added to it, tampering should become much more evident. If all of the parties involved have access to the same information, an attempt by anything less than a majority of stakeholders to incorrectly report a transaction will be noticed by all of the other parties who are processing the data honestly.
Without distribution and decentralisation – and therefore equal access to data for all interested parties – blockchain is no more tamper proof than any other data storage mechanism. A blockchain owned and exclusively processed by one individual, regardless of how many nodes they operate and how many people can read the data stored in the blockchain, could be tampered with by virtue of the fact that one individual controls all of the processing.
At this point, we need a method whereby the interested parties can communicate with each other and check the validity of a new block submitted to the chain. This is where implementations diverge. Historically, there have been three common approaches:
Proof of Work – Make calculating a valid hash for a block difficult to do, but easy for other parties to verify. The first person to calculate a valid hash submits it to the network and it is validated by the other parties prior to adding it to their chain.
Well-Known Uses: Bitcoin (cryptocurrency); Monero (cryptocurrency)
Proof of Stake – Block creators are determined pseudo-randomly based on their ‘stake’ in the blockchain. This is primarily used by cryptocurrencies as the stake is easily calculated based on the amount of currency held by each member.
Well-Known Uses: DASH (cryptocurrency); Ethereum (cryptocurrency, hybrid PoS/PoW)
Practical Byzantine Fault Tolerance – Something of a mouthful, PBFT is a consensus-based method of ‘tolerating’ faults in the data and recovering automatically. The specifics of the system are beyond the scope of this article.
Well-Known Uses: Hyperledger Fabric
As an aside: all of the above are solutions to the risk of what is known as a Byzantine Fault – that is, a fault where there is potentially imperfect or incomplete information which may result in the fault presenting differently to the various parties involved (consider that each party doesn’t know if or how many malicious participants there are and that some of these, while ‘in’ on any scheme to falsify data may only be malicious ‘approvers’ of bad data, not generators of it).
Is it time to talk about mining yet?
Yes and no.
Yes, because we already have: mining is Proof of Work.
No, because mining is really an artefact of cryptocurrencies using the Proof of Work fault tolerance solution and not generally desirable for the majority of other blockchain applications.
How do you prove work?
Proof of Work functions rely on setting a ‘difficulty target’ for the hash: you set some sort of numerical goal which the hash has to meet. As hashes will always be the same for the same set of data, we need to change some value which makes up the data being hashed to try and manipulate the hash value – this is where we use the nonce.
Take a look at the diagram below, which borrows from the example on the en.bitcoin.it wiki:
Let’s simplify all of the data held in the Previous Hash, Timestamp, and Transaction Root elements to the text ‘Hello, world!’ – shown in red on both the left and right side of the diagram.
Let’s also set an arbitrary difficulty target which states that the hash must start with four zeroes.
To achieve this, we start concatenating an arbitrary piece of data – the nonce, shown in blue – to the data we want to record and then hashing the entire string. If that doesn’t meet our difficulty target, we add a different nonce on to the original data we wanted to record and try again.
In this case, to hash ‘Hello, world!’ so that it meets our ‘four zeroes’ difficulty target we will need to iterate 4,250 times. In the real world, this would then be sent to the network who can then validate our effort very quickly by checking just the value we’ve sent and comparing the hash to the difficulty target – remember that all of the people checking should have a copy of the same data as is shown in red above, so if we’ve cheated and used falsified data to generate our hash, they’ll get a different one to us.
This may sound a lot like 'brute forcing' something, and that’s because it is. However, because it’s quick to check that the work has been done legitimately it means that the network can easily verify and reject results with which it doesn’t agree.
First and foremost, try to think of cryptocurrencies as applications which use blockchains for storage. Equally, remember that cryptocurrencies could conceivably be based on any of the above fault tolerance approaches: Proof of Work, Proof of Stake, or PBFT although in reality all major implementations are either PoW or PoS based at the time of writing.
We can talk about mining now
As we previously discussed, certain currencies being minable is a result of their use of Proof of Work fault tolerance. As an inducement to perform the intensive calculations required by the PoW approach, the first person to successfully generate a valid hash first which is subsequently accepted by the network is rewarded with either the transaction fees included in that block (generally the pieces of a single ‘coin’ included in a transaction after a certain number of decimal places) and/or a new coin in the currency.
Unsurprisingly, this resulted in a rush to mine these currencies and, as with anything where money is involved, a rush to develop better and faster ways of mining. In Bitcoin’s case, this resulted in the development of Application Specific Integrated Circuits (ASICs) dedicated to mining, resulting in hash rates of over 30,000,000 terahashes per second in April 2018. That’s 30 quintillion hashes per second across the whole network.
To compensate for this, the Bitcoin network adjusts the difficulty rate to result in a new block being ‘mined’ every ten minutes. Keep in mind that different PoW-based currencies frequently have different target times for blocks.
This huge hashing rate quickly made using anything other than specialised hardware massively inefficient (to the point where you’d likely make a loss on the electricity used). The following table was again compiled from data on en.bitcoin.it/wiki and manufacturer specifications:
|2017 Antminer T9 ASIC||0.126|
|2013 AMD 7870 XT GPU||326.8|
|2011 Intel Core i5||~20,000|
Note: The 2013 AMD 7870XT is the fastest, most recent GPU for which anyone has submitted a Bitcoin hashing benchmark.
As a result of this arms race, implementations diverge once again. This time with some cryptocurrencies branching off to use a different type of Proof of Work algorithm. Broadly, these can be categorised as:
CPU-Constrained Algorithms – These are algorithms such as SHA256 (as used by Bitcoin) which require very little memory per instance (< 512 bytes) and are therefore easily performed by cheaply-made ASICs.
Well-Known Uses: Bitcoin (SHA256)
Memory-Constrained Algorithms – Algorithms which require significantly more memory per instance and therefore significantly increase the cost and effort involved in ASIC development.
Well-Known Uses: Litecoin (Scrypt); Monero (CryptoNight)
At this point, we need to introduce the topic of malicious miners: the prospect of easy financial reward often goes hand-in-hand with malicious activity.
We have spoken before on this blog about malicious mining, in particular on the rise of Monero mining malware throughout 2017. The memory-constrained CryptoNight algorithm used by Monero was developed specifically to be more effective on home PC hardware than on ASICs. In fact, the design spec (called CryptoNote) states the following:
[CryptoNote] is designed to make CPU and GPU mining roughly equally efficient and restrict ASIC mining.
As such, it naturally appeals to malicious actors looking to deploy miners on home PCs – be that via traditional malware which executes natively on a PC or, latterly, through in-browser miners which run when a user visits a webpage with the miner embedded. Our next blog post will look at the latter in more detail.