Sharding
Sharding is a term used to describe the process of splitting a blockchain into smaller, partitioned blockchains, called shards, for easier data segment management, improved scalability, and increased transaction speed. [1][2][3]
Overview
Sharding is a technique in blockchain derived from traditional databases with the purpose of scalability, enabling them to process more transactions per second. Sharding originates from a technology called database partitioning, also referred to as horizontal partitioning. The process involves dividing a large database into smaller units, known as "shards". This division facilitates improved accessibility to information on the database. [2]
Each shard is comprised of its own data, making it distinctive and independent when compared to other shards. A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard is held on a separate database server instance, to spread load. Some data within a database remains present in all shards, but some appear only in a single shard. Each shard acts as the single source for this subset of data.
Sharding can help reduce the latency or slowness of a network since it splits a blockchain network into separate shards. However, there are some security concerns surrounding sharding in which shards can be attacked.[7]
Sharding takes the stress off the single chain processing all the interactions and transactions on the blockchain network. Each shard has its own ledger and can process then process its own transactions and holds a unique set of smart contracts. [1][3]
Advantages
- Scalability: Sharding is well suited for large, distributed enterprise applications. It allows for the fast execution of a command or a query. Storage segmentation, which is a key feature of the sharding pattern, enables the physical infrastructure to scale in a controlled manner.[8]
- Decentralization: In addition to its positive impact on blockchain scalability, sharding also holds significant implications for decentralization. Sharding facilitates the involvement of a greater number of nodes in the network, enabling them to handle transactions. Consequently, this augments the degree of decentralization and equitable power distribution within the network. Such an arrangement could potentially enhance security, as the absence of a singular central point of vulnerability in an extensively decentralized network can contribute to heightened security levels.[5]
- Performance: By distributing the data across multiple machines, a sharded database can handle more requests than a single machine can. Sharding is a form of scaling known as horizontal scaling or scale-out, as additional nodes are brought on to share the load.[8]
Disadvantages
Although sharding might seem like a potential resolution to blockchain's scalability concerns, its implementation comes with a few challenges. There are two primary drawbacks associated with sharding a blockchain: complexity and security.
- Complexity: Introducing sharding to existing blockchain networks is a formidable task due to the intricate process of network partitioning and the necessary reassignment of state. Sharding greatly increases the complexity of a software development project. Additional logic is required to shard the database and properly direct queries to the correct shard. This increases development time and cost. [3][4]
- Security Concerns: Security concerns surrounding sharding include a hack or shard takeover, where one shard attacks another, resulting in a loss of information. Inadequate implementation can lead to the risk of double spending, significantly impacting the overall security of the network. In addition, a segmented blockchain could become vulnerable to attacks, as hackers might find it easier to gain control over a single shard due to the reduced hash power necessary to manipulate individual segments. If a segment is compromised, malicious transactions could potentially be disseminated to the broader network, causing disruptions to the entire system.[3][6][7]
- Infrastructure Costs: A more elaborate network mesh is often necessary, which leads to an increase in lab and infrastructure costs.[9]
- Expertise Required: The sharding pattern requires that DBAs have both specific domain expertise and experience with the best practices of the database technologies in play in order to manage the sharding segmentation effectively.[8]
- Network Traffic: Shards distributed over a large number of geolocations can be susceptible to performance degradation due to excessive network traffic.[8]
Ethereum Sharding
Ethereum, initially developed by Vitalik Buterin as Ethereum Classic, operates as a unified blockchain system. Despite its ability to accommodate a growing number of decentralized applications and users, all transactions must pass through a limited group of miners. Consequently, this centralized processing can result in network congestion.[2]
In 2018, Vitalik added a comment in a publication for the proof of concept for a sharding update of the Ethereum blockchain:[17]
"A not yet fully solved challenge is determining how to incentivize and when to allow cross-links."
Ethereum sharding is a method for increasing the number of transactions that the Ethereum network can process. The concept involves splitting the entire Ethereum network into multiple portions known as 'shards’. Each shard contains its own independent state, meaning a unique set of account balances and smart contracts.[14]
In one of his posts in 2021, Vitalik emphasized the importance of sharding: [15]
"Sharding is the future of Ethereum scalability, and it will be key to helping the ecosystem support many thousands of transactions per second and allowing large portions of the world to regularly use the platform at an affordable cost.
And also, in December 2018 Buterin tweeted: [16]
"Proof-of-stake blockchains with sharding will be “thousands of times” more efficient in the future."
Danksharding
Danksharding is a method for increasing the number of transactions that the Ethereum network can process. It is how Ethereum becomes a truly scalable blockchain. Danksharding will bring massive amounts of space on Ethereum for rollups to dump their compressed transaction data.
The concept involves splitting the entire Ethereum network into multiple portions known as 'shards’. Each shard contains its own independent state, meaning a unique set of account balances and smart contracts.
Ethereum’s roadmap includes a protocol upgrade called Danksharding. This upgrade aims to make transactions on Layer 2 as cheap as possible for users and should scale Ethereum to >100,000 transactions per second. Proto-Danksharding, also known as EIP-4844, is an intermediate step along the way.
Proto-Danksharding introduces data blobs that can be sent and attached to blocks. The data in these blobs is not accessible to the EVM and is automatically deleted after a fixed time period (1-3 months). This means rollups can send their data much more cheaply and pass the savings on to end users in the form of cheaper transactions.
The blob of data submitted by a rollup has to be verified to ensure the rollup is not misbehaving. This involves a prover re-executing the transactions in the blob to check that the commitment was valid. This is conceptually the same as the way execution clients check the validity of Ethereum transactions on layer 1 using Merkle proofs.[13]
Zilliqa Sharding
Zilliqa is a high-performance, high-security blockchain platform that uses sharding to scale. Sharding in Zilliqa takes many forms: network sharding, transaction sharding, and computational sharding.
- Network Sharding: This is a mechanism that allows the Zilliqa network to be divided into smaller groups of nodes each referred to as a shard. For example, imagine a network of 1,000 nodes, then, one may divide the network into 10 shards each composed of 100 nodes. These shards can process transactions in parallel. If each shard is capable of processing 10 transactions per second, then all shards together can process 100 transactions per second.
- Transaction Sharding: Whenever a transaction reaches the network, it gets assigned to a specific shard. The assignment is determined by the first few bits of the sending address of the transaction. This assignment strategy, however, only works with payment transactions. To properly handle both payment and smart contract transactions, a different solution is employed by categorizing transactions so that we can have a separate assignment strategy for each category.
- Computational Sharding: The approach that Zilliqa has chosen with computational sharding is that every single node will have a copy of the current state but then the transaction history will be split in pieces so that not everyone will have to have a full copy of it.
The ability to process transactions in parallel due to the sharded architecture ensures that the throughput in Zilliqa linearly increases with the size of the network.[10][11][12]