Select Page

Sharding and Small-world Networks: the golden needle in the blockchain haystack


By Lucius Gregory Meredith & Ralph Benko

Introduction and motivation

The late great US Senator Everett Dirkson allegedly once said, “a billion here, a billion there, pretty soon you’re talking real money.”  The same concept applies to data. The masters of the digital universe process many billions of high level transactions (like emails, Google Maps requests, Facebook and Twitter posts, Spotify streams, and many more) per month and, as a consequence, generate billions of dollars in profits.

Meanwhile, the digital world has taken the mistaken position that blockchain’s primary value is that of cryptocurrency and DeFi, i.e. financial transactions.  One of the reasons for this is its Secret Origin Story:  Bitcoin, which proposed to be a digital currency. And it persists due to the inability of the dominant incumbent blockchain networks, Bitcoin and Ethereum, to scale. In fact, the big money in digital networks lies in data, whether transmitting it or retrieving it, and advertising or subscriptions associated with providing data services.

For example, Ethereum miners make about a million dollars a day. In doing so they burn up inordinate amounts of electricity. To replace that infrastructure with a more elegant, energy efficient, and sustainable infrastructure, the economic proposition has to become inescapably compelling. Welcome to capitalism!  

What if they could make $10M a day and do so by charging 1% of their current transaction fees? To achieve that goal, they would have to get 1,000x as many transactions on the network.  

Is that even possible?

The goal of this essay is to show that indeed we can create a robust decentralized small-world network for supply chain, one that will be resilient in the face of systemic supply chain failure almost certain to happen as a result of climate related and other disasters. In other work we have cataloged the challenges facing supply chain management as climate related events become more and more numerous and frequent. 

Here we will emphasize that preparing for these eventualities can at the same time create very lucrative and socially relevant business opportunities. In particular, we argue that this strategy is in fact the fastest way to get engagement on a utility token in a large scale public blockchain network.

A billion here, a billion there, pretty soon you’re talking real money.  

Is there a way to  provide real scalability to the blockchain and Web3 markets? As it happens, there is: by breaking the bottleneck between computer scientists and software engineers.  

Computer science has discovered the power of concurrency to make blockchain scalable. This has not translated into the requisite software. Until now.

The commitment to scale requires thinking about scaling from the very beginning, and not just the technical requirements, but also the economic opportunity. In reality, most transactions are isolated, and as such can run concurrently. This is how today’s financial networks scale. In fact, the market has been scaling software systems by adding hardware for the last half century. 

Only blockchain networks have opted for architectures that get slower as more nodes are added. Next generation blockchain, however, realizes that the only way to achieve scale is to exploit concurrency, assigning the ever growing, and ever less costly (both in terms of cash and power consumption) supply of physical threads provided by today’s hardware markets to process transactions that can be processed independently at the same time.

Note, however, that the value proposition of a network based on concurrency only kicks in when the transaction volumes reach a certain number. Roughly speaking, using a sharded network  to store and retrieve data with the mainnet coordinating between the shards  really makes sense when the transaction volumes are three orders of magnitude (1,000X) above the transaction volumes of networks like Ethereum and Bitcoin.

At such volumes validators can charge 1/100th the transaction fees typically associated with Ethereum and yet make 10X the revenue of Ethereum miners. Frequently, Ethereum fees can run as high as 25% of the transaction and for smaller transactions can exceed the value of the transaction itself. 

To be clear we are suggesting that a concurrency-based protocol becomes economically compelling when it reaches billions of transactions per month, instead of the millions of transactions per month that Ethereum and Bitcoin process today, combined. Is this realistic?

When coupled with the significantly lower energy costs associated with concurrency-based  transaction validation (hundreds of millions times more efficient than Bitcoin), the choice to run these validators over proof-of-work or even most proof-of-stake networks becomes economically overwhelming. Such an approach means rethinking a lot of received wisdom about blockchain in the light of the actual mathematics, scientific, and market data about networks and how they scale. So, let’s get right to it.

A quick feasibility study

Data transactions instead of financial transactions. 

First, let’s check feasibility. Is it even possible to get to billions of transactions per month? It certainly is. Let’s rethink the source of transactions. 

Stop for a minute and consider your own behavior on your mobile, laptop, and desktop applications. How often are you clicking to pay versus how often are you clicking to play, by which we mean how often are you clicking to access or update data stored in some cloud-based service rather than transferring cash. 

Specifically, how often does a user click to access or send email through GMail? How often does a user click to access Google Maps? What about Wechat? Telegram? Facebook? Twitter? Yet, in those services the end user never clicks to pay.

The use of networked services for data access and update, i.e. data transactions, overwhelmingly dominates that of financial transactions. So, to achieve transaction volumes in the billions it is necessary to serve data transactions above financial transactions. 

Coordination costs.

But even if we do serve data transactions how hard is it to actually get to billions of transactions? Is the coordination cost to get to this scale too much of a barrier? 

Let’s consider a simple example from the world of audio applications such as Spotify. Suppose a dApp development community wanted to create a decentralized Spotify. How big would they have to be to get to a billion transactions per month? 

Well, if they could land 100 artists, each of whom had 100,000 followers, assuming there was almost no overlap in the artists’ audiences, that’s 10M followers. If each follower listened to 100 songs per month (around three a day, not much of a stretch), that’s a minimum of 1B transactions per month.

Further, it turns out that the fastest growing sector of online music is the small to medium artists with audiences in the range of 1M followers. So, this sort of dApp is not only feasible, it actually represents an optimal product market fit. 

Moreover, a quick survey of online applications shows that Spotify is not unique. Spotify-like services are being provided not just in music, but in all of the creative content markets, from video to journalism.

We are looking at a volume of trillions, easily.  There is gold in them thar hills!

On-chain data and search. 

Of course a dApp like this also runs into technical feasibility questions. How can blockchains actually handle data like audio data? Specifically, if the data isn’t actually on chain, could the dApp be said to be decentralized? If it isn’t actually decentralized, what’s the advantage over Spotify or Pandora? 

Further, if the data isn’t on chain, how would users search for songs or podcasts, videos or other content?

Fortunately, a consensus based system has already demonstrated that it scales so well as  to store not only audio data, but video data on chain and stream it back from chain in real time. Further, the design of its smart contracting language, rholang, makes it a transactional query language, thereby providing searchability of Web3 data. 

Consider the implications of a searchable blockchain.

From the very outset, rholang was designed with the requirements of catering to the data transaction markets by providing query and therefore search capabilities in a decentralized setting. And, if there is one project to have found this kind of solution, eventually, there will be many more. Thus, a decentralized dApp replacing Spotify now becomes  technically feasible.

In short, it is eminently practical  to get to transaction volumes in the billions with a blockchain architecture within today’s market requirements and market demands. 

Digital currency versus utility tokens.

Note that already something is very different in this picture. The role of tokens in a blockchain facilitating data access is network security. It’s not providing a digital currency. Instead, it’s providing protection against denial of service attacks. 

This is a tried and true approach to managing Internet-facing APIs used in practice by companies like Google and Microsoft and Amazon and others  to prevent bad actors from calling the APIs in such volume and frequency that no other customer can use the network service to access or update data. 

The practice is to issue a randomly generated unique number, a/k/a a token, that can be used as a key to allow for a certain allocation of uses of the service. Smart contracting blockchains that provide a data access and update capability merely upgrade this kind of  token to become a cryptographic token connected to a consensus protocol. In a word, the token is also used to make sure all the network nodes managing the data agree on the data being stored.

The value of a utility token guaranteeing the security of the network in this fashion is secured by the use of the network to store, access, and presumably compute over the data on the network. It is not  “magic Internet money”, like Bitcoin, or Doge Coin, or any other ice cream coin that melts at the first sign of a bear market. It is an information utility, sharing more in common with electricity than gold.

But there’s a lot more to understand about the big picture economics when we consider networks at scale and how to achieve that scale and what that means for network architecture and go to market strategy.  Let’s unpack this.

Economics, meet Sharding

The power of the group. 

Think for a moment about Facebook groups or WeChat groups. How many of them are there by comparison to individual users? Perhaps you might think that there are more users than groups. After all, there are billions of people on the planet. 

However, if there are 1.5B Facebook users, then there are at least 2^150000000000 possible groups, assuming people don’t make more than one group with exactly the same members (which, in fact, they often do, making that astronomical figure the minimum). 

The number of groups evidently completely dwarfs the number of users and this fact dominates considerations for plans to achieve network scaling, network effects, and network economics.

More generically, for a group of N participants, there are at least 2^N groups. If we are serious about getting to transaction volumes in the billions quickly, it’s very useful to visualize how quickly the population of groups grows versus the population of users. 

For a community of only three people, say Alice, Bob, and Carol, there are eight subgroups.

Figure 1

At this level of detail we don’t really care too much who is actually present in the community. So, let’s abstract the picture a little in order to see how this grows as we add more people.

Figure 2

Now, if we add just one more person to the community, the population of the community is four, but the population of groups is 16.

Figure 3

And, if we add one more person to the community, the population is five, but the population of groups is 32.

Figure 4

Network architecture and Network evolution. 

These pictures are useful not only to see the growth of the network of groups as a function of the size of the community population. It helps to understand how a real public network comes into being. 

A public network does not spring up overnight. That’s not how the Internet was created. It was built through the development of lots of private networks (“intranets”). Then the value of connectivity between the subnetworks of the emerging public network – the ability to communicate between, rather than only within, an organization – overwhelmed the security concerns associated with being connected. 

The reason for this, again, resides in the relationship between the size of the population of a community versus the size of the population of the groups. To wit, the cost of coordinating 2^N people quickly becomes prohibitive.

Yet  if we break the populace into groups, coordinating the smaller groups is much more manageable. This is, in practice, how actual networks form. 

To be perfectly clear, a flat public network of 2^N participants costs vastly more, and takes more time to create, than letting N shards form and then letting them connect. 

It’s common sense.  Nations are made up of provinces or states, which are made of counties, which comprise cities, towns and villages, which comprise people. The geopolitical organization of most nation states arises from the relationship between coordination costs and concurrency afforded by autonomy. 

In other words, Houston operates independently of Seattle.  Seattle operates independently of Houston. If they had to coordinate, nothing would get done. Houston does Houston things, Seattle does Seattle things. Texas does Texas things. Virginia does Virginia things. 

The connections between the states and the federal provide the backbone of a small-world network. States upload to federal, federal download to States, and the States form their own shards. The US is sharded into states.

This is much much less expensive than attempting to create a flat network of the total population. This mathematical reality shows up in nature as well as human networks. More on that later.

Understanding both the architecture of computer and human networks and their evolutionary development is why sharding is essential. The central point of this design for sharding is to allow for organic growth of a public network through the creation of many shards so that the public network can emerge through the connection of the shards to higher level coordinating networks.

This approach to the market is unequivocally the fastest route to a public network enjoying the transaction volumes contemplated to achieve unprecedented profitability because it solves for maximal concurrency in the development of the network, and thus minimum time.

Of course, groups are not the same as networks. Networks, almost by definition, enjoy connections between the members that represent communication capabilities, such as being able to send data (like emails, dms, and documents) or funds. When we are talking about the sorts of computing networks making up the subnetworks of the Internet, these communication capabilities give rise to the transaction volumes necessary to scale a massively profitable blockchain network. 

So, it becomes important to understand how much communication happens between subnetworks and where in the hierarchy of networks the bulk of the communication happens.

Sharding, meet Economics

The Dunbar number versus small-world networks. 

This question of where the bulk of communication happens in a sharded network is best answered in the context of data regarding actual networks. Intuitively, there will be a balance of overlap versus autonomy between shards before there is a need for cross-shard communication. 

If there is a great deal of overlap, the likelihood is that the bulk of the communication occurs in one shard or the other. If, on the other hand, there is little overlap, communication is unlikely, as there are too few contact points.

Cross-shard communication becomes effective  when there is a balance between overlap and autonomy. There must be an interlocking or interpenetrating structure amongst the shards to create the demand for cross-shard communication. 

In human networks the amount of interpenetration is correlated to the Dunbar number, which is to say that most people maintain 150 active relationships. Specifically, these active relationships determine how much overlap and independence there is in subnetworks of a given population.

In terms of the diagrams above, there is very little demand for cross-shard communication at the layer just above the bottom of the lattice, because there is no overlap in those network nodes. Likewise, it would appear that there is not as much need for cross-shard communication near the top of the lattice. 

If this were actually true then the mainnet where all the shards roll up would not be the place to expect the resolution of cross-shard communication. Instead, somewhere in the layers above the middle of the lattice, but below the top would be where we would expect the maximum occurrence of cross-shard communication. 

Thus, placing bets on validation at that level of the network would be the right economic move.

What this doesn’t take into account is the “small-world network phenomena.” Prof. Strogatz and others have shown that actual networks, from the design of physical computing devices to livestock distribution, tend to enjoy a backbone of communicators linking subnetworks one level down. Here are two examples illustrating the point.

Figure 5

These network shapes are typical of what we see in actual network development, whether they arise from a human design process or natural selection. One way to see this is to look at the most common domain name structure: it is three levels deep, indicating a backbone, middle layer, and leaves, just like we see in the networks in figure 5, above.  

Technically, a small-world network is defined as a network in which most nodes are not neighbors of one another, but the neighbors of any given node are likely to be neighbors of each other and most nodes can be reached from every other node by a small number of hops or steps. Specifically, a small-world network is defined to be a network where the typical distance between two randomly chosen nodes (the number of steps in a path between them) grows proportionally to the logarithm of the number of nodes in the network. 

As summarized in the wikipedia article on the subject, small-world properties are found in websites with navigation menus, food webs, electric power grids, metabolite processing networks, networks of brain neurons, voter networks, telephone call graphs, and airport networks. Typical small-world networks, i.e. networks satisfying this path length to population size property, are highly clustered, with “short cuts” between the clusters. In other words, shards and a backbone.

This curious fact about the design and evolution of effective networks underlies the requisite blockchain design and go-to-market strategy. Allowing the creation of shards minimizes coordination cost and time to network design while maximizing the utilization of the utility tokens.

The utility token is the backbone in a “small-world network” of blockchains

The purpose of the mainnet of such a sharded network is to serve as the backbone in a small-world network of shards. This architecture maximizes the utilization of the tokens and achieves the maximum utilization in the shortest amount of time. 

When we consider a small-world network of dApps that decentralize the existing online services, from Spotify to Google Maps to WeChat, we are looking at hundreds of billions of cross-shard transactions on a monthly basis, to say nothing of the dApps that people will create once they understand the extraordinary possibilities this creates.

A billion here, a billion there, pretty soon you’re talking real money.

The enterprise’s role in jumpstarting a small-world network of blockchains. 

Of course, the market has to be convinced to build those dApps, and this is where enterprise development strategy is crucial. The enterprise sector understands this kind of small-world network in a way that the B2C markets don’t. 

Consider soft supply chain management, by which we mean supply chain management in non mission critical markets, such as newsstand distribution of magazines and newspapers.

Risk to transaction volume. 

A brief digression to understand the transaction risk to transaction volume in the evolution of small-world networks is in order. 

In the case of Spotify, no one dies if a song is not delivered to a user. Likewise, no one dies if a palette of magazines doesn’t make it to the corner shop on time, or if the latest paperback romance doesn’t hit the bookstore on the high street until next week. 

These are soft distribution markets because the risk to consumers of the failure of a transaction or even a whole cycle of transactions is low. 

Financial market transactions, by contrast, are high risk. The failure of a single financial transaction, if sufficiently large, can be catastrophic. The failure of one day’s cycle of smaller financial transactions also can ruin the provider and cascade destruction into the world financial system.

Networks that develop critical transmission-of-value capabilities in high-volume low-risk contexts can take over high-risk markets much more effectively than the other way around, because failures due to scale are much more frequent than failures due to transaction content. 

For example, consider what Google did with search. Google built a giant computer network making the Internet searchable. When you build a network to serve hundreds of millions of search requests per day, that network is robust. Its users are hitting all the possible failure modes all the time. 

The network doesn’t care what the content of the search is. It cares that it is serving up each search result in a speedy fashion, not fumbling the delivery of the search result.

This wisdom was one of the key takeaways from Web2. Nobody dies if Google’s search function suffers a bug. Nobody dies if Gmail has a glitch. Nobody dies if a Facebook or Twitter post gets dropped on the floor.

Rolling out a search service or a social network to a billion people creates a much more robust data transmission network than building a bulletproof network for a few hundred people with a transaction volume in the thousands per day.

A concurrency-based blockchain enterprise strategy builds on these ideas. Soft physical distribution networks, like print media, understand small-world networks. For example, a print media wholesaler provides to multiple chains, like supermarket and convenience store chains, which in turn have individual stores, each with a high degree of autonomy. 

This means that managing the supply chain from end to end crosses trust boundaries. That is the ideal use for a blockchain. A blockchain with scaling characteristics, such as the RChain, can handle the transaction volume. Likewise, it can allow each subnetwork to develop its own shard, thus allowing maximal concurrency in the creation of an emerging public network.

Once each of these shards is working, the value of cross shard communication only increases. For example, if a supermarket in one region is out of stock of a hot selling tabloid, it is to both parties’ advantage if others can step in to fulfill the  shortfall. 

This becomes immediately possible in the sort of small-world network proposed. More to the point, the enterprise already gets these uses… and many others. This is effectively going on already by hook and by crook. A “small-world network” architecture just formalizes these capabilities. In doing so it maximizes the utility, and value, of the associated tokens.

Conclusion and future work

If blockchains are to realize their potential as a key coordination technology, they must be designed from the ground up to fit actual network architecture and evolution. That is, they must be suited to small-world networks. 

The proposed sharded architecture, now working in practice not just in theory, is designed to support the emergence of large scale public networks in the shortest possible time, maximizing the utilization of the utility token as the backbone of a small-world network of blockchain shards.

To reiterate, we have created a robust decentralized small-world network for supply chain resilient in the face of systemic supply chain failure almost certain to happen as a result of disasters, including those associated with climate change as well as other geopolitical and force majeure contingencies.

In other work we have cataloged the challenges facing supply chain management, creating very lucrative and socially relevant business opportunities. This strategy is in fact the fastest way to get engagement on a utility token in a large scale public network, creating epic new investment opportunities at least as lucrative as those of the market leaders of Web 2.0.

On beyond Google!

A billion here, a billion there, pretty soon you’re talking real money.

© 2022 Lucius Gregory Meredith and Ralph Benko

Lucius Gregory (Greg) Meredith is president and founder of RChain Cooperative. Greg is a mathematician, and the discoverer of the rho-calculus, a co-inventor of OSLF (Operational Semantics in Logic Form), and the inventor of the ToGL approach to graph theory.

Ralph Benko, an advisor to RChain Cooperative, and co-founder and general counsel to New Pavilion, an NFT consultancy, senior counselor to The American Blockchain PAC,  co-author of Redefining the Future of the Economy: governance blocks and economic architecture, is a former White House official.