As a community that likes blockchain, we talk about putting lots of things on the blockchain. Blockchain offers advantages of durability, security, and decentralization. Users can control what belongs to them, without needing the trust or approval of a central authority or third party. So it makes sense that this could be useful for things beyond simply records of money changing hands. In other articles we’ve talked about using blockchain for things like contracts, deeds, educational records, identity documents, and art, in the form of NFTs.
There is something important to note, however, about using a blockchain to track larger files like big documents or NFTs: More often than not, those files are NOT directly on the blockchain, locked into a block for all time. Not the same way a financial transaction record is, anyway.
First of all, why not?
The quick and easy answer is file size. The nature of a blockchain is that everything has to be synced among participating nodes. If the amount of data in a block was extremely large, syncing blocks would take much longer, offering a bigger attack vector, thus compromising security. The file size of a high-quality image, for example, isn’t just a little bit bigger than a simple transaction record - it’s many many times bigger. Ultimately, large and uncontrolled block size would also affect the scalability of the chain, as well as its speed and security.
Every blockchain defines its own limit for block size. On some chains, the block size is big enough that you actually could fit somewhat larger files into a block. The limit isn’t about whether it’s a transaction record, or an NFT, or anything else. They’re all just 1s and 0s at some point! The real limit is the size of that data. Cardano actually has a smaller block size limit than some. One of the reasons for that decision was to specifically encourage and enforce the idea that not everything should go directly ON the blockchain.
Then how does it work?
So if your valuable credential, NFT, or document isn’t itself locked in a block, where is it? And if it’s not in a block, what does it then mean to put these things on a blockchain?
What goes on the block is not the data of your file, but the metadata - data that describes the who, what, where, and when, of the file. Sometimes, along with the metadata, a hash function is used to create a small, fixed-size representation of the bigger raw data, to store on chain as an attestation..
BUT WHERE IS IT??
The value of putting something on-chain isn’t just knowing that it’s secured. In the case of a pretty JPG, for example, you might actually want to see it!
It’s hypothetically possible that the JPG could just be stored in the cloud somewhere, with the blockchain metadata hash pointing to its location in the cloud. However, traditional cloud storage is centralized, insecure, and impermanent. So it should come as no surprise that the builders behind blockchains have turned to new ways to solve this problem. The two most commonly used are IPFS and Arweave.
The InterPlanetary File System (IPFS)
IPFS is a free, distributed, peer-to-peer file sharing system. Thousands of participants around the world share the task of storing data, and producing that data when it’s requested. Every time a file is requested, it runs through a few nodes of the peer-to-peer network, leaving copies of itself wherever it goes. Files that get requested a lot are always readily available with fresh recent copies stored on numerous nodes. A file that is regularly requested on IPFS could truly be permanent.
However, the storage available on any single IPFS node does have a limit. When a node reaches its capacity limit, the oldest, least-viewed files are purged. If a file on IPFS is never requested, all copies of it will eventually disappear from the network. To deal with that, companies arose offering what is called “pinning services.” For a fee, they will take on the responsibility for ensuring your file on IPFS remains safe from purging. If you want to buy an NFT or use a service that is sharing files on IPFS, you might want to know if that project is paying for pinning, or if it’s up to you to make sure your files stay fresh.
Arweave is a solution that addresses the durability problem. Arweave uses a structure that is much like a blockchain. Unlike IPFS, node operators are directly incentivized, like the node operators of any blockchain. Naturally then, using Arweave isn’t free. However, it is very low cost, and the cost is predictable. As a service, Arweave provides a gateway to easily view your files, which is useful. Lots of blockchains use Arweave for file storage, because for a fixed, reasonable cost, you are assured of secure storage that should last a very long time.
What about privacy??
If your file is floating around in a peer-to-peer file sharing network, or on a public blockchain, that means anyone can see it. If it’s a plain .jpg, they can look at it with any image viewer. If it's a .doc or .txt file, they can read it. It is possible to use these systems with “unsecured” files of these types; indeed, they can be used specifically for sharing files in a way that is intentionally public. However, if privacy, copyright protection, or exclusivity are in view, files can be encrypted before putting them on IPFS and Arweave.
Weighing the options
IPFS is free to use, has been around for a while, has the most ecosystem tooling, and is widely compatible with existing systems. It can be free. However, it’s less durable - or you have to spend more to make it durable.
Arweave offers a truly blockchain-like solution, with very high durability for a low fixed cost. However the future of Arweave is unknown. If the economics of the treasury and node incentives don’t work out the right way, the costs could someday change, or the viability of the whole enterprise could come into question. For now, however, it seems like a good quality solution for long-term, decentralized file storage.
For the “Day at the Lake” NFT project here at Lido Nation, we chose Arweave. We made that decision based on our preference, in this case, for great durability for a fixed, affordable cost. It’s possible that we would choose IPFS for a different project. If a project simply didn’t require long-term durability, we would use IPFS because it’s free. Similarly, if there was a guarantee of ongoing active interest in the files, IPFS can be both durable and free, and would be a good choice.
In the future, the landscape of durable, decentralized file storage will continue to evolve. Arweave and IPFS may grow, or shrink, and new solutions may emerge. Being aware of the factors and considerations described here can help you better understand the projects you choose to participate in.