InterPlanetary File System (IPFS) is a powerful technology that, sadly, has gained a sort of negative reputation due to the avalanche of low-quality cryptocurrency projects. When Macula.Link was created, we made a conscious decision to use IPFS as the storage engine, leveraging its benefits in a secure controlled environment without connecting to the global network.

Macula.Link uses a private IPFS network to provide reasonable redundancy, content versioning, and deduplication.

IPFS: ELI5

Alice keeps a photo of her beloved cat on her personal website so that all her friends can go there and see it. One day she forgets to pay a monthly fee, and an evil hosting provider deletes the website completely. This makes Alice very sad because she wanted everyone to always be able to see her cat. To preserve these precious photographs, Alice decides to upload them to IPFS.

In IPFS, thousands of computers take Alice’s photo, split it into many small chunks, and share copies of these chunks with each other. The photo itself gets a unique name, a long string of letters and numbers called a hash. Now, instead of the link to the website, Alice shares this hash with her friends. Knowing the hash, they can go to any computer in IPFS and ask for the image they want to see. If the computer has all the chunks, it assembles them and gives back the photo. If not, it gives the chunks it has and the address of a computer that has the missing chunks.

No evil hosting provider can delete Alice’s photos now because thousands of computers keep them and are ready to give them to anyone who asks. The owners of these computers decide to keep her photos voluntarily because they, like Alice, enjoy cat pictures and want to help others preserve them. Unless each and every computer decides to purge them, Alice can be sure that her photographs will be available for many years to come.

What is IPFS? (Explanation for Grown-Ups)

IPFS is a distributed protocol and network for decentralized peer-to-peer storage and sharing of media.

In a traditional client-server approach, the client sends a request to a single server or a group of servers, all managed by the owners of the resource, platform, or website they are communicating with. When making a request, the client must know the site address and the path to the content (e.g., https://example.com/cat.jpg). Obviously, if the client tries to get the same content from a different server (e.g., https://example.org/cat.jpg), they will either get a different result or nothing.

In IPFS, there is no central server that a client can refer to. Instead, there are many independent nodes that may or may not store the requested content. Once the file is uploaded to IPFS, it gets a unique identifier based on its contents. With this identifier, the client can get the file from anyone who keeps it or assemble it from several peers, similar to BitTorrent.

IPFS Building Blocks

Content Addressing

Content addressing in IPFS is based on cryptographic hashing. You can think of it as the digital fingerprint of the file. This fingerprint is called a Content Identifier (CID). When you upload a file, a CID is generated that will be used to find the file regardless of where it is stored.

This also means that the content is immutable — when you upload the same file even with the slightest modifications (like changing one character in a document or one pixel in an image), it will get a different hash. Luckily, IPFS has content versioning, so you can easily tie subsequent versions of your content together, similar to Git, Subversion, or Mercurial.

In addition to this, CIDs allow you to check for file integrity and ensure deduplication. If the CID you calculate isn't the one you expect it to be, it means the file was modified. At the same time, if you try to upload a file that is already present somewhere in IPFS, you will get the same CID, and the node will save space by not uploading it again.

You can learn more about content addressing in IPFS here.

Distributed Hash Table

The Distributed Hash Table (DHT) is a decentralized key-value store that allows nodes in the IPFS network to find and share information about the location of data and the peers storing it. When a node wants to retrieve a piece of data from the network, it queries the DHT with the content's hash to find its location.

IPFS uses the Kademlia DHT protocol, where nodes are grouped into a structure similar to a binary tree, with each level representing a particular number of shared prefix bits in their IDs (hash values).

You can learn more about the specifics of Distributed Hash Tables in IPFS here.