IPFS

General Architecture


While this course will mainly focus on IPFS, there are a number of proposed solutions to blockchain-based and decentralized storage. The general architecture remains quite similar between them, so it will help to review how these systems operate before comparing them.

HOW BLOCKCHAIN STORAGE WORKS: A: Data; B: Shards; C: Encrypted Shards; D: Hashes; E: Blockchain Ledger; F: Distributed Nodes with Synced Ledger; 1: Shard Data; 2: Encrypt Shards; 3: Generate Hashes; 4: Replicate Shards; 5: Distribute Shards; 6: Record Transactions.
Because blockchains must be stored on all nodes of the network, data on them is expensive. As a result, hashes are used to represent an image of each piece of a file, and the pieces can then be safely distributed to storage nodes without risk of substitution of compromise.

 

GENERAL PROCESS


  1. Shard Data

    In order to optimize storage and retrieval of files, data is broken down into tiny packets which can each be stored on a different node.

  2. Encryption

    Each shard is encrypted to anonymize and sterilize the data before public storage.

  3. Signatures

    In order to uniquely track each shard or file, a hash signature is created that represents each unit of storage uniquely. This will be used to verify that nodes are storing the correct files at the correct addresses.

  4. Replication

    In a decentralized network, it’s important to have multiple redundant versions of each file. This allows nodes to join and leave the network as they see fit without the file being unavailable.

  5. Distribution

    Shards are then distributed across the network to a group of nodes to maximize availability.

  6. Recording

    In a truly decentralized network, a ledger or blockchain can be used to record when and where files are stored, and to ensure that parties are held accountable for breaches of trust.

 

In the next section, we’ll cover in more detail how IPFS implements this functionality.