Storing Data Permanently, as a Website: A Review of Arweave and AR.IO
Rei Sato Oct. 2024
1. Introduction
I am interested in remote storage that meets the following five requirements:
Decentralizing: Meaning that the service provider is not a single organization, but a group of independent parties. Unlike centralized services,
service provision and data storage will continue even if a specific actor withdraws. This prevents data loss due to the whims of service providers.
Storing Over 1 GB of Data: Sufficient capacity to store multiple media files such as images and videos.
Paying with Cryptocurrency: Enhances anonymity.
Maintaining Data for Over 10 Years without Contract Renewal: Data will not be lost even if you lose interest in it or neglect payments.
Publishing Data as a Website: Allows easy distribution of data to a large audience, similar to GitHub Pages.
I have realized that Arweave and AR.IO, two blockchain projects, together fulfill these conditions. In section 2, I review how Arweave and AR.IO meet
the five conditions. In section 3, I demonstrate how to store data on Arweave and display it via AR.IO.
2. How Arweave and AR.IO Together Meet the Five Conditions
2.1. Decentralizing
Arweave is a peer-to-peer network comprised of multiple independent nodes (miners) that store data previously entrusted by users. Your data is not stored in a single
location; instead, it is ideally replicated more than 20 times and distributed across multiple nodes, making it robust against data loss
[code1]. To entrust new
data to the network, users attach it to a request called a transaction and send it to the network. Additionally, Arweave's code and specifications are
available as open-source software (OSS), allowing anyone to join the network as a new miner, which makes it a decentralized system.
2.2. Storing Over 1 GB of Data
Transactions submitted by users can include data, and the size of this data is effectively unlimited, making it possible to store large binary files as well. In most
client implementations, large files are divided into small chunks before being uploaded
[code2].
2.3. Paying with Cryptocurrency
The user's payment, which is the transaction fee, is required only once when submitting a transaction. This fee is paid using Arweave's token, AR, which is a
cryptocurrency and can be obtained on exchanges such as Bybit. In the official client, the default setting is to pay the minimum transaction fee
[code3][api1]. Note that the minimum transaction fee varies based on the
size of the attached data, and transactions below this fee will be rejected, meaning that submitting larger files requires higher payments.
2.4. Maintaining Data for Over 10 Years without Contract Renewal
Reward Emission.
As previously mentioned, the user's payment is made only once when submitting a transaction. But do miners have the incentive to store data for over 10 years with
only a one-time payment from users?
In order for miners to receive rewards, they need to bundle up to 1,000 transactions submitted by users into a block and have it approved by the network. The
task of generating new blocks is called mining, and a new block is generated approximately every 2 minutes across the network. The reward for the miner who
generates the block is calculated to exceed
total data size (in GiB) on network × storage cost per GiB per minute × average block generation time (2 minutes). Therefore, the total rewards paid to
miners are ensured to exceed the overall storage cost of the entire network at any given moment. Note that the storage cost for keeping 1 GiB of data for 1 minute is
calculated using the miner rewards from the last 30 days, while considering the AR/USD exchange rate
[p2023][spec2.6][code4][code5].
Endowment Pool.
To ensure the reward is stable and sustainable, Arweave calculates rewards as follows: the rewards for miners are composed of the sum of R_fees,
R_inflation, and R_endowment[yellow]. R_fees refer to the total
transaction fees included in a block, but not all of these fees are immediately paid to miners. Instead, the majority is reserved in the endowment pool for future
R_endowment payments. R_inflation is a predetermined reward paid by the protocol, which decreases as the block height (the number of
generated blocks) increases. R_endowment is only paid from the endowment pool when the sum of the other two components falls below the storage cost.
Transaction Fee.
In order to continue the incentive mechanism over 200 years (which is virtually permanent), the minimum transaction fee imposed on users is calculated as the
perpetual storage cost. To prevent the fee from diverging to infinity, Arweave assumes the cost of storage will decay consistently
[code6].
Succinct Proof of Random Access (SPoRA).
If rewards are paid for block generation, how are block generation and data storage related? In Arweave, the data stored across the entire network is called the
weave, and to mine a new block, it is necessary to calculate a hash using a randomly selected portion of the weave. Therefore, to increase the probability of
generating new blocks and earning rewards, miners are incentivized to store as many partitions of the weave as possible
[spec2.6]. Additionally, since the miner who finds the appropriate hash first will receive the reward, they have an
incentive to use storage with faster read speeds
[103].
In Arweave (especially from version 2.6 onward), the probability of earning rewards increases based on the following factors: (1) holding partitions that are less
likely to be held by other miners, (2) consolidating storage into as few nodes as possible, and (3) storing as many unique partitions as possible on each node. These
encourage the prevention of imbalances in the number of partition replicas
[spec2.6][yellow].
2.5. Publishing Data as a Website: AR.IO
Gateway.
Arweave's miner nodes are equipped with an API that returns data from previously generated blocks in response to HTTP requests, allowing content (e.g., HTML, JPEG) to
be displayed in web browsers
[api2][api3]. However, miners have no incentive to make
this API publicly accessible to a large number of users or to maintain high availability. In fact, in most cases, no response can be received from the API of a
randomly selected node [peers]. To ensure long-term access to data stored on the Arweave network via web browsers, it is
necessary to consider both the incentive design and decentralized management of the web servers that provide access to Arweave, referred to as gateways.
AR.IO.
Here, I mention AR.IO, a noteworthy gateway project. In AR.IO, nodes serve not only as gateways that return content stored on the Arweave network in response to HTTP
requests, but also as observers that monitor other gateways. By fulfilling these roles, nodes are rewarded with IO tokens from the protocol, providing an incentive to
increase their availability. The tokens are supplied both by the protocol itself and through revenue earned from the Arweave Name System (ArNS), which functions
similarly to DNS on AR.IO
[p2024]. As of the time of writing,
the mainnet has not yet launched, but several instances are operational on the testnet [nodes].
3. Practice
I present the minimum code to send a transaction with image data, which is based on the official client and has been tested on Node.js v20.17.0
[repo].
[example.js]:
Beautiful venetia forever.
The above image is stored on Arweave network and is downloaded to your browser via
arweave.net gateway. You can also access the data from other gateways, such
as ar-io.dev or
permagate.io. The size of the image is 290KB, and it cost approximately
0.0004 AR x $22 ~ $0.0088 to store it permanently. Since there is no way to delete information once it has been stored on the network, and it can be accessed by
anyone, please be careful when considering uploading sensitive information.