Jag satte upp en full node igår (Geth) för att hjälpa Ethereumnätverket och lära mig lite mer om hur allt hänger ihop. Men det funkar ju inte. Efter över ett dygn har den fortfarande inte synkat alla block och states. (Jo, jag har bra internetlina och Dappnodeservern har bra hårdvara.) Man måste ha SSD:s i datorn, hårddiskar är för långsamma. En 256GB SSD räckte några timmar, petade då i en 120 GB igår kväll, den är också snart full. Ska jag behöva addera en ny 1TB SSD till min blockkedjeserver varannan vecka eller vad är det frågan om?
Det sker 200 ändringar/sekund i datasetet som varje full node måste synka. Det gör ju att man aldrig kan komma ikapp, hur bra internet och server du än sitter på! Nej, det här med Ethereum kommer aldrig att lyfta tror jag, det är alldeles för resurskrävande. Mycket snack om decentralisering vilket är bra i teorin, men varför har de då inte skapat ett system som fungerar på vanlig hårdvara som vanliga människor har?
Jag behåller mina investeringar i ETH utifall de på något mirakulöst sätt lyfter framöver, men det här med att hjälpa till går bort. Känns inte stabilt och skalbart på den nivå som behövs.
"Wait that out"....jomenvisst sörru, hur länge då, en månad? Och hur ofta måste jag köpa fler 1TB SSD:s för att hantera den snabbt svällande blockkedjan?
Citat:
Syncing Ethereum is a pain point for many people, so I'll try to detail what's happening behind the scenes so there might be a bit less confusion.
The current default mode of sync for Geth is called fast sync. Instead of starting from the genesis block and reprocessing all the transactions that ever occurred (which could take weeks), fast sync downloads the blocks, and only verifies the associated proof-of-works. Downloading all the blocks is a straightforward and fast procedure and will relatively quickly reassemble the entire chain.
Many people falsely assume that because they have the blocks, they are in sync. Unfortunately this is not the case, since no transaction was executed, so we do not have any account state available (ie. balances, nonces, smart contract code and data). These need to be downloaded separately and cross checked with the latest blocks. This phase is called the state trie download and it actually runs concurrently with the block downloads; alas it take a lot longer nowadays than downloading the blocks.
So, what's the state trie? In the Ethereum mainnet, there are a ton of accounts already, which track the balance, nonce, etc of each user/contract. The accounts themselves are however insufficient to run a node, they need to be cryptographically linked to each block so that nodes can actually verify that the account's are not tampered with. This cryptographic linking is done by creating a tree data structure above the accounts, each level aggregating the layer below it into an ever smaller layer, until you reach the single root. This gigantic data structure containing all the accounts and the intermediate cryptographic proofs is called the state trie.
Ok, so why does this pose a problem? This trie data structure is an intricate interlink of hundreds of millions of tiny cryptographic proofs (trie nodes). To truly have a synchronized node, you need to download all the account data, as well as all the tiny cryptographic proofs to verify that noone in the network is trying to cheat you. This itself is already a crazy number of data items. The part where it gets even messier is that this data is constantly morphing: at every block (15s), about 1000 nodes are deleted from this trie and about 2000 new ones are added. This means your node needs to synchronize a dataset that is changing 200 times per second. The worst part is that while you are synchronizing, the network is moving forward, and state that you begun to download might disappear while you're downloading, so your node needs to constantly follow the network while trying to gather all the recent data. But until you actually do gather all the data, your local node is not usable since it cannot cryptographically prove anything about any accounts.
If you see that you are 64 blocks behind mainnet, you aren't yet synchronized, not even close. You are just done with the block download phase and still running the state downloads. You can see this yourself via the seemingly endless Imported state entries [...] stream of logs. You'll need to wait that out too before your node comes truly online.
Citat:
A: State sync is mostly limited by disk IO, not bandwidth.
The state trie in Ethereum contains hundreds of millions of nodes, most of which take the form of a single hash referencing up to 16 other hashes. This is a horrible way to store data on a disk, because there's almost no structure in it, just random numbers referencing even more random numbers. This makes any underlying database weep, as it cannot optimize storing and looking up the data in any meaningful way.
Not only is storing the data very suboptimal, but due to the 200 modification / second and pruning of past data, we cannot even download it is a properly pre-processed way to make it import faster without the underlying database shuffling it around too much. The end result is that even a fast sync nowadays incurs a huge disk IO cost, which is too much for a mechanical hard drive.
Servern blir alltså helt upplåst till en enda uppgift eftersom synkandet tar 100% av möjlig disk-i/o på en vanlig pc med vanlig hårdvara (ej svindyra serverprylar). Konstant. Hur länge håller en SSD under den belastningen?
Någon har tänkt rejält fel när de skapade Ethereum. Dessa matematiska genier verkar tyvärr inte ha så bra koll på datorhårdvara....