< randy-waterhouse> ok, another 'devil's advocate' tester on-board segwit test smoothly ... thnx gmaxwell, sipa wumpus, etc ... and many others, good job.
< Lauda> Does the dbcache=N parameter only start using up additional memory during block processing?
< gmaxwell> Lauda: I think what you're asking is if it will use it right away, no it won't-- only has there is more data read into the cache
< Lauda> gmaxwell all that is required is "dbcache=3000" in bitcoin.conf right (since mine is practically empty) and I want to run another reindex over night?
< Lauda> since*
< gmaxwell> right!
< Lauda> Okay great. Thanks!
< Lauda> The reindex issue is indepdant of the blockchain, i.e. it doesn't matter whether one tests on testnet or mainnet (someone asked me this)?
< Lauda> independant*
< midnightmagic> i can convert my testnet node(s) to segwitty. is it time?
< jl2012> What would happen if I upgrade my node after segwit is active and there are already segwit blocks on the chain?
< GitHub136> [bitcoin] paveljanik opened pull request #8261: The bit field is shown only when status is "started" (master...20160625_sw_getblockchaininfo_bit) https://github.com/bitcoin/bitcoin/pull/8261
< gmaxwell> jl2012: it reorgs back to before segwit activated.
< gmaxwell> jl2012: then will download blocks as needed.
< jl2012> Same for other soft forks like csv?
< jl2012> I think it has to be
< gmaxwell> jl2012: it arguably should be, but we haven't done that before.
< gmaxwell> for segwit it had to go get the witness data in order to serve blocks
< gmaxwell> we'd talked about doing it in the past for other soft forks but I think we thought it would be harder to implement than it turned out to be.
< gmaxwell> (and unlikely to really matter that much unless the reason you were upgrading was to fight a network split, in which case the invalidateblock rpc can be used)
< jl2012> But theoretically I may be following an invalid chain
< gmaxwell> Potentially; though if there were a large invalid chain we would make loud announcements to recommend the use of the invalidateblock command. I think now that the code exists, we'd likely use it in the future.
<@wumpus> segwit testnet node: 213.46.222.31:18333
< Lauda> What is the main bottleneck for reindex, storage speed or CPU processing?
< gmaxwell> probably depends on the hardware, on my laptop I think it's IO. on systems with faster IO, I think its cpu inside leveldb code... at least with default dbcache. with dbcache cranked, its likely cpu elsewhere in bitcoin.
< Lauda> Hmm, seems like the last 20-30 weeks are taking forever
<@wumpus> Lauda: unless you have a very fast disk/ssd, or increase the dbcache, i/o will be your main bottleneck
< Lauda> I've just checked in on my node, and some bans say e.g. until June 19th but the nodes are still banned. Do these get removed after a restart or something went wrong?
<@wumpus> Lauda: you can easily check though: is CPU maxed out?
< Lauda> It isn't
< Lauda> DBcache 3GB
<@wumpus> (unless you have changed the number of script verification threads, bitcoind will max out your CPU cores in initial sync when it's not i/o bound)
< Lauda> Okay so even with 3GB dbcache that's still not enough
<@wumpus> I have a branch to run bitcoind db-less: https://github.com/laanwj/bitcoin/tree/2016_04_dummy_db it does mean it loses all data when e.g. bitcoind crashes
<@wumpus> but the utxo and block index is simply stored in a flat file, which is loaded at startup and written at shutdown
< Lauda> That's interesting and those times are amazing in comparison to leveldb
<@wumpus> if you can afford the memory :)
<@wumpus> though it uses less memory than keeping everyting in the dbcache, and doesn't have the issue that the cache is not seeded at startup
<@wumpus> I think research and experimentaitno how to best store the utxo set is in order
< Lauda> The move towards SSDs should definitely help with this, but the industry is not there yet..
< Lauda> I can afford 4 GB on this machine, but it still takes a fair amount of time.
<@wumpus> memristor would be nice
<@wumpus> but no matter what, also at some point, unrestrained growth of the utxo set needs to be addressed
< Lauda> ^
< sipa> wumpus: we should try switching to a model where all utxos are stored as separate db entries, rather than in a vector of unspends per txid
<@wumpus> but it may well be we're running against the limits of what databases can (with good performance) handle, which means there is no room for scaling there at all
< gmaxwell> gigantic cuckoo hash table. with a update log. :P
<@wumpus> sipa: yes, that would be an interesting experiment too
<@wumpus> also the access pattern is essentially random, so the only type of caching that helps very well is keep everything
< gmaxwell> sipa: the ripple people have claimed that leveldb performance falls off a cliff with more than some threshold number of entries (I believe they were storing every transaction in it)
< sipa> gmaxwell: i think they don't have application level caching
<@wumpus> well I'm not actually sure how random the access pattern is, but it looks like that from a disk perspective with the current organization
<@wumpus> it's very possible for optimizations to be possible based on sorting utxos smartly which are expected to be accessed together? I don't know
< gmaxwell> sipa: sure, but that wouldn't change the performance of the underlying database.
<@wumpus> it's not like the other databases that we tried perfomed better
<@wumpus> leveldb still seems, all in all, the best perforing on-disk database for utxo storage
< gmaxwell> Ripple folks created their own.
< gmaxwell> (and also suggested we might be interested in using it)
<@wumpus> lmdb looked promising but it has it's own performance cliff
<@wumpus> (depending on the amount of memory in the system, it seems)
<@wumpus> what license is it under?
<@wumpus> I could give it a try pretty easily
< GitHub74> [bitcoin] btccode opened pull request #8262: Forgetaddress 0.12 (0.12...forgetaddress-0.12) https://github.com/bitcoin/bitcoin/pull/8262
< gmaxwell> I think it's permissively licensed, looking for it now
< GitHub2> [bitcoin] btccode closed pull request #8262: Forgetaddress 0.12 (0.12...forgetaddress-0.12) https://github.com/bitcoin/bitcoin/pull/8262
< sipa> gmaxwell: well is it read performance or write performance that is bad?
<@wumpus> with leveldb it's read performance, and also latency
<@wumpus> write performance of leveldb is quite good, I suppose because it writes consecutive files
<@wumpus> but no database likes huge databases + random seek patterns for reads
< gmaxwell> (thats why I made the half serious suggestion of a gigantic hash table)
<@wumpus> lmdb read latency seems to be - on average- better, but its writing is worse than leveldb, I think it does more random writing
< GitHub121> [bitcoin] laanwj pushed 2 new commits to master: https://github.com/bitcoin/bitcoin/compare/5cdc54b4b62d...63fbdbc94d76
< GitHub121> bitcoin/master b0be3a0 Wladimir J. van der Laan: doc: Mention Windows XP end of support in release notes...
< GitHub121> bitcoin/master 63fbdbc Wladimir J. van der Laan: Merge #8240: doc: Mention Windows XP end of support in release notes...
< GitHub49> [bitcoin] laanwj closed pull request #8240: doc: Mention Windows XP end of support in release notes (master...2016_06_windows_xp) https://github.com/bitcoin/bitcoin/pull/8240
< gmaxwell> as every read would simply be one or two random disk accesses... and its hard to do better than that. it's just writing is awful. (e.g. end up with read-write-write to update a log, with both sequential reads and writes, and if the table needs to be resized woe is you).
<@wumpus> going to try nudb when I have some time
<@wumpus> unfortunately we also do a lot of writing, at least during initial sync, every utxo read is updated and written back
<@wumpus> so making reading much faster at the expense of writing is going to give yo mixed results
<@wumpus> has research been done on utxo access patterns? e.g. are more recent blocks more often accessed, or the other way around, or are there other regularities that could be used?
< gmaxwell> Spending is more freuqntly from recently created utxo.
<@wumpus> interesting
< gmaxwell> I would expect naievely that the expected lifetime of a utxo is how long it's lived so far. If something had made it a year without being spent, you should expect it to last another year. But beyond knowing that an unusually large number of utxo have short lives, I've not done anything to try to verify this hypothesis.
< gmaxwell> we could probably construct fairly elaborate predictions using other features like how many txouts were in the creating transactions, reuse of the pubkey, and the amount of the coin.
< gmaxwell> (or even using non-fungibility-- a coin is likely to be spent soon if its recent ancestors were spent soon)
< sipa> wumpus: that's why the "fresh" optimization helps a lot... we create utxo entries indide the database, and fully spend them before they even hit disk
< sipa> s/inside the database/inside the cache/
< gmaxwell> with the cache turned way up, the whole initial sync runs without writing the chainstate until the end.
< gmaxwell> oh, seems nudb is a big hashtable (uses external storage for values)
< sipa> it keeps the entire keyset in memory?
< gmaxwell> no, the keys are an a file. sounds like it's chunked so it can independantly resize sub tables.
<@wumpus> isn't the fact that nudb is insert-only a problem? we delete and change entries a lot
<@wumpus> gmaxwell: would be a good research project to investigate that hypothesis in detail, and see if it is possible to optimize storage based on those predictions/assumptions. Maybe one huge key/value store is not the best way to handle this
< gmaxwell> hm. I thought it could delete keys but not the values in external storage.
< gmaxwell> Oh I see what you mean there... I hadn't caught that implication before.. that effect is more or less why caching smaller than the utxo set in memory is still effective, but depending on the geometry of the effect it might make sense to have two databases.. so that the high access parts are in something with low log(n) costs.
< gmaxwell> LOL, totally offtopic: https://lkml.org/lkml/2016/3/31/1109
< btcdrak> inb4 linus splats him
< gmaxwell> it's old, just turned up in a random google search
< btcdrak> oh it's an april fool!
< Lauda> Error reading from database, shutting down 15 weeks left :<
< Lauda> I think I can still get detect results if I test another version and break at 15 weeks left?
< Lauda> decent*
< gmaxwell> do you know why it failed?
< gmaxwell> if you're benchmarking you can just compare to a common height.
< gmaxwell> e.g. use the timestamps in the log
< Lauda> I can't be sure. It's possible that my HDD disconnected for a second (the cable seems a bit unstable if touched).
< spudowiar> Lauda: check dmesg for I/O errors?
< Lauda> Okay then, I'll compare the timestamp of the same height. Running a test on a revision before the reindex changes now
< Lauda> I'll check system error log (these tests are on Windows not Unix).
< Lauda> http://pastebin.com/Zau75AHY it doesn't tell me much.
< sipa> Lauda: you have debug.log.
< sipa> ?
< Lauda> Yes
< Lauda> The last entry is just an UpdateTip.
< Lauda> Comparing the partial data shows that re-index is much faster on the newer version than one before the re-index changes (at least on custom dbcache).
< sipa> yes, for a fair comparison you need to disable checkpoints
< sipa> before the reindex changes, signatures were always checked
< Lauda> How do I disable checkpoints?
< sipa> after thry're only checked past the last checkpoints
< sipa> -nocheckpoints i think
< Lauda> So I should delete this data (version before re-index changes) and run it again with that flag?
< MarcoFalke> no need to delete data
< Lauda> It's still re-indexing the build from 16-05.
< sipa> you can start over
< Lauda> How would I add that within the .conf file?
< sipa> checkpoints=0
< sipa> or nocheckpoints=1
< sipa> but please consult the help
< sipa> (bitcoind -help)
< Lauda> Okay thanks!
< Lauda> sipa is it normal that the wallet shows weird/non-existing transactions (date-wise) during reindex?
< sipa> example?
< Lauda> e0a871f4897af619c4e0d8ab91d6c6f81e25d23f4dea421439b60e9c9dd8cb83
< Lauda> Received Time2015-09-09 20:57:49
< Lauda> wallet shows yesterday
< Lauda> I don't even recognize these transactions, a (big) list of microtransactions (incoming and outgoing) for 24/06
< sipa> heh
< sipa> did you import some common brainwallet
< Lauda> No. I didn't do anything. I'm running this on my wallet machine (since my other one is down)
< Lauda> I have never used anything besides QT.
< Lauda> I think it wasn't showing on 24-06 nightly build.
< sipa> that seems unlikely :)
< Lauda> The dates seem correct for all transactions up to this point. There are surely a few hundred TX's stamped at this date now
< Lauda> Hmm..
< Lauda> My wallet.dat grew. I made a backup before I started reindex tests. It was ~400kb, now it is 4.6MB
< GitHub165> [bitcoin] laanwj pushed 2 new commits to master: https://github.com/bitcoin/bitcoin/compare/63fbdbc94d76...1922e5a65458
< GitHub165> bitcoin/master 27f8126 Daniel Cousens: remove unnecessary LOCK(cs_main)
< GitHub165> bitcoin/master 1922e5a Wladimir J. van der Laan: Merge #8244: remove unnecessary LOCK(cs_main) in getrawpmempool...
< GitHub97> [bitcoin] laanwj closed pull request #8244: remove unnecessary LOCK(cs_main) in getrawpmempool (master...patch-1) https://github.com/bitcoin/bitcoin/pull/8244
< da2ce7_mobile> Well done! https://github.com/bitcoin/bitcoin 11,111 comments :)
< sipa> *commits
< spudowiar> No one is allowed to commit
< spudowiar> You must squash commits in order to add more
< spudowiar> Huh, it says 10,000 commits
< spudowiar> Not 11,111
< da2ce7_mobile> oh spelling. oh well.
< spudowiar> da2ce7_mobile: you are a commit :)
< spudowiar> [da2ce7] Fix spelling mistakes
< spudowiar> Committer: spudowiar
< GitHub41> [bitcoin] bitcoiner opened pull request #8264: src: Fix typo in comment - tinyformat.h (master...bitcoiner-fix-typo-tinyformat) https://github.com/bitcoin/bitcoin/pull/8264
< GitHub43> [bitcoin] bitcoiner opened pull request #8265: src: Fix spelling error in comment - netbase.h (master...bitcoiner-fix-typo-netbase) https://github.com/bitcoin/bitcoin/pull/8265