< bitcoin-git>
[bitcoin] fanquake opened pull request #18415: scripts: add MACHO tests to test-security-check.py (master...add_MACHO_to_security_check) https://github.com/bitcoin/bitcoin/pull/18415
< vasild>
What is the reason we store blocks in blk*.dat files in the order they are received and undos in rev*.dat files in height order? This convolutes the code a lot. If we can store in the same order (either receive or height) then some significant simplifications in the code can be done.
< aj>
they're both stored in the order the data is discovered, which minimises having to shuffle the data on disk?
< sipa>
i experimented with 1 file per block/undo early on
< sipa>
but most filesystems didn't deal well with that
< vasild>
aj: is it the case that we don't have the undo for a given block before having all prior blocks?
< sipa>
vasild: undo data is created when a block is activated
< sipa>
a block can only be activated once we have its data + its parent is active
< vasild>
sipa: right, that would be 600k small files
< vasild>
I see
< sipa>
so yes it requires a block + its ancestors
< sipa>
but we only attempt activation in chains that would lead to a new best tip
< vasild>
so if we want (I am not sure if we do) store block and undo in the same order it means to store both in height order.
< sipa>
so it's possible we download a block and then never activate it
< sipa>
that would mean killing parallel block download
< sipa>
(which inherently downloads them in non-height order)
< vasild>
sipa: what about keeping the downloaded blocks with are not connected yet in memory and only send them to disk once we have all prior blocks, together with the undo (both in height order)?
< vasild>
s/with are/which are/
< sipa>
vasild: that's not very realistic
< sipa>
except for big machines
< vasild>
too much memory would be required?
< sipa>
we download up to 1000 blocks simultaneously
< sipa>
that number can probably be reduced
< sipa>
it's too low early on in the chain, and too large later on
< sipa>
i don't think this should be a big problem really
< vasild>
hmm, we can deduce how much memory would have been needed if we look at an existent database where blocks are stored in receive order
< sipa>
i guess
< sipa>
but you can't just add gigabytes (or even 100 MB) to the memory footprint
< sipa>
people configure nodes on small devicez
< vasild>
If this turns out to be worth exploring, then I guess the download would have a buffer where out of order blocks are stored until being connected and if this buffer grows too much then some blocks evicted from it (maybe highest ones) meaning they would have to be re-downloaded because we couldnt store them in memory and couldn't put them on disk
< sipa>
gah
< sipa>
we did that in 0.9
< sipa>
it was horrible
< sipa>
you'd download the same blocks over and over again
< vasild>
too many re-downloads?
< sipa>
i really don't see a reason to change that (i'm biased, i wrote the parallel block download code...)
< vasild>
how many blocks were kept in memory before starting eviction?
< vasild>
well, parallel download is a must, I am not saying to kill it
< sipa>
1000, i think - but the whole block download logic was a bunch of hacks ducktaped together back then
< sipa>
it's unfair to compare it to anything you're suggesting now
< sipa>
but i also don't see a good motivation for changing the order
< sipa>
we could just reorder the undo data in files whenever needed too by rewriting them
< vasild>
ok, two key points from the above discussion - 1. we can't write undo in receive order and 2. there is no strict reason to store blocks in receive order - it is being done to ease parallel download.
< sipa>
except that'd be much harder than just fixing the flushing logi
< sipa>
well, they're stored in order to avoid needing to rewrite things
< sipa>
it's just the easiest thing to do
< vasild>
to rewrite things you mean rewrite the source code or data on disk? :)
< aj>
data on disk
< vasild>
if we buffer blocks and store them in height order on disk then nothing will be rewritten on disk?
< aj>
if you buffer them on disk, you'll be rewriting them; if you buffer them in ram, you're adding 100M-1G of extra ram which is unreasonable
< vasild>
I am talking about buffering in ram. Buffering on disk makes no sense to me.
< sipa>
buffering in ram makes no sense to me
< sipa>
it's not data we likely ever need
< vasild>
Maybe I would write a tool to analyze an existent blk/rev database and it would show max peak buffer size for that database, if it was downloaded in a way that out of order blocks are buffered in ram.
< sipa>
the max peak buffer size is 4 GB
< sipa>
1000 blocks @ 4 MB/block
< vasild>
but it feels like the similification of writing block and undo in the same order will come at a cost - complicating parallel block download with buffering out of order blocks.
< vasild>
ok, sipa, aj, thanks for the explanations!
< sipa>
vasild: yw!
< bitcoin-git>
[bitcoin] pierreN opened pull request #18416: Prevent num op overflows in ParseScript() helper (master...fix-parsescript-numop-overflow) https://github.com/bitcoin/bitcoin/pull/18416
< fjahr>
sipa: when you looked into single block *.blk files did you also think about concatenating these to a larger file at a certain depth? At least there would not be too many small files or were there other issues caused by the file system? Just curious.
< bitcoin-git>
[bitcoin] practicalswift opened pull request #18417: tests: Add fuzzing harnesses for functions in addrdb.h, net_permissions.h and timedata.h (master...fuzzers-misc) https://github.com/bitcoin/bitcoin/pull/18417
< bitcoin-git>
[bitcoin] naumenkogs opened pull request #18421: Periodically update DNS caches for better privacy of non reachable-nodes (master...2020_03_dns_cache_update) https://github.com/bitcoin/bitcoin/pull/18421
< achow101>
is anyone else missing bitcoin/bitcoin on the travis.org sidebar (under My Repositories)?
< achow101>
oh, it looks like i've lost access to our travis stuff. no longer have the option to restart things. also lost access to HWI's travis
< bitcoin-git>
[bitcoin] jnewbery opened pull request #18422: [consensus] MOVEONLY: Move single-sig checking EvalScript code to EvalChecksig (master...2020-03-evalchecksig) https://github.com/bitcoin/bitcoin/pull/18422
< bitcoin-git>
[bitcoin] practicalswift opened pull request #18423: tests: Add fuzzing harness for classes/functions in blockfilter.h. Add integer {de,}serialization fuzzing. (master...fuzzers-misc-2) https://github.com/bitcoin/bitcoin/pull/18423
< wumpus>
looks like i still have access to travis for bitcoin/bitcoin
< achow101>
logging out and back in did not fix it for me
< bitcoin-git>
[bitcoin] hebasto opened pull request #18424: qt: Use parent-child relation to manage lifetime of OptionsModel object (master...20200324-options-model) https://github.com/bitcoin/bitcoin/pull/18424
< bitcoin-git>
[bitcoin] achow101 opened pull request #18425: releases: Update with new Windows code signing certificate (master...win-cert-3-20) https://github.com/bitcoin/bitcoin/pull/18425
< bitcoin-git>
[bitcoin] theStack opened pull request #18426: scripts: previous_release: improve behaviour on failed download (master...20200324-scripts-previous-release-show-error-message-if-download-fails) https://github.com/bitcoin/bitcoin/pull/18426