< warren>
cfields: btw, Ubuntu 12.04 Precise is hitting EOL, will gitian switch to a newer base soon?
< sipa>
cfields: or we shouldn't, i guess
< sipa>
if getrandom isn't available on linux, i think we should fallback to using /dev/urandom directly rather than through a fallback for a wrapper for a BSD function :)
< cfields>
sipa: atm it's being used in linux, best i can tell
< cfields>
(i get a runtime link error, so it must be)
< cfields>
warren: gitian uses Trusty currently
< warren>
oops
< sipa>
cfields: yes, we compile in code to use getrandom whenever configure detects it
< cfields>
sipa: sorry, i was referring to getentropy
< sipa>
eh, i mean getentropy
< sipa>
cfields: the getentropy implementation in glibc on Linux just uses SYS_getrandom, just like we already do directly
< sipa>
so a fallback that just returns 0 should be fine
< sipa>
we check the return status
< sipa>
ah, no
< sipa>
hmm
< cfields>
sipa: looks like getentropy needs to check for ENOSYS, and do GetDevURandom as a fallback. Then the weak getentropy just returns ENOSYS.
< cfields>
I think that works?
< sipa>
cfields: yup
< cfields>
ok, will do.
< sipa>
cfields: i think our GetOSEntropy call should always compile in the GetDevURandom fallback
< sipa>
(so move it out of the #ifdef ... #elif .. #endif switch)
< cfields>
yep, agreed
< sipa>
then the getrandom/getentropy calls can fail without problems
< sipa>
and agreed, a simple fallback returning ENOSYS is perfect then
< cfields>
<sipa> cfields: i think our GetOSEntropy call should always compile in the GetDevURandom fallback
< cfields>
sipa: looking again, I'm not sure what you meant by that
< cfields>
we always compile GetDevURandom, except for win32
< sipa>
cfields: i mean the GetDevURandom call at the bottom of GetOSEntropy
< sipa>
which is only compiled if getrandom and getentropy are not availablr
< sipa>
that should change
< cfields>
sipa: so you want early returns for any better function, and fallback if one isn't hit by the end?
< sipa>
right
< cfields>
makes sense, will do
< sipa>
getrandom currently has a fallback already
< jonasschnelli>
Grml... if we enforce a min HD recover gap limit (currently 20 keypool keys), all tests are broken (mosts tests run with a keypool of size 1). I guess I have to add a -hdignoregaplimit and set it to true by default for the tests.
< jonasschnelli>
Ah.. yes. the [] is also part of it. Will fix that now
< wumpus>
yeah :)
< wumpus>
I don't understand why we'd need *yet another* workaround for getentropy on linux
< wumpus>
man if it's really that difficult let's just use /dev/urandom and give this up
< jonasschnelli>
wumpus: Agree. If /dev/urandom is not reliable, your OS is not and thus, getentropy doesn't give much more value... or do I misinterpret this?
< wumpus>
well the idea is that it can be run in a jail/container without /dev
< jonasschnelli>
I see...
< wumpus>
it's not so much that the syscall is more reliable (though that may be the case) but that it's always available (if the OS happens to have it avaiable...)
< wumpus>
I just don't see the point in a getentropy() implementation that reads /dev/iurandom anyway
< wumpus>
we already have a function for that
< wumpus>
this is getting way more complicated than I imaginedc
< jonasschnelli>
morcos: Would you say it makes sense to keep a singe samples vector and add fixed time-limits for sample frequency and purge out in-between values? Example: we add a sample whenever the mempool has changed while respecting a min delta of 2s, == no extra thread required.
< jonasschnelli>
Then, during the cleanup (every 100 samples or so), we purge out in-between samples to ensure a min-frequency of, say, delta 30s for samples older then 2h (TBD)
< jonasschnelli>
same for older then 24h (maybe ensure a min-delta of 2min), etc.
< jonasschnelli>
The question then is, if we should interpolate fix time steps during cleanup/purge. Because the samples may be distributed "randomly" over time.
< morcos>
jonasschnelli: that seems considerably messier to me than keeping several samples vectors (one for each time scale we might care about) and only recording a data point in the appropriate vector if its time to. i don't think that requires any extra thread either (althgough its not clear that it wouldn't be better in a separate thread)
< morcos>
periodic cleanup just seems weird
< jonasschnelli>
morcos: Separate vectors would require move samples from one to another vector... though not sure if this bad or good. :)
< morcos>
i think the lack of precision in recording the data point at exactly the right time is just something i would ignore witht he exception of if we miss a sample period entirely we should do some sort of filling in
< morcos>
i don't understand that
< morcos>
every 2 seconds record in the short vector.. on the 60th second add the same sample point to the short vector and the medium vector at the same time.. on the hour add to all 3
< morcos>
there is some very small data duplication, but no moving, cleaning or reorganizing
< jonasschnelli>
Oh. Yes. I overcomplicated.
< jonasschnelli>
Yes. Some duplication.. but I guess that okay.
< morcos>
very little
< jonasschnelli>
Yeah. Use shared pointers *duck*
< jonasschnelli>
About the periodic cleanup:
< jonasschnelli>
I though calculating the dynamic allocation on each added sample seems a bit to expansive,.. that's why I added the periodic check
< jonasschnelli>
But if we have fixed timespans with a thread, then this can be solved simpler.
< jonasschnelli>
Though I wanted to avoid grabbing the cs_mempool to calculate the dynamic mempool size, etc.
< jonasschnelli>
Stats could then have an more significan impact on the mempool performance then with "passive collecting"
< sipa>
wumpus: presumably his configure did not detect getrandom (old kernel headers?), but did detect getentropy?
< morcos>
jonasschnelli: you could cache the latest value for things that were being calculated anyway, and then only record it on the periodic sample
< morcos>
but i'm not sure what you mean about the dynamic allocation for the added sample.. i'm assuming that at least for the smaller time scales you're going to have a limited time span for which you keep samples.. like just a recent buffer of the last couple hours at 2 sec sample periods or something.. so it seems like it should not really require much allocation
< morcos>
anyway.. i'm happy to revisit it in a week or so.. but i'm mostly out of the office this week.. so just trying to keep head above water
< jonasschnelli>
What I meant is the check against the set max memory size for the stats.... say someone has set 10MB as max memory for stats then you have to know when you overshoot it. Not sure if vector-size * sizeof(sample) and cutoff the unnecessary samples is very expansive... I though doing it every 100-added sample makes sense.
< jonasschnelli>
morcos: Yes. No hurry... I'll ping you once I have more...
< morcos>
jonasschnelli: yeah i was imagining you initially just calculate how many samples you're going to get to keep given their memory max and then have a ring buffer, but you coudl do it many ways.. personally i think its kind of an over configuration option. i'd have a -stats=0 option to save the memory, but then otherwise everyone gets whatever the default memory usage we think makes sense. not everything has to be con
< jonasschnelli>
Yes. fair enough... thanks for your inputs
< sipa>
wumpus: we don't need another wrapper around urandom though; only a weak alias to a function that always returns ENOSYS, and a change in GetOSEntropy that falls back to GetDevURandom in case getentropy fails
< sipa>
wumpus: or, alternatively, just never use getentropy on Linux
< wumpus>
sipa: not using getentropy on linux would make sense, after all there is a specific syscall handling for thatt
< * wumpus>
is probably not going to make today's meeting, can anyone else chair please?
< BlueMatt>
jonasschnelli: how is #10238 intended to interact with #10235? It looks like it was designed to be complementary, but which performance improvements are you envisioning using 10238 for?
< jonasschnelli>
BlueMatt: I wrote 10238 to ensure we have a fast way to lookup CKeyIds... without this, we have to read each keypool key from the database on each SyncTransaction
< jonasschnelli>
(with HD restore)
< jonasschnelli>
I shouldn't conceptually conflict with your 10235
< BlueMatt>
ahh, ok, i was missing the motivation part, thanks
< jonasschnelli>
I though the description was good... need to check...
< BlueMatt>
oh, maybe i missed that
< jonasschnelli>
Currently we only keep the keypool index in memory.. the rest is database AFAIK
< morcos>
wumpus: jonasschnelli: does anyone care about the QT option in Custom fee to pay "total at least"?
< morcos>
I've been looking at some of these recently reported issues with transations paying too high a fee, and they seem likely to be a result of coin selection logic
< morcos>
I think it might make for a cleaner redesign to make all fee calcluations in terms of fee rate and then take that into account when selecting inputs the first time around
< jonasschnelli>
morcos: not sure... I'm not sure if we want this feature...
< morcos>
The "total at least" option only applies to pre-selected coins anyway, but it seems like it would make for cleaner code to just dump that
< morcos>
I mean why does anyone care about a minimum amount of fee, only the rate matters, its not like there is a dust issue with fees
< morcos>
Another thing that might be nice to clean up, is the fact that setting Custom Fee in the GUI changes the paytxfee used by the command line, that just seems like a mistake to me, but i suppose changing that now might be a surprise to some poeple
< gmaxwell>
The total at least stuff makes no sense, it should go. Feerate at least, sure. Total no more than, sure.
< morcos>
gmaxwell:hmm... total no more than... you mean just the existing maxtxfee? shoot.. i had forgotten about that
< morcos>
i wanted to reduce each coin by the fee needed to spend it before doing coin selection, and then we could be a bit smarter about which ones we added..
< morcos>
i suppose it would make some sense to have the -maxtxfee failsafe, but maybe that is at a different place in the code
< gmaxwell>
Yes, I think thats at a different place in the code.
< morcos>
no its not now, but i could move it to a different place.. like repeat coin selection if your coin selection violated maxtxfee
< morcos>
i want to also get rid of the knapsack solver
< morcos>
if you have enough inputs that are lower than the target to reach target + CENT, just start randomly adding them until you are > target+CENT, and then you're done
< morcos>
the idea that you are trying to get exactly target + CENT is stupid. target + 3.5 CENTS is just as good or whatever
< gmaxwell>
getting a result with no change is also attractive, but I think that should be a seperate algorithim run first.
< morcos>
gmaxwell: hmm.. i think thats rare enough that i'm not sure its worth trying to hard for
< gmaxwell>
I don't think it should be, at least if we're reasonable about how much fee we'll throw away.
< morcos>
its the way we do that now that is kind of causing the super high fees i think
< BlueMatt>
so apparently the coin selection dialog's "received by" address is, essentially, complete gibberish
< morcos>
but that is fixable while still trying to do that, for sure
< BlueMatt>
and its "tree" (which I thought was supposed to group things ala listaddressgroupings) does not at all do that, and is equally useless
< gmaxwell>
morcos: I've tried a bit and have not managed to make it cause high fees since the changes we made to fix the fees in later passes. How confident are we that there is still an issue?
< morcos>
i think its pretty plausible
< morcos>
if nTotalLower is < target + CENT, then you have a decent chance of the problem happening
< morcos>
once nTotalLower < target + CENT, the knapsack algorithm tries to find you exactly target
< morcos>
which leads to decent chance that your change is dust
< morcos>
so your change gets eliminated
< morcos>
then when you callculate the fee and you need to go re-find inputs which meet target_2 (= target + fee)
< morcos>
you end up choosing fewer different inputs, have a lower fee, and now no change to dump the excess fee on
< sipa>
why does CENT even still occur in the algorithm?
< morcos>
actually, i used to hate that, now i think its the best part of the algorithm!
< morcos>
its the goal size for change, it shouldn't be an exact target, but a minimum as i mentioned above
< morcos>
but it's better to have a target like that, orders of magnitude higher than dust, rather than just randomly creating change ouputs that might barely pass dust
< morcos>
what's silly is that if you can't get CENT, then you aim for 0.
< morcos>
but i think the key improvement is to look at the net value of inputs in coin selection given the desired fee rate
< morcos>
i suppose i could have written this all up first (i still will) but my idea was you'd have some sort of economic threshold such as only use inputs whose net_input_value > total_input_value/2 unless your tx confirm target > 48, then you use all inputs whos net_input_value > 0
< morcos>
the current algorithm throws in stupid inputs whose net_input_value is 0
< morcos>
i mean negative
< morcos>
btw.. i'm probably out for the meeting today.. so if anyone has any fee estimation questions, feel free to ask before.. i'm still working through comments on PR
< gmaxwell>
morcos: I'd rather two two total solutions, e.g. a change and no-change solution, and then use post selection.
< morcos>
What information would you use from the change solution to make that decision? It seems to me either you find a good no-change solution or you don't?
< gmaxwell>
both may be 'less than perfect', e.g. no-change might give away more than we'd like, but change may involve higher fees, so it's no better.
< gmaxwell>
e.g. because the extra output increases the size, and then an extra input is needed... which increases the size further.
< gmaxwell>
and so you'd prefer the no change even though it gave away more than you might normally like, because the other option was worse.
< morcos>
hmm.. so my way of thinking about it, which is i thought your logic, that a solution that is 10x as big and costs 10x as much is not worse
< morcos>
granted, the change output vs no change is strictly worse
< morcos>
but that in my mind is a bit how you would go about figuring out whether you have an acceptable no change solution
< gmaxwell>
it's true that you'll pay the fees eventually.
< gmaxwell>
but then there is another angle is that our strategy should be changing if we think the feerate is 'high' for the moment, vs 'low'... in the latter we should optimize for larger transactions.
< morcos>
yeah, that last part is important, but i think a complicated optimization
< morcos>
maybe not ready for that yet
< morcos>
i'm including a tiny bit of that in not being willing to spend most of inputs unless its a low confirm target
< morcos>
but then there get to be all kinds of complicated questiosn, what kinds of fee rates does this user usually use and compare to current market conditions and size of current transaction
< gmaxwell>
it's a fair point that you don't want to tie up all your inputs on a rare 25-confirm-target payment.
< morcos>
that was all kind of a next level in my mind... first level was meant to try to eliminate some stupid edge cases but not deviate too much from existing logic
< morcos>
yeah.. so many ways to so grade tx selection
< gmaxwell>
Fair enough.
< instagibbs>
morcos, I think earlier you basically described murch's algorithm, first it does some quick search for "exact match", meaning sum of effective amounts of the inputs that don't create a change output, and then if that fails backs off to random selection
< morcos>
instagibbs: ah yes, ok i'll go reread what he said... i wanted to keep random selection from the lower than target value inputs if possible.. and do them on net value... that seems to me the important improvements
< instagibbs>
sure, that seems to mesh. He said he got 50x more exact matches than Core. Which means it might be worth it.
< instagibbs>
(on a real world scrubbed dataset)
< instagibbs>
I still think 10333 is complementary. Anytime we screw up somehow hitting the feerate we want, we can try and salvage. Hopefully will be much less important at least.
< gmaxwell>
and exploiting no change is important for reducing the utxo set size; I think if payment amounts come from a unimodal distribution the utxo count will grow expoentially under any algorithim that minimizes input counts and always creates change.
< morcos>
instagibbs: yes, i haven't read the code yet, but i'm not opposed to the idea... but i think if we consider net values for inputs, then there is no such thing as screwing up the fee.
< sipa>
gmaxwell: but aiming for no change probably requires that every wallet permanently has a set of utxos of various sizes
< sipa>
gmaxwell: which on itself may not be ideal
< gmaxwell>
well then there is the extra change output, which I'd like to work out how to do, at least when the change would be large we should split the outputs.
< instagibbs>
morcos, yeah I can't convince myself otherwise, but having a safety net feels better. It's not a lot of logic. Treating the cause is clearly fixing coin selection.
< gmaxwell>
as far as the safty net, I think it's fine if it's just a dumb check at the end that aborts the transfer if it fails.
< gmaxwell>
people get spazzy about automatic fees because they don't know what it could pay.
< gmaxwell>
omg what if it pays all my monies!
< sipa>
cfields: a bit ironic... if you build with new glibc on an old kernel, but later run on a new kernel... using getentropy actually improves things
< cfields>
sipa: ironic how?
< sipa>
cfields: in that it would seem that getentropy is never useful on linux
< sipa>
s/ironic/surprising/
< cfields>
ah
< cfields>
well, it basically just forces the syscall to be tried on all systems, when built on something newish. but yea, the path isn't exactly obvious in all cases :)
< sipa>
we could of course just hardcode the SYS_getrandom constant :p
< * sipa>
ducks
< cfields>
I suppose we could add a weak getrandom() that does the syscall and avoid the loops
< cfields>
haha
< cfields>
sipa: i find the logic harder to follow in your version :\
< cfields>
i'm not sure if we want to cascade like that?
< sipa>
ok, just a suggestion
< sipa>
yow
< sipa>
meeting?
< sdaftuar>
did anyone volunteer to chair?
< sipa>
#startmeeting
< lightningbot>
Meeting started Thu May 4 19:01:13 2017 UTC. The chair is sipa. Information about MeetBot at http://wiki.debian.org/MeetBot.
< luke-jr>
BlueMatt: I agree change shouldn't be grouped as it is, but I don't understand how "received by address" is wrong
< BlueMatt>
turns out the coin control dialog is almost entirely useless
< jonasschnelli>
BlueMatt: entierly useless? I disagree
< BlueMatt>
the "received by address" thing works fine for coins you just received, but if it is a change output it walks back the first input until it finds a non-change output
< luke-jr>
BlueMatt: seems plenty useful to me
< BlueMatt>
so it, essentially, picks an address at random if the output is change
< luke-jr>
BlueMatt: I see different "received by address" for change too
< BlueMatt>
and groups by it, which obviously isnt what you want for privacy
< BlueMatt>
luke-jr: for addresses marked as change in the wallet? no
< luke-jr>
BlueMatt: yes..
< BlueMatt>
for random addresses of yours it should work, but not for addresses via getrawchangeaddress
< luke-jr>
I don't use that RPC, but it works for normal change
< sipa>
it sounds to me like it is doing a mix of "receive by address" and "linked grouping"
< sipa>
both are perhaps useful
< BlueMatt>
more importantly, really, is that I've repeatedly seen the tree mode of the coin picker dialog as the same as listaddressgroupings, which it is clearly not
< BlueMatt>
sipa: well the change-output-results-in-random-grouping thing is kinda strange
< sipa>
right, it shouldn't walk for receive address
< luke-jr>
BlueMatt: I have nfc what that code does, but it *looks* right in the end window :/
< sipa>
and it should alwaya walk (to some deterministic representative) for grouping
< sipa>
*alwaya
< sipa>
*alwayS
< BlueMatt>
*always
< BlueMatt>
but, yea, anyway, I think this should really emulate the grouping rpc
< BlueMatt>
the "where did these coins come from" question is not really useful for anything but coins you just got, in which case they will already be ungrouped
< sipa>
a "received by address" is still useful i think, but it's not the same as grouping
< kanzure>
hi.
< BlueMatt>
yes, but if it is not received directly, it should be "Change"
< jonasschnelli>
I have worked out a solution that seems to work for encrypted and encrypted&pruned wallets.
< * BlueMatt>
wonders why we cant have a derivation key which is not encrypted in our wallet so taht we dont have to pause sync
< jonasschnelli>
It can halt the sync / validation progress.
< jonasschnelli>
But,.. I'm not sure what gap limit and default keypool size we should use
< BlueMatt>
(to generate new keys)
< jonasschnelli>
100 and 20 seems very little
< sipa>
BlueMatt: that requires non-hardened derivation
< jonasschnelli>
BlueMatt: IMO encryption (private key wallets) with public key derivation should be avoided
< BlueMatt>
sipa: yes, is that a concern?
< sipa>
BlueMatt: yes, we have a dumpprivkey command
< jonasschnelli>
We should aim – longterm – for watchonly-hd (see NicolasDorier workd) and add a signing-agent ala gpg / ssh
< sipa>
leaking one private key means you leak your whole walley
< BlueMatt>
jonasschnelli: why? your list of funds is already public to the encrypted wallet holders? that wouldnt change?
< jonasschnelli>
But that is alredy the next topic. :)
< luke-jr>
sipa: dumpprivkey isn't supposed to be safe
< sipa>
luke-jr: i know
< luke-jr>
but we could make it fail for non-hardened keys?
< sipa>
luke-jr: but it breaks expectations
< sipa>
people have a mental model about how it works
< BlueMatt>
sipa: maybe I'm confused on the format of HD...seems you can build a list of derivation secrets which is based on a non-encrypted private key which is unexportable?
< BlueMatt>
sipa: and then that would not be the case
< jonasschnelli>
Yes. But I vaguely remember that we once said we don't want to mix private-key wallets with public key derivation... and this makes very much sense to me
< BlueMatt>
(dont know the format of HD, but we could do something else if its way better)
< sipa>
BlueMatt: if you have the parent public key + private child key, you can compute all private child keys
< jonasschnelli>
If we would do child pubkey derivation, keypools could be removed (at least for HD)
< sipa>
BlueMatt: that is an inevitable weakness of EC based derivation
< jonasschnelli>
What sipa said
< sipa>
BlueMatt: and it is reason why bip32 has hardened keys
< sipa>
which core uses
< BlueMatt>
sipa: even if your list of privkeys is based on adding a new random value to the previous privkey where the new random value is just a hashchain of a private secret?
< sipa>
BlueMatt: then you lose public derivation
< sipa>
(the ability to compute child pubkeys without knowing the parent privkey)
< BlueMatt>
lets take this offline
< sipa>
sounds good
< jonasschnelli>
back to the topic: what GAP limit should we enforce by default?
< BlueMatt>
1000
< BlueMatt>
default keypool 10k
< jonasschnelli>
Yeah.. I like this
< jonasschnelli>
But only for encrypted wallets?
< jonasschnelli>
IMO we should (only encrypted)
< sipa>
but in general i believe that most cases where the public derivation is wanted, just use huge precomputed key lists
< jonasschnelli>
non encrypted can say with 100
< sipa>
jonasschnelli: meh
< sipa>
why bother differentiating?
< gmaxwell>
why only encryption? I don't think that makes sense.
< jonasschnelli>
True, the gap limit must be the same.. right
< jonasschnelli>
sry
< gmaxwell>
less complexity please, and keys are cheap.
< jonasschnelli>
Okay, ... I'll bump it to 1000
< bitcoin-git>
[bitcoin] jtimon opened pull request #10339: Optimization: Calculate block hash less times (master...b15-optimization-blockhash) https://github.com/bitcoin/bitcoin/pull/10339
< jonasschnelli>
Next question: the tests are mostly running with a keypool size of 1... so the gap limit stuff is only enforced for the new test in #10240
< BlueMatt>
gmaxwell: needs more review, could use side-by-side benchmarks incl: memory usage, disk usage, performance numbers
< sipa>
yes, i'm planning to do benchmarks
< gmaxwell>
BlueMatt: "much faster"
< sipa>
other todos are better upgrade code (with a fancy progress bar...), that doesn't leave gigabytes of uncompacted data in the chainstate
< sipa>
but i believe it is functionally complete and tested
< BlueMatt>
alright, if there are no more topics I'd emplore people to keep reviewing the big 0.15 things, since it looks like we're gonna slip a few, which is sad
< sipa>
it seems to make the chainstate some 20% larger
< sipa>
i'll report numbers on the PR, no need to discuss here
< sipa>
BlueMatt: ack
< sipa>
#endmeeting
< lightningbot>
Meeting ended Thu May 4 19:32:31 2017 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)
< jtimon>
yeah, I was just wondering if further preparations could be separated, to make review less scary, but in the end they will be the same commits
< * BlueMatt>
votes for 10199 as most-critical user-facing feature, maybe multiwallet too
< cfields>
I think #10189 is ready to go, and it's blocking a few people. jtimon at least.
< gribble>
https://github.com/bitcoin/bitcoin/issues/10189 | devtools/net: add a verifier for scriptable changes. Use it to make CNode::id private. by theuni · Pull Request #10189 · bitcoin/bitcoin · GitHub
< sipa>
i could use 10189 for some of the changes in 10195 as well
< sipa>
jtimon: looks like something for developer-notes.md
< cfields>
jtimon: thanks!
< cfields>
jtimon: not sure what 3) is saying through
< jtimon>
sipa: right, well, cfields feel free to use whatever parts of it make sense to you
< cfields>
jtimon: lgtm, I'm just not following that one part
< jtimon>
well, I needed point 3 because I was some times editing the script trying to do more specific things, so I didn't want to keep changes done by an earlier version of the script or by me manually because I stupid and I forget the commit should only do what the script does and nothing else or something
< jtimon>
maybe we can keep it out, I don't know
< cfields>
jtimon: oh, i see.
< cfields>
jtimon: maybe just say "1) Writer: apply your script to a clean tree and commit with an appropriate message:"
< cfields>
then list the other steps. I was confused because you mention committing before running the script :)
< praxeology>
While upgrading the chainstate to keying on id+o.ix, does it pretty much freeze the init process of the program?
< sipa>
praxeology: yeah...
< sipa>
praxeology: and it takes a while
< praxeology>
Sounds like you are using the same db, not trying to make a new one?
< sipa>
indeed
< jtimon>
cfields: yeah that may do it without explicitly saying the "git checkout ." trick
< sipa>
praxeology: it's an in-place upgrade, and it is interruotible
< praxeology>
interruptible? so you make the new entries, then delete the old ones? and if interrupt, you delete the new ones?
< sipa>
it adds and deletes simultaneously, atomically
< sipa>
you can just kill it at any point
< praxeology>
so the new code can handle lookup and editing of both kinds of entries?
< praxeology>
I guess it doesn't matter that much if I understand how it works if I'm not going to help code review/test/code the progress UI
< sipa>
praxeology: no, at startup it just iterates through all old entries and converts them
< sipa>
if you interrupt during that conversion, you need to continue later
< sipa>
but by interruptible i mean you don't lose progress from doing that
< praxeology>
"need to continue later" or else your client doesn't process new blocks/transactions?
< gmaxwell>
when the software restarts it will continue where it left off.
< gmaxwell>
this is all in the init process before it even makes any connections to anything.
< sipa>
praxeology: by interrupt i mean you quit during the conversion
< sipa>
praxeology: the node can't do anything until the conversion is complete
< praxeology>
ok I see
< sipa>
it would be possible to do the conversion on the fly, but it would result in a long term slowdown
< praxeology>
also more complicated code that only need be used once
< praxeology>
Are you making the upgrade optional?
< sipa>
no
< sipa>
the new database code cannot read or write the old format anymore, and supporting that would be complicated and slow
< gmaxwell>
It's optional: don't use newer software if you don't want it upgrading.
< gmaxwell>
:)
< praxeology>
Alrighty... I can see why this would be a delayed commit to the main branch.
< praxeology>
I wish I could clone myself
< gmaxwell>
praxeology: we do not do development in master. (few large OSS projects that use Git do)-- things that go into master are at least believed to be done and more or less releasable.
< praxeology>
Yea I know. I am saying that I see why this feature is/may be pushed back to later and later release versions due to the requirement of locking up the node for... how long does it take to process that 2GB of data?
< sipa>
maybe an hour on slow hardware
< sipa>
slow disk, mostly
< gmaxwell>
15 minutes minutes on fast hardware, I'd guess?
< sipa>
yeah
< praxeology>
and then is leveldb optimized to handle such a db transformation? And does it compress well after deleting all the old entries?
< sipa>
nope, it seems
< sipa>
but there is an explicit call to ask it to compact
< sipa>
which i'll experiment with
< praxeology>
Why not create a new db?
< sipa>
Why create a new db?
< praxeology>
if leveldb has bad performance w/ this sort of transformation
< sipa>
calling db.CompactRange(); is much simpler than dealing with multiple database at once during the conversion
< praxeology>
because you want atomic?
< gmaxwell>
praxeology: it doesn't have bad performance.
< sipa>
praxeology: no...
< sipa>
praxeology: the argument you're bringing up in favor of multiple databases is because leveldb doesn't deal well with the conversion
< sipa>
i offer an alternative to having multiple databases at once which also deals well with the conversion
< sipa>
which is simpler
< praxeology>
ok
< gmaxwell>
You need to ask it to free the space or its very lazy about it, thats all.
< gmaxwell>
Unrelated, while watching a debug=1 logging rescan.. why is it battering out leveldb activity during rescan?
< praxeology>
ah, so CompactRange() will just clean up a sorted key range of the db? Which could be done incrementally along with a process that alters the format with a sorted key iterator
< gmaxwell>
the repeatedbash is the some instrumentation I added to count repeated hashings of the block header.. every 6 or so of those is processing a new block.
< gmaxwell>
praxeology: ya, thats what he's going to have it do now.
< sipa>
gmaxwell: that looks like just background compactions
< gmaxwell>
e.g. every time the leading byte of the key changes, compact.
< praxeology>
I've heard rumors that leveldb is unreliable. Is that still true?
< gmaxwell>
praxeology: a lot of those rumors are a result of several things (1) (primary) leveldb has agressive checking and will actually catch lower level corruption that other things miss, (2) there there is no windows filesystem interface for leveldb and the one we previously used was incorrect, leading to unnecessary corruption with unclean shutdowns, (3) there is a usenix paper that panned leveldb f
< gmaxwell>
or not handling crashes given worst case behavior of the the VFS contract that they inferred.
< gmaxwell>
(3) appears to be completely inconsequential in practice, e.g. I left systems running for months in constant high load plus crash loops and it always recovered.
< gmaxwell>
When we do get reliablity reports from users we sometimes investigate them in detail, and frequently find actual disk corruption.
< gmaxwell>
Unfortunately, there is a lot of windows-grade hardware out there.
< sipa>
or overheating CPUs
< praxeology>
I would guess memory failure would be more common that cpu failure in my experience
< gmaxwell>
yes, esp laptops, many brands of laptop cannot be run at full speed for an hour at a time without actually ending up with apparent 'memory' corruption. (presumably cpu caches).
< sipa>
praxeology: i would think so too, but that's not very consistent with what we're seeing (for example, for some people, running with -par=1 actually improves things)
< gmaxwell>
well it could be ram which behaves more poorly with the cpu taxing the power rails and making things hot. who knows.
< praxeology>
I've had terrible experience with cheap consumer memory from lower cost computer brands, when operating under load
< sipa>
i think we're seeing fewer db corruptions than a few years ago
< praxeology>
Are you saying there are still issues with crash recovery in windows?
< sipa>
that may be due to better windows env support in leveldb
< sipa>
or due to people running on better hardware
< sipa>
or just fewer people running bitcoin core on windows
< sipa>
not all corruption reports are on windows, to be clear
< sipa>
but historically those were a majority
< praxeology>
I have like 3 windows machines running core on windows, and they haven't corrupted. But I have all very reliable machines that don't crash or shutdown unexpectedly
< sipa>
praxeology: ha, that's good to know
< praxeology>
Alright well best of luck w/ the utxo index change. I'd consider help testing or maybe doing some UI work on it. But I am very busy
< sipa>
thanks regardless!
< praxeology>
Maybe if you point me to the code and ask me for help directly I'll add it to my todo list and maybe I will do it if my priorities make it there.
< BlueMatt>
jonasschnelli: you might kill me for #10340, sorry :/