< NicolasDorier>
the problem that I found out by trying that
< NicolasDorier>
is that I was not expecting a flush to wipeout the cache clean
< NicolasDorier>
I thought only dirty entries in the cache would be commited into a batch
< sipa>
i think you misunderstand what this does
< sipa>
it's not about limiting the number of dirty entries
< NicolasDorier>
This flush the CoinViewCache when certain condition arise
< NicolasDorier>
right ?
< sipa>
it's splitting them up over different batches
< sipa>
we want to limit the size of the batches, as they cause a memory blowup
< NicolasDorier>
I know
< NicolasDorier>
My point is that we can prevent a batch to be too big at the CoinViewCache level
< sipa>
no
< NicolasDorier>
I was assuming CoinViewCache was not deleting all cached coin during a flush
< NicolasDorier>
I was wrong on that
< sipa>
yeah, that's weird
< sipa>
but we've tried mny things to change that, and they all make things slower
< NicolasDorier>
if it was not deleting all cached coin during a flush, then we could flush when the CoinViewCache know that a batch size would be too big
< NicolasDorier>
ha I see
< sipa>
that's an eventual goal here
< sipa>
but this is doing something much more basic: not requiring that flushes are consistent with blocks
< sipa>
let me take a step back
< NicolasDorier>
I understood your PR and the goal of it
< NicolasDorier>
I thought there was a simpler way of doing, but I relied on the assumption that a flush did not throw away all the cachedCoins
< NicolasDorier>
if my assumption was right, fixing the problem you attempt to solve would have been way more easy.
< sipa>
the reason why we have the cache, and go through all this effort (as opposed to just using leveldb's caching), is that we can do an awesome optimization: if a utxo is created in the cache, and spent before it is ever flushed, we can just delete it from the cache, preventing it from ever being serialized or written to disk
< NicolasDorier>
yes I am aware of it
< sipa>
that means that we must maximize the time a utxo is in the cache before being flushed
< sipa>
what you're suggesting is limiting the amount of dirty entries in the cache
< NicolasDorier>
right
< sipa>
that would radically reduce our ability to use that optimization above
< NicolasDorier>
yes because at every flush
< NicolasDorier>
you throw away all the cachedCoins
< sipa>
no
< sipa>
that has nothing to do with it
< sipa>
it's more fundamental
< NicolasDorier>
ah yes let me think
< sipa>
you're suggesting more frequent flushes (which don't wipe the cache, just mark the written entries as non-dirty, roght?)
< NicolasDorier>
yes
< sipa>
more frequent flushes would reduce the time between a utxo is created and the time they hit disk
< NicolasDorier>
aaaaah
< NicolasDorier>
got it
< sipa>
at that point, it's too late
< NicolasDorier>
yes, I understand now
< NicolasDorier>
I remove my comment
< NicolasDorier>
removed
< sipa>
so my eventual goal is to have continuous flushing that indeed only writes small amounts of entries
< sipa>
but not by limiting the amount of dirty entries
< NicolasDorier>
mmh so how ?
< sipa>
just have sort of a rolling window... once an entry has been in the cache without being written, flush it
< sipa>
however, that requires that the disk cache is allowed to be in an inconsistent state
< sipa>
and needs replay to correct at startup
< sipa>
which this PR is the first step towards (and a great memory usage improvement on its own)
< NicolasDorier>
I see
< sipa>
it's a very usual cache structure, which only works because we mostly create and delete recently created entries
< sipa>
and only write them once, read them once, delete them once
< sipa>
most caches are optimized for many reads
< NicolasDorier>
ok this is clear thanks. So I am fine with 10148, would just like to see a simple python test though
< sipa>
agree, if you have ideas for what it should do, let me know
< NicolasDorier>
sipa: A dbbatchsize of 1 bytes, dbcrashratio of 10. Check if the node can eventually sync 200 blocks and have the same utxoset hash than the other node.
< NicolasDorier>
does not test the forking logic though
< sipa>
ah, yes
< sipa>
well we could construct forks as well, i guess
< NicolasDorier>
yes, I don't think this is very difficult to test. Just the dbbatchsize being in MB make it inconvenient
< NicolasDorier>
maybe have a magic number just for the tests be enough
< sipa>
agree
< sipa>
it can be in bytes
< sipa>
it's a test-only option really
< NicolasDorier>
yes indeed
< xinxi>
gmaxwell: Please check my PM.
<@wumpus>
indeed, no specific reason that abortrescan should not allowed in safe mode, though should anything that triggers rescan be allowed?
<@wumpus>
safe mode = the block chain is in uncertain state
< gmaxwell>
sipa: I think for testing that atomic flushing what we really need is a few bugs to insert in the code, and see that the tests catch them... at least that would give some feel for how adequate the tests are.
< jonasschnelli>
gmaxwell: You once mentioned that there are better filters for "block filters" then bloom. I think what mostly matters is the compactness. What filter type would you recommend?
< bitcoin-git>
bitcoin/master e367ad5 John Newbery: [tests] rename nodehandling to disconnectban
< bitcoin-git>
[bitcoin] laanwj closed pull request #10143: [net] Allow disconnectnode RPC to be called with node id (master...disconnect_node_by_id) https://github.com/bitcoin/bitcoin/pull/10143
< sipa>
jonasschnelli: the optimal datastructure is something like a bloom filter, but with only 1 hash function, and then instead of storing the bits directly, store the distances between the 1s
< jonasschnelli>
sipa: okay.. I try to parse your answer.. give me some days. :)
< sipa>
it's 44% smaller than bloom filters, afaik
< sipa>
but much much slower to look things up
< sipa>
as you need to decompress the data
< jonasschnelli>
So just use a single MurmurHash? I need to understand what you mean with "store the distances between the 1s". Let me think a bit about it.
< sipa>
if you only have 1 hash function, there will be a few 1s and many 0s in your filter
< sipa>
which can be compressed using run length encoding
< jonasschnelli>
Okay. Got that part.
< gmaxwell>
jonasschnelli: I think this stuff is a waste of time. even with commited filters the users privacy is significantly harmed. Saving 14kb/s hardly seems worth it.
< jonasschnelli>
gmaxwell: You propose to just scan all blocks?
< gmaxwell>
sipa: as far as slower, I doubt it, even a pretty huge filter the decompression time would be insignificant compared to transfer time.
< gmaxwell>
jonasschnelli: scan blocks since the creation of the keys.
< sipa>
gmaxwell: i mean cpu time
< jonasschnelli>
gmaxwell: Yes. But catching up two weeks on a cellphone would result in 144*14MB = 2GB of data...
< jonasschnelli>
Resulting in users stick to the current BF SPV model
< gmaxwell>
users don't use that in any large number now.
< gmaxwell>
jonasschnelli: already almost no one uses multibit/android wallet due to poor performance, even though it completely destroys the user's privacy. You cannot compete with server based lookup which is what almost everyone uses.
< gmaxwell>
also, why would the cellphone be two weeks out of date? should be catching up in the background...
< jonasschnelli>
Hmm... well, .. SPV BF is pretty fast and I think it's used by a large usergroup... though, I don't have reliable numbers.
< sipa>
what is BF?
< jonasschnelli>
gmaxwell: Catching up data in the background puts your app in a different app-group
< jonasschnelli>
BF = bloom filter
< jonasschnelli>
gmaxwell: the group that has the significant warning about battery consumption. :)
< jonasschnelli>
sipa: It may be different on Android. But on apples iOS, if you use backgroup activity, you'll end up in a different app-group resulting to different reviews,... I think you need to add warnings to your app description, etc.
< gmaxwell>
sipa: I fail to see how cpu time alone is interesting for what is being discussed. :)
< sipa>
gmaxwell: that chart is just reachable nodes
< sipa>
no?
< gmaxwell>
sipa: no.
< gmaxwell>
sipa: the opposite of that, in fact.
< sipa>
gmaxwell: i was discussing the data structure, not the use case
< gmaxwell>
ah well, you wouldn't use it for any of the things you normally use a bloomfilter for.
< gmaxwell>
Though FWIW, the cuckoo-like filter also is close to capacity achieving (at least if you high enough N to allow a high fill rate) and works incrementally.
< jonasschnelli>
Hmm... is it entirely stupid to only extend and check the kepool-keyrange (HD restore) if the wallet's bestblock lacks some blocks behind the chaintip? I think a check during init could give an indication if the wallet is in restore mode or not.
< jonasschnelli>
Alternatively we could add an option hdalwayscheckkeypool=1
<@wumpus>
what would be the rationale behind doing that, then only?
< jonasschnelli>
wumpus: a) performance for unencrypted wallets, b) [more important] encrypted wallets would require a unlock during the time of the scan
<@wumpus>
"encrypted wallets would require a unlock during the time of the scan" but only if the scan notices that new keys are needed, right?
< jonasschnelli>
for bitcoind, there is the problem how to want the user if the gap limit is reached, because, at this point, he would need to unlock the wallet in order to continue scanning
< jonasschnelli>
wumpus: Right.
<@wumpus>
yes I understand there's a notificatino problem there. The GUI could just pop up a dialog, not so much for bitcoind
< jonasschnelli>
And... to do a precaution scan, you probably should use a gap limit of 100 (configurable).
< jonasschnelli>
Yes. The GUI way is much simpler.. but even there. Do we always want to extend up to a default gap limit (even in normal operations)?
<@wumpus>
so yes it may make sense to have a seprate 'wallet is reconstructing' mode
< jonasschnelli>
Because someone may have handed out 100+ addresses and want to make sure he catches all of them in a HD rescan
< jonasschnelli>
Though, most people probably dont want to auto-extend their keypool over 100+
<@wumpus>
this flag could also be stored in the wallet (like the reindex flag in the utxo db) instead of determining this based on the wallet's bestblock
< jonasschnelli>
but if you load an initial HD wallet backup, you probably can only identify the possible backup by comparing the bestblock against the chaintip
<@wumpus>
one argument against treating reconstruction specially would be simultaneous use of the wallet on different machines. I know, we don't support this, but then it could be detected.
<@wumpus>
jonasschnelli: that's true
< jonasschnelli>
Yes. That.
< jonasschnelli>
Asking the user make sense... (GUI).
<@wumpus>
yes
< jonasschnelli>
"Wallets is out of sync, do you want to restore a backup?"
< jonasschnelli>
Then extend the keypool +1000 or ask about the previous usage
<@wumpus>
#10231 gives me a compile error : bitcoin/src/qt/clientmodel.h:85:30: error: implicit instantiation of undefined template
< morcos>
gmaxwell: fee estimation currently does not use mempool queue (nor in the improvements for 0.15) it's an idea that i've been contemplating since the beginning, but i never settled on a design that i thought met all the criteria
< morcos>
balancing performance, usefulness, and security is hard.
< bitcoin-git>
[bitcoin] jonasschnelli opened pull request #10238: Change setKeyPool to hold flexible entries (master...2017/04/keypool_fix_a) https://github.com/bitcoin/bitcoin/pull/10238
< jonasschnelli>
(INFO): Tests successful... but signmessages.py failed, Duration: 3 s
< jonasschnelli>
if proc.returncode == TEST_EXIT_PASSED and stderr == "":
< jonasschnelli>
the later if statement
< jonasschnelli>
or at least split by newline and pass if all lines start with /Warning/
< jonasschnelli>
(or a clever regex)
<@wumpus>
stderr == "" should go
<@wumpus>
return code should be what determines whether a test passed
<@wumpus>
anything else is insane
< jonasschnelli>
I think so. Tests may by successful is there is something in stderr
< jonasschnelli>
Okay. I'll PR
< bitcoin-git>
[bitcoin] jonasschnelli opened pull request #10241: Allow tests to pass even when stderr got populated (master...2017/04/test_stderr) https://github.com/bitcoin/bitcoin/pull/10241
< gmaxwell>
I think the sanitizer stuff is only useful in our current test harnesses because we fail on stderr output.
< luke-jr>
ah
<@wumpus>
sanitizer stuff?
< gmaxwell>
TSAN/ASAN/UBSAN.
<@wumpus>
do we use that in travis?
< jonasschnelli>
Well, we could add a test_runner argument (fail_on_stderr) if someone wants to use that with sanitizer
< sdaftuar>
meeting time?
<@wumpus>
but ok, at least I understand why the stderr check is there now, it's for private test runs with sanitizer?
< jonasschnelli>
however.. meeting
<@wumpus>
#startmeeting
< lightningbot>
Meeting started Thu Apr 20 19:02:12 2017 UTC. The chair is wumpus. Information about MeetBot at http://wiki.debian.org/MeetBot.
< luke-jr>
IIRC one of the sanitisers used to require a special env var to cause an exit
< gmaxwell>
Not yet, only with some not yet merged PRs are we finally TSAN clean, but many of us run it locally and it has found real bugs. I'm not protesting, but just bringing up the one thing I remember that interacts with that assumption.
< luke-jr>
but I can't find that now (some do need an extra build option tho)
< luke-jr>
jonasschnelli: I can put it in Knots 0.14.1
< jonasschnelli>
luke-jr: Yes. Do that.
<@wumpus>
jonasschnelli: is that even tagged for backport? anyhow, tag it for 0.14.2 I'd say
< jonasschnelli>
wumpus: Yeah. I tagged (not the project though).. 0.14.2 is good IMO.
<@wumpus>
jonasschnelli: ok!
<@wumpus>
next topic?
< * jonasschnelli>
damns cs_main
<@wumpus>
jonasschnelli: if it's any consolation, many projects had a similar issue with a central lock
< * luke-jr>
coughs at Python
<@wumpus>
I was thinkinkg about the Big Kernel Lock, but yes, python is guilty too
< jonasschnelli>
wumpus: Yes. I guess there is much room for optimisation.
< gmaxwell>
There has been some interesting discussion in github related to the wallets handling of address reuse and dust and what not. anyone interested in that subject might want to check out the discussion on #10233 and PRs linked from there.
< gmaxwell>
jonasschnelli: on locking we need some better lock profiling. If we have some instrumention that yelled anytime lock contention caused >100ms delays, we'd probably find a number of things to fix.
< jonasschnelli>
gmaxwell: Yes. That!
< gmaxwell>
I don't think cs_main is itself really the issue there... just not carefully avoiding it via things like caches.
< jonasschnelli>
gmaxwell: I was printf profiling yesterday
< luke-jr>
I looked into disabling address reuse and it looks harder than I'd like :/
< morcos>
i would like to briefly discuss fee estimation (maybe as separate topic)
< gmaxwell>
on the blocker PRs-- I'm kinda lost where we are with non-atomic writes.
< gribble>
https://github.com/bitcoin/bitcoin/issues/10179 | Give CValidationInterface Support for calling notifications on the CScheduler Thread by TheBlueMatt · Pull Request #10179 · bitcoin/bitcoin · GitHub
< BlueMatt>
gm2053: i think its ready for review now?
< morcos>
gmaxwell: #10148 in its current form without multi head just needs more review i think
< gribble>
https://github.com/bitcoin/bitcoin/issues/10179 | Give CValidationInterface Support for calling notifications on the CScheduler Thread by TheBlueMatt · Pull Request #10179 · bitcoin/bitcoin · GitHub
< BlueMatt>
wumpus: oh, wasnt sure if it got switched after the last merge, sorry
<@wumpus>
adding 10148
< gmaxwell>
9792 isn't hard to review, FWIW, in my expirence.
< sdaftuar>
gmaxwell: i've become more comfortable conceptually with the non-atomic writes (it did take me a while to come around to it being worth the effort). i'd like to review and test more.
<@wumpus>
if there's something that an be merged you should tell me, preferably outside the meeting :)
< morcos>
yeah wumpus i think that is now just wasting review cycles, it has more than enouch ACK's (we have told you a couple times.. :) )
< luke-jr>
multiwallet is rebased and nits fixed btw
< gmaxwell>
sdaftuar: will let us effectively double the dbcache size being a first benefit... plus it should allow some really nince improvements later post per-txo. Sorry if it wasn't communicated well, the value was more obvious to pieter and I perhaps because we've been hammering on caching policy changes based on per-txo for a while.
< luke-jr>
CWalletDB still needs some serious refactoring, but IMO that's something to do outside multiwallet's PR
< jonasschnelli>
luke-jr: agree
<@wumpus>
morcos: I don't remember
< morcos>
wumpus: no problem, i just wasnt telling you because i didn't want to tell you too many times.. in any case i think its ready (9942 that is)
< bitcoin-git>
[bitcoin] luke-jr closed pull request #7289: [WIP] Make arguments reconfigurable at runtime via RPC (master...rpc_setarg) https://github.com/bitcoin/bitcoin/pull/7289
< sdaftuar>
gmaxwell: yeah, makes sense to me now -- there are a lot of steps of "why don't we do X simpler thing instead" that i know you guys have tried/thought through already, that i needed to think through myself
< gmaxwell>
morcos: thanks for that fee estimation writeup, I guess I understood it better than I thought I did, I think I thought more of the discussed things were actually implemented.
< bitcoin-git>
[bitcoin] ryanofsky opened pull request #10242: [qt] Don't call method on null WalletModel object (master...pr/rbfnull) https://github.com/bitcoin/bitcoin/pull/10242
< gmaxwell>
morcos: I think that writeup is good and should go into the codebase.
< morcos>
And then I wrote #10199 with a bunch of improvements. I suppose it makes sense to add another section the gist that provides a high level overview of the improvements?
< gmaxwell>
I think the estimatior is a complex enough machine that we should maintain a seperate description of it, if not an actual spec. Just like we do for many major protocol features.
< morcos>
But what I would like to do is err on the side of merging 10199 early and then if there are small bugs or fixes, we can fix them in master
< * jtimon>
remembers that he also wants to decouple the estimator from the mempool
< sipa>
oops, forgot about meeting
< gmaxwell>
morcos: the writeup could use some more details about the reliablity estimates and how it merges bins.
< morcos>
it takes 2 weeks of continuous up time to even explore all the code paths
< gmaxwell>
sipa: the meeting did not forget about you.
< morcos>
jtimon: yes, i have a plan to do that that builds off BlueMatt's CValidationInterface. The groundwork is laid in 9942 that was just merged
< gmaxwell>
morcos: are we not saving enough data between restarts that we really do need two weeks continious uptime to hit it all?
< morcos>
reliability estimates? reliability OF estimates?
< gmaxwell>
morcos: I know that if there aren't many samples in a bin it doesn't use the bin.
< morcos>
gmaxwell: well if you want to know how much fee it'll take to be confirmed in a week, you sure as hell better wait at least a week (but yes once you've done that once, you may not need to do it again on a restart)
< morcos>
gmaxwell: some of that stuff is changed in 10199 (for the better, obviously i guess)
< luke-jr>
if anyone else acks #10242, maybe mention the meeting going on in a P.S. :p
< jtimon>
morcos: I know, I reveiwed it yesterday and linked to similar PRs of my own, at the time you only wanted to decouple the mempool from the estimator (9942 just did it), but not the estimator from the mempool, happy if you changed your mind and want both like me now
< gmaxwell>
I guess in general a thing to keep in mind for this sort of description is that we should try to make it detailed enough that if an academic showed up and wrote a paper on it based on only the description (which they will) would their results be likely useful to us or not. :)
< morcos>
but there are some open questions in 10199 that would be helpful to get feedback on
< morcos>
such as starting with not being able to upgrade from the old estimates
<@wumpus>
imo the most important about the description is that we understand it
< gmaxwell>
morcos: we should think about saving more of its state in the future. I have nodes that don't spend more than a few minutes down per month, but don't often make it to two weeks up.
< luke-jr>
morcos: how complicated is the upgrading? we only would need it for one version at most IMO
< morcos>
gmaxwell: i think thats problematic for being able to predict really long time horizons...
<@wumpus>
upgrading isn't much of an issue, if the estimation algorithm changes, feel free to throw away the data from the previous one
<@wumpus>
just make sure it doesn't crash on upgrading/downgrading
< gmaxwell>
well, to really advance I think what we would probably want is a simulator (perhaps based on historical data) and a metric of success.
<@wumpus>
yes
< morcos>
luke-jr: the complicated part is deciding what we want to do, implementing it probably isn't that bad... but for instance the new estimates are smart about whether your estimates file is stale... but should it just dumbly use your old estimates until it has new estimates... what if the new estimates for 5 blocks which you do have is lower than the old estimate for 25 (which you dont' have a new estimate for)
< morcos>
etc.
< gmaxwell>
I think it's more or less fine to toss out data on algo changes. we could worry about doing better when the algo is stable for a long time.
<@wumpus>
the difficult part, as with coin selection, is evaluationg algorithms
< morcos>
the historical data is useless... the question is whether you use the old estimates until your new estimates are warmed up (by calculating them before you throw away the data)
< luke-jr>
huh, someone's playing malleability games on testnet.
< gmaxwell>
Electrum does some things with using static estimates when it doesn't have data to estimate on its own. I think there are a lot of interesting tradeoffs we could make to hotstart. But I don't think starting speed is at all our biggest concern.
< gmaxwell>
luke-jr: they have been for months.
< gmaxwell>
The purely retrospective algorithim is really slow to update to changing network conditions, in particular, it doesn't track the weekly load cycle well.
< morcos>
gmaxwell: right so that is the question.. my preference would be to merge as is.. and then if we get around to it before 0.15 we add a smarter hotstart
< luke-jr>
morcos: well, if it's not already implemented, I'd say it's not important enough to spend time implementing
< luke-jr>
(upgrading, that is)
< morcos>
gmaxwell: yes... one of the main problems the new design was meant to address... still using only a purely retrospective algorithm, so the problem fundamentally remains, but in practice its much more responsive (b/c it looks at different time horizons simultaneously)
< jcorgan>
clearly this calls for Deep Fee Estimation
< gmaxwell>
die
< jcorgan>
tell me what you *really* think
<@wumpus>
hehe, deep fee estimation
< luke-jr>
no no, Xtreme Deep Fee Estimation!
< morcos>
anyway, ok for now i'll update the gist with a high level description of the algorithm
< gmaxwell>
I have a lovely algorithim for an efficient limited memory 2D exponentially weighed moving average somewhere....
< gmaxwell>
morcos: great.
< sipa>
Xthin fees
< luke-jr>
XD
< morcos>
but my basic point here is that ideally we need weeks/months of testing in master to uncover possible edge cases
< morcos>
i'm relatively confident that overall this is better, but thats not the same thing as saying it doesn' thave problems that need fixing...
<@wumpus>
yes, it shouldn't be merged last minute
< gmaxwell>
morcos: well get your description up soon, and I'll review shortly after. I think fee estimation is self contained enough we could merge something and back it out if we don't like it... but we need to have more than you understanding what we're doing at least. :)
< morcos>
gmaxwell: yes basically my point.. ok sounds good
< gmaxwell>
(if for no other reason than we need to understand it better to spot failures with it.)
< jonasschnelli>
morcos: maybe it was asked already, how fast are the estimations available after startup? Does it work with prune=550?
< morcos>
prune is irrelevant
< morcos>
it can give you an estimate for a target of N once it has been caught up for 2*N blocks...
< gmaxwell>
jonasschnelli: Estimations for depth X need to at least see some multiple of X blocks.
< morcos>
but then it saves that
< gmaxwell>
Becuase you have a moving window of analysis, and no data for longer windows.
< jonasschnelli>
So for a conf target of 2 you need to wait ~40min after startup?
< morcos>
so if you stop and restart you're starting over again for increasing your max possible target, but you still have access for up to that max possible target
< morcos>
jonasschnelli: correct, but again, only the first time (or if you have been down for more than 6 weeks i think)
< gmaxwell>
jonasschnelli: well not really, because you save the results. so the first time, yes. But that goes back to the hot start question.. and there are lots of ways we could hot start these things, if we really had something that was working well otherwise.
< jtimon>
jcorgan: after alphago took go away from me I was looking for other problem to solve with https://github.com/jtimon/preann as an excuse to run it again </offtopic spam>
< jonasschnelli>
Okay. Thanks... I'll test 10199 with the mainnet GUI then a bit (before of after merging)
< bitcoin-git>
[bitcoin] ryanofsky opened pull request #10244: [qt] Add abstraction layer for accessing node and wallet functionality from gui (master...pr/ipc-local) https://github.com/bitcoin/bitcoin/pull/10244
< jonasschnelli>
*or
< morcos>
gmaxwell: in the meantime the PR description in 10199 covers the changes pretty close to what i will write up on the gist
< morcos>
i guess i need to do a quick rebase now that 9942 is done
< * luke-jr>
crickets
<@wumpus>
any other topics?
< gmaxwell>
I want to talk to luke some about the address reuse thing, but it can be post meeting.
<@wumpus>
time to tag 0.14.1 final and start gitian building
< * luke-jr>
spins out a Knots branch :p
<@wumpus>
gmaxwell: well there's time
<@wumpus>
#topic address reuse thing
< gmaxwell>
so a serious privacy problem which has been actively exploited for a long time is that people make near-dust payments to addresses once they've been spent from, so that you spend from them again in new txn, creating snowballing that links all your transactions togeather.
< gmaxwell>
Latest discussions seem to be driven by a user that runs a gambling site and whom cares about this because his customers get running into issues transactions that link back to him.
<@wumpus>
do we need a 'block transacton' functionality against such transaction abuse?
< gmaxwell>
but it's a general concern for everyone.
< gmaxwell>
So there have been discussion about some very heavy handed manual methods, but I think I have a suggestion that could potentially be a default behavior:
< gmaxwell>
but I'm interested in hearing if other people think I'm crazy.
<@wumpus>
wouldn't be the first time I have an UTXO I just want to ignore
< morcos>
stop the suspense!
< BlueMatt>
morcos: ack
< gmaxwell>
First create a seperate quarenteened balance. Any address or specfic txo could be manually quarenteened or unquarenteed at any itme.
< gmaxwell>
Then adjust coin selection to always spend all payments to a particular address at once (+/- some filtering with dust that might be needed to prevent dust attacks).
< gmaxwell>
Then once an address has been spent from, it's automatically added to tue quarenteen list (with any outputs that weren't spent, e.g. whatever failed the dust filtering).
<@wumpus>
I think the quarantaine is a good idea, not so sure about adding things automatically though
< gmaxwell>
If users want to intentionally reuse an address, I suppose they'd need a way to prevent them from being reblocked.
< morcos>
i like the general idea
< luke-jr>
gmaxwell: and quarantined funds are excluded from balance or tx list somehow?
< gmaxwell>
Well I think the attacks will continue unless we could come up with something that was close to automatic... Could be something that gives a GUI user a choice the first time it happens or if the Q balance becomes non-negligble.
< BlueMatt>
morcos: I might prefer if we were Quarantine ing things and not quarantaine or quarenteened :p
< morcos>
but what if you have 10 10btc utxos at the same address and you need to pay someone 1 btc
< morcos>
you spend them all?
< gmaxwell>
luke-jr: they'd be shown in the tx list, but skipped for spending, and shown as a seperate balances. Like confirmed vs unconfirmed balance.
< BlueMatt>
gmaxwell: I'm somewhat unsold as default policy
< gmaxwell>
morcos: yep. and create a big change. Which I think is fine. I think a seperate issue is that we should auto-split very large change. But thats 90% independant.
< BlueMatt>
it seems to be a somewhat-surprising break
< jcorgan>
and the name 'quarantined' might be a bit heavy handed
< gmaxwell>
BlueMatt: the current privacy trashing is itself a very surprising break.
< BlueMatt>
fair
< luke-jr>
it might be less confusing if only the first receive ever was displayed/accepted, and all subsequent ones got quarantined
< gmaxwell>
jcorgan: well I came up with that on the spot, on github they're calling it frozen, which I think is super misleading (bank froze my funds!). :P
< jcorgan>
reserved?
< luke-jr>
suspicious? :P
<@wumpus>
trash can
< Chris_Stewart_5>
extraneous?
< gmaxwell>
luke-jr: first recieved leads to an immediate attack: dust spammer races the payment then you get the dust and not the payment. :)
< morcos>
gmaxwell: hmm... i do like the idea of auto-quarantining spent address or dust left over in mostly spent addresses, but not sure i like default spending all the inputs and possibly giving you large change
< luke-jr>
gmaxwell: first confirmed with a larger value?
< gmaxwell>
morcos: really I'm surprised at that. That change alone is something I've wanted to do for a while (and was carrying patches for it for a bit)
< morcos>
quarantine is a good name, but lets not bikeshed that
< luke-jr>
gmaxwell: maybe auto-quarantine dust too
< instagibbs>
morcos, assuming spending the dust is worthwhile, what's the concern?
< BlueMatt>
gmaxwell: I'm not sold on non-end-user wallets here. it seems like it woul dbreak many merchant workflows that use bitcoind
< sdaftuar>
auto-spending all the funds with a given address makes sense to me as well
<@wumpus>
morcos: let's quarantine the bikeshed
< luke-jr>
BlueMatt: merchant workflows that reuse addresses are broken anyway
< BlueMatt>
(eg you receive half a payment, your coin selection spends from that addr, then you receive the other half, and now you dont realize you got paid?)
< luke-jr>
wumpus: lol
< gmaxwell>
BlueMatt: why? (thats why I'm asking.) -- obviously it would be configurable.
< jtimon>
re bikeshedding: the class managing this obviously needs to be called quarantiner
< gmaxwell>
BlueMatt: how often are merchants doing that? I mean you can get into advanced things like biasing selection to chose SPK that have least recently recieved funds to avoid that.
< BlueMatt>
gmaxwell: I assume most merchants at least support multiple txn to complete your payment?
< BlueMatt>
most guis ive read seem to imply that
< morcos>
yeah i suppose if its configurable, an option that autospends everythign from any address that gets spent makes sense
< gmaxwell>
BlueMatt: kinda. but they also require them to be recieved at effectively the same time. I think it's managable.
< instagibbs>
I'm not sure they accept multiple txn as policy
< instagibbs>
err automatically
< gmaxwell>
Just the autospend alone would radically improve privacy, and would almost be enough except for the malicious dust creation.
< BlueMatt>
gmaxwell: to make it compatible you'd have to never spend outputs newer than X, where X is merchant time frame
< gmaxwell>
BlueMatt: ya, which would be a trivial 'first try without x' in the current framework.
< BlueMatt>
i agree in principal, but it sounds like you'd break some folks' workflow in subtle ways. adding the option and defaulting off for non-gui users, maybe?
< sipa>
i don't see how autospending would break anything?
< luke-jr>
or just tweak how RPC shows quarantine
< gmaxwell>
well it could evolve over time, too-- I do think it's not worth our time to do things here that we don't think could be on for a majority of users eventually in some form.
< instagibbs>
At a minimum you could make near-dust be quarantined
< morcos>
i would think there could be some threshold for auto-un-quarantining too right? like if your quarantine address receives 1 BTC? or maybe not, maybe it just becomes common sense to check that
<@wumpus>
in the first version this is introduced it should be disabled by default in any case, I think, let's present it as a security feature first. Could always be enabled by default later but that should not be the initial goal.
< gmaxwell>
Because already people who are super aware of privacy can and will already manually do coin selection to achieve ends like this.
< BlueMatt>
gmaxwell: agreed in principal, but there are also easier fixes we can do initially. eg bias coin selection towards this with fallbacks?
< luke-jr>
wumpus: true
< gmaxwell>
wumpus: absolutely.
<@wumpus>
anything that potentially 'disappears' funds shouldn't be enabled lightly
< luke-jr>
it's harmless to add if it's disabled by default initially
<@wumpus>
luke-jr: exactly
< morcos>
ok, so this sounds like general agreement that this is a good idea and has degenerated into arguing about defaults. all development discussion in a nutshell!
< BlueMatt>
"disabled by default" can also mean "if something fails, fall back to the current behavior"
< gmaxwell>
Just in principle I don't think the resource investment is worth if it we don't think that the end goal couldn't be default-ish (e.g. GUI) use.
< BlueMatt>
morcos: yup
<@wumpus>
I'd enable it personally
<@wumpus>
it's worth the resource investment if we think it's useful to have
< sdaftuar>
so perhaps a first simple step would be to enable auto-spending of all funds from a given address in the coin selection logic?
< gmaxwell>
I think users should be okay with 'multiple balances' we already have confirmed vs unconfirmed, and normal bank accounts have multiple balances.
< luke-jr>
IMO the end goal should be to treat address reuse as something that just doesn't work, and have a quarantine people can dig out lost funds if necessary
< BlueMatt>
sdaftuar: ack
< BlueMatt>
DONG
< jtimon>
I would start with the quarantine thing as only usable manually, which we all seem to like, and then propose automatic things
<@wumpus>
#endmeeting
< gmaxwell>
sdaftuar: yea, that would be a good first step with minimal impact, we might have to add some extra features with it, like automatic change splitting.
< lightningbot>
Meeting ended Thu Apr 20 20:00:36 2017 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)
< morcos>
i do think it could be default on.. i just think its a matter of avoiding surprised users who are wondering why money appears to be lost, but that can be solved with informing them
<@wumpus>
luke-jr: yeah, for the long long term I agree on that
< BlueMatt>
morcos: or fallbacks
< luke-jr>
jtimon: +1
< BlueMatt>
"wait, i hit spend and it says not enough money" can be prevented while still using this
< gmaxwell>
sdaftuar: I used to have a patch that basically post processes coin selection to add all other inputs that were in the same "address group" (listaddressgroupings) as anything selected.
< gmaxwell>
sdaftuar: but it was kind of suboptimal, and with the existance of coinjoin it's less important to use a whole group rather than a whole address.
<@wumpus>
no one would send 1 BTC to an address to breach someones privacy
<@wumpus>
it's always small amounts
< luke-jr>
wumpus: if they did, many people would be happy to give up their privacy XD
< gmaxwell>
wumpus: well never say never, but "champaign problems"
<@wumpus>
luke-jr: xD
< sdaftuar>
gmaxwell: one thing just occurred to me, if you receive lots of payments to the same address, and issue lots of payments yourself, then this will tie up lots of your utxo's, which could be operationally annoying?
< gmaxwell>
unfortunately it can be realistic to send several dollars to do so, no problem, but there is only so much we can do.
< sdaftuar>
ie you'll be generating lots of big unconfirmed chains, and run out of utxos
< sdaftuar>
could*
< gmaxwell>
sdaftuar: thus the comment about change splitting.
< sdaftuar>
that doesn't help?
< sdaftuar>
oh, some
< gmaxwell>
yes it does, they'll go to different addresses.
< jcorgan>
jtimon: i'll take a look at it
< sdaftuar>
well i was thinking that you're still spending an unconfirmed output, which will be under the descendant limit for the parent... eh, not sure how it would work out.
< morcos>
also the threshold as to what is too small to include in the spend and leave quarantined is tricky...
<@wumpus>
in any case I agree on the long run, address reuse should be seen as something suspicious unless the user opted in to it (e.g. a publically published address)
< gmaxwell>
morcos: well I think there is a simple objective measure: at the current target feerate, what is the break even level?
< gmaxwell>
monero seems to get along okay with protocol prohibited address reuse, we're maybe too conservative on some of these things. :)
< morcos>
lets say you have 1mBTC that would cost 0.5mBTC to add as an input at the fee rate you are proposing for this tx... maybe you prefer to leave that quarantined and send it separately at a lower fee rate .. certainly if you have multiple ones liek that that could be combined
< jtimon>
jcorgan: unfortunately the documentation for the university had to be in spanish and I never bothered translating it https://github.com/jtimon/preann/blob/master/doc/pfc-jorge-timon.org (there's a latex generated from that and can give you a pdf as well) </sorry offtopic again>
<@wumpus>
so isn't such a bad idea to 'forget' old addresses after they've been used too long ago, or auto-quarantaine at least
< gmaxwell>
morcos: if only we had a fee estimator that could give us a reasonable floor feerate! :)
< morcos>
we do!
< morcos>
2 sat/byte will get you confirmed within 500 blocks right now
<@wumpus>
but shouldn't be enabled by default at this point
< morcos>
ahh, it went up to 3.3 since i last checked.. :)
<@wumpus>
I'd like my transactions to be confirmed within 100 years at least :)
< gmaxwell>
morcos: right so you could use a floor feerate. But also I think it would be reasonable for us to have behavior that cleans up the UTXO set some at the users (small) expense, I think most users would support that, especially when it has privacy benefits.
<@wumpus>
otherwise I'd have to restart my node first to prevent 64 bit node ids from overflowing
< instagibbs>
wumpus, no patience eh?
< gmaxwell>
There is a whole layer of extra features we could think about what to do with the quarantined funds... but I think that should be future work.
< jcorgan>
jtimon: eso no es problema
< jtimon>
hahaha, estupendo!
< gmaxwell>
e.g. if later we have some kind of coinjoin intergration, they could be preferentially sent into that.
< gmaxwell>
too many things going on, we've responded to this too slowly. :( oh well, in any case, I think that this will prevent a lot of dust from getting created.
< gmaxwell>
so it's kind of counter intutive, you might worry that the quarantine would result in UTXO bloat, but I suspect the opposite, at least if we're able to make this default-ish: with the incentive to make the tracing payments gone, they'll stop being created.
<@wumpus>
add an 'empty trash can' that sends the quarantained funds into devnull
< morcos>
participate in network spam attack using quarantined funds button
< gmaxwell>
I've mused about the idea about having some shred wallet feature that creates some long timelocked spend of any remaining coins and gives them over to fees... then sends them off somewhere.
<@wumpus>
'send wallet to wumpus'
< sipa>
ACK
< gmaxwell>
because I am somewhat pained by all the utxo bloat created by people who end up with 0.00001 BTC in a wallet, in 100 inputs, and then they just delete the file because its effectively worthless.
< gmaxwell>
yea, wumpus is fine too. tricky part is timelocking it so that they have some time to reconsider their decision. :P
< gmaxwell>
personally I wouldn't want it, you'll get clowns using it as a backup service and demanding their funds back. :P
<@wumpus>
gmaxwell: yes, there are certainly some practical issues :p
< morcos>
that reminds me, we should revisit before 0.15 the dust level the wallet will create
<@wumpus>
sending to fees would be a better idea
<@wumpus>
especially those small amounts...
< gmaxwell>
morcos: if we had a really good lower bound fee estimat it would be sensible to use that. e.g. don't create any change output where more than half its value would be lost in fees.
< morcos>
but it depends on what you mean by that
<@wumpus>
morcos: what would you like to revisit it to?
< morcos>
wumpus: not change the network definition, but make the wallet smarter about not creating (ever) outputs just about the network standard limit
< morcos>
gmaxwell: the problem is historically the lower bound fee estimate is 1 sat/byte
< morcos>
i think any transaction ever created which paid more than that could have been mined by now, some probably weren't because they were collectively forgotten about
< morcos>
but the lowest feerate mined on the weekend often drops that low
< morcos>
people are just in a hurry to be confirmed quickly
<@wumpus>
doesn't seem to be doing that here, win already finished buildling
< luke-jr>
wumpus: relative to 0.14.0
< luke-jr>
I missed the RCs
<@wumpus>
I'm not sure of that
< cfields>
yes
< cfields>
due to zlib bump
< luke-jr>
ah, since zlib is a dep?
< cfields>
yep
< * luke-jr>
goes to figure out food then
< jtimon>
regarding https://github.com/bitcoin/bitcoin/pull/10193 I'm still failing at replacing BOOST_REVERSE_FOREACH and I don't understand why, maybe I should reduce the scope of the PR (I moved from working stuff to trying to fully remove boost/foreach.hpp by popular demand)
< jtimon>
?
<@wumpus>
zlib is at the bottom of the food chain, dependency-wise, not surprised it triggers rebuild of everything
<@wumpus>
jtimon: what's the problem with BOOST_REVERSE_FOREACH?
< nanotube>
sipa: yes, though it's easier to just fix the problem :) which is that github decided to add fancy middle-dots to its title, which make the existing code barf with unicode errors >_<
< luke-jr>
lol
< luke-jr>
nanotube: you're alive!
< nanotube>
luke-jr: o/ :)
< jtimon>
wumpus: it seems to interfere with prevector templates in prevector_tests.cpp, see https://github.com/bitcoin/bitcoin/pull/10193/commits/cfef34884684e71c6f43ef3e4f2e87590fc87c9e for the trick to make the PR pass travis. probably I should remove that commit already, but I wanted to make sure the commits after removing BOOST_REVERSE_FOREACH weren't breaking something else, plus that commit is kind of the link that maybe answers your
< jtimon>
question
< jtimon>
I would really prefer to just solve the problem but I'm kind of lost
<@wumpus>
jtimon: it's sad that c++11 doesn't provide an as-is replacement
< jtimon>
wumpus: yep, it seems things get slightly better on c
< cfields>
wumpus: does reverse_iterate not end up being a dangling reference? I'm not sure what lifetime is expected for the container in a range-based-for
< jtimon>
c++0.14, I mean...c++14
< cfields>
(rather, I don't know if the loop extends the lifetime of the container)
< cfields>
er, that was for jtimon, sorry
<@wumpus>
jtimon: ok, so it's just a matter of waiting a few years (hey at least not 100 years :p)
<@wumpus>
c++17 is pretty nice too, esp std::optional
< jtimon>
no, this can be certainly solved in c++11, but maybe not in a very clean way, perhaps we want our own macro to replace it
<@wumpus>
I'd say replacing it with another macro would not be worth it
<@wumpus>
there are still significant impediments to ditching boost wholesale, and until we have replacements for those, there's no use in rolling our own for the simpler things. Certainly not if they're not clean and simple.
< jtimon>
or just get rid of BOOST_FOREACH, ¿Q_FOREACH? and PAIRTYPE for now, but not BOOST_REVERSE_FOREACH or #include <boost/foreach.hpp> for now (although the include would only be used for BOOST_REVERSE_FOREACH now)
<@wumpus>
but maybe we can get your reverse_iterator to work
<@wumpus>
I would be surprised if it's not possible
< jtimon>
I am already surprised that this is not working, it's working for all the other cases using BOOST_REVERSE_FOREACH, just not compiling prevector_tests for reasons beyond me. it seems the templating is colliding somehow
< jtimon>
but the option of reducing the scope and leave fully removing boost/foreach.hpp for a later PR is also there, that's why I ask
< jtimon>
btw, sorry for asking again, but any blockers for #10189 ?
< gribble>
https://github.com/bitcoin/bitcoin/issues/10189 | devtools/net: add a verifier for scriptable changes. Use it to make CNode::id private. by theuni · Pull Request #10189 · bitcoin/bitcoin · GitHub
< TD-Linux>
gmaxwell, you could have a checkbox when generating an address that indicates it's supposed to be public
< jtimon>
nevermind, just remembered cfields needs to enforce a commit tittle prefix
< cfields>
jtimon: that's done
< TD-Linux>
it seems more useful than the "reuse an existing address" checkbox there now
< cfields>
I suppose I should comment
< gmaxwell>
TD-Linux: there is a reuse checkbox? I don't recall that.
< TD-Linux>
gmaxwell, maybe it's gone in the latest version, let me check. regardless I don't understand the use case
< luke-jr>
gmaxwell: I have a PR open to remove it..
< jtimon>
cfields: oops, great! thanks
< gmaxwell>
is that just the thing that brings up the dislog that shows you existing addresses.
< jtimon>
let me rebase and remove the travis-cheating commit for everyone to see the error without having to build locally, maybe it's obvious to someone else
< gmaxwell>
oh well generating a QR code for an older address still seems useful.
< gmaxwell>
should probably be elsewhere though.
< TD-Linux>
I'd rather have it accessible via a right click in the list below but I guess that list is only so long
< bitcoin-git>
[bitcoin] shigeya opened pull request #10245: Minor fix in build documentation for FreeBSD 11 (master...freebsd-11-build-doc-fix) https://github.com/bitcoin/bitcoin/pull/10245
< bitcoin-git>
[bitcoin] practicalswift opened pull request #10246: Silence "warning: "MSG_DONTWAIT" redefined" when compiling under Linux (master...silence-msg_dontwait-warning) https://github.com/bitcoin/bitcoin/pull/10246
< jtimon>
also I insist if you don't find #8855 interesting to review, maybe you do find #8994 interesting