< GitHub14> [bitcoin] gmaxwell opened pull request #9002: Make connect=0 disable automatic outbound connections. (master...connect0) https://github.com/bitcoin/bitcoin/pull/9002
< midnightmagic> :-/
< luke-jr> does it not do that again? :x
< sipa> do what?
< luke-jr> sipa: disable outbound connections.
< luke-jr> it used to do so at least, and seemed to work when I recently did it for testing
< gmaxwell> My belief is that it used to work, but what it does right now is just loop continually trying to connect to '0' and on my system this manages to connect to myself.
< gmaxwell> this, indeed, does prevent it from connecting to other parties...
< luke-jr> O.o
< gmaxwell> but in a fairly log spammy way (esp with debugging turned up)
< sipa> any theory why '0' results in a connect to self? :s
< sipa> that shouldn't even resolve to a valid ip
< luke-jr> sipa: most integers are a valid IP
< gmaxwell> I didn't bother checking my belief that it used to behave better.
< aj> sipa: telnet 0 22 -> 0.0.0.0:22 -> ssh to localhost
< gmaxwell> $ telnet 0 8333
< gmaxwell> Trying 0.0.0.0...
< gmaxwell> Connected to 0.
< gmaxwell> Escape character is '^]'.
< gmaxwell> ^]
< gmaxwell> telnet> close
< sipa> well i knew it'd resolve to 0.0.0.0
< sipa> i'm surprised 0.0.0.0 is a valid thing to connect to
< gmaxwell> This surprised me too. I figured it was some quirk of my system... but in any case, better to just not have the useless thread.
< luke-jr> ping 2130706433
< luke-jr> (this is 0x7f000001)
< aj> 0.0.0.0 means "all ipv4 addresses on the local machine", so seems kinda plausible
< sipa> luke-jr: some browsers even allow much larger integers (up to 2^1000-ish), resolving them modulo 2^32
< luke-jr> >_<
< sipa> as to why anyone ever tbought to support that: i believe a naive decimal-to-uint32 converter will actually behave that way
< gmaxwell> we it would stop at 2^1000 isn't clear. :P
< luke-jr> I should access my KVMoIP by decimal. I don't have DNS on it anyway. :p
< sipa> gmaxwell: ish
< sipa> it's probably just an input buffer size limit
< midnightmagic> 0 is the inaddr_any wildcard/alias address, and in some systems (the ones I'm familiar with) it means "the first interface that was ifconfig'd at boot" -- in other words, it's *not* guaranteed to be 127.0.0.1.
< luke-jr> I would have expected 0.0.0.0 to be a blackhole :/
< luke-jr> midnightmagic: what if you ifconfig the first interface with 0.0.0.0? <.<
< midnightmagic> luke-jr: i dunno, can you ifconfig an interface to 0.0.0.0?
< luke-jr> probably not
< luke-jr> I think that means unconfigure
< midnightmagic> or on some machines you get:
< midnightmagic> :dom13:06:14:57 ~# telnet 0 22
< midnightmagic> 0: No address associated with hostname
< wumpus> IP parsing is not guaranteed to parse '0' as '0.0.0.0', in this csae it just thinks it's a hostname
< wumpus> any other 0.x.y.z IP seems to give "invalid argument" on linux
< wumpus> anyhow the problem here is that bitcoin core uses the OS's IP parsing, this has resulted in confusion before
< gmaxwell> I don't ~think~ special casing "0" is too ugly, but I am open to that idea. :)
< wumpus> but it's not just 0, some IP parsers will happily parse any int32 into a IPv4
< wumpus> +have other OS/libc specific quirks
< gmaxwell> right, this discussion is inspired by my patch that special cases "0" to disable connect. Which itself was inspired by me getting irritated with my logs being flooded with connects to self after setting connect=0 to disable automatic outbound connections.
< wumpus> yes, the reason to special-case 0 would be for -noconnect
< wumpus> which is fine with me
< gmaxwell> Yes, right now, my PR doesn't handle -noconnect (I believe), good point on that.
< wumpus> noconnect should automatically get converted to connect=0
< wumpus> you shoudln't have to do anything special for that
< wumpus> -noX bcomes X=1 in the low-level arg handling
< wumpus> ehhh X=0
< gmaxwell> Will test.
< wumpus> in any case special-casing 0 makes sense because the argument handling special-cases 0 for 'no'
< wumpus> we do that in plenty of places in init.cpp, and adding one more is not a problem at all. If someone really wants to connect to 0.0.0.0, let them specify 0.0.0.0
< wumpus> -noconnect should ideally mean "no outgoing connections"
< wumpus> ooh rowhammer on ARM, we're nowhere safe are we
< gmaxwell> positive news, now that ECC memory will be needed to prevent users from controlling their own devices, we might get ecc memory in more devices.
< gmaxwell> so I thought about "should it imply nodnsseed" and "what if you are goofy and combine connect=0 and addnode? should it still addnode?".
< gmaxwell> the former seems like a harmless and reasonable thing to do, the latter... I dunno if it should touch the addnode behavior, I think probably not.
< wumpus> I guess it should mean "no automatic connections"
< gmaxwell> Yes. No automatic connections makes sense.
< wumpus> if you really want to force a connection using addnode, I don't see why it should stop it
< gmaxwell> Okay, I'll add the imply on dnsseed.
< wumpus> yes stopping dnsseed is an optimiziation that makes sense
< wumpus> why dnsseed if you're going to ignore the result
< gmaxwell> hardly likely to cause a negative surprise... you won't know if you got seeded or not, with no automatic connections. :)
< gmaxwell> well it's not a total no-op as it does update your peers.dat. If a tree falls in the forrest...
< wumpus> well if you really insist on that you can still override it to true right? I suppose you're using 'soft' parameter interaction?
< gmaxwell> Yep.
< gmaxwell> oh actually any seting of connect already softsets dnsseed to off, even -noconnect. (also listen too, which is potentially confusing, but historic, and conservative)
< wumpus> re: android I'm happy that these kind of exploits allow full control of the device by their owner, on the other hand, OSes such as android *invite* people to install all kinds of bullshit apps because of their claimed secure sandbox, and all of those can take full control of the device too, that worries me
< wumpus> ah yes that's true
< wumpus> although the DRM-on-IoT thing is perverse, if someone hacks your phone (or toaster) through a rogue app, they will have root, but you can't get control yourself to chase them off again
< wumpus> "how to hand the world to blackhats with one easy trick"
< midnightmagic> wumpus: you're talking about rowhammer?
< gmaxwell> part of the reson we have this problem is due to technical illiteracy-- to Joe Blow, he doesn't know how to control his device in a meaningful way no matter how much root you give him; so to actually take control from him (even while still accidentally giving it to hackers) doesn't seem like that big of a loss.
< wumpus> yes rowhammer, though it similarly applies to dirty cow, or still-undisclosed-exploit-of-the-day
< wumpus> technical illiteracy is certainly part of the problem - another one would be lack of standards in regard to gaining control of a device by someone with technical literacy
< midnightmagic> "It's okay I'm just going to root your phone.." "Uh, okay Jim. Would you like another beer too?"
< wumpus> + stupid laws that disallow circumventing "
< wumpus> security measures"
< wumpus> all to protect mickey mouse, of course
< wumpus> "International Journal of Proof-of-Concept or Get The Fuck Out" hah I'd never seen that one written out
< wumpus> -noconnect seems to work fine w/ 9002
< GitHub101> [bitcoin] laanwj pushed 2 new commits to master: https://github.com/bitcoin/bitcoin/compare/f08222e882b1...fd29348dbe82
< GitHub101> bitcoin/master 1d8e12b Pavel Janík: Fix doxygen comment: the transaction is returned in txOut
< GitHub101> bitcoin/master fd29348 Wladimir J. van der Laan: Merge #8993: Trivial: Fix doxygen comment: the transaction is returned in txOut...
< GitHub145> [bitcoin] laanwj closed pull request #8993: Trivial: Fix doxygen comment: the transaction is returned in txOut (master...20161021_fix_GetTransaction_comment) https://github.com/bitcoin/bitcoin/pull/8993
< GitHub87> [bitcoin] unsystemizer opened pull request #9004: Clarify `listenonion` (master...patch-3) https://github.com/bitcoin/bitcoin/pull/9004
< da2ce7> Hello, is there a overview of this blocksign feature that is being developed?
< gmaxwell> da2ce7: I presume it would be a cut down port of whats in elements alpha.
< da2ce7> I can only suppose it is only to be used for testnet.
< gmaxwell> the motivation is just to have testnets that are less unreliable than a very small pow blockchain for use for testing that doesn't really care about the consensus mechenism.
< gmaxwell> yea, of course.
< gmaxwell> I'm curious where you heard about it where that wasn't clear?
< wumpus> yes, which is the only thing that makes the implementation difficult in practice
< wumpus> if you are looking for a proposal to enable it on mainnet, there's none, no chance
< da2ce7> I'm curious as it is similar to a feature that where a miner signs their block with a random key (with the pk fingerprint included in the coin base), where after-the-fact the miner could prove that he/she mined that block.
< gmaxwell> that doesn't make a lot of sense, someone signing something doesn't demonstrate authorship
< gmaxwell> instead, to achieve what you describe needs only a commitment to a public key in the block.
< gmaxwell> Though it's somewhat important to the system that miners operate anonymously, to reduce their exposure to coersion.
< da2ce7> well it dose prove that you own that public key, so it means that I cannot put _your_ public key in my block.
< luke-jr> not necessarily. you could sign the template and ask me to mine it for you :p
< gmaxwell> If the key is random and used once it doesn't really matter.
< da2ce7> however then you would need a mechanism to enforce using a different key eveyblock. Otherwise, BadMiner could put a GoodMiner public key, and be a nuance.
< gmaxwell> da2ce7: huh? no... if you care about that you just ignore reuse.
< da2ce7> *nuisance
< gmaxwell> same as someone not providing a key at all.
< da2ce7> well anyway, maybe there is no demand (or it isn't a wise thing at all), for there to be a standard way for miners to prove they made a particular block.
< gmaxwell> Sort of the opposite. I think it's extremely risky for the system for miners to be attaching any kind of identity to blocks at all.
< da2ce7> it could/would make miner/share statistics more reliable if used, again, if that is a wise idea is very debatable.
< gmaxwell> There are ways to do that without any publication of info to the general public though.
< gmaxwell> (presumably we'll see a change to miners self identifying the first time after someone gets sued because someone was unhappy about which transactions they happened to confirm. :( )
< da2ce7> I can imagine that it could be useful in the case the network was under a 51% attack, if miners could attach pseudo-anonymous identities to blocks. However it would be much preferable to never be in such a dystopian state.
< da2ce7> anyway, off-topic. Blocksign for testnet is good for testing :).
< gmaxwell> yes, centeralizing the system can make it more secure against many kinds of attacks... :)
< GitHub42> [bitcoin] laanwj pushed 2 new commits to master: https://github.com/bitcoin/bitcoin/compare/fd29348dbe82...ced22d035ac0
< GitHub42> bitcoin/master dfe7906 Matt Corallo: Add missing cs_main lock to ::GETBLOCKTXN processing...
< GitHub42> bitcoin/master ced22d0 Wladimir J. van der Laan: Merge #8995: Add missing cs_main lock to ::GETBLOCKTXN processing...
< GitHub156> [bitcoin] laanwj closed pull request #8995: Add missing cs_main lock to ::GETBLOCKTXN processing (master...2016-10-fix-getblocktxn-locks) https://github.com/bitcoin/bitcoin/pull/8995
< arubi> wumpus, so specifically re. parsing partial transactions as pre-segwit (#8837), I actually hit this issue in my own toy parser and it shows in core too: http://paste.debian.net/plainh/8c80d8cd so having some flag would be great
< wumpus> thanks, so I wasn't being overly paranoid about that
< arubi> yea those 0 inputs pre segwit are something to think about when fundrawtransaction
< wumpus> please comment on the issue too
< wumpus> ah yes :(
< arubi> well those fivepiece comments are mine :)
< arubi> I'll copy this comment too
< wumpus> does anyone care about DecodeBase58 performance? If so, please review/test https://github.com/bitcoin/bitcoin/pull/8736
< jonasschnelli> during IBD, there is "chainActive" like (CChain) object that contains the headers-only chain?
< jonasschnelli> *there is no "chainActive"
< jonasschnelli> It looks like that getchaintips is ripping out the headers-tip from setOrphans
< jonasschnelli> nm: I think pindexBestHeader is acceptable for my usecase
< wumpus> ok :)
< jonasschnelli> hmm.. why points pindexBestHeader->pprev to pindexBestHeader?!
< jonasschnelli> looks like it not traversable
< jonasschnelli> again... nm! local issue.
< BlueMatt> wait, what?
< BlueMatt> argh, we did break addnode :(
< sipa> how so?
< BlueMatt> i was just trying to use it for fibre and the hostname as both ipv4 and ipv6 but it will only try to connect to one resolved addr even though the other one will work
< sipa> hmm
< BlueMatt> i didnt realize we broke the pick-different-address-if-it-doesnt-connect logic :(
< sipa> i didn't think so either
< BlueMatt> there doesnt seem to be any handling for it
< sipa> ConnectSocketByName should pick a random resolved address for each attempt
< BlueMatt> ThreadOpenAddedConnections does a LooupNumeric before calling OpenNetworkConnection
< BlueMatt> lol, the client in question doesnt even have a v6 default route...bitcoind was trying (and failing) to connect to the v6 host, it seems :/
< sipa> and LookupNuumeric will fail if it's not a x.y.z.w or aaaa::bbbb:ccc or .onion style string
< BlueMatt> ahh
< BlueMatt> hmm, maybe they just didnt wait long enough
< BlueMatt> but it did try to connect to v6, get a network unreachable error, and give up :(
< sipa> it will only retry 2 minutes later, i think
< BlueMatt> yes, it did, and tried again to ipv6
< BlueMatt> though if its random i suppose he could have waited longer and maybe it would have worked
< BlueMatt> though its possible the hosts ipv6-preference logic is bad?
< Lightsword> it’s a fairly standard EC2 ubuntu 16.04 VPS
< sipa> maybe it only resolves to IPv6 addresses for some reason?
< BlueMatt> Lightsword: can you disconnect the ipv4 peer and just addnode the hostname and wait a few sets of 2 minutes?
< Lightsword> BlueMatt, yeah not seeing any connection
< Lightsword> 2016-10-24 16:40:07.057232 trying connection us-west.fibre.bitcoinrelaynetwork.org lastseen=0.0hrs
< Lightsword> 2016-10-24 16:40:07.237914 connect() to [2607:f0d0:2002:169::2]:8333 failed: Network is unreachable (101)
< BlueMatt> yea, so it seems broken :(
< BlueMatt> argh, someone wanna tag https://github.com/bitcoin/bitcoin/issues/9007 0.13.1?
< paveljanik> I can't do so, sorry.
< wumpus> already done
< wumpus> another assert that should be removed, like 1ab21cf344ed0547de5ae679b7e479cb4b1a923b I guess...
< wumpus> should check there's no other weird asserts added
< sipa> is 9007 in 0.13?
< wumpus> yes
< sipa> ah, caused by feelers?
< wumpus> yes, maxconnections cannot be lower than maxoutbound+maxfeeler, I suppose if he'd just set maxconnections=9 it'd be ok
< gmaxwell> addnode can take you beyond the nMaxOutbound count.
< wumpus> that's not what that assert is checking though
< wumpus> it doesn't look at your actual connections, just the max possible
< gmaxwell> ah, indeed.
< gmaxwell> that should be a return, not an assert, in any case.
< * gmaxwell> feels stupid for missing that.
< gmaxwell> There was another inappropriate assert added in the same commit, but it was already removed by PR 8944.
< wumpus> well the previous assert based bug was that
< wumpus> right
< gmaxwell> sipa: I can't see how that couldn't be reproduced in rc2.
< gmaxwell> even returning there wouldn't be right.
< gmaxwell> lets say I set -connect=1.2.3.4 and maxconnections=4 ... I should still be able to accept 3 connections.
< sipa> if you set -connect, isn't listening disabled by default?
< gmaxwell> its softset off.
< gmaxwell> so if you -connect + listen=1 to be precise.
< sipa> ok
< gmaxwell> I was about to suggest that maxconnections<8 should hard force listen off, because then it would make it easier to troubleshoot why things aren't connecting; then realized that no actually in some configs inbounds should still be working even with low max connections.
< gmaxwell> wait.. what.. is eviction now broken?!
< gmaxwell> okay, its not, just kind of stupid.
< gmaxwell> without the insert, the -noconnect + listen=1 case with maxconnections<8 will continually try evicting a connection and fail.
< morcos> sipa: sorry for the annoying questions here.. it appears to me that the dynamic memory usage tracking in CCoinsViewCache assumes that the memory usage of a pruned coins is 0. i'm guess this is usually the case, but its not guaranteed right? (depends on capacity = 0)
< sipa> morcos: i believe we actually always call CCoins::Cleanup()
< sipa> which sets the capacity to 0
< morcos> i was trying to track down some signficant variations between actual usage and the tracked usage that appear in my code.. and am starting by trying to understand why the existing code is correct
< morcos> well it calls a new vector and swaps
< sipa> yes, that was the C++03 trick to set the capacity to 0
< morcos> but from my googling it seems implementation dependent whether a new vector has capacity 0?
< sipa> ok, i agree there is in theory a standard-compliant implementations which doesn't have that behaviour
< morcos> ok.. i noticed that a while ago.. and its probably not the cause of my problem, but just wanted to understand
< morcos> second , kind of unrelated question
< morcos> prevector.resize(0) doesn't seem like the fastest way to clean up a prevector<unsigned char> does it?
< sipa> the c++11 way is std::vector::shrink_to_fit() btw
< morcos> looking at the resize code, it calls erase, which iterates the elements destructing them
< morcos> but in our case where we're always using unsigned chars and we want to clear the whole thing.. couldn't we do that faster?
< sipa> so there is a question of what prevector should support
< sipa> if it can only contain POD types, i think more complexity can go away
< sipa> but if it's to be value for any movable c++ type (as it does now), it has to call erase (which may still be optimized out, when instanciated for simple types)
< sipa> i thought cfields_ was working on some improvements to prevector?
< cfields_> sipa: one of many things that got to 90% and shelved.
< sipa> s/erase/destructors/
< morcos> maybe.. i'm not sure. seems like it might be nice to at least optimize that one particular case, would we subtemplate or something that particular call.
< cfields_> sipa: in particular, I was doing a specialiazation for size=1
< sipa> yes, in c++11 you can use templates to figure out whether it's POD, and use a simpler implementation in that case or something
< cfields_> since that's our main use-case, and you can get huge speedups with that assumption
< morcos> ok.. thanks for the thoughts..
< GitHub49> [bitcoin] MarcoFalke opened pull request #9008: [net] Remove assert(nMaxInbound > 0) (master...Mf1610-netAssert) https://github.com/bitcoin/bitcoin/pull/9008
< nibor> Could someone let me know what they think about: https://gist.github.com/n1bor/d5b0330a9addb0bf5e0f869518883522
< nibor> Is a functioning proof of concept of chainstate only sync. Syncs in about 30mins to a pruned full node state.
< nibor> Obviously need a soft-fork to be any use.
< gmaxwell> nibor: it's more frequent than the model I've been thinking of. For security reasons you really don't want to have a case where miners could make a 100 block fork and then forward print themselves a lot of coins. :) Also it's quite common for nodes to be offline for 1-2 weeks, so if nodes aren't keeping that much in blocks easily available, then security redegrades to SPV history (new chainstat
< gmaxwell> e sync). ... and downloading and syncing a few thousand blocks isn't really slow compared to 100 (relative to current sync times).
< gmaxwell> This is all particular relevant because the snapshot management means that different peers really can't choose their own checking time.
< gmaxwell> I'd been thinking of something that was more like a 3 month interval.
< gmaxwell> Petertodd will protest that requring a particular UTXO set construction will be a hard barrier to even more scalable things like STXO commitments in the future. I came up with a solution to that which you might want to use:
< gmaxwell> Two softfork rules: (1) if the commitment is present, it must be correct. and (2) the commitment must be present from activation until block XXX. If halfway to XXX everyone is still happy with the scheme, a new softfork is applied that says the commitment must be presnet until YYY.
< gmaxwell> That way if someothing better comes along, the commitment can eventually be dropped in a smooth and compatible manner.
< gmaxwell> (perhaps making new installs of old software take a long time to sync :) )
< gmaxwell> nibor: the hash chunking thing should use some kind of tree hash, it probably doesn't need to go down to the indivigual entry, but if I fetch chunks from N peers in parallel and one peer gives me garbage, I should be able to tell _which_ peer gave me garbage, otherwise you get DOS attacks.
< nibor> Could not see how tree helped. The snapshot message contains hash of all the chunks. So you know if a node is nasty after the 1st chunk.
< gmaxwell> okay, that potentially makes the snapshot message quite large, the only difference that a tree would make is that the snapshot value is just the single hash in the blockchain, and the chunks give you enough to verify membership.
< nibor> Regarding gap between snapshots problem with going too long is that the chainstate grows quite fast. Keeping snapshot from 300 blocks back makes chainstate 2.4G vs 1.7G with no snapshots.
< gmaxwell> The other important thing about this proposal is that it needs to be very upfront about this being a signficant change to the Bitcoin security model, and justify it. I believe it is a nessary one.
< gmaxwell> We generally need to engineer for the worst case, so we should probably just assume that they're maximum size even though fancy COW handling could reduce that.
< nibor> Current snapshot message is about 200k and chunks are about 200k each. So msgs small so should scale by factor of 10..
< gmaxwell> reorginizing chainstate into 'old' and 'new' could help with that churn fwiw.
< nibor> Annoyingly the leveldb snapshots are only held in RAM. So with a big gap node would really need to do a bit rewind to check state.
< gmaxwell> yea, I was surprised you got this working using leveldb snapshots.
< nibor> s/bit/big/
< nibor> Not sure I understand your 1st comment though. About miners creating big fork?
< gmaxwell> no rewind should be needed however, you should compute the hash as it goes by, e.g. snapshot at the height as you validate it, then at two blocks after, start computing the hash, and just save it.
< nibor> Sorry - yes you are right. Was thinking of putting hash 20 blocks after chainstate. So have time to compute even when chainstate say 50gig.
< gmaxwell> (50 gb chainstate is likely unworkable with leveldb :( but thats an aside. :) )
< nibor> Prob ok with 64Gig RAM... In day job just order 2 boxes with 2Tb so not so much..
< gmaxwell> nibor: majority hashpower can make their new commitment say that a million bitcoin that wasn't theirs is now theirs. Then all newly joining nodes will get the new chainstate, and eventually all old nodes will think they've hit a reorg larger than people have blocks available, and so they'll do a chainstate sync too...
< gmaxwell> nibor: we'd like to have a decenteralized system you know, :P
< nibor> With a 100gig download?...
< gmaxwell> nibor: did you understand my "phase out" suggestion? (the line refereincing petertodd)
< gmaxwell> nibor: more people handle a 100 GB download today than have more than 8GB ram available. (in particular hosted systems, VPSes, are often quite ram starved) but we're on a tangent.
< nibor> Not sure I do.. but might in a bit...
< gmaxwell> OKAY.
< gmaxwell> In any case, exciting work
< gmaxwell> Your POC is awesome.
< nibor> Thinking about a 3month interval. That is quite easy. Just copy the whole chainstate to another leveldb. Is not more work than hashing it really.
< gmaxwell> Or not another leveldb but just a seralized flat blob.
< gmaxwell> It would be faster and more space efficient. And random access is not needed, except to chunks.
< gmaxwell> (could be one file per chunk. Though 8000 chunks is a bit excessive for that.)
< nibor> Not really - leveldb is up to 1000 files.
< gmaxwell> I believe that only needs two snapshots at any time too...
< nibor> Yes - I only had 3 cos was short gap. And did not want a client that took over 100 blocks to download be left with nothing to find.
< nibor> Will think about Petertodd issue to. Is nasty to cause future issues.
< nibor> I guess some of the 100fork issues would be reduced if client back populated blocks over time.
< gmaxwell> There is a longer term proposal that would eliminate the utxo set, effectively, that we don't want to block.
< gmaxwell> nibor: yes, though if we just want the fastest possible start, it could start immediately as SPV, and then back populate.. irrespective of the snapshotting behavior.
< nibor> Thanks for thoughts... am off now. Will see what others think..
< gmaxwell> (just as background for you: the way the utxo set is eliminated, is that there is a small, perhaps fixed size utxo set, and transactions are expired out of it and commited into an insertion ordered hash tree... then when spending one of those outputs the spending transaction must provide a membership and update proof that lets nodes verify it was there and mark the entry as spent)
< gmaxwell> Great!
< murch> Hey gmaxwell, I was missing you at Scaling Bitcoin. :)
< murch> I was curios what you'd say about my coin selection talk after we had chatted here about it.
< gmaxwell> molz: I was sad to see that wider-match only made a fairly modest improvement!
< sipa> s/molz/murch/
< gmaxwell> oops!
< gmaxwell> murch:
< molz> haha i was scratching my head "what's the wider-match" lol
< murch> gmaxwell: After Scaling Bitcoin I came up with a new algorithm. I'm still running experiments on it (and writing my evaluation chapter, thesis is due next week), but it looks pretty promising in all aspects.
< molz> gmaxwell, btw, is there a way to load the ban list into my node or do i have to type each line manually?
< murch> It has a much higher rate of direct matches than the current core coin selection and is less computationally expensive.
< sipa> molz: they're remembered across restarts already
< sipa> bans.dat
< gmaxwell> murch: one thing your framework doesn't consider is patalogical cases, in the past, people actually attacked litecoin wallets by paying them lots of dust, with enough of it, the subset sum solver would come up with solutions so bad that it couldn't transact.
< molz> sipa but i haven't banned any bad nodes
< gmaxwell> They 'solved' this in a really kludgy way just making the wallet ignore payments below a dust threshold.
< gmaxwell> molz: just paste the lines in a sutiable format.
< molz> yea i meant copy and paste, ok thanks
< gmaxwell> murch: so I think SRD would suffer badly from that. But I think that could be addressed by trying multiple strategies and taking the 'best'.
< gmaxwell> molz: I put up pastbins with the commands in cli format as well as sutiable for pasting into the debug console..
< murch> gmaxwell: Yeah, that's true. My new algorithm uses a two phase approach. First it purposefully looks for direct matches. It will not consider inputs there that have a negative payload (more fees than value), but after that it falls back to random selection, which may spend small inputs over time.
< murch> (Experiment is still running, can't give you good details on the final set composition yet)
< gmaxwell> By direct matches, you mean no change created--but possible 'extra fee', not one-input only, right?
< murch> gmaxwell: yeah, that
< gmaxwell> Good! sounds reasonable to me.
< gmaxwell> Then again lots of things that didn't work sounded reasonable to me.
< murch> Since a change output causes an additional output on creation, then an additional input some time in the future. So I allow up to the cost of one input + one output as a padding for the "exact" match.
< murch> gmaxwell: Is a "flood with dust" attack still reasonable? I thought that transactions with dust outputs are considered non-standard and they'd have a hard time getting confirmed?
< gmaxwell> Good, that is a rational way to set it. (arguably, even more would be justified either on the basis of of fees being higher in the future, or on the basis of that you may get faster confirmation for it... but your approach sounds reasonably conservative)
< gmaxwell> murch: That particular one, perhaps less so (well a miner could do it fine)-- but I meant it more of a concrete example that patalogical cases could be created and it is desirable if the wallet doesn't behave badly in any situation easily setup by an attacker.
< murch> mh.
< murch> I haven't considered that attack scenario too much yet.
< gmaxwell> e.g. if you have one 50 btc input and 200 0.00001 inputs (still above dust threshold!) ... then the random selection would pick a pretty bad solution.
< murch> I was thinking to replace the SDR fall back with 7 random drawings and the median input set size
< murch> or some similar scheme
< gmaxwell> (er my example just now should have also said "and you try to pay 1 btc")
< murch> random drawing should have nice privacy properties though, and generates a wider range of utxos which are beneficial for finding direct matches
< murch> gmaxwell: Yes, of course.
< gmaxwell> I agree, well you don't have data for it too, but I think all cases the selection should try to spend all inputs paid to a particular scriptpubkey to limit linkage graph inflation, but that can be layered right on top of your ideas by treating a input set as an input.
< murch> exactly
< murch> I'm planning to put addresses in my simulation in the future, but right now I'm focusing on writing up what I have. ;)
< murch> I have to print next wednesday. :D
< murch> This is the new algorithm I have come up with, in case you want to take a look
< murch> gmaxwell: some preliminary results on the same data set I talked about at Scaling Bitcoin: https://docs.google.com/spreadsheets/d/1dzugGnAw2nBNL_BwpFR44jci8rO3RkkF5M9OcP2ESN0/pubhtml
< murch> the perpetrator is "BranchAndBoundWallet"
< murch> What I also like is that it has a very low standard deviation in the input set. (Although, as you have pointed out, I have not considered pathological cases yet.)