#bitcoin-core-dev on 2016-10-24 — searchable irc log

01:24 < GitHub14> [bitcoin] gmaxwell opened pull request #9002: Make connect=0 disable automatic outbound connections. (master...connect0) https://github.com/bitcoin/bitcoin/pull/9002

02:29 < midnightmagic> :-/

04:11 < luke-jr> does it not do that again? :x

04:15 < sipa> do what?

04:18 < luke-jr> sipa: disable outbound connections.

04:18 < luke-jr> it used to do so at least, and seemed to work when I recently did it for testing

04:32 < gmaxwell> My belief is that it used to work, but what it does right now is just loop continually trying to connect to '0' and on my system this manages to connect to myself.

04:32 < gmaxwell> this, indeed, does prevent it from connecting to other parties...

04:33 < luke-jr> O.o

04:33 < gmaxwell> but in a fairly log spammy way (esp with debugging turned up)

04:34 < sipa> any theory why '0' results in a connect to self? :s

04:34 < sipa> that shouldn't even resolve to a valid ip

04:34 < luke-jr> sipa: most integers are a valid IP

04:35 < gmaxwell> I didn't bother checking my belief that it used to behave better.

04:35 < aj> sipa: telnet 0 22 -> 0.0.0.0:22 -> ssh to localhost

04:35 < gmaxwell> $ telnet 0 8333

04:35 < gmaxwell> Trying 0.0.0.0...

04:35 < gmaxwell> Connected to 0.

04:35 < gmaxwell> Escape character is '^]'.

04:35 < gmaxwell> ^]

04:35 < gmaxwell> telnet> close

04:35 < sipa> well i knew it'd resolve to 0.0.0.0

04:35 < sipa> i'm surprised 0.0.0.0 is a valid thing to connect to

04:35 < gmaxwell> This surprised me too. I figured it was some quirk of my system... but in any case, better to just not have the useless thread.

04:36 < luke-jr> ping 2130706433

04:36 < luke-jr> (this is 0x7f000001)

04:36 < aj> 0.0.0.0 means "all ipv4 addresses on the local machine", so seems kinda plausible

04:36 < sipa> luke-jr: some browsers even allow much larger integers (up to 2^1000-ish), resolving them modulo 2^32

04:37 < luke-jr> >_<

04:38 < sipa> as to why anyone ever tbought to support that: i believe a naive decimal-to-uint32 converter will actually behave that way

04:38 < gmaxwell> we it would stop at 2^1000 isn't clear. :P

04:39 < luke-jr> I should access my KVMoIP by decimal. I don't have DNS on it anyway. :p

04:44 < sipa> gmaxwell: ish

04:45 < sipa> it's probably just an input buffer size limit

04:51 < midnightmagic> 0 is the inaddr_any wildcard/alias address, and in some systems (the ones I'm familiar with) it means "the first interface that was ifconfig'd at boot" -- in other words, it's *not* guaranteed to be 127.0.0.1.

04:52 < luke-jr> I would have expected 0.0.0.0 to be a blackhole :/

04:52 < luke-jr> midnightmagic: what if you ifconfig the first interface with 0.0.0.0? <.<

04:55 < midnightmagic> luke-jr: i dunno, can you ifconfig an interface to 0.0.0.0?

04:55 < luke-jr> probably not

04:55 < luke-jr> I think that means unconfigure

06:15 < midnightmagic> or on some machines you get:

06:15 < midnightmagic> :dom13:06:14:57 ~# telnet 0 22

06:15 < midnightmagic> 0: No address associated with hostname

06:25 < wumpus> IP parsing is not guaranteed to parse '0' as '0.0.0.0', in this csae it just thinks it's a hostname

06:27 < wumpus> any other 0.x.y.z IP seems to give "invalid argument" on linux

06:30 < wumpus> anyhow the problem here is that bitcoin core uses the OS's IP parsing, this has resulted in confusion before

06:30 < gmaxwell> I don't ~think~ special casing "0" is too ugly, but I am open to that idea. :)

06:31 < wumpus> but it's not just 0, some IP parsers will happily parse any int32 into a IPv4

06:32 < wumpus> +have other OS/libc specific quirks

06:33 < gmaxwell> right, this discussion is inspired by my patch that special cases "0" to disable connect. Which itself was inspired by me getting irritated with my logs being flooded with connects to self after setting connect=0 to disable automatic outbound connections.

06:34 < wumpus> yes, the reason to special-case 0 would be for -noconnect

06:34 < wumpus> which is fine with me

06:34 < gmaxwell> Yes, right now, my PR doesn't handle -noconnect (I believe), good point on that.

06:34 < wumpus> noconnect should automatically get converted to connect=0

06:35 < wumpus> you shoudln't have to do anything special for that

06:35 < wumpus> -noX bcomes X=1 in the low-level arg handling

06:35 < wumpus> ehhh X=0

06:36 < gmaxwell> Will test.

06:36 < wumpus> in any case special-casing 0 makes sense because the argument handling special-cases 0 for 'no'

06:38 < wumpus> we do that in plenty of places in init.cpp, and adding one more is not a problem at all. If someone really wants to connect to 0.0.0.0, let them specify 0.0.0.0

06:39 < wumpus> -noconnect should ideally mean "no outgoing connections"

06:40 < wumpus> ooh rowhammer on ARM, we're nowhere safe are we

06:40 < gmaxwell> positive news, now that ECC memory will be needed to prevent users from controlling their own devices, we might get ecc memory in more devices.

06:41 < gmaxwell> so I thought about "should it imply nodnsseed" and "what if you are goofy and combine connect=0 and addnode? should it still addnode?".

06:42 < gmaxwell> the former seems like a harmless and reasonable thing to do, the latter... I dunno if it should touch the addnode behavior, I think probably not.

06:42 < wumpus> I guess it should mean "no automatic connections"

06:42 < gmaxwell> Yes. No automatic connections makes sense.

06:42 < wumpus> if you really want to force a connection using addnode, I don't see why it should stop it

06:42 < gmaxwell> Okay, I'll add the imply on dnsseed.

06:43 < wumpus> yes stopping dnsseed is an optimiziation that makes sense

06:43 < wumpus> why dnsseed if you're going to ignore the result

06:43 < gmaxwell> hardly likely to cause a negative surprise... you won't know if you got seeded or not, with no automatic connections. :)

06:43 < gmaxwell> well it's not a total no-op as it does update your peers.dat. If a tree falls in the forrest...

06:44 < wumpus> well if you really insist on that you can still override it to true right? I suppose you're using 'soft' parameter interaction?

06:44 < gmaxwell> Yep.

06:46 < gmaxwell> oh actually any seting of connect already softsets dnsseed to off, even -noconnect. (also listen too, which is potentially confusing, but historic, and conservative)

06:46 < wumpus> re: android I'm happy that these kind of exploits allow full control of the device by their owner, on the other hand, OSes such as android *invite* people to install all kinds of bullshit apps because of their claimed secure sandbox, and all of those can take full control of the device too, that worries me

06:47 < wumpus> ah yes that's true

06:50 < wumpus> although the DRM-on-IoT thing is perverse, if someone hacks your phone (or toaster) through a rogue app, they will have root, but you can't get control yourself to chase them off again

06:51 < wumpus> "how to hand the world to blackhats with one easy trick"

06:53 < midnightmagic> wumpus: you're talking about rowhammer?

06:54 < gmaxwell> part of the reson we have this problem is due to technical illiteracy-- to Joe Blow, he doesn't know how to control his device in a meaningful way no matter how much root you give him; so to actually take control from him (even while still accidentally giving it to hackers) doesn't seem like that big of a loss.

06:54 < wumpus> yes rowhammer, though it similarly applies to dirty cow, or still-undisclosed-exploit-of-the-day

06:59 < wumpus> technical illiteracy is certainly part of the problem - another one would be lack of standards in regard to gaining control of a device by someone with technical literacy

06:59 < midnightmagic> "It's okay I'm just going to root your phone.." "Uh, okay Jim. Would you like another beer too?"

07:01 < wumpus> + stupid laws that disallow circumventing "

07:01 < wumpus> security measures"

07:01 < wumpus> all to protect mickey mouse, of course

07:06 < wumpus> "International Journal of Proof-of-Concept or Get The Fuck Out" hah I'd never seen that one written out

07:15 < wumpus> -noconnect seems to work fine w/ 9002

07:18 < GitHub101> [bitcoin] laanwj pushed 2 new commits to master: https://github.com/bitcoin/bitcoin/compare/f08222e882b1...fd29348dbe82

07:18 < GitHub101> bitcoin/master 1d8e12b Pavel Janík: Fix doxygen comment: the transaction is returned in txOut

07:18 < GitHub101> bitcoin/master fd29348 Wladimir J. van der Laan: Merge #8993: Trivial: Fix doxygen comment: the transaction is returned in txOut...

07:19 < GitHub145> [bitcoin] laanwj closed pull request #8993: Trivial: Fix doxygen comment: the transaction is returned in txOut (master...20161021_fix_GetTransaction_comment) https://github.com/bitcoin/bitcoin/pull/8993

07:50 < GitHub87> [bitcoin] unsystemizer opened pull request #9004: Clarify `listenonion` (master...patch-3) https://github.com/bitcoin/bitcoin/pull/9004

08:14 < da2ce7> Hello, is there a overview of this blocksign feature that is being developed?

08:15 < gmaxwell> da2ce7: I presume it would be a cut down port of whats in elements alpha.

08:16 < da2ce7> I can only suppose it is only to be used for testnet.

08:16 < gmaxwell> the motivation is just to have testnets that are less unreliable than a very small pow blockchain for use for testing that doesn't really care about the consensus mechenism.

08:16 < gmaxwell> yea, of course.

08:16 < gmaxwell> I'm curious where you heard about it where that wasn't clear?

08:16 < wumpus> yes, which is the only thing that makes the implementation difficult in practice

08:18 < wumpus> if you are looking for a proposal to enable it on mainnet, there's none, no chance

08:18 < da2ce7> I'm curious as it is similar to a feature that where a miner signs their block with a random key (with the pk fingerprint included in the coin base), where after-the-fact the miner could prove that he/she mined that block.

08:18 < gmaxwell> that doesn't make a lot of sense, someone signing something doesn't demonstrate authorship

08:19 < gmaxwell> instead, to achieve what you describe needs only a commitment to a public key in the block.

08:19 < gmaxwell> Though it's somewhat important to the system that miners operate anonymously, to reduce their exposure to coersion.

08:20 < da2ce7> well it dose prove that you own that public key, so it means that I cannot put _your_ public key in my block.

08:21 < luke-jr> not necessarily. you could sign the template and ask me to mine it for you :p

08:21 < gmaxwell> If the key is random and used once it doesn't really matter.

08:23 < da2ce7> however then you would need a mechanism to enforce using a different key eveyblock. Otherwise, BadMiner could put a GoodMiner public key, and be a nuance.

08:24 < gmaxwell> da2ce7: huh? no... if you care about that you just ignore reuse.

08:24 < da2ce7> *nuisance

08:24 < gmaxwell> same as someone not providing a key at all.

08:27 < da2ce7> well anyway, maybe there is no demand (or it isn't a wise thing at all), for there to be a standard way for miners to prove they made a particular block.

08:29 < gmaxwell> Sort of the opposite. I think it's extremely risky for the system for miners to be attaching any kind of identity to blocks at all.

08:29 < da2ce7> it could/would make miner/share statistics more reliable if used, again, if that is a wise idea is very debatable.

08:29 < gmaxwell> There are ways to do that without any publication of info to the general public though.

08:32 < gmaxwell> (presumably we'll see a change to miners self identifying the first time after someone gets sued because someone was unhappy about which transactions they happened to confirm. :( )

08:32 < da2ce7> I can imagine that it could be useful in the case the network was under a 51% attack, if miners could attach pseudo-anonymous identities to blocks. However it would be much preferable to never be in such a dystopian state.

08:34 < da2ce7> anyway, off-topic. Blocksign for testnet is good for testing :).

08:38 < gmaxwell> yes, centeralizing the system can make it more secure against many kinds of attacks... :)

09:17 < GitHub42> [bitcoin] laanwj pushed 2 new commits to master: https://github.com/bitcoin/bitcoin/compare/fd29348dbe82...ced22d035ac0

09:17 < GitHub42> bitcoin/master dfe7906 Matt Corallo: Add missing cs_main lock to ::GETBLOCKTXN processing...

09:17 < GitHub42> bitcoin/master ced22d0 Wladimir J. van der Laan: Merge #8995: Add missing cs_main lock to ::GETBLOCKTXN processing...

09:17 < GitHub156> [bitcoin] laanwj closed pull request #8995: Add missing cs_main lock to ::GETBLOCKTXN processing (master...2016-10-fix-getblocktxn-locks) https://github.com/bitcoin/bitcoin/pull/8995

10:47 < arubi> wumpus, so specifically re. parsing partial transactions as pre-segwit (#8837), I actually hit this issue in my own toy parser and it shows in core too: http://paste.debian.net/plainh/8c80d8cd so having some flag would be great

10:47 < wumpus> thanks, so I wasn't being overly paranoid about that

10:48 < arubi> yea those 0 inputs pre segwit are something to think about when fundrawtransaction

10:49 < wumpus> please comment on the issue too

10:49 < wumpus> ah yes :(

10:50 < arubi> well those fivepiece comments are mine :)

10:50 < arubi> I'll copy this comment too

11:04 < wumpus> does anyone care about DecodeBase58 performance? If so, please review/test https://github.com/bitcoin/bitcoin/pull/8736

11:49 < jonasschnelli> during IBD, there is "chainActive" like (CChain) object that contains the headers-only chain?

11:49 < jonasschnelli> *there is no "chainActive"

11:50 < jonasschnelli> It looks like that getchaintips is ripping out the headers-tip from setOrphans

12:04 < jonasschnelli> nm: I think pindexBestHeader is acceptable for my usecase

12:05 < wumpus> ok :)

12:11 < jonasschnelli> hmm.. why points pindexBestHeader->pprev to pindexBestHeader?!

12:12 < jonasschnelli> looks like it not traversable

12:12 < jonasschnelli> again... nm! local issue.

16:25 < BlueMatt> wait, what?

16:25 < BlueMatt> argh, we did break addnode :(

16:26 < sipa> how so?

16:26 < BlueMatt> i was just trying to use it for fibre and the hostname as both ipv4 and ipv6 but it will only try to connect to one resolved addr even though the other one will work

16:27 < sipa> hmm

16:27 < BlueMatt> i didnt realize we broke the pick-different-address-if-it-doesnt-connect logic :(

16:28 < sipa> i didn't think so either

16:28 < BlueMatt> there doesnt seem to be any handling for it

16:29 < sipa> ConnectSocketByName should pick a random resolved address for each attempt

16:29 < BlueMatt> ThreadOpenAddedConnections does a LooupNumeric before calling OpenNetworkConnection

16:30 < BlueMatt> lol, the client in question doesnt even have a v6 default route...bitcoind was trying (and failing) to connect to the v6 host, it seems :/

16:30 < sipa> and LookupNuumeric will fail if it's not a x.y.z.w or aaaa::bbbb:ccc or .onion style string

16:30 < BlueMatt> ahh

16:31 < BlueMatt> hmm, maybe they just didnt wait long enough

16:31 < BlueMatt> but it did try to connect to v6, get a network unreachable error, and give up :(

16:32 < sipa> it will only retry 2 minutes later, i think

16:32 < BlueMatt> yes, it did, and tried again to ipv6

16:33 < BlueMatt> though if its random i suppose he could have waited longer and maybe it would have worked

16:33 < BlueMatt> though its possible the hosts ipv6-preference logic is bad?

16:33 < Lightsword> it’s a fairly standard EC2 ubuntu 16.04 VPS

16:34 < sipa> maybe it only resolves to IPv6 addresses for some reason?

16:34 < BlueMatt> Lightsword: can you disconnect the ipv4 peer and just addnode the hostname and wait a few sets of 2 minutes?

16:44 < Lightsword> BlueMatt, yeah not seeing any connection

16:44 < Lightsword> 2016-10-24 16:40:07.057232 trying connection us-west.fibre.bitcoinrelaynetwork.org lastseen=0.0hrs

16:44 < Lightsword> 2016-10-24 16:40:07.237914 connect() to [2607:f0d0:2002:169::2]:8333 failed: Network is unreachable (101)

16:45 < BlueMatt> yea, so it seems broken :(

16:46 < BlueMatt> argh, someone wanna tag https://github.com/bitcoin/bitcoin/issues/9007 0.13.1?

16:49 < paveljanik> I can't do so, sorry.

16:52 < wumpus> already done

16:53 < wumpus> another assert that should be removed, like 1ab21cf344ed0547de5ae679b7e479cb4b1a923b I guess...

16:53 < wumpus> should check there's no other weird asserts added

16:53 < sipa> is 9007 in 0.13?

16:54 < wumpus> yes

16:56 < wumpus> https://github.com/bitcoin/bitcoin/blob/0.13/src/net.cpp#L1024

16:56 < sipa> ah, caused by feelers?

16:57 < wumpus> yes, maxconnections cannot be lower than maxoutbound+maxfeeler, I suppose if he'd just set maxconnections=9 it'd be ok

16:58 < gmaxwell> addnode can take you beyond the nMaxOutbound count.

16:58 < wumpus> that's not what that assert is checking though

16:59 < wumpus> it doesn't look at your actual connections, just the max possible

16:59 < gmaxwell> ah, indeed.

16:59 < gmaxwell> that should be a return, not an assert, in any case.

16:59 < * gmaxwell> feels stupid for missing that.

17:04 < gmaxwell> There was another inappropriate assert added in the same commit, but it was already removed by PR 8944.

17:04 < wumpus> well the previous assert based bug was that

17:04 < wumpus> right

17:07 < gmaxwell> sipa: I can't see how that couldn't be reproduced in rc2.

17:10 < gmaxwell> even returning there wouldn't be right.

17:10 < gmaxwell> lets say I set -connect=1.2.3.4 and maxconnections=4 ... I should still be able to accept 3 connections.

17:11 < sipa> if you set -connect, isn't listening disabled by default?

17:12 < gmaxwell> its softset off.

17:12 < gmaxwell> so if you -connect + listen=1 to be precise.

17:13 < sipa> ok

17:14 < gmaxwell> I was about to suggest that maxconnections<8 should hard force listen off, because then it would make it easier to troubleshoot why things aren't connecting; then realized that no actually in some configs inbounds should still be working even with low max connections.

17:16 < gmaxwell> wait.. what.. is eviction now broken?!

17:18 < gmaxwell> okay, its not, just kind of stupid.

17:20 < gmaxwell> without the insert, the -noconnect + listen=1 case with maxconnections<8 will continually try evicting a connection and fail.

18:06 < morcos> sipa: sorry for the annoying questions here.. it appears to me that the dynamic memory usage tracking in CCoinsViewCache assumes that the memory usage of a pruned coins is 0. i'm guess this is usually the case, but its not guaranteed right? (depends on capacity = 0)

18:07 < sipa> morcos: i believe we actually always call CCoins::Cleanup()

18:07 < sipa> which sets the capacity to 0

18:07 < morcos> i was trying to track down some signficant variations between actual usage and the tracked usage that appear in my code.. and am starting by trying to understand why the existing code is correct

18:07 < morcos> well it calls a new vector and swaps

18:08 < sipa> yes, that was the C++03 trick to set the capacity to 0

18:08 < morcos> but from my googling it seems implementation dependent whether a new vector has capacity 0?

18:08 < sipa> ok, i agree there is in theory a standard-compliant implementations which doesn't have that behaviour

18:09 < morcos> ok.. i noticed that a while ago.. and its probably not the cause of my problem, but just wanted to understand

18:09 < morcos> second , kind of unrelated question

18:09 < morcos> prevector.resize(0) doesn't seem like the fastest way to clean up a prevector<unsigned char> does it?

18:09 < sipa> the c++11 way is std::vector::shrink_to_fit() btw

18:10 < morcos> looking at the resize code, it calls erase, which iterates the elements destructing them

18:10 < morcos> but in our case where we're always using unsigned chars and we want to clear the whole thing.. couldn't we do that faster?

18:11 < sipa> so there is a question of what prevector should support

18:11 < sipa> if it can only contain POD types, i think more complexity can go away

18:12 < sipa> but if it's to be value for any movable c++ type (as it does now), it has to call erase (which may still be optimized out, when instanciated for simple types)

18:12 < sipa> i thought cfields_ was working on some improvements to prevector?

18:12 < cfields_> sipa: one of many things that got to 90% and shelved.

18:13 < sipa> s/erase/destructors/

18:13 < morcos> maybe.. i'm not sure. seems like it might be nice to at least optimize that one particular case, would we subtemplate or something that particular call.

18:13 < cfields_> sipa: in particular, I was doing a specialiazation for size=1

18:13 < sipa> yes, in c++11 you can use templates to figure out whether it's POD, and use a simpler implementation in that case or something

18:13 < cfields_> since that's our main use-case, and you can get huge speedups with that assumption

18:14 < morcos> ok.. thanks for the thoughts..

20:13 < GitHub49> [bitcoin] MarcoFalke opened pull request #9008: [net] Remove assert(nMaxInbound > 0) (master...Mf1610-netAssert) https://github.com/bitcoin/bitcoin/pull/9008

20:48 < btcdrak> https://github.com/bitcoin-core/bitcoincore.org/pull/239

21:58 < nibor> Could someone let me know what they think about: https://gist.github.com/n1bor/d5b0330a9addb0bf5e0f869518883522

21:59 < nibor> Is a functioning proof of concept of chainstate only sync. Syncs in about 30mins to a pruned full node state.

22:00 < nibor> Obviously need a soft-fork to be any use.

22:10 < gmaxwell> nibor: it's more frequent than the model I've been thinking of. For security reasons you really don't want to have a case where miners could make a 100 block fork and then forward print themselves a lot of coins. :) Also it's quite common for nodes to be offline for 1-2 weeks, so if nodes aren't keeping that much in blocks easily available, then security redegrades to SPV history (new chainstat

22:10 < gmaxwell> e sync). ... and downloading and syncing a few thousand blocks isn't really slow compared to 100 (relative to current sync times).

22:10 < gmaxwell> This is all particular relevant because the snapshot management means that different peers really can't choose their own checking time.

22:10 < gmaxwell> I'd been thinking of something that was more like a 3 month interval.

22:11 < gmaxwell> Petertodd will protest that requring a particular UTXO set construction will be a hard barrier to even more scalable things like STXO commitments in the future. I came up with a solution to that which you might want to use:

22:12 < gmaxwell> Two softfork rules: (1) if the commitment is present, it must be correct. and (2) the commitment must be present from activation until block XXX. If halfway to XXX everyone is still happy with the scheme, a new softfork is applied that says the commitment must be presnet until YYY.

22:12 < gmaxwell> That way if someothing better comes along, the commitment can eventually be dropped in a smooth and compatible manner.

22:13 < gmaxwell> (perhaps making new installs of old software take a long time to sync :) )

22:14 < gmaxwell> nibor: the hash chunking thing should use some kind of tree hash, it probably doesn't need to go down to the indivigual entry, but if I fetch chunks from N peers in parallel and one peer gives me garbage, I should be able to tell _which_ peer gave me garbage, otherwise you get DOS attacks.

22:15 < nibor> Could not see how tree helped. The snapshot message contains hash of all the chunks. So you know if a node is nasty after the 1st chunk.

22:16 < gmaxwell> okay, that potentially makes the snapshot message quite large, the only difference that a tree would make is that the snapshot value is just the single hash in the blockchain, and the chunks give you enough to verify membership.

22:17 < nibor> Regarding gap between snapshots problem with going too long is that the chainstate grows quite fast. Keeping snapshot from 300 blocks back makes chainstate 2.4G vs 1.7G with no snapshots.

22:17 < gmaxwell> The other important thing about this proposal is that it needs to be very upfront about this being a signficant change to the Bitcoin security model, and justify it. I believe it is a nessary one.

22:18 < gmaxwell> We generally need to engineer for the worst case, so we should probably just assume that they're maximum size even though fancy COW handling could reduce that.

22:18 < nibor> Current snapshot message is about 200k and chunks are about 200k each. So msgs small so should scale by factor of 10..

22:18 < gmaxwell> reorginizing chainstate into 'old' and 'new' could help with that churn fwiw.

22:19 < nibor> Annoyingly the leveldb snapshots are only held in RAM. So with a big gap node would really need to do a bit rewind to check state.

22:19 < gmaxwell> yea, I was surprised you got this working using leveldb snapshots.

22:20 < nibor> s/bit/big/

22:20 < nibor> Not sure I understand your 1st comment though. About miners creating big fork?

22:21 < gmaxwell> no rewind should be needed however, you should compute the hash as it goes by, e.g. snapshot at the height as you validate it, then at two blocks after, start computing the hash, and just save it.

22:22 < nibor> Sorry - yes you are right. Was thinking of putting hash 20 blocks after chainstate. So have time to compute even when chainstate say 50gig.

22:22 < gmaxwell> (50 gb chainstate is likely unworkable with leveldb :( but thats an aside. :) )

22:23 < nibor> Prob ok with 64Gig RAM... In day job just order 2 boxes with 2Tb so not so much..

22:23 < gmaxwell> nibor: majority hashpower can make their new commitment say that a million bitcoin that wasn't theirs is now theirs. Then all newly joining nodes will get the new chainstate, and eventually all old nodes will think they've hit a reorg larger than people have blocks available, and so they'll do a chainstate sync too...

22:24 < gmaxwell> nibor: we'd like to have a decenteralized system you know, :P

22:26 < nibor> With a 100gig download?...

22:26 < gmaxwell> nibor: did you understand my "phase out" suggestion? (the line refereincing petertodd)

22:27 < gmaxwell> nibor: more people handle a 100 GB download today than have more than 8GB ram available. (in particular hosted systems, VPSes, are often quite ram starved) but we're on a tangent.

22:28 < nibor> Not sure I do.. but might in a bit...

22:29 < gmaxwell> OKAY.

22:29 < gmaxwell> In any case, exciting work

22:29 < gmaxwell> Your POC is awesome.

22:29 < nibor> Thinking about a 3month interval. That is quite easy. Just copy the whole chainstate to another leveldb. Is not more work than hashing it really.

22:30 < gmaxwell> Or not another leveldb but just a seralized flat blob.

22:30 < gmaxwell> It would be faster and more space efficient. And random access is not needed, except to chunks.

22:30 < gmaxwell> (could be one file per chunk. Though 8000 chunks is a bit excessive for that.)

22:31 < nibor> Not really - leveldb is up to 1000 files.

22:31 < gmaxwell> I believe that only needs two snapshots at any time too...

22:32 < nibor> Yes - I only had 3 cos was short gap. And did not want a client that took over 100 blocks to download be left with nothing to find.

22:33 < nibor> Will think about Petertodd issue to. Is nasty to cause future issues.

22:34 < nibor> I guess some of the 100fork issues would be reduced if client back populated blocks over time.

22:34 < gmaxwell> There is a longer term proposal that would eliminate the utxo set, effectively, that we don't want to block.

22:34 < gmaxwell> nibor: yes, though if we just want the fastest possible start, it could start immediately as SPV, and then back populate.. irrespective of the snapshotting behavior.

22:36 < nibor> Thanks for thoughts... am off now. Will see what others think..

22:36 < gmaxwell> (just as background for you: the way the utxo set is eliminated, is that there is a small, perhaps fixed size utxo set, and transactions are expired out of it and commited into an insertion ordered hash tree... then when spending one of those outputs the spending transaction must provide a membership and update proof that lets nodes verify it was there and mark the entry as spent)

22:36 < gmaxwell> Great!

22:38 < murch> Hey gmaxwell, I was missing you at Scaling Bitcoin. :)

22:39 < murch> I was curios what you'd say about my coin selection talk after we had chatted here about it.

22:48 < gmaxwell> molz: I was sad to see that wider-match only made a fairly modest improvement!

22:49 < sipa> s/molz/murch/

22:49 < gmaxwell> oops!

22:49 < gmaxwell> murch:

22:49 < molz> haha i was scratching my head "what's the wider-match" lol

22:50 < murch> gmaxwell: After Scaling Bitcoin I came up with a new algorithm. I'm still running experiments on it (and writing my evaluation chapter, thesis is due next week), but it looks pretty promising in all aspects.

22:50 < molz> gmaxwell, btw, is there a way to load the ban list into my node or do i have to type each line manually?

22:50 < murch> It has a much higher rate of direct matches than the current core coin selection and is less computationally expensive.

22:50 < sipa> molz: they're remembered across restarts already

22:50 < sipa> bans.dat

22:51 < gmaxwell> murch: one thing your framework doesn't consider is patalogical cases, in the past, people actually attacked litecoin wallets by paying them lots of dust, with enough of it, the subset sum solver would come up with solutions so bad that it couldn't transact.

22:51 < molz> sipa but i haven't banned any bad nodes

22:51 < gmaxwell> They 'solved' this in a really kludgy way just making the wallet ignore payments below a dust threshold.

22:51 < gmaxwell> molz: just paste the lines in a sutiable format.

22:52 < molz> yea i meant copy and paste, ok thanks

22:52 < gmaxwell> murch: so I think SRD would suffer badly from that. But I think that could be addressed by trying multiple strategies and taking the 'best'.

22:52 < gmaxwell> molz: I put up pastbins with the commands in cli format as well as sutiable for pasting into the debug console..

22:52 < murch> gmaxwell: Yeah, that's true. My new algorithm uses a two phase approach. First it purposefully looks for direct matches. It will not consider inputs there that have a negative payload (more fees than value), but after that it falls back to random selection, which may spend small inputs over time.

22:53 < murch> (Experiment is still running, can't give you good details on the final set composition yet)

22:54 < gmaxwell> By direct matches, you mean no change created--but possible 'extra fee', not one-input only, right?

22:54 < murch> gmaxwell: yeah, that

22:54 < gmaxwell> Good! sounds reasonable to me.

22:54 < gmaxwell> Then again lots of things that didn't work sounded reasonable to me.

22:55 < murch> Since a change output causes an additional output on creation, then an additional input some time in the future. So I allow up to the cost of one input + one output as a padding for the "exact" match.

22:56 < murch> gmaxwell: Is a "flood with dust" attack still reasonable? I thought that transactions with dust outputs are considered non-standard and they'd have a hard time getting confirmed?

22:56 < gmaxwell> Good, that is a rational way to set it. (arguably, even more would be justified either on the basis of of fees being higher in the future, or on the basis of that you may get faster confirmation for it... but your approach sounds reasonably conservative)

22:57 < gmaxwell> murch: That particular one, perhaps less so (well a miner could do it fine)-- but I meant it more of a concrete example that patalogical cases could be created and it is desirable if the wallet doesn't behave badly in any situation easily setup by an attacker.

22:58 < murch> mh.

22:58 < murch> I haven't considered that attack scenario too much yet.

22:58 < gmaxwell> e.g. if you have one 50 btc input and 200 0.00001 inputs (still above dust threshold!) ... then the random selection would pick a pretty bad solution.

22:58 < murch> I was thinking to replace the SDR fall back with 7 random drawings and the median input set size

22:58 < murch> or some similar scheme

22:59 < gmaxwell> (er my example just now should have also said "and you try to pay 1 btc")

22:59 < murch> random drawing should have nice privacy properties though, and generates a wider range of utxos which are beneficial for finding direct matches

22:59 < murch> gmaxwell: Yes, of course.

23:00 < gmaxwell> I agree, well you don't have data for it too, but I think all cases the selection should try to spend all inputs paid to a particular scriptpubkey to limit linkage graph inflation, but that can be layered right on top of your ideas by treating a input set as an input.

23:00 < murch> exactly

23:01 < murch> I'm planning to put addresses in my simulation in the future, but right now I'm focusing on writing up what I have. ;)

23:01 < murch> I have to print next wednesday. :D

23:01 < murch> gmaxwell: https://github.com/Xekyo/CoinSelectionSimulator/blob/master/src/BnBWallet.scala#L59

23:01 < murch> This is the new algorithm I have come up with, in case you want to take a look

23:02 < murch> gmaxwell: some preliminary results on the same data set I talked about at Scaling Bitcoin: https://docs.google.com/spreadsheets/d/1dzugGnAw2nBNL_BwpFR44jci8rO3RkkF5M9OcP2ESN0/pubhtml

23:03 < murch> the perpetrator is "BranchAndBoundWallet"

23:04 < murch> What I also like is that it has a very low standard deviation in the input set. (Although, as you have pointed out, I have not considered pathological cases yet.)