#bitcoin-core-dev on 2016-07-10 — searchable irc log

00:00 < petertodd> I could make use of a breakdown by age fwiw

00:01 < sipa> yeah, i'll create such a graph too

00:01 < sipa> should be easier as it doesn't require reindexing

00:01 < gmaxwell> I'm thinking it would be interesting to solve the following optimization problem: given a set of utxo size, birth/death times, value. Find the coefficients of multinomial a f(value,size,age_in_blocks) such that a fixed cache of size N with retention sorted by f() has the highest hitrate (item still in cache at utxo death).

00:02 < petertodd> sipa: thanks!

00:03 < petertodd> gmaxwell: though note that you probably can't bake something like that into a consensus thing like txo commitments w/ utxo cache, as it changes peoples' behavior...

00:03 < petertodd> gmaxwell: potentially still useful for local optimisation though

00:08 < gmaxwell> petertodd: well you can to some extent, for example, having a different cache policy at some threshold, like 1BTC... 21 million txo is the absolute maximum number of outputs at 1BTC+ so there is a pretty strong upper bound on whatever impact you could have on encouraging people to make utxo at or over 1 BTC. :)

00:10 < petertodd> gmaxwell: I mean, if you take the coefficients from prior data, they're likely to be wrong after people change their behavior - if you use coefficients, you need to have a different rational is all I'm saying

00:10 < petertodd> gmaxwell: keeping higher value UTXO's in cache longer probably does make sense

00:11 < gmaxwell> yea sure.. the full answer isn't probably that useful as a consensus rule

00:11 < gmaxwell> The part of the answer that tells you some value breakpoint(s) may be.

00:12 < gmaxwell> because even if they're slightly wrong due to changing usage which they themselves incentivize, the whole prospect of having value relative retention is reasonable.

00:13 < gmaxwell> e.g. if age * value is a good scoring function on the past data, it probably is a robust one, which could be prudently used in consensus.

00:14 < petertodd> yup

00:15 < gmaxwell> or some polynomial on age * value. Or really I think any degree multinomial on age and value is probably also okay. Only size is one that is probably busted. :)

00:15 < petertodd> gmaxwell: speaking of, what approaches are good for writing polynomials in consensus critical code? (I wonder if someone has a writeup on that)

00:25 < gmaxwell> So-- if possible, probably best by converting it to a simple segmented linear approximation. But failing that, I would assume fixed point with horner's method (see wikipedia) which is more computationally efficient and has better numerical properties than the naieve way you'd compute it.

00:27 < gmaxwell> this is where you write is as a c0 + x * (c1 + x * (c2 + x *....

00:28 < gmaxwell> petertodd: there are all sorts of 'consensus critical' polynomials in opus (ones where a discrepency in the integer value returned causes a total entropy decoding failure)-- they never were a big issue, they're written like that, and we tested them exhaustively. no biggie.

00:31 < petertodd> gmaxwell: cool, thanks. re: exhaustively, I assume that's 16bit ints at most?

00:31 < gmaxwell> no, 32 bits.

00:32 < petertodd> gmaxwell: huh, how does that work?

00:32 < gmaxwell> for some function thats just doing some multiplies and adds, trying all 32 bit inputs isn't a big deal.

00:32 < petertodd> gmaxwell: wait, as in 2^64?

00:33 < sipa> i assume it's a function with 1 input :)

00:33 < kanzure> https://github.com/xiph/opus/blob/58dbcf23f3aecfb9c06abaef590d01bb3dba7a5a/celt/cwrs.c#L164

00:33 < gmaxwell> no, as in computing f(x) where x is some 32 bit value and x is a single variable polynomial in x. :)

00:34 < petertodd> gmaxwell: yeah, that makes a lot more sense :)

00:34 < petertodd> gmaxwell: I'm writing a spec for dex, and it's interesting how you can make an argument for only supporting 16-bit ints natively, as you can exhaustively test them

00:34 < gmaxwell> yes, indeed that is a bit normative polynomial, though I think the current CWRS code there mostly uses tables.

00:36 < gmaxwell> petertodd: thats something you could argue, though it would have to be weighed against footgun and bloat risks.

00:36 < gmaxwell> (though I suppose I could make a anti-footgun argument-- the user is much more likely to _know_ the range of the type is limited, when it's so limited they are constantly forced to deal with it)

00:37 < gmaxwell> over and underflow can be defined as script failure.

00:37 < petertodd> gmaxwell: yeah, it'd only be practical if you can write reasonably efficient n-bit add/multiply/etc. routines, and make them part of the built-in library

00:38 < petertodd> gmaxwell: yes, I'm planning on under/overflow to be a failure condition (probably with wrapping and/or overflow variants)

00:38 < gmaxwell> petertodd: also a commonly interesting case is things like where one input has small range, and that doesn't really impede exaustive testing.

00:38 < gmaxwell> e.g. U32 * U8. It's really common in software to take arbritary numbers and multiply or divide them by small constants.

00:38 < petertodd> gmaxwell: so, what I noted in my draft spec doc, is that interestingly economicly relevant numbers often have quite large ranges: e.g. coin value

00:39 < gmaxwell> like computing 2/3 of a coin value.

00:39 < petertodd> gmaxwell: though, consensus critical economic use-cases just need summation, not multiplication/division (usually)

00:40 < sipa> petertodd: but but demurrage

00:40 < gmaxwell> for example, you may want to compute input * 2/3, and input - input * 2/3.

00:40 < gmaxwell> for value splits in a contract.

00:40 < gmaxwell> and input is 64 bits.

00:41 < petertodd> sipa: ok, austrian economics... :P

00:42 < petertodd> gmaxwell: well, once you talk about contracts more genreally, you get interest calculations, which gets very complex...

00:43 < gmaxwell> polynomial approximations of interest calculations generally work fine over limited ranges.

00:43 < petertodd> gmaxwell: but I'm assuming sane contracts would generally only need a few calculations along those lines, so slower approaches should be ok

00:43 < petertodd> yeah, polynomial approx being one approach

00:44 < petertodd> in any case, it's looking like having reasonable type checking is a way harder and more complex problem :)

00:45 < gmaxwell> one nice way for exponential functions is to use CLZ to get the integer part of a base 2 log, and then use a polynomial correction.

00:45 < petertodd> gmaxwell: CLZ?

00:45 < sipa> count leading zeroes

00:45 < petertodd> sipa: ah!

00:45 < sipa> __builtin_clz

00:49 < gmaxwell> er actually its the log you want the clz for, for exp you just need to use the leading bits of the number.

00:49 < gmaxwell> https://github.com/xiph/opus/blob/58dbcf23f3aecfb9c06abaef590d01bb3dba7a5a/celt/mathops.h#L184 is such an example. (of course to get to any other base for the exponential is just some scaling)

00:50 < petertodd> gmaxwell: a neat article to write would be numerical methods for consensus critical code :)

00:51 < sipa> "Bounded memory, bounded CPU usage... and bounded errors"

00:51 < gmaxwell> I think they mostly end up like numerical methods for fixed point realtime dsp.

00:51 < petertodd> sipa: lol, did chris send you a copy of my spec? :P

00:51 < sipa> no

00:51 < sipa> well, maybe he did, but i did not look at it

00:51 < petertodd> sipa: I mean, I have basically the exact same sentence in it - though no surprise, same problem :)

00:52 < petertodd> gmaxwell: yeah, probably true

00:52 < gmaxwell> with perhaps some additional considerations for exhaustive testing and/or formal verification.

00:53 < gmaxwell> but yea, the other reason you exaustively test these approximations is to characterize the worst case error: https://github.com/xiph/opus/blob/58dbcf23f3aecfb9c06abaef590d01bb3dba7a5a/celt/mathops.c#L202

00:54 < petertodd> bbl, need a bigger battery...

00:55 < midnightmagic> i wonder how heavy that special 9-cell thinkpad thing is

01:09 < petertodd> midnightmagic: doesn't help that mine is a few years old - batteries wearing out

01:09 < sipa> petertodd: protip, when buying a new laptop, buy an extra battery that fits... once your battery wears out they'll be harder to find

01:10 < petertodd> sipa: good advice - though I've never actualy bought a laptop new

01:10 < * midnightmagic> has no user-replaceable battery in his mbp. :(

01:10 < midnightmagic> petertodd: why not?

01:10 < petertodd> midnightmagic: because I act like I'm a poor student :P

01:12 < petertodd> midnightmagic: I have a t520 right now, which I think is about four years old

01:18 < midnightmagic> Humility and reasonable resource management are virtues.

01:20 < petertodd> midnightmagic: well, I was going through my corporate expenses the other day... and a new laptop would be a drop in the bucket compared to what gets spent on me travelling

01:21 < midnightmagic> i hate travelling :( the world is made for people much, much smaller than I am.

01:22 < petertodd> midnightmagic: I actually find I get more work done; I'm no tourist so when I'm in the middle of a foreign country I tend to find somewhere to sit down with my laptop :)

01:25 < midnightmagic> i do too, all the stuff that is on the to-do list in my home can't be addressed so I can let it go

01:26 < petertodd> midnightmagic: yeah, and when I'm home, friends want to like, hang out with me and take up all my time :P

01:30 < midnightmagic> psh. friends. so annoying.

01:56 < petertodd> sure

01:56 < petertodd> er, wrong window - stupid lag

02:02 < sipa> haha

02:12 < gmaxwell> sipa: lithion ion batteries of varrious type have a shelf life, sadly.

02:12 < gmaxwell> The unused one will also fade.

02:13 < gmaxwell> But fortunately if you buy something like a thinkpad, people make batteries for them forever.

02:13 < gmaxwell> You can buy batteries for ten year old thinkpads no problem.

02:20 < gmaxwell> petertodd: go stab the people your blog post has confused: https://www.reddit.com/r/Bitcoin/comments/4s3a9r/segwit_code_review_update/

02:41 < kanzure> consensus-related non-malleability vs wallet/p2p-level, is my guess.

02:42 < petertodd> kanzure: it's just a flaw in the way the mempool/p2p refers to segwit txs; has little if anything to do with consensus

02:46 < petertodd> kanzure: tl;dr: is that the p2p asks for txs by txid, which doesn't commit to the witness, and marks invalid txs by txid, without being able to consistently know if a tx was invalid due to a bad witness

02:46 < petertodd> equaly, s/invalid/unacceptable due to fee, size, whatever/

02:46 < sipa> i actually don't think the flaw is referring by txid instead of wtxid

02:47 < petertodd> sipa: how so?

02:47 < sipa> but we should have made resource limits part of the base transaction, not the witness

02:47 < kanzure> yes i was wondering about things like "how to actually show a node that a certain tx is valid later, if at first it receives a bad witness" :\

02:48 < petertodd> sipa: what do you mean by that?

02:48 < sipa> (fees, sigop count, byte sizes) go in scriptSig... or even better, in the inv

02:48 < sipa> so a node can decide to not process before you know... processing

02:48 < petertodd> sipa: duplicating that stuff has historically lead to endless problems, basically because you have to check it twice

02:49 < sipa> how do you mean?

02:49 < petertodd> sipa: in a decent system, processing even invalid txs is something that happens very quickly, so there's no DoS attack

02:49 < petertodd> sipa: notice how satoshi screwed up sigops from the beginning...

02:50 < sipa> unfortunately, fees and sigop counting require fetching the utxos

02:50 < petertodd> sipa: in any of these schemes, you still have to count up sigops as they're actually executed, to check that the sum matches (or is less than) the claimed sum

02:50 < sipa> of course

02:50 < petertodd> sipa: so? fetching utxos can't be expensive, or we've already lost

02:50 < sipa> but a mismatch can then actually result in a ban, because it cannot happen accidentally

02:51 < petertodd> sipa: if we used wtxids, you could still ban based on that

02:51 < sipa> but our rate limiting is based on feerate, which depends on fee, which we cannot enforce until we've done the effort

02:52 < sipa> if there is no rate limit, even a cheap validation per transaction will be too much

02:52 < petertodd> sipa: huh? someone sends you a DoS tx, just ban them - there's no reason *legit* transactions should take significant cost to accept

02:54 < sipa> petertodd: so there is a trivial solution... fully validate every transaction you asked for

02:54 < sipa> so you don't prematurely discard a transaction before finding out it was an attempted attack

02:55 < sipa> it is too expensive, or invalid, or malleated... you can ban who sent it

02:55 < petertodd> sipa: I think you're missing my point: the threshold that we consider it an "attempted attack" should be low enough that there's no DoS issues; txs fundementally should never be expensive to validate, and cases where they are should be non-standard, and eventually, removed entirely from the protocol

02:56 < sipa> yes

02:56 < sipa> i agree

02:57 < sipa> but the issue here is that we fail to detect whether a too expensive transacrion is due to its creator or due to who relayed it

02:57 < petertodd> sipa: right, but wtxid solves that issue

02:58 < petertodd> if relayer malleates, we'll still ask for a different version of the same txid if another peer gives us it

02:58 < sipa> yes, it would... but it adds complications

02:58 < petertodd> how so?

02:58 < sipa> you need indexes by both txid and wtxid..

02:58 < sipa> and you always risk requesting things twice

02:58 < petertodd> sipa: in what, the mempool/p2p?

02:59 < petertodd> but they are different things!

02:59 < sipa> in my view they are the same thing

02:59 < sipa> one with an optional part omitted

02:59 < sipa> except we only find out too late that it was not optional

03:00 < petertodd> well, I don't agree at all - the optional part has a non-optinal effect on the tx

03:00 < sipa> for those who care (which a full node does)

03:00 < sipa> that's a semantic discussion, though

03:00 < sipa> i can see your point

03:00 < petertodd> I certainly care: my tx fees depend on the witness

03:01 < petertodd> I may also have a protocol where I'm publishing something in the witness, e.g. hashlock

03:01 < gmaxwell> Cleanstack means that ordinary transactions cannot have their fees increased without invalidating them. (If not for that I would have already recommended we have some preprocessing to strip transactions as they're relayed)

03:02 < sipa> i think the easiest solution is to validate scripts even for things we know we won't accept

03:02 < gmaxwell> You should probably assume that relaying nodes will perform witness stripping in the future.

03:02 < sipa> we have spent almost all the work anyway (fetched inputs, and wasted bandwidth)

03:02 < petertodd> gmaxwell: with currnet tx patterns yes; it's non-trivial to make txs with more complex scripts that have non-malleable witnesses

03:03 < gmaxwell> (by witness stripping I mean compacting the witness to the minimal data needed to be valid, as best it can determine)

03:03 < petertodd> sipa: as in, valiate scripts to determine if the witness is wrong?

03:03 < gmaxwell> Yes.

03:03 < gmaxwell> This was something I had been advocating for a while because there are some other potential DOS attacks that exist because we don't.

03:03 < gmaxwell> (a while = before segwit existed)

03:04 < petertodd> well, again, that assumes you know how to clean witnesses - there are tx patterns where that's not trivial (and indeed, the user may intentionally have more than one form of witness for the same txid)

03:04 < gmaxwell> though without an enforced feefilter its a bit less than ideal.

03:04 < gmaxwell> petertodd: sure any cleaning would always be a best effort attempt.

03:05 < petertodd> I'm still of the opinion that asking for a wtxid is a way simpler overall conceptual design (obvs implementation level may suck due to historical baggage)

03:05 < sipa> this is orthogonal to fetching by wtxid, of course

03:05 < petertodd> it's not orthogonal at all: if I manage to clean a script significantly, I want my peers to get it, and then say "hey, this is way smaller, lets replace it in our mempool..."

03:05 < gmaxwell> yea, I agree it's orthorgonal, fetching by wtxid has an amplification attack vector that is kind of sad.

03:06 < petertodd> gmaxwell: what's the vector?

03:06 < sipa> gmaxwell: presenting multiple versions of the same transaction with different witness?

03:06 < gmaxwell> The amplification attack vector is that I create grab transactions and relay witness malleations to every node in the network, different version to each node, so when I happen to get txn early every node ends up with a different txid and you get N^2 bandwidth forwarding it around.

03:07 < petertodd> gmaxwell: but you can do that anyway with your own txs

03:09 < sipa> you could of course create invs with both txids and wtxids

03:09 < petertodd> sipa: yeah, that'd be fine

03:09 < petertodd> sipa: and obviously, our peer can tell us it's segwit and wants us to do that

03:10 < sipa> except that also breaks your try to replace with smaller witness use case

03:10 < petertodd> sipa: why? once you go down that rabbit hole, you can also advertise length

03:11 < sipa> petertodd: yes, that was my suggestion in the beginning of the discussion :)

03:11 < sipa> advertize sigops/size/fees in the inv

03:11 < petertodd> sipa: no, you suggested having the tx _commit_ to that info, which is a very different thing; non-consensus critical code advertising length isn't a big deal

03:12 < sipa> petertodd: reread my sentence

03:12 < petertodd> sipa: ah, ok, I have no issues with that

03:12 < petertodd> sipa: (I've been thinking about this exact kind of issue for my dex specification actually)

03:12 < sipa> "... or even better, in the inv$

03:12 < petertodd> sipa: yeah, for mempool I think invs advertising this stuff makes a lot of sense

03:13 < petertodd> for starters, if we screw that up, it's relatively easy to fix :)

03:13 < sipa> though i think it's not necessarily an urgent issue

03:14 < sipa> the worst case is that a bad peer can poison your reject cache, preventing you from getting a legitimate transaction before it confirms

03:14 < petertodd> so, is a reasonable game plan to releast segwit with the current p2p design, and then add wtixds to invs later? (potentially with sigops/size in the inv)

03:14 < petertodd> sipa: it's a good thing no-one relies on zeroconf anymore :P

03:14 < sipa> i'm sure there are other ways that you can accomplish that

03:14 < sipa> (like inving and not responding to getdata)

03:14 < gmaxwell> wrt different kinds of relaying later, mostly I thought those improvements would go into mempool sync.

03:15 < gmaxwell> and rather than wtxid relay, wtxid,feerate,txid tuples (probably truncated/quantized) may make more sense.

03:16 < sipa> yeah, for mempool sync we can redesign things cleanyl

03:16 < petertodd> gmaxwell: note too how schemes like txo commitments allow - and expect - nodes to do significant modifications to txs

03:16 < gmaxwell> (or even wtxid, feerate, vin id with lowest hash)

03:18 < gmaxwell> in any case, optimal sync behavior in the presence of double spends (of any kind) isn't a nut I've cracked yet.

03:18 < sipa> petertodd: yes, the priority should be to make sure no internal inconsistency or banning of unaware nodes occur

03:18 < gmaxwell> I think I have constructions for schemes which are close to optimal absent doublespends.

03:18 < gmaxwell> we already improved relay efficiency a LOT in 0.13, fwiw.

03:19 < sipa> petertodd: rejectioncache behaviour can either degenerate into attackers preventing you from receiving transactions on the one hand

03:19 < sipa> or to the old pre-rejectioncache bevahiour of requesting failed txn from every peer (which is made much less bad due to feefilter already)

03:21 < gmaxwell> there are already several trivially exploited ways where an attacker can massively delay you getting a transaction.

03:21 < petertodd> sipa: well, again, an attacker can do that DoS attack without segwit by just sending multiple slightly different versions of the same tx

03:22 < gmaxwell> (e.g. just inv the damn thing from many sockets and don't respond.

03:22 < gmaxwell> )

03:24 < sipa> yeah

20:34 < oddishh> WOW! My bitcoin expander is now READY! Put some bitcoin in my wallet and I'll intantly expand it & send you more back. Totally vouched & legit. PM me to begin!

23:41 < grubles> 1/3

23:41 < grubles> oops. sorry.

23:42 < sipa> 0.33333333

23:42 < grubles> :)