< petertodd> I could make use of a breakdown by age fwiw
< sipa> yeah, i'll create such a graph too
< sipa> should be easier as it doesn't require reindexing
< gmaxwell> I'm thinking it would be interesting to solve the following optimization problem: given a set of utxo size, birth/death times, value. Find the coefficients of multinomial a f(value,size,age_in_blocks) such that a fixed cache of size N with retention sorted by f() has the highest hitrate (item still in cache at utxo death).
< petertodd> sipa: thanks!
< petertodd> gmaxwell: though note that you probably can't bake something like that into a consensus thing like txo commitments w/ utxo cache, as it changes peoples' behavior...
< petertodd> gmaxwell: potentially still useful for local optimisation though
< gmaxwell> petertodd: well you can to some extent, for example, having a different cache policy at some threshold, like 1BTC... 21 million txo is the absolute maximum number of outputs at 1BTC+ so there is a pretty strong upper bound on whatever impact you could have on encouraging people to make utxo at or over 1 BTC. :)
< petertodd> gmaxwell: I mean, if you take the coefficients from prior data, they're likely to be wrong after people change their behavior - if you use coefficients, you need to have a different rational is all I'm saying
< petertodd> gmaxwell: keeping higher value UTXO's in cache longer probably does make sense
< gmaxwell> yea sure.. the full answer isn't probably that useful as a consensus rule
< gmaxwell> The part of the answer that tells you some value breakpoint(s) may be.
< gmaxwell> because even if they're slightly wrong due to changing usage which they themselves incentivize, the whole prospect of having value relative retention is reasonable.
< gmaxwell> e.g. if age * value is a good scoring function on the past data, it probably is a robust one, which could be prudently used in consensus.
< petertodd> yup
< gmaxwell> or some polynomial on age * value. Or really I think any degree multinomial on age and value is probably also okay. Only size is one that is probably busted. :)
< petertodd> gmaxwell: speaking of, what approaches are good for writing polynomials in consensus critical code? (I wonder if someone has a writeup on that)
< gmaxwell> So-- if possible, probably best by converting it to a simple segmented linear approximation. But failing that, I would assume fixed point with horner's method (see wikipedia) which is more computationally efficient and has better numerical properties than the naieve way you'd compute it.
< gmaxwell> this is where you write is as a c0 + x * (c1 + x * (c2 + x *....
< gmaxwell> petertodd: there are all sorts of 'consensus critical' polynomials in opus (ones where a discrepency in the integer value returned causes a total entropy decoding failure)-- they never were a big issue, they're written like that, and we tested them exhaustively. no biggie.
< petertodd> gmaxwell: cool, thanks. re: exhaustively, I assume that's 16bit ints at most?
< gmaxwell> no, 32 bits.
< petertodd> gmaxwell: huh, how does that work?
< gmaxwell> for some function thats just doing some multiplies and adds, trying all 32 bit inputs isn't a big deal.
< petertodd> gmaxwell: wait, as in 2^64?
< sipa> i assume it's a function with 1 input :)
< gmaxwell> no, as in computing f(x) where x is some 32 bit value and x is a single variable polynomial in x. :)
< petertodd> gmaxwell: yeah, that makes a lot more sense :)
< petertodd> gmaxwell: I'm writing a spec for dex, and it's interesting how you can make an argument for only supporting 16-bit ints natively, as you can exhaustively test them
< gmaxwell> yes, indeed that is a bit normative polynomial, though I think the current CWRS code there mostly uses tables.
< gmaxwell> petertodd: thats something you could argue, though it would have to be weighed against footgun and bloat risks.
< gmaxwell> (though I suppose I could make a anti-footgun argument-- the user is much more likely to _know_ the range of the type is limited, when it's so limited they are constantly forced to deal with it)
< gmaxwell> over and underflow can be defined as script failure.
< petertodd> gmaxwell: yeah, it'd only be practical if you can write reasonably efficient n-bit add/multiply/etc. routines, and make them part of the built-in library
< petertodd> gmaxwell: yes, I'm planning on under/overflow to be a failure condition (probably with wrapping and/or overflow variants)
< gmaxwell> petertodd: also a commonly interesting case is things like where one input has small range, and that doesn't really impede exaustive testing.
< gmaxwell> e.g. U32 * U8. It's really common in software to take arbritary numbers and multiply or divide them by small constants.
< petertodd> gmaxwell: so, what I noted in my draft spec doc, is that interestingly economicly relevant numbers often have quite large ranges: e.g. coin value
< gmaxwell> like computing 2/3 of a coin value.
< petertodd> gmaxwell: though, consensus critical economic use-cases just need summation, not multiplication/division (usually)
< sipa> petertodd: but but demurrage
< gmaxwell> for example, you may want to compute input * 2/3, and input - input * 2/3.
< gmaxwell> for value splits in a contract.
< gmaxwell> and input is 64 bits.
< petertodd> sipa: ok, austrian economics... :P
< petertodd> gmaxwell: well, once you talk about contracts more genreally, you get interest calculations, which gets very complex...
< gmaxwell> polynomial approximations of interest calculations generally work fine over limited ranges.
< petertodd> gmaxwell: but I'm assuming sane contracts would generally only need a few calculations along those lines, so slower approaches should be ok
< petertodd> yeah, polynomial approx being one approach
< petertodd> in any case, it's looking like having reasonable type checking is a way harder and more complex problem :)
< gmaxwell> one nice way for exponential functions is to use CLZ to get the integer part of a base 2 log, and then use a polynomial correction.
< petertodd> gmaxwell: CLZ?
< sipa> count leading zeroes
< petertodd> sipa: ah!
< sipa> __builtin_clz
< gmaxwell> er actually its the log you want the clz for, for exp you just need to use the leading bits of the number.
< gmaxwell> https://github.com/xiph/opus/blob/58dbcf23f3aecfb9c06abaef590d01bb3dba7a5a/celt/mathops.h#L184 is such an example. (of course to get to any other base for the exponential is just some scaling)
< petertodd> gmaxwell: a neat article to write would be numerical methods for consensus critical code :)
< sipa> "Bounded memory, bounded CPU usage... and bounded errors"
< gmaxwell> I think they mostly end up like numerical methods for fixed point realtime dsp.
< petertodd> sipa: lol, did chris send you a copy of my spec? :P
< sipa> no
< sipa> well, maybe he did, but i did not look at it
< petertodd> sipa: I mean, I have basically the exact same sentence in it - though no surprise, same problem :)
< petertodd> gmaxwell: yeah, probably true
< gmaxwell> with perhaps some additional considerations for exhaustive testing and/or formal verification.
< gmaxwell> but yea, the other reason you exaustively test these approximations is to characterize the worst case error: https://github.com/xiph/opus/blob/58dbcf23f3aecfb9c06abaef590d01bb3dba7a5a/celt/mathops.c#L202
< petertodd> bbl, need a bigger battery...
< midnightmagic> i wonder how heavy that special 9-cell thinkpad thing is
< petertodd> midnightmagic: doesn't help that mine is a few years old - batteries wearing out
< sipa> petertodd: protip, when buying a new laptop, buy an extra battery that fits... once your battery wears out they'll be harder to find
< petertodd> sipa: good advice - though I've never actualy bought a laptop new
< * midnightmagic> has no user-replaceable battery in his mbp. :(
< midnightmagic> petertodd: why not?
< petertodd> midnightmagic: because I act like I'm a poor student :P
< petertodd> midnightmagic: I have a t520 right now, which I think is about four years old
< midnightmagic> Humility and reasonable resource management are virtues.
< petertodd> midnightmagic: well, I was going through my corporate expenses the other day... and a new laptop would be a drop in the bucket compared to what gets spent on me travelling
< midnightmagic> i hate travelling :( the world is made for people much, much smaller than I am.
< petertodd> midnightmagic: I actually find I get more work done; I'm no tourist so when I'm in the middle of a foreign country I tend to find somewhere to sit down with my laptop :)
< midnightmagic> i do too, all the stuff that is on the to-do list in my home can't be addressed so I can let it go
< petertodd> midnightmagic: yeah, and when I'm home, friends want to like, hang out with me and take up all my time :P
< midnightmagic> psh. friends. so annoying.
< petertodd> sure
< petertodd> er, wrong window - stupid lag
< sipa> haha
< gmaxwell> sipa: lithion ion batteries of varrious type have a shelf life, sadly.
< gmaxwell> The unused one will also fade.
< gmaxwell> But fortunately if you buy something like a thinkpad, people make batteries for them forever.
< gmaxwell> You can buy batteries for ten year old thinkpads no problem.
< gmaxwell> petertodd: go stab the people your blog post has confused: https://www.reddit.com/r/Bitcoin/comments/4s3a9r/segwit_code_review_update/
< kanzure> consensus-related non-malleability vs wallet/p2p-level, is my guess.
< petertodd> kanzure: it's just a flaw in the way the mempool/p2p refers to segwit txs; has little if anything to do with consensus
< petertodd> kanzure: tl;dr: is that the p2p asks for txs by txid, which doesn't commit to the witness, and marks invalid txs by txid, without being able to consistently know if a tx was invalid due to a bad witness
< petertodd> equaly, s/invalid/unacceptable due to fee, size, whatever/
< sipa> i actually don't think the flaw is referring by txid instead of wtxid
< petertodd> sipa: how so?
< sipa> but we should have made resource limits part of the base transaction, not the witness
< kanzure> yes i was wondering about things like "how to actually show a node that a certain tx is valid later, if at first it receives a bad witness" :\
< petertodd> sipa: what do you mean by that?
< sipa> (fees, sigop count, byte sizes) go in scriptSig... or even better, in the inv
< sipa> so a node can decide to not process before you know... processing
< petertodd> sipa: duplicating that stuff has historically lead to endless problems, basically because you have to check it twice
< sipa> how do you mean?
< petertodd> sipa: in a decent system, processing even invalid txs is something that happens very quickly, so there's no DoS attack
< petertodd> sipa: notice how satoshi screwed up sigops from the beginning...
< sipa> unfortunately, fees and sigop counting require fetching the utxos
< petertodd> sipa: in any of these schemes, you still have to count up sigops as they're actually executed, to check that the sum matches (or is less than) the claimed sum
< sipa> of course
< petertodd> sipa: so? fetching utxos can't be expensive, or we've already lost
< sipa> but a mismatch can then actually result in a ban, because it cannot happen accidentally
< petertodd> sipa: if we used wtxids, you could still ban based on that
< sipa> but our rate limiting is based on feerate, which depends on fee, which we cannot enforce until we've done the effort
< sipa> if there is no rate limit, even a cheap validation per transaction will be too much
< petertodd> sipa: huh? someone sends you a DoS tx, just ban them - there's no reason *legit* transactions should take significant cost to accept
< sipa> petertodd: so there is a trivial solution... fully validate every transaction you asked for
< sipa> so you don't prematurely discard a transaction before finding out it was an attempted attack
< sipa> it is too expensive, or invalid, or malleated... you can ban who sent it
< petertodd> sipa: I think you're missing my point: the threshold that we consider it an "attempted attack" should be low enough that there's no DoS issues; txs fundementally should never be expensive to validate, and cases where they are should be non-standard, and eventually, removed entirely from the protocol
< sipa> yes
< sipa> i agree
< sipa> but the issue here is that we fail to detect whether a too expensive transacrion is due to its creator or due to who relayed it
< petertodd> sipa: right, but wtxid solves that issue
< petertodd> if relayer malleates, we'll still ask for a different version of the same txid if another peer gives us it
< sipa> yes, it would... but it adds complications
< petertodd> how so?
< sipa> you need indexes by both txid and wtxid..
< sipa> and you always risk requesting things twice
< petertodd> sipa: in what, the mempool/p2p?
< petertodd> but they are different things!
< sipa> in my view they are the same thing
< sipa> one with an optional part omitted
< sipa> except we only find out too late that it was not optional
< petertodd> well, I don't agree at all - the optional part has a non-optinal effect on the tx
< sipa> for those who care (which a full node does)
< sipa> that's a semantic discussion, though
< sipa> i can see your point
< petertodd> I certainly care: my tx fees depend on the witness
< petertodd> I may also have a protocol where I'm publishing something in the witness, e.g. hashlock
< gmaxwell> Cleanstack means that ordinary transactions cannot have their fees increased without invalidating them. (If not for that I would have already recommended we have some preprocessing to strip transactions as they're relayed)
< sipa> i think the easiest solution is to validate scripts even for things we know we won't accept
< gmaxwell> You should probably assume that relaying nodes will perform witness stripping in the future.
< sipa> we have spent almost all the work anyway (fetched inputs, and wasted bandwidth)
< petertodd> gmaxwell: with currnet tx patterns yes; it's non-trivial to make txs with more complex scripts that have non-malleable witnesses
< gmaxwell> (by witness stripping I mean compacting the witness to the minimal data needed to be valid, as best it can determine)
< petertodd> sipa: as in, valiate scripts to determine if the witness is wrong?
< gmaxwell> Yes.
< gmaxwell> This was something I had been advocating for a while because there are some other potential DOS attacks that exist because we don't.
< gmaxwell> (a while = before segwit existed)
< petertodd> well, again, that assumes you know how to clean witnesses - there are tx patterns where that's not trivial (and indeed, the user may intentionally have more than one form of witness for the same txid)
< gmaxwell> though without an enforced feefilter its a bit less than ideal.
< gmaxwell> petertodd: sure any cleaning would always be a best effort attempt.
< petertodd> I'm still of the opinion that asking for a wtxid is a way simpler overall conceptual design (obvs implementation level may suck due to historical baggage)
< sipa> this is orthogonal to fetching by wtxid, of course
< petertodd> it's not orthogonal at all: if I manage to clean a script significantly, I want my peers to get it, and then say "hey, this is way smaller, lets replace it in our mempool..."
< gmaxwell> yea, I agree it's orthorgonal, fetching by wtxid has an amplification attack vector that is kind of sad.
< petertodd> gmaxwell: what's the vector?
< sipa> gmaxwell: presenting multiple versions of the same transaction with different witness?
< gmaxwell> The amplification attack vector is that I create grab transactions and relay witness malleations to every node in the network, different version to each node, so when I happen to get txn early every node ends up with a different txid and you get N^2 bandwidth forwarding it around.
< petertodd> gmaxwell: but you can do that anyway with your own txs
< sipa> you could of course create invs with both txids and wtxids
< petertodd> sipa: yeah, that'd be fine
< petertodd> sipa: and obviously, our peer can tell us it's segwit and wants us to do that
< sipa> except that also breaks your try to replace with smaller witness use case
< petertodd> sipa: why? once you go down that rabbit hole, you can also advertise length
< sipa> petertodd: yes, that was my suggestion in the beginning of the discussion :)
< sipa> advertize sigops/size/fees in the inv
< petertodd> sipa: no, you suggested having the tx _commit_ to that info, which is a very different thing; non-consensus critical code advertising length isn't a big deal
< sipa> petertodd: reread my sentence
< petertodd> sipa: ah, ok, I have no issues with that
< petertodd> sipa: (I've been thinking about this exact kind of issue for my dex specification actually)
< sipa> "... or even better, in the inv$
< petertodd> sipa: yeah, for mempool I think invs advertising this stuff makes a lot of sense
< petertodd> for starters, if we screw that up, it's relatively easy to fix :)
< sipa> though i think it's not necessarily an urgent issue
< sipa> the worst case is that a bad peer can poison your reject cache, preventing you from getting a legitimate transaction before it confirms
< petertodd> so, is a reasonable game plan to releast segwit with the current p2p design, and then add wtixds to invs later? (potentially with sigops/size in the inv)
< petertodd> sipa: it's a good thing no-one relies on zeroconf anymore :P
< sipa> i'm sure there are other ways that you can accomplish that
< sipa> (like inving and not responding to getdata)
< gmaxwell> wrt different kinds of relaying later, mostly I thought those improvements would go into mempool sync.
< gmaxwell> and rather than wtxid relay, wtxid,feerate,txid tuples (probably truncated/quantized) may make more sense.
< sipa> yeah, for mempool sync we can redesign things cleanyl
< petertodd> gmaxwell: note too how schemes like txo commitments allow - and expect - nodes to do significant modifications to txs
< gmaxwell> (or even wtxid, feerate, vin id with lowest hash)
< gmaxwell> in any case, optimal sync behavior in the presence of double spends (of any kind) isn't a nut I've cracked yet.
< sipa> petertodd: yes, the priority should be to make sure no internal inconsistency or banning of unaware nodes occur
< gmaxwell> I think I have constructions for schemes which are close to optimal absent doublespends.
< gmaxwell> we already improved relay efficiency a LOT in 0.13, fwiw.
< sipa> petertodd: rejectioncache behaviour can either degenerate into attackers preventing you from receiving transactions on the one hand
< sipa> or to the old pre-rejectioncache bevahiour of requesting failed txn from every peer (which is made much less bad due to feefilter already)
< gmaxwell> there are already several trivially exploited ways where an attacker can massively delay you getting a transaction.
< petertodd> sipa: well, again, an attacker can do that DoS attack without segwit by just sending multiple slightly different versions of the same tx
< gmaxwell> (e.g. just inv the damn thing from many sockets and don't respond.
< gmaxwell> )
< sipa> yeah
< oddishh> WOW! My bitcoin expander is now READY! Put some bitcoin in my wallet and I'll intantly expand it & send you more back. Totally vouched & legit. PM me to begin!
< grubles> 1/3
< grubles> oops. sorry.
< sipa> 0.33333333
< grubles> :)