< petertodd>
I could make use of a breakdown by age fwiw
< sipa>
yeah, i'll create such a graph too
< sipa>
should be easier as it doesn't require reindexing
< gmaxwell>
I'm thinking it would be interesting to solve the following optimization problem: given a set of utxo size, birth/death times, value. Find the coefficients of multinomial a f(value,size,age_in_blocks) such that a fixed cache of size N with retention sorted by f() has the highest hitrate (item still in cache at utxo death).
< petertodd>
sipa: thanks!
< petertodd>
gmaxwell: though note that you probably can't bake something like that into a consensus thing like txo commitments w/ utxo cache, as it changes peoples' behavior...
< petertodd>
gmaxwell: potentially still useful for local optimisation though
< gmaxwell>
petertodd: well you can to some extent, for example, having a different cache policy at some threshold, like 1BTC... 21 million txo is the absolute maximum number of outputs at 1BTC+ so there is a pretty strong upper bound on whatever impact you could have on encouraging people to make utxo at or over 1 BTC. :)
< petertodd>
gmaxwell: I mean, if you take the coefficients from prior data, they're likely to be wrong after people change their behavior - if you use coefficients, you need to have a different rational is all I'm saying
< petertodd>
gmaxwell: keeping higher value UTXO's in cache longer probably does make sense
< gmaxwell>
yea sure.. the full answer isn't probably that useful as a consensus rule
< gmaxwell>
The part of the answer that tells you some value breakpoint(s) may be.
< gmaxwell>
because even if they're slightly wrong due to changing usage which they themselves incentivize, the whole prospect of having value relative retention is reasonable.
< gmaxwell>
e.g. if age * value is a good scoring function on the past data, it probably is a robust one, which could be prudently used in consensus.
< petertodd>
yup
< gmaxwell>
or some polynomial on age * value. Or really I think any degree multinomial on age and value is probably also okay. Only size is one that is probably busted. :)
< petertodd>
gmaxwell: speaking of, what approaches are good for writing polynomials in consensus critical code? (I wonder if someone has a writeup on that)
< gmaxwell>
So-- if possible, probably best by converting it to a simple segmented linear approximation. But failing that, I would assume fixed point with horner's method (see wikipedia) which is more computationally efficient and has better numerical properties than the naieve way you'd compute it.
< gmaxwell>
this is where you write is as a c0 + x * (c1 + x * (c2 + x *....
< gmaxwell>
petertodd: there are all sorts of 'consensus critical' polynomials in opus (ones where a discrepency in the integer value returned causes a total entropy decoding failure)-- they never were a big issue, they're written like that, and we tested them exhaustively. no biggie.
< petertodd>
gmaxwell: cool, thanks. re: exhaustively, I assume that's 16bit ints at most?
< gmaxwell>
no, 32 bits.
< petertodd>
gmaxwell: huh, how does that work?
< gmaxwell>
for some function thats just doing some multiplies and adds, trying all 32 bit inputs isn't a big deal.
< gmaxwell>
no, as in computing f(x) where x is some 32 bit value and x is a single variable polynomial in x. :)
< petertodd>
gmaxwell: yeah, that makes a lot more sense :)
< petertodd>
gmaxwell: I'm writing a spec for dex, and it's interesting how you can make an argument for only supporting 16-bit ints natively, as you can exhaustively test them
< gmaxwell>
yes, indeed that is a bit normative polynomial, though I think the current CWRS code there mostly uses tables.
< gmaxwell>
petertodd: thats something you could argue, though it would have to be weighed against footgun and bloat risks.
< gmaxwell>
(though I suppose I could make a anti-footgun argument-- the user is much more likely to _know_ the range of the type is limited, when it's so limited they are constantly forced to deal with it)
< gmaxwell>
over and underflow can be defined as script failure.
< petertodd>
gmaxwell: yeah, it'd only be practical if you can write reasonably efficient n-bit add/multiply/etc. routines, and make them part of the built-in library
< petertodd>
gmaxwell: yes, I'm planning on under/overflow to be a failure condition (probably with wrapping and/or overflow variants)
< gmaxwell>
petertodd: also a commonly interesting case is things like where one input has small range, and that doesn't really impede exaustive testing.
< gmaxwell>
e.g. U32 * U8. It's really common in software to take arbritary numbers and multiply or divide them by small constants.
< petertodd>
gmaxwell: so, what I noted in my draft spec doc, is that interestingly economicly relevant numbers often have quite large ranges: e.g. coin value
< gmaxwell>
like computing 2/3 of a coin value.
< petertodd>
gmaxwell: though, consensus critical economic use-cases just need summation, not multiplication/division (usually)
< sipa>
petertodd: but but demurrage
< gmaxwell>
for example, you may want to compute input * 2/3, and input - input * 2/3.
< gmaxwell>
for value splits in a contract.
< gmaxwell>
and input is 64 bits.
< petertodd>
sipa: ok, austrian economics... :P
< petertodd>
gmaxwell: well, once you talk about contracts more genreally, you get interest calculations, which gets very complex...
< gmaxwell>
polynomial approximations of interest calculations generally work fine over limited ranges.
< petertodd>
gmaxwell: but I'm assuming sane contracts would generally only need a few calculations along those lines, so slower approaches should be ok
< petertodd>
yeah, polynomial approx being one approach
< petertodd>
in any case, it's looking like having reasonable type checking is a way harder and more complex problem :)
< gmaxwell>
one nice way for exponential functions is to use CLZ to get the integer part of a base 2 log, and then use a polynomial correction.
< petertodd>
gmaxwell: CLZ?
< sipa>
count leading zeroes
< petertodd>
sipa: ah!
< sipa>
__builtin_clz
< gmaxwell>
er actually its the log you want the clz for, for exp you just need to use the leading bits of the number.
< midnightmagic>
i wonder how heavy that special 9-cell thinkpad thing is
< petertodd>
midnightmagic: doesn't help that mine is a few years old - batteries wearing out
< sipa>
petertodd: protip, when buying a new laptop, buy an extra battery that fits... once your battery wears out they'll be harder to find
< petertodd>
sipa: good advice - though I've never actualy bought a laptop new
< * midnightmagic>
has no user-replaceable battery in his mbp. :(
< midnightmagic>
petertodd: why not?
< petertodd>
midnightmagic: because I act like I'm a poor student :P
< petertodd>
midnightmagic: I have a t520 right now, which I think is about four years old
< midnightmagic>
Humility and reasonable resource management are virtues.
< petertodd>
midnightmagic: well, I was going through my corporate expenses the other day... and a new laptop would be a drop in the bucket compared to what gets spent on me travelling
< midnightmagic>
i hate travelling :( the world is made for people much, much smaller than I am.
< petertodd>
midnightmagic: I actually find I get more work done; I'm no tourist so when I'm in the middle of a foreign country I tend to find somewhere to sit down with my laptop :)
< midnightmagic>
i do too, all the stuff that is on the to-do list in my home can't be addressed so I can let it go
< petertodd>
midnightmagic: yeah, and when I'm home, friends want to like, hang out with me and take up all my time :P
< midnightmagic>
psh. friends. so annoying.
< petertodd>
sure
< petertodd>
er, wrong window - stupid lag
< sipa>
haha
< gmaxwell>
sipa: lithion ion batteries of varrious type have a shelf life, sadly.
< gmaxwell>
The unused one will also fade.
< gmaxwell>
But fortunately if you buy something like a thinkpad, people make batteries for them forever.
< gmaxwell>
You can buy batteries for ten year old thinkpads no problem.
< kanzure>
consensus-related non-malleability vs wallet/p2p-level, is my guess.
< petertodd>
kanzure: it's just a flaw in the way the mempool/p2p refers to segwit txs; has little if anything to do with consensus
< petertodd>
kanzure: tl;dr: is that the p2p asks for txs by txid, which doesn't commit to the witness, and marks invalid txs by txid, without being able to consistently know if a tx was invalid due to a bad witness
< petertodd>
equaly, s/invalid/unacceptable due to fee, size, whatever/
< sipa>
i actually don't think the flaw is referring by txid instead of wtxid
< petertodd>
sipa: how so?
< sipa>
but we should have made resource limits part of the base transaction, not the witness
< kanzure>
yes i was wondering about things like "how to actually show a node that a certain tx is valid later, if at first it receives a bad witness" :\
< petertodd>
sipa: what do you mean by that?
< sipa>
(fees, sigop count, byte sizes) go in scriptSig... or even better, in the inv
< sipa>
so a node can decide to not process before you know... processing
< petertodd>
sipa: duplicating that stuff has historically lead to endless problems, basically because you have to check it twice
< sipa>
how do you mean?
< petertodd>
sipa: in a decent system, processing even invalid txs is something that happens very quickly, so there's no DoS attack
< petertodd>
sipa: notice how satoshi screwed up sigops from the beginning...
< sipa>
unfortunately, fees and sigop counting require fetching the utxos
< petertodd>
sipa: in any of these schemes, you still have to count up sigops as they're actually executed, to check that the sum matches (or is less than) the claimed sum
< sipa>
of course
< petertodd>
sipa: so? fetching utxos can't be expensive, or we've already lost
< sipa>
but a mismatch can then actually result in a ban, because it cannot happen accidentally
< petertodd>
sipa: if we used wtxids, you could still ban based on that
< sipa>
but our rate limiting is based on feerate, which depends on fee, which we cannot enforce until we've done the effort
< sipa>
if there is no rate limit, even a cheap validation per transaction will be too much
< petertodd>
sipa: huh? someone sends you a DoS tx, just ban them - there's no reason *legit* transactions should take significant cost to accept
< sipa>
petertodd: so there is a trivial solution... fully validate every transaction you asked for
< sipa>
so you don't prematurely discard a transaction before finding out it was an attempted attack
< sipa>
it is too expensive, or invalid, or malleated... you can ban who sent it
< petertodd>
sipa: I think you're missing my point: the threshold that we consider it an "attempted attack" should be low enough that there's no DoS issues; txs fundementally should never be expensive to validate, and cases where they are should be non-standard, and eventually, removed entirely from the protocol
< sipa>
yes
< sipa>
i agree
< sipa>
but the issue here is that we fail to detect whether a too expensive transacrion is due to its creator or due to who relayed it
< petertodd>
sipa: right, but wtxid solves that issue
< petertodd>
if relayer malleates, we'll still ask for a different version of the same txid if another peer gives us it
< sipa>
yes, it would... but it adds complications
< petertodd>
how so?
< sipa>
you need indexes by both txid and wtxid..
< sipa>
and you always risk requesting things twice
< petertodd>
sipa: in what, the mempool/p2p?
< petertodd>
but they are different things!
< sipa>
in my view they are the same thing
< sipa>
one with an optional part omitted
< sipa>
except we only find out too late that it was not optional
< petertodd>
well, I don't agree at all - the optional part has a non-optinal effect on the tx
< sipa>
for those who care (which a full node does)
< sipa>
that's a semantic discussion, though
< sipa>
i can see your point
< petertodd>
I certainly care: my tx fees depend on the witness
< petertodd>
I may also have a protocol where I'm publishing something in the witness, e.g. hashlock
< gmaxwell>
Cleanstack means that ordinary transactions cannot have their fees increased without invalidating them. (If not for that I would have already recommended we have some preprocessing to strip transactions as they're relayed)
< sipa>
i think the easiest solution is to validate scripts even for things we know we won't accept
< gmaxwell>
You should probably assume that relaying nodes will perform witness stripping in the future.
< sipa>
we have spent almost all the work anyway (fetched inputs, and wasted bandwidth)
< petertodd>
gmaxwell: with currnet tx patterns yes; it's non-trivial to make txs with more complex scripts that have non-malleable witnesses
< gmaxwell>
(by witness stripping I mean compacting the witness to the minimal data needed to be valid, as best it can determine)
< petertodd>
sipa: as in, valiate scripts to determine if the witness is wrong?
< gmaxwell>
Yes.
< gmaxwell>
This was something I had been advocating for a while because there are some other potential DOS attacks that exist because we don't.
< gmaxwell>
(a while = before segwit existed)
< petertodd>
well, again, that assumes you know how to clean witnesses - there are tx patterns where that's not trivial (and indeed, the user may intentionally have more than one form of witness for the same txid)
< gmaxwell>
though without an enforced feefilter its a bit less than ideal.
< gmaxwell>
petertodd: sure any cleaning would always be a best effort attempt.
< petertodd>
I'm still of the opinion that asking for a wtxid is a way simpler overall conceptual design (obvs implementation level may suck due to historical baggage)
< sipa>
this is orthogonal to fetching by wtxid, of course
< petertodd>
it's not orthogonal at all: if I manage to clean a script significantly, I want my peers to get it, and then say "hey, this is way smaller, lets replace it in our mempool..."
< gmaxwell>
yea, I agree it's orthorgonal, fetching by wtxid has an amplification attack vector that is kind of sad.
< petertodd>
gmaxwell: what's the vector?
< sipa>
gmaxwell: presenting multiple versions of the same transaction with different witness?
< gmaxwell>
The amplification attack vector is that I create grab transactions and relay witness malleations to every node in the network, different version to each node, so when I happen to get txn early every node ends up with a different txid and you get N^2 bandwidth forwarding it around.
< petertodd>
gmaxwell: but you can do that anyway with your own txs
< sipa>
you could of course create invs with both txids and wtxids
< petertodd>
sipa: yeah, that'd be fine
< petertodd>
sipa: and obviously, our peer can tell us it's segwit and wants us to do that
< sipa>
except that also breaks your try to replace with smaller witness use case
< petertodd>
sipa: why? once you go down that rabbit hole, you can also advertise length
< sipa>
petertodd: yes, that was my suggestion in the beginning of the discussion :)
< sipa>
advertize sigops/size/fees in the inv
< petertodd>
sipa: no, you suggested having the tx _commit_ to that info, which is a very different thing; non-consensus critical code advertising length isn't a big deal
< sipa>
petertodd: reread my sentence
< petertodd>
sipa: ah, ok, I have no issues with that
< petertodd>
sipa: (I've been thinking about this exact kind of issue for my dex specification actually)
< sipa>
"... or even better, in the inv$
< petertodd>
sipa: yeah, for mempool I think invs advertising this stuff makes a lot of sense
< petertodd>
for starters, if we screw that up, it's relatively easy to fix :)
< sipa>
though i think it's not necessarily an urgent issue
< sipa>
the worst case is that a bad peer can poison your reject cache, preventing you from getting a legitimate transaction before it confirms
< petertodd>
so, is a reasonable game plan to releast segwit with the current p2p design, and then add wtixds to invs later? (potentially with sigops/size in the inv)
< petertodd>
sipa: it's a good thing no-one relies on zeroconf anymore :P
< sipa>
i'm sure there are other ways that you can accomplish that
< sipa>
(like inving and not responding to getdata)
< gmaxwell>
wrt different kinds of relaying later, mostly I thought those improvements would go into mempool sync.
< gmaxwell>
and rather than wtxid relay, wtxid,feerate,txid tuples (probably truncated/quantized) may make more sense.
< sipa>
yeah, for mempool sync we can redesign things cleanyl
< petertodd>
gmaxwell: note too how schemes like txo commitments allow - and expect - nodes to do significant modifications to txs
< gmaxwell>
(or even wtxid, feerate, vin id with lowest hash)
< gmaxwell>
in any case, optimal sync behavior in the presence of double spends (of any kind) isn't a nut I've cracked yet.
< sipa>
petertodd: yes, the priority should be to make sure no internal inconsistency or banning of unaware nodes occur
< gmaxwell>
I think I have constructions for schemes which are close to optimal absent doublespends.
< gmaxwell>
we already improved relay efficiency a LOT in 0.13, fwiw.
< sipa>
petertodd: rejectioncache behaviour can either degenerate into attackers preventing you from receiving transactions on the one hand
< sipa>
or to the old pre-rejectioncache bevahiour of requesting failed txn from every peer (which is made much less bad due to feefilter already)
< gmaxwell>
there are already several trivially exploited ways where an attacker can massively delay you getting a transaction.
< petertodd>
sipa: well, again, an attacker can do that DoS attack without segwit by just sending multiple slightly different versions of the same tx
< gmaxwell>
(e.g. just inv the damn thing from many sockets and don't respond.
< gmaxwell>
)
< sipa>
yeah
< oddishh>
WOW! My bitcoin expander is now READY! Put some bitcoin in my wallet and I'll intantly expand it & send you more back. Totally vouched & legit. PM me to begin!