< davec>
gmaxwell: I have noticed that happening to me a while ago as well. Unfortunately with there being impls like BitcoinJ (and probably others) that rely on that behavior can't really easily stop it
< davec>
I haven't done any testing, but at one point I was considering doing some type of logic to only allow X unsolicited transactions in Y timeframe
< gmaxwell>
davec: Part of the motivation for DOS in bitcoin core is that we could ban peers that do a lot of something while leaving others alone.
< gmaxwell>
though I'd prefer to get bitcoinj fixed first; and have that just clean up whatever remains.
< davec>
That is certainly preferred if the new maintainers are willing to change it
< CodeShark>
I think they would given sufficient pressure ;)
< gmaxwell>
We haven't asked the new maintainer yet, I think.
< gmaxwell>
I was thinking of seeing if I could sweet talk bluematt into just sending a patch. :)
< BlueMatt>
i know i know
< BlueMatt>
i told you i would....
< BlueMatt>
oh, this is unrelated to other things
< BlueMatt>
yea, i mean you should be able to ping andreas
< gmaxwell>
yea, actually I'm still waiting on the other thing. :)
< CodeShark>
too late - now you've committed yourself to this as well!
< CodeShark>
slick move, gmaxwell :)
< BlueMatt>
gmaxwell: i think i saw someone else starting to implement that
< gmaxwell>
BlueMatt: This is just the "bitcoinj sends unsolicitied full transactions, with no inv" ... which by itself isn't that awful, but as a result of tolerating it there are clowns now relaying every transaction they get this way.
< BlueMatt>
but i havent looked closely
< BlueMatt>
yea, ok
< BlueMatt>
i mean we should really just disconnect people who relay something we've already seen
< BlueMatt>
dont ban them, but disconnect them
< BlueMatt>
if someone implements this in a broken way, they'll notice a ton of disconnects and stop
< gmaxwell>
Hm, that we already have and didn't request from them? hm. that should actually be okay.
< BlueMatt>
if its a wallet like bitcoinj doing it, the disconnects will almost never happen
< gmaxwell>
(won't actually rescue blocksonly nodes from the noise, but thats fine)
< BlueMatt>
blockonly nodes can also disconnect
< gmaxwell>
I mean they'll never already have.
< BlueMatt>
yes, but they can also disconnect
< davec>
Well depending on the relay speed, that could cause BitcoinJ to get disconnected when the node they are relaying to saw it from elsewhere first (though given the time to process and re-relay, that should be exceedingly rare)
< davec>
assuming the new tx is relayed to all connected nodes immediately that is
< CodeShark>
race conditions are bad ;)
< gmaxwell>
well getting disconnected is something that happens which you must be able to handle.
< gmaxwell>
still best to get bitcoinj fixed.
< davec>
I'm definitely in favor of getting BitcoinJ fixed and then tightening the protocol
< davec>
I think it should be allowed for whitelisted nodes however
< davec>
if you have a relay node, proxy, etc setup locally, requiring the extra round trip is probably not desirable
< gmaxwell>
Well, it's probably irrelevant -- if your transactions are latency critical you're doing something wrong. But sure, I don't see a reason whitelisted hosts couldn't be excempted from that.
< gmaxwell>
though since a peer doesn't know its whitelisted they shouldn't rely on that behavior.
< gmaxwell>
main argument for it I see is enabling very simplistic scripts.
< CodeShark>
yes, exempt whitelisted nodes...but in general, it would be best to send invs first if sending to multiple nodes
< davec>
Right, I was thinking more for specialized nodes (like relay nodes or proxies, which know they're only talking to a local node)
< davec>
of course, being local on what's probably a 1 or 10Gb link I doubt it would really make any difference anyways
< CodeShark>
latency isn't really the issue with that, as gmaxwell said - but it simplifies the propagation logic
< CodeShark>
disconnecting whitelisted nodes is probably a bad idea anyhow, though, since such incidents are more likely to be due to bugs or errors rather than DoS
< davec>
agreed
< CodeShark>
and you're not spamming anyone else
< CodeShark>
although perhaps we could have a "strict mode" for testing clients
< CodeShark>
use it for testing, not for production
< gmaxwell>
I'd like that but am not sure that it's worth the code overhead unless it was just a milepost on the way to enforcing those things for everyone.
< gmaxwell>
At some point just starting on a new, parallel, P2P protocol would be a better investment of time-- and of course that could be maximally strict from the start.
< CodeShark>
yeah, you're probably right
< gmaxwell>
doing so would allow replacing the inv process with something that actually scales, and so on.
< davec>
while dreaming it would be nice to design it so things like I2P can work too
< gmaxwell>
not just I2P, tor's next HS protocol has larger addresses.
< gmaxwell>
Though those would be not so hard to shoehorn into ye'ole p2p compatbily.
< davec>
yeah I think there was a patch running around for it at some point
< CodeShark>
is it feasible (or even desirable) to move entirely to a tx propagation protocol where txs can be directly routed to miners and recipients rather than flooded?
< gmaxwell>
I don't think it's particularly desirable.
< CodeShark>
reasons?
< gmaxwell>
makes mining more vulnerable to policy risk, dos attack etc. Also forwarding ahead reduces probagation times tremendously (with no caching at all it takes seconds to verify blocks now)
< gmaxwell>
What I'd like to try doing is to do periodic set reconcillation of the top N mbytes of the mempool, and have that be the only relay mechenism in a p2p protocol.
< gmaxwell>
This can get efficiency beyond the peers*transactions bound, and has nice anti-DOS properties.
< CodeShark>
assuming blocks are flooded, all transactions that confirm must get flooded sooner or later...but things like RBF mean that many transactions might get flooded that never confirm
< CodeShark>
whereas routing directly to miners (assuming it is practical) might allow direct fee bidding
< gmaxwell>
with that only things likely to get mined soon relay, naturally. ... and it means that replacements that happen between the batching intervals get nicely suppressed. You can scale your bandwidth by scaling how deep and often you reconcile.
< tulip>
it's effectively required that nodes in the network see transactions before they see them in blocks, 'blocksonly' with 'dbcache=100' frequently sees 20 second or higher times for block verification, I don't think there's any way around that.
< gmaxwell>
CodeShark: in any case, I think you can take what you're thinking and relax a bit to public relay hubs and then we'd agree.
< gmaxwell>
but as tulip notes, you don't want any mempoolless nodes in your block propagation critical path.
< gmaxwell>
I want to get initial announcement of the p2p network in any case; because of privacy problems and the network attacks they're encouraging.
< gmaxwell>
though I consider that largely orthorgonal to general relay changes.
< gmaxwell>
tulip: I'm surprised it's that bad. :( might this be on a VPS with poor IO?
< tulip>
gmaxwell: it's a lot better with a very large dbcache set (more like a normal node, though obviously still slower). that's on a 4+4HT core i7, 7200RPM 2.5" HDD.
< CodeShark>
and disabling signature validation doesn't significantly reduce the time, right?
< gmaxwell>
well default dbcache is 100. but okay, yea so signature checks won't have that big an effect (esp not since libsecp256k1; which I assume you're using if you're using blocksonly)
< tulip>
CodeShark: is there a patch around for doing that?
< CodeShark>
it's a one line change in the code, I believe
< CodeShark>
trying to remember which line :)
< tulip>
assuming I can just short circuit the checkpoints line.
< davec>
I'm with gmaxwell here on the announcement bit. One of the things I've measured as a part of the multi-peer work is how much data transmission is avoided by keeping track of inflight data and avoiding re-requesting it and such. It's roughly been on the order of 30-100MB/hr (depending on number of txns, connected nodes, etc)
< gmaxwell>
wow, all the time in verify. interesting!
< gmaxwell>
hm. I thought we had more resolution in verify. I can't distinguish signature checks from txin lookups. darn.
< tulip>
guess more granularity wouldn't hurt.
< davec>
that said, the idea of set reconciliation is pretty attractive and would save a lot more
< midnightmagic>
ooookay, primary node validates itself just fine.
< GitHub188>
[bitcoin] CodeShark opened pull request #7074: Added a command line option -scriptchecks (master...disable_script_checks) https://github.com/bitcoin/bitcoin/pull/7074
< GitHub106>
[bitcoin] CodeShark closed pull request #7074: Added a command line option -checkscripts (master...disable_script_checks) https://github.com/bitcoin/bitcoin/pull/7074
< gmaxwell>
CodeShark: hah sorry!
< CodeShark>
ok, what about if disabling script checks also disables relay?
< CodeShark>
:)
< CodeShark>
or disables something else that makes it unusable as an actual node but can still be used for benchmarking
< tulip>
anecdotally there's some configurations out there people have copied onto their nodes with no real idea of what they do beyond being "suggested", one of them has gen=1 in it, another has par=1 (which is non-obviously named). things like that seem to get entrenched pretty easily.
< gmaxwell>
CodeShark: I think people will just bypass those other things. It's a bad route in general.
< CodeShark>
but a strictly benchmarking mode would mean people would need to recompile to bypass it
< CodeShark>
and that's already the case anyhow :)
< gmaxwell>
ya, but thats effectively what we have: you just add a comment in one line.
< CodeShark>
yes, but a runtime switch that allows benchmarking while making the node useless as a node seems...useful :)
< gmaxwell>
Though we don't promise that it won't dangerously break things, and in fact it does-- if someone wanted to do that to their mining op to 'make it faster' they'd end up mining soft-fork ops. We already had an issue with that this year, with 1% of the hashrate running of an rpi configured thusly. :(
< CodeShark>
disable GBT, disable the wallet, disable all relay
< gmaxwell>
CodeShark: then we'll get another "forced fee disabled fork" promoted by some idiot.
< gmaxwell>
I really don't think this is that useful. Also if all that stuff is off then it's more you can't benchmark!
< tulip>
I'd like to break out the script portions of debug=bench.
< gmaxwell>
if we just want to know the time contribution; then thats addressable via making the timing more granular.
< gmaxwell>
which we really should do.
< tulip>
is there any reason debug=bench also dumps the peer timestamp offsets? I thought that would be in debug=net.
< gmaxwell>
tulip: so go try?
< gmaxwell>
tulip: no clue, sounds like an error, go look for the commit that did it?
< tulip>
am.
< tulip>
looks like the timezone offset thing gets printed for everybody (not specific to bench), my mistake.
< gmaxwell>
I /thought/ it did. But trusted the guy with the code open. :)
< tulip>
I've not run in non-bench mode for so long I've forgotten what log prints are normal.
< gmaxwell>
I ran without debugging options recently and I was super pleasntly surprised how much we've cut down the chattyness.
< tulip>
yes, it's a lot easier to glance at a log and get a good idea of what's happening.
< phantomcircuit>
gmaxwell, every time i've gone to fix the trickle logic i end up going down the rabbit hole
< gmaxwell>
phantomcircuit: when that happens it can be useful to look to see what the smallest possible change that makes it clearly better is.. rather than trying to redesign it.
< gmaxwell>
like ... non-trickle to a fixed number of peers rather than a fixed proportion.
< gmaxwell>
(and perhaps favor outbound peers)
< phantomcircuit>
gmaxwell, there's trickle logic for the local peers address as well
< phantomcircuit>
gmaxwell, it's really not about the minimal change, im not sure that the original logic really worked either tbh
< phantomcircuit>
oh and there's a race in the inv stuff
< phantomcircuit>
you cant just decide to disconnect from peers who give you a transaction w/o inv that you've seen before
< gmaxwell>
phantomcircuit: whats the race? if you get a transaction and you didn't getdata. They're naughty.
< phantomcircuit>
gmaxwell, yes but if we dont want to disconnect broken spv clients doing the same thing
< phantomcircuit>
then there's a race
< gmaxwell>
sure we can't do it yet.
< gmaxwell>
I suggest step (1) we ask bitcoinj to change, step (2) we do what matt suggests which is disconnect and don't ban on unsolicited tx when we've already have the tx (usually not the case for SPV)
< gmaxwell>
(2) will usually not punt spv, but if it does they can just reconnect. And if will reliably hang up on the really dumb nodes.
< phantomcircuit>
gmaxwell, ok that's reasonable
< gmaxwell>
and (2) is enough to stop people from making more software that behaves like this.
< phantomcircuit>
gmaxwell, i actually suggest as the first step to fixing the trickle logic... to delete the trickle logic
< gmaxwell>
oh come on now, it's not that broken.
< phantomcircuit>
then mark specific peers as "instant relay" peers
< phantomcircuit>
gmaxwell, there's two goals with the trickle logic
< gmaxwell>
phantomcircuit: well if you're going that route, rather-- get rid of the idea of instant relay. And just trigger relay more often on some peers.
< phantomcircuit>
first is to prevent fingerprinting attacks that allow mapping the network
< gmaxwell>
and to try to stop duplicate data crossing in flight.
< phantomcircuit>
second is to prevent fingerprinting attacks on the wallet
< gmaxwell>
then three goals.
< gmaxwell>
oh you mean addr trivle vs tx.
< phantomcircuit>
sort of but not exactly
< phantomcircuit>
those two should definitely be strongly separated in the code as they have similar but different goals
< gmaxwell>
the mapping should be fixed by peer rotation resulting in little to no static topology.
< phantomcircuit>
today if you connect to all the nodes on the network you can get a pretty good idea about who is connected to who based on the timing of transaction inv's
< phantomcircuit>
none of the peer rotation suggestions have been completely dynamic
< gmaxwell>
well, little, because we want the robustness to short lived eclipse attacks that comes from long lived connections.
< gmaxwell>
right
< phantomcircuit>
they all include some fixed set of nodes to make partitioning harder
< phantomcircuit>
those links would still be vulnerable to fingerprinting attacks
< tulip>
I don't think you're going to be able to get rid of the timing aspect unless you remove listening nodes altogether.
< gmaxwell>
well if you effectively firewallet the results of communication between your long lived links and your short lived ones. I dunno if it can be done without overhead.
< gmaxwell>
e.g. if you effectively run two p2p stacks, one for the long lived connections one for everyone else.. and don't leak data between them except in hugely delayed batches. yadda yadda.
< phantomcircuit>
gmaxwell, the last time i did this i ended up basically writing a very shitty mixmaster thing
< phantomcircuit>
it would hold onto all the transactions it received for 1 minute and then send them all out at once to everybody
< phantomcircuit>
but wlel all invs cross in flight now
< GitHub191>
[bitcoin] tulip0 opened pull request #7075: Move time data log print to 'net' category to reduce noise (master...no-time-offset-logging) https://github.com/bitcoin/bitcoin/pull/7075
< gmaxwell>
phantomcircuit: they don't all cross in flight.. but a lot do.
< phantomcircuit>
gmaxwell, assuming most clocks are on ntp and that they're +-1 second
< phantomcircuit>
then they do mostly cross with that kind of simple mixing
< tulip>
phantomcircuit: most bitcoin nodes don't seem to have very accurate time.
< phantomcircuit>
tulip, well... i'd rather not write something that assumes that as it's starting point :P