< gmaxwell>
I realize that some might want a more extensive wallet redesign, but I think this would be good incremental progress and would enable a _lot_ of flexibility for minimal code impact.
< phantomcircuit>
gmaxwell, thoughts on a "-miner" mode that does things like reduce (or even remove) the sleep calls in some of the networking threads?
< phantomcircuit>
(ie please do everything possible to reduce latency even if you spin my cpu at 100% constantly)
< gmaxwell>
phantomcircuit: what sleep calls are there at all in networking? (other than in e.g. connect loops, which are needed to avoid dos attacking peers)
< Luke-Jr>
we probably should have a miner mode regardless. relay nodes have no real need for the mempool.
< Luke-Jr>
(they just need to relay and forget, plus some txid record so they can't be DoS'd)
< phantomcircuit>
Luke-Jr, please define relay node more specifically
< Luke-Jr>
phantomcircuit: a node that wishes to participate (non-leech) in the p2p network, but has no miners
< phantomcircuit>
Luke-Jr, they require a mempool if they are going to forward transactions otherwise they become a DoS amplifier
< phantomcircuit>
gmaxwell, there's a sleep on select() error (im not even sure why that's there actually)
< Luke-Jr>
phantomcircuit: why can't they just remember txids?
< phantomcircuit>
and there's messageHandlerCondition which can be replaced by a spin lock
< phantomcircuit>
there's also a bunch of other things like changing the defaults for various things to be much much larger
< gmaxwell>
Luke-Jr: forgetting things is not so useful for avoiding the 2x++ redundant transmission when the same data shows up again in a block.
< phantomcircuit>
gmaxwell, i'd also add a background thread that just calls CreateNewBlock in a loop trying to improve on the current cached block template
< Luke-Jr>
gmaxwell: hm, that's a point
< Luke-Jr>
phantomcircuit: merely calling CNB won't use/update the cache
< phantomcircuit>
Luke-Jr, yes i know
< * phantomcircuit>
hand waves details
< phantomcircuit>
Luke-Jr, it'll update every 5 seconds if the memory pool was modified, correct?
< phantomcircuit>
(or the tip changes obviously)
< Luke-Jr>
phantomcircuit: the cache is in the RPC code
< phantomcircuit>
yes i know
< phantomcircuit>
that's the logic though right? (just verifying i read it correctly)
< gmaxwell>
phantomcircuit: I was recommending before that we change the gbt behavior to _never_ run createnewblock at the time the rpc is called.
< phantomcircuit>
gmaxwell, yes that's what i would be implementing, a background thread only
< phantomcircuit>
there's some interesting optimizations that can be done if we're willing to spend a bunch of cpu time on it
< phantomcircuit>
the obvious one is optimizing for the block that pays the highest fee
< phantomcircuit>
(or whatever)
< wumpus>
gmaxwell: concept ACK
< wumpus>
gmaxwell: I'd personally prefer not adding more 'invoke external programs', I know we already have walletnotify etc but it makes sandboxing the thing harder because we can't disable fork/exec
< wumpus>
gmaxwell: I know there is walletnotify, blocknotify etc but at least we have an alternative notification mechanism for that now, so they may be deprecated in some future version
< wumpus>
gmaxwell: anyhow so I'm not strongly against that, just wary of it, also with regard to argument shell injection. Although at least that an be easily avoided by executing a program directly instead of using ::system
< wumpus>
an alternative would be to use a little protocol intead for communicating with the signer
< gmaxwell>
wumpus: hm. we can limit to what we can exec (via seccomp) .. interesting I was thinking the seperate process there was a sandboxing benefit, e.g. because it can have totally distinct selinux / seccomp settings.
< wumpus>
sure, just be really careful
< gmaxwell>
But the biggest reason for the seperate process is to tear down development barriers. E.g. so someone can build a module without the scarryness in bitcoin core. :)
< wumpus>
walletnotify/blocknotify use ::system that has always bugged me
< wumpus>
sure that's why I dont give "integrate it into bitcoin core" as an alternative, but say communicate through a pipe or socket
< gmaxwell>
wumpus: still need a way to start the thing that speaks the protocol.
< wumpus>
could be similar to e.g. torcontrol.cpp which speaks a simple protocol with Tor
< gmaxwell>
But okay, reasonable.
< wumpus>
true
< wumpus>
ok, let's leave that isue for later
< wumpus>
for initial version invoking a process is fine
< gmaxwell>
wumpus: do you have any advice for phantomcircuit on my DEFAULT_BLOCKSONLY complaint? My response to him was not terribly constructive... in part because I don't have any good advice.
< gmaxwell>
I'd have merged it already but for that issue.
< wumpus>
what is the issue exactly? can't find it in the comments
< wumpus>
I do see DEFAULT_BLOCKSONLY is not actually used (besides documentation)
< wumpus>
but what prevents him from doing that?
< gmaxwell>
oh he adds a DEFAULT_BLOCKSONLY, but then doesn't use it in the getargs because some of them are in net.cpp.
< wumpus>
move the constant to net.h?
< gmaxwell>
And he would pass it as an argument but apparently pushversion is called in a blinking constructor.
< * gmaxwell>
checks if he can do that easily
< wumpus>
in general the constants should be defined in the header of the cpp they are used in
< wumpus>
if they are used in multiple places that could be annoying, but at least init.cpp already inclused net.h so that's not a problem
< gmaxwell>
:) I'd assumed there was a reason he couldn't do that, but looking now.
< gmaxwell>
yup he can. one is in main.cpp but it also pulls in net.h.
< wumpus>
otherwise define a global bool and assign that in init.cpp, instead of calling GetBoolArg every time, we do that for more settings (especially those used in inner loops where the overhead of argument-parsing-every-time hurts)
< wumpus>
ok!
< gmaxwell>
okay thanks for the cluestick. I'd assumed that he used them someplace that didn't have net.h but I didn't look.
< wumpus>
interesting, so this seccomp filter is a little program (in a restricted instruction set) that is uploaded to the kernel that checks system calls and arguments and returns whether allowed or not
< wumpus>
never expected it to be so dynamic
< wumpus>
and it should be possible to set it as non-privileged user, given that PR_SET_NO_NEW_PRIVS is set first to avoid tricking set-uid executables with it
< wumpus>
sounds ActuallyUseful, should do some experiments with it some time...
< gmaxwell>
Yes. It is. I tried to do it for bitcoin core eons ago, but the utility library was GPLed... they since changed the licensing.
< wumpus>
(somehow always avoided use of seccomp because I assumed you'd need root to set them, like chroot etc)
< wumpus>
cjdns uses it without utility library
< gmaxwell>
nope. The thing that makes me most sad right now is walletbackup and the import thing, since they require arbritary file access. But I'd like to replace them with little utility programs, which we'd let ourselves exec and talk to over pipes.
< wumpus>
(but cjdns does recommend starting it as root, which was one of the things that put the idea in my head, but they need it for tun setup)
< wumpus>
(as well as other jail functionality, which, unfortunately, does need root)
< wumpus>
well my idea at least at first would be to have a restricted, secure mode, that disabled calls like that (and also walletnotify/blocknotify and everything that calls external processes)
< gmaxwell>
ACK
< wumpus>
but you're right about walletbackup/import, I would prefer if they just dumped/retrieved their data over HTTP instead of making files
< wumpus>
within the JSON RPC framework this is difficult to solve, but outside that w/ using the http server directly it'd be pretty easy to stream instead
< gmaxwell>
I think there is some utility for triggering local backups, at least. But instead of arbritary file access it could simply be making it write out dated files; without giving the rpc caller huge freedom.
< wumpus>
one thing that would make sense is to write to the datadir only
< wumpus>
it also proves that the user using the calls has access to it
< wumpus>
on the other hand, of course people want to backup to mounted filesystems etc
< wumpus>
hence I'd really prefer it to stream from/to http so that the client can decide where to put it or take it from, anywhere *they* have access to
< wumpus>
this avoids using bitcoind in a confused deputy attack (e.g have it write /etc/passwd :-) )
< wumpus>
(or read, for that matter, although you'd be hard pressed to find a file that it likes to read and interpret)
< wumpus>
(except maybe *other* backups it happens to have access to)
< gmaxwell>
IMO if you want to backup to a mounted FS you can either symlink it into the datadir, or have something external copy there. but ::shrugs::
< morcos>
phantomcircuit: I wrote a prototype separate thread for running CreateNewBlock, but the naive approach turned out much worse than the existing implementation.
< morcos>
The issue was that the template creation code kept needing to hold cs_main, so you ended up holding that lock all the time and holding up things like block relay.
< morcos>
Actually it makes a lot of sense to be sure you are not holding that lock at all when a new block comes in so you can validate it as quickly as possilbe and decide whether you should now be building off that
< morcos>
Of course redesigning the template generation code to not need all the locking is probably what we need to do, but it turned out to be a much easier win just to speed up how long it takes
< morcos>
It's really a mempool lock it needs but thats unfortunately cs_main partly.
< morcos>
sipa and I discussed some ideas on having a transaction storage object, which the mempool and the mining code could separately refer to.
< gmaxwell>
a while back matt was also just suggesting that CNB could terminate early if there was contention for the lock from a new block coming in.
< gmaxwell>
which sounds a little hackish, but probably the kind of prudent hack that would work pretty well.
< gmaxwell>
the 30 second rule of thumb is an amalgmation of total delays, in pratice;-- including most pool server software being fairly slow. as far as hardware goes I'm not sure what the antminer s7 is like; S1 was very good (I beat on them pre-release); S2 was pretty terrible (and completely unusable with p2pool which has 30 second blocks); I think I heard S5 was better than S2, I haven't heard about S7
< gmaxwell>
. KNC devices have quite high latency (seconds to respond to new work in my expirence), SP10/SP20 have quite low latency.
< gmaxwell>
ckpool uses blocknotify, many things (including p2pool) talk to bitcoind on the p2p port. Eloipool will use GBT longpool I think, as well as p2p port.
< morcos>
shocking that there are so many different mining software implementations, when its essentially consensus critical too
< gmaxwell>
Even when things long poll they can take a really long time to get it out to the devices due to 'reasons'. (such as almost no one ever actually measuring)
< morcos>
and do the big miners use the existing implementations, or have custom code. and who maintains all those things
< morcos>
sorry for the barrage of ?'s
< gmaxwell>
morcos: well it's armored by bitcoin nodes in and out; not that there haven't been problems.
< gmaxwell>
morcos: so historically mining pools have mostly been custom code; sometimes hacked on top of parts openly available. Historically, public pools spend a lot of their development brainpower on dealing with DOS attacks; then data management for payout computation..
< morcos>
are "pools" still what dominate mining or is it mostly single entities using the pool infrastructure b/c thats what exists
< gmaxwell>
So there are still big public pools but many of them have their own hashpower (either directly or in commonly owned partner companies). The exact breakdown of things is unclear.
< gmaxwell>
There are, of course, big entities now and they have simpler mining challenges (e.g. not so much having to worry about DOS). They sometimes use existing public pools. (I mean some large entities are always on public pools; some only use them sometimes for various reasons)
< gmaxwell>
Mining ecosystem is weird; as you observe there are a bazillion seperate implementations of this or that... and then no one goes and makes createnewblock fast.
< morcos>
yeah, thats what i was wondering about
< morcos>
but i suppose for our purposes (not that it would lead to a different design goal) what we should envision is i guess a p2pool implementation
< gmaxwell>
Only 'mining pool' people who've really been active in core are luke-jr and phantomcircuit (who used to run the gear for a big-mining entitiy before working at BS). And mostly, even for these guys the ideal strategy is to work around limitations; because doing so eas easier/safer.
< morcos>
i mean the whole point is to make the small miner able to compete?
< gmaxwell>
E.g. eloipool (the mining pool software luke-jr wrote that runs eligius) compensates for CNB latency by constructing empty blocks on its own, until CNB responds. And most mining operations just run multiple bitcoinds: one for GBT and several others that they submit through.
< tulip>
morcos: many of the larger pools appear to run completely custom software, for the most part their quirks are completely unique. many of the smaller ones now use ckpool, which seems to be for the most part significantly faster than everybody else.
< morcos>
tulip: thanks. i'm trying to narrow the universe of pool/mining software to explore
< tulip>
morcos: ckpool, eloipool, p2pool, NOMP
< gmaxwell>
P2pool is nice idea with cute features, but has always struggled with the high startup cost of having to run a bitcoin node with it; and then high latency asics showed up (in particular, some of the early BFL products had 10%+ hashrate loss from 10 second retasking)
< gmaxwell>
and that pretty much killed it there, and when it wasn't yet dead enough; somewhat later antpool announced that they'd be converting to p2pool based (but with a friendly front end where they run it for you) and lots of p2pool users moved over, then they didn't actually do it.
< gmaxwell>
so now p2pool is too low of a hashrate for anyone who is variance twitchy to use. :(
< morcos>
oh no, thats sad
< gmaxwell>
Also through this time the developer burned out on Bitcoin, and only does life support maintaince on p2pool now.
< gmaxwell>
at the moment P2Pool's hashrate is 918 TH, says my local p2pool node.
< gmaxwell>
(it's pretty neat, has a built in webserver and graphs and such)
< gmaxwell>
(Forrestv seems to have flamed out due to a mixture of bitcoin drama, and then donations to support p2pool being relatively low (compared to life changing amounts of income centeralized pool operators were getting), and then he got ripped off ... twice, I think, by mining hardware makers.)
< tulip>
morcos: there's also stratum proxies like bfgminer and stratum-mining-proxy which present work to downstream miners, based on either an upstream work server or a Bitcoin node. the latter was meant for use with early hardware miners which supported getwork but not stratum.
< gmaxwell>
then of course there is all these mining devices with embedded mips and arm linux systems that run usually very old versions of cgminer (or sometimes bfgminer)
< phantomcircuit>
morcos, generally speaking.. yes i dont care that cnb is slow because it's mostly irrelevant
< phantomcircuit>
and i really dont care that it holds a lock for ages since blocks are submitted to nodes dedicated for that
< gmaxwell>
it's much less irrelevant if you're not assuming a big miner farm.
< phantomcircuit>
gmaxwell, true
< morcos>
phantomcircuit: i thinking having getblocktemplate and createnewblock quickly return a valid block after a new tip is important to everyone
< morcos>
even better if it is with txs, but not critical
< phantomcircuit>
morcos, well making it return *something* immediately afterwards would certainly be nice
< phantomcircuit>
that would reduce the need for some of the comical hacks
< phantomcircuit>
but as gmaxwell said most of the hardware has queues that are very long and dont flush
< phantomcircuit>
(there's good reason for this but i wont get into that)
< morcos>
right so the way i look at it is assuming we don't want to code up validationless mining (which i am not yet convinced of) then we need 100ish ms to validate the new tip and about 100ish ms to generate a new block with txs (with the new code)
< morcos>
so at most we could carve out the second 100ms, but at this point we've gotten 95% of the improvement, so maybe not worth doing that unless we are going all the way
< phantomcircuit>
morcos, 100ms to return an empty but validated block template would be better :P
< morcos>
and before gmaxwell reaches through the tubes to strangle me, my primary argument for considering doing validationless mining is that if people are going to implement it anyway, we might as well make it as safe as possible
< morcos>
but i agree that if there are other bottlenecks to switching work, then its not something we can do... (validationless block header then 100-200ms later the validated version, if you can't switch after 200ms, then its not safe to start on the unvalidated version)
< morcos>
but that same argument applies much less importantly to empty blocks. if you're not going to switch after 100ms to the block with txs... then you probably dont' want to start with the empty block... depends on the amount of fees and the minimum switching delay i guess
< gmaxwell>
If we want to implement validationless mining, then we need to do it right; which would mean doing something like signaling in the block header what we haven't verified prev, and so SPV clients should ignore the block for conf count purposes.
< gmaxwell>
But I'd rather get validation lower first. :)
< phantomcircuit>
morcos, there's basically no bottleneck in the validation
< morcos>
gmaxwell: sure agreed. phantomcircuit: no bottleneck? i'm talking about latency in sending new work to hardware outside of bitcoind
< morcos>
i'd guess that a significant chunk of the remaining 100ms is allocation (in the non degenerate block case)
< phantomcircuit>
morcos, yes... the issue is almost entirely not in validation but in selecting transactions
< gmaxwell>
it's an interesting point that the extra 100ms doesn't matter. But rather, I think getting the 100ms upfront matters, and then it doesn't really matter if CNB is slow (except for lock holding reasons)
< phantomcircuit>
the mempool indexing + limiting mostly fixes that
< morcos>
phantomcircuit: huh? no thats backwards. selecting txs is blazingly fast. (well until cpfp)
< morcos>
oh you're talking about in master
< morcos>
yeah sure.. i'm talking about the remaining 100ms... which is 10ms tx selection and 90 ms validation. tx selection is still only 20ms if you scan an entire 300mb mempool to also sort by priority (with a fast priority calc)
< phantomcircuit>
morcos, im not (and i dont think anybody paying attention) is worried about validation times of 100ms
< morcos>
i see 100 ms, and i think the units still need to change. :)
< phantomcircuit>
that's like 0.01% orphan rate
< phantomcircuit>
yes well faster is always better
< gmaxwell>
hah. I think 100ms is really slow, but people clearly don't mind that much. But they don't mind in part because they perform hacks; some of which are harmful and we'd like them to stop. :)
< morcos>
actually we are forgetting to measure some things there
< morcos>
i'm measuring from receive block to new tip = 100ms and new tip to new template = 100ms
< gmaxwell>
when I say they don't mind that much; a year ago there were two places in the network handling code that just interted 100ms _sleeps_.
< morcos>
but we also need to worry about receive most work header to receive block
< morcos>
that's probably what we need to optimize now
< phantomcircuit>
yes the push head instead of inv patch is a huge win for that
< morcos>
or even someone else genarates new tip to receiving most work header. direct headers will help with that
< gmaxwell>
yes, well we can remove a one way delay by nominating your favorite peer to relay uninvited; beyond that needs something like matt's relay protocol. (or better)
< morcos>
yeah i need to look into how the relay client works... if you generate a new tip... how do you communicate that to the relay network, same as p2p
< * phantomcircuit>
looks at brain clock *wow so late* goes to sleep
< gmaxwell>
morcos: the relay network has its own protocol, and a local proxy you run that speaks bitcoin p2p.
< morcos>
oh i suppose if you submitted to a node thats not mining, then it doesn't have the problem of that immediate cs_main lock from gbt after you generate a new tim
< morcos>
tip
< gmaxwell>
yea, thats why the earlier mentioned architecture of multiple nodes.
< gmaxwell>
The relay protocol is super simple. It can send lose transactions, and keeps a history of the X sent. When you want to send a block it sends the header and a list of two byte indexes into that transaction history, as well as any history-miss transactions.
< gmaxwell>
other end reconstructs the reblock and passes it on to bitcond unsolicited.
< gmaxwell>
his server side, passes all transactions through a local bitcoind, and only relays what it spits out.. but for blocks it checks work and relays them on without more then minimal verification.
< gmaxwell>
This really simple approach frequently managed to turn 1MB blocks into 3.6KB. http://bitcoinrelaynetwork.org/stats.html and leaves matt either fighting sha256 speed ... or mempool spam that breaks his hitrates.
< GitHub101>
[bitcoin] gmaxwell opened pull request #7016: Avoid a compile error on hosts with libevent too old for EVENT_LOG_WARN. (master...without_EVENT_LOG_WARN) https://github.com/bitcoin/bitcoin/pull/7016
< vev__>
we need people to support us #libreidea
< GitHub50>
[bitcoin] morcos closed pull request #6292: Rename and comment priority calculation in TxMemPoolEntry (master...PriorityComment) https://github.com/bitcoin/bitcoin/pull/6292
< morcos>
Can someone tag #6134 with 0.12 milestone.
< morcos>
I also think #6915 should go in for 0.12.