< Luke-Jr>
gmaxwell: perhaps. optimizing the sorted list would help in multiple places though
< gmaxwell>
as far as optimizing, how about adding one of the satoshi dice addreses to a wallet and rescanning?
< Luke-Jr>
(just the last item will actually change the algorithm btw)
< gmaxwell>
Luke-Jr: sure other optimizations are good too, but AFAIR all that needs to do is track a the extrema.
< Luke-Jr>
gmaxwell: it ignores any transactions timestamped >5m in the future from the local time, which means such txns would have to be reconsidered over time
< gmaxwell>
Luke-Jr: uh. how are they ever going to get learned then?
< Luke-Jr>
gmaxwell: ?
< Luke-Jr>
ignores in the context of looking at other txns timestamps to decide the timestamp for this one
< phantomcircuit>
what's the relative cost of checking an address vs pattern matching the output to start with?
< gmaxwell>
remind me precisely what it does?
< gmaxwell>
phantomcircuit: go read the pr
< gmaxwell>
(the text on the opening)
< Luke-Jr>
gmaxwell: it's complex enough that I'd just be rewriting the code in pseudo-code to explain it :/
< gmaxwell>
Luke-Jr: good well that means it can change behavior and not violate anyone's expectations because apparently no one can understand it. :P
< Luke-Jr>
it's not hard to understand, it's just complex :p
< phantomcircuit>
gmaxwell, i never understood why we even bothered to guess when an unconfirmed transaction happened
< Luke-Jr>
the point is that it meets expectations
< gmaxwell>
phantomcircuit: because we show them in listtransactions
< phantomcircuit>
gmaxwell, "this transaction happened now" seems reasonable for all unconfirmed transactions
< gmaxwell>
phantomcircuit: but then its inconsistent with what you get when it confirms.
< Luke-Jr>
phantomcircuit: no, definitely not when you're syncing a node that was off
< phantomcircuit>
Luke-Jr, how about null
< gmaxwell>
(the listtransaction output gives several different times when you care about details... and a smart time)
< gmaxwell>
phantomcircuit: bet peoples software will handle that super awesomely well.
< Luke-Jr>
phantomcircuit: … that is not useful
< phantomcircuit>
the timestamp we're returning now isn't very useful either
< gmaxwell>
phantomcircuit: It's usually pretty good.
< gmaxwell>
Luke-Jr: I think you might need to write out the algorithim again, we cannot continue with this being slow like this.
< phantomcircuit>
i guess...
< Luke-Jr>
gmaxwell: that's why I'm working on optimizing it.. if we want accounting to keep working, I can deal with that, just would have saved time to omit it
< Luke-Jr>
(this code actually seems to be somewhat broken with accounts, fwiw)
< gmaxwell>
Luke-Jr: I thought smarttime just returned blocktime for confirmed transactions that you didn't observe first?
< Luke-Jr>
not necessarily
< gmaxwell>
unless the blocktime is in the future, in which case?
< phantomcircuit>
this seems much smarter than it needs to be...
< Luke-Jr>
it's approximately max(last_txn_time, min(block_time, max(last_txn_time, now)))
< phantomcircuit>
also it seems like all of these values should be cachable
< gmaxwell>
Luke-Jr: than can be simplified to refer only to blocktimes in the context of a rescan.
< Luke-Jr>
gmaxwell: half the goal here is to avoid transactions ever having timestamps out of order
< gmaxwell>
which they cannot, in a rescan.
< Luke-Jr>
uh, why not?
< Luke-Jr>
block times are out of order all the time
< phantomcircuit>
what's the rational behind not setting the transaction timestamp to the block timestamp?
< gmaxwell>
Luke-Jr: as I just said, you can apply that filter to blocktimes only!
< phantomcircuit>
oh you want to avoid the timestamps disagreeing with the nOrdPos
< gmaxwell>
phantomcircuit: becuase they end up out of order which really screws with people.
< Luke-Jr>
gmaxwell: that sounds a lot more complex, and doesn't consider that we can ignore it when blocks don't have any txns of ours
< gmaxwell>
Luke-Jr: this code takes _9.5 hours of processing_ in the requesting users example.
< gmaxwell>
And you are going to complain about complex?
< gmaxwell>
Luke-Jr: having it use some really old time because your last transaction was 50 blocks ago isn't good.
< Luke-Jr>
I find it simpler to just optimise the ordered tx list
< gmaxwell>
e.g. if the latest block is in the future, okay you don't use the time. Then it should use the time of the prior block, even if there was no transaction there, not the time of your last transaction.
< gmaxwell>
at least in the context of a rescan where you do not learn times anywhere except blocks.
< gmaxwell>
I think the current behavior is just busted there.
< Luke-Jr>
gmaxwell: it would use `now` in that case
< Luke-Jr>
in a rescan, block times will basically never be in the future.
< gmaxwell>
yea, unlikely, indeed.
< Luke-Jr>
gmaxwell: will you NACK an optimization that preserves existing behaviour?
< gmaxwell>
if it's measurably slower, yes. This should not take any time at all. as I said, I would have NAKED smarttimes if I knew it caused a measurable slowdown.
< Luke-Jr>
gmaxwell: I *think* it is possible to optimise so it's reasonably close, but in case not, there is a possibility of rescan just skipping smart-time assignment (which currently causes "time" to fall back to received time = rescan time, but maybe that should change?)
< Luke-Jr>
side note: hmm, sync of my node confused BFGMiner as it got notified of old blocks O.o
< gmaxwell>
Luke-Jr: of course, if this function is slow during rescan its also slow during runtime... so I do support speeding it up generally too.
< gmaxwell>
Luke-Jr: would rather fall back to monotoneized blocktime than rescan time.
< Luke-Jr>
well, part of why it was unoptimised is probably because runtime is not time sensitive
< gmaxwell>
Luke-Jr: hope you're not mining from a node with a wallet...
< gmaxwell>
but really having a very big wallet (Even bigger than the ones there) would eventually be disruptive.
< Luke-Jr>
eh? it's just BFGMiner (running for testing) automatically picking up my hot wallet
< Luke-Jr>
it's using Eligius mainly, but the local node's block disagreements somewhat confused it
< Luke-Jr>
no big deal if mining is negatively impacted by it; maybe a slightly bigger deal if it were to compromise my hot wallet
< Luke-Jr>
but if that latter case is a risk, we should fix it. I'm not aware of such a problem.
< gmaxwell>
I wasn't commenting on your bfgminer comment.
< Luke-Jr>
oh
< gmaxwell>
I'm saying that you shouldn't dismiss that the performance of sorting the whole wallet on an add as being important. As an example, someone who is mining with a wallet cares about a 100ms delay, (which is the ballpark of what that user was reporting).. or very big wallets.
< phantomcircuit>
Luke-Jr, cant you just cache the oldest record timestamp?
< Luke-Jr>
phantomcircuit: newest*, but no, because of the 5 minute future rule
< Luke-Jr>
optimising OrderedTxList looks like an easy big win. let me finish that and we can see where we are and if we need more improvements?
< gmaxwell>
Luke-Jr: sounds like something to try.
< phantomcircuit>
bleh the real problem is the rpc api is wrong
< phantomcircuit>
probably cant reasonably fix that though
< gmaxwell>
on my not very huge wallet, the PR on github is a ~1% speedup.
< gmaxwell>
really we're doing something dumb if every wallet isn't approximately the same speed for rescan.
< kanzure>
rescan was killing all my rpcthreads a few weeks ago and i didn't have any insight into what was going on :-( maybe i'll submit an rpcthread status info command.
< kanzure>
s/killed/pegged? what's the right term
< dcousens>
jgarzik: not sure what users would use it tbh
< dcousens>
I know I would for an API perspective
< dcousens>
So, in my case, automated?
< dcousens>
And I guess for sites like webbtc, they lightly dress it up in some readable HTML
< dcousens>
hell, if an addrindex existed, a block explorer would solely be a HTML dress up of the bitcoind REST API ha
< jgarzik>
dcousens, yeah the 'GET /foo' case is more likely to be used manually, versus PUT/POST
< dcousens>
jgarzik: probably a more important question is, what is the default
< dcousens>
that will dictate how easy it is for curl users
< dcousens>
And even then, content type isn't that hard for curl users?
< dcousens>
I feel like most users might do something along the lines of 'bitcoin-tx ... | curl 'http://somepopularnode.com/tx' --data-raw -H "Content-Type:application/octet-stream"
< dcousens>
or application/json if `-json` was used
< dcousens>
Probably easiest to just specify the format to whatever the default is, and have `-json` or whatever on bitcoin-tx though
< wumpus>
kanzure: rescan blocks all your RPC threads, on aquiring cs_main
< wumpus>
kanzure: long-running RPC commands tend to block the rest, apart from gettxoutsetinfo which I changed at some point to not hold locks (as it runs on a snapshot of the database)
< wumpus>
probably in rescan the lock could be moved down too, and only aquired when block database structures are accessed
< Luke-Jr>
gmaxwell: caching ordered tx list is faster than removing smart time. good enough? :x
< gmaxwell>
Luke-Jr: so what happens when the wallet is huge?
< gmaxwell>
then we end up with this gigantic tx list cache in ram. :(
< Luke-Jr>
eh, right now we store all the tx anyway..
< Luke-Jr>
is storing a list of pointers all that much more?
< gmaxwell>
I guess not, probably only another three gigabytes of ram with the whole blockchain in your wallet.
< gmaxwell>
I'm not super keen on this whole issue existing though. :-/
< gmaxwell>
Re: 6844 I am twitching a bit about more things being added to the rest api, while little effort has been performed to gain strong confidence that it's not a remote code execution vector. IIRC, we were able to crash the node via it when it went in and we addressed that by disabling it by default. I absolutely don't want to crap on 6844 itself, as it looks like a great feature. But we are making
< gmaxwell>
an error if we continue to offer a potentially insecure external interface and make it more and more attractive to turn it on and expose it.
< gmaxwell>
(keep in mind that even if it is localhost only; a local webbrowser has almost free reign over it.)
< phantomcircuit>
whoa what?
< phantomcircuit>
you can remotely crash the rest api?
< gmaxwell>
phantomcircuit: Could initially. I'm unsure if it's fixed, it's a huge attack surface in any case-- we don't have any fuzzing harness for it either.
< gmaxwell>
(the prior crash I was aware of was just send it an infinityish sized request...)
< wumpus>
it's relatively safe as long as the JSON parser is not used for input data, and if there is an input argument it is just a simple linear string directly part of the URI, with straightforward validation
< wumpus>
the crash was a case where JSON spirit was used to parse input data query, which had a known out of memory crash if you make structures too deep
< gmaxwell>
heh. I found a (local) buffer overflow bug in wget while trying to crash it.
< wumpus>
IMO REST is for 'trivial, quickly answered, traightforward queries', not where you need to pass in JSON encoded monstriosities
< jonasschnelli>
wumpus, yes. Probably a good idea. What about maximum length of URLs in case of submit tx?
< wumpus>
gmaxwell: that could still be interesting; some scripts are bound to pass remote input to wget's parameters
< wumpus>
jonasschnelli: there definitely should be a limit!
< jonasschnelli>
wumpus: right. But it might be, that http clients or proxys are using limits lower then the user might expect.
< jonasschnelli>
I think for large tx /GET is not suitable.
< * wumpus>
wishes he had more time to hunt for vulnerabilities in bitcoind's codebase
< jonasschnelli>
POST with hex/bin should be supported IMO
< wumpus>
using GET to provide input for tx submission is very un-restful
< jonasschnelli>
most webservers (and probably proxys) have a limit of 8KB for URL (HTTP GET) length
< wumpus>
should be PUT?
< wumpus>
or yes, POST
< jonasschnelli>
wumpus: agree. PUT or POST would be better. But there comes in the question of supporting JSON... which is a security risk.
< wumpus>
no, no JSON
< jonasschnelli>
Maybe just support BIN/HEX
< wumpus>
just raw POST data
< jonasschnelli>
(or only BIN)
< jonasschnelli>
wumpus: agree.
< wumpus>
JSON is ok for returning data in some cases, but should not be *parsed* in REST
< Luke-Jr>
can we parse plain English language instead?
< gmaxwell>
please don't do the rest submittx at all if its only bin, if it won't take hex it will be massively less used, and we'll just have more code to hang around and bloat the codebase; while people will continue to eschew bitcoin core and use some web api that actually gives an interface people expect. :-/
< gmaxwell>
okay, I feel better now-- I have a whole bunchof fuzzers running against it at the moment and it's not immediately dying.
< gmaxwell>
\O/
< jonasschnelli>
gmaxwell: agree. Hex should be supported in the first place.
< wumpus>
hex is fine, just don't add a zillion kinds of input formats...
< wumpus>
each one adds more attack surface
< Luke-Jr>
tonal pls
< wumpus>
hah...
< Luke-Jr>
(no, tonal isn't used for binary data..)
< Luke-Jr>
is -connect=0 the best way to avoid any external interactions with p2p?
< wumpus>
-connect=0.0.0.0
< Luke-Jr>
k
< wumpus>
-connect=0 will work too, but will spam your debug.log with errors
< wumpus>
but yes I feel bad about adding more and more to the REST interface as well. Initially it was meant to just request data, making it possible to submit transactions unauthenticated is a different thing.
< wumpus>
I understand why it's useful of course...
< Luke-Jr>
IMO it will get horridly abused
< Luke-Jr>
when Eligius offered such a thing, it got hard-coded into spam scripts
< wumpus>
you're not *supposed* to offer REST interface to the open internet, but lacking auth it may be easy to do accidentally (same as zillions of other critical systems are accidentally on the internet)
< gmaxwell>
I dunno if there is any real reason to do anything but hex.
< gmaxwell>
I suppose if block submission was also supported then the hex overhead would be pretty unwelcome.
< gmaxwell>
and for that we'd want binary data. so there might be a uniformity argument.
< wumpus>
binary is the least risky to accept, I don't think there's anything to be won by not accepting that
< wumpus>
then again I don't really care, if you want to make it hex-only, be my guest, just don't add JSON parsing
< wumpus>
(or any encoding more involved than straightforward encode-two-characters-to-one like hex)
< GitHub21>
[bitcoin] luke-jr opened pull request #6851: Optimisation: Store transaction list order in memory rather than compute it every need (master...opti_txorder) https://github.com/bitcoin/bitcoin/pull/6851
< wumpus>
Luke-Jr: why is -zapwallettxes performance a goal?
< wumpus>
isn't that a troubleshooting option that should be quite rarely used?
< Luke-Jr>
wumpus: I don't know, gmaxwell was annoyed it was taking some guy with a big wallet 9 hours. Apparently it's more common because we don't handle malleability right still.
< Luke-Jr>
wumpus: but right now, it sorts the entire wallet's transactions every time it adds a new transaction to the wallet. so it's pretty bad.
< wumpus>
we had a RPC pull request at some point where it was possible to remove conflicted transactions one-by-one, I greatly preferred that to the hammer that is zapwallettxes
< wumpus>
it's pretty bad but does it warrant using more memory even in the cases where people don't use zapwallettxes?
< wumpus>
I don't think so - memory usage of bitcoind is already famously bad
< Luke-Jr>
for the wallet? I've only heard complaints when people fail to configure their mempool policy
< wumpus>
sorry to be grumpy about this but there's so many people complaining about that
< wumpus>
well the entire wallet is kept in memory at alll times
< Luke-Jr>
yes, that's why I didn't think the overhead from this would be significant ;)
< Luke-Jr>
(my hot wallet has been in regularish use since 2009 and hasn't reached 40 MB yet..)
< wumpus>
i don't like "it's already bad, so making it any worse doesn't matter" arguments, but ok...
< Luke-Jr>
2010*
< dcousens>
IMHO, I'd argue the REST API shouldn't exist at all
< dcousens>
Its trivial to expose it via the RPC if you want to do that
< wumpus>
that was my argument too; it could just as well have been an external script that wraps RPC and proides a new API
< dcousens>
For the same reasons Guest89849 (gmaxwell) mentioned before, its just a huge surface vector for limited gain
< wumpus>
but who listens to me...
< Luke-Jr>
wumpus: the benefits IMO significantly outweigh the costs. someone would need to be running some major service provider to use a lot of memory for the wallet, and that use case is just out of reach for bitcoind's wallet without any kind of major rewrite
< wumpus>
not everything has to be in the monolithic bitcoind executable
< Luke-Jr>
even aside from zap, it improves performance for listtransactions RPC etc
< wumpus>
Luke-Jr: oh if it helps perfomance in other places too you should definitely mention that!
< dcousens>
wumpus: Luke-Jr has mentioned a few times now people writing their own mempool policy, which is fine, and maybe including stricter default policies might be slightly controversial (too restrictive etc), but, maybe some pre-written policies that users can easily enable that have some sane defaults would be beneficial?
< wumpus>
Luke-Jr: I wasn't aware, and just 'it improves zapwallettx' is not a big selling point to me
< Luke-Jr>
k, added it to the summary. looks like it's *just* listtransactions
< wumpus>
dcousens: if someone wants to write and test them, I don't see why not, there are some pulls that factor out the mempool policy stuff which makes it easier to create custom ones, at least in theory
< dcousens>
wumpus: I mean, I guess that would be the ideal case right?
< dcousens>
Everyone configuring their policy to what they are happy with
< wumpus>
dcousens: the ideal case - from the network view - is that everyone runs the same, predictable policy
< dcousens>
wumpus: true, but varying hardware requirements mean that is difficult to cater for
< dcousens>
So, I guess, you could go the highly configurable policy route
< wumpus>
dcousens: if people insist on using their own mempool policy on their miner there's no way to prevent that, and arguably it shouldn't be made more difficult than necessary to prevent stupid issues from patching
< dcousens>
Or what seems to be the current direction, scalable policy which matches your hardware (if configured)
< wumpus>
dcousens: but I'm not sure it makes sense to talk about the ideal case there
< wumpus>
how does hardware determine the policy? just how much memory available?
< dcousens>
wumpus: yes
< wumpus>
if so, adding a parameter to set the amount of memory used (with as little impact otherwise) would be preferable
< Luke-Jr>
wumpus: everyone running the same predictable policy necessarily implies centralisation and censorship risk
< wumpus>
that's the whole point of the mempool limiting
< wumpus>
add limiting but don't mess up the network too much or introduce DDoSes
< wumpus>
I don't see the point of 'highly configurable policy' just to work around hardware limitations
< dcousens>
Luke-Jr: only if the default policy is restrictive right?
< Luke-Jr>
dcousens: which it must be
< dcousens>
Luke-Jr: or do you mean, encouraging people just to use the bundled policy is the centralizatoin risk?
< dcousens>
(as it seemed you implied to me last night)
< wumpus>
Luke-Jr: censorship risk? that's a strawman, I'm talking a about a *sane* default policy, not some blacklist hellscape
< Luke-Jr>
dcousens: I was responding to wumpus saying ideal is everyone on the same policy.
< wumpus>
if bitcoin core starts to include code that censors specific transactions/senders, start running very fast
< Luke-Jr>
wumpus: a unified policy means there is someone making the decisions
< dcousens>
Luke-Jr: you could say the same thing about a unified protocol
< wumpus>
right, it is about encouraging people to use the same or similar policy for relaying, there is no mechanism to force nodes to use a certain policy. And if someone powerful wanted to censor it'd be much more effective to approach the (heavily centralized) miners than trying to get it merged upstream.
< GitHub199>
[bitcoin] laanwj closed pull request #6162: Use 512-bit arithmetic for difficulty retarget calculation. (master...regtest_consensus_fix) https://github.com/bitcoin/bitcoin/pull/6162
< GitHub124>
[bitcoin] CodeShark opened pull request #6853: Added fNoRetargeting field to Consensus::Params (master...fNoRetargeting) https://github.com/bitcoin/bitcoin/pull/6853
< GitHub169>
bitcoin/0.10 fb818b6 Micha: Bring historical release notes up to date...
< GitHub169>
bitcoin/0.10 7e9a987 Wladimir J. van der Laan: Merge pull request #6836...
< wumpus>
the more I think about it the more I think we should have implemented REST as a separate script that wraps the RPC interface... I really don't like how another interface to the same things is converging more and more on RPC's functionality
< btcdrak>
^
< btcdrak>
wumpus: could still be changed that way now
< wumpus>
yes
< wumpus>
with libevent's http, and univalue-as-as-library it shouldn't be that difficult at all
< wumpus>
(even to do it in C wouldn't, I mean, in python it's obviously easy)
< btcdrak>
wumpus: so really, if you want this to be done you'll have to refuse to merge new duplicate REST methods until the wrapper is done.
< wumpus>
if I want it to be done I'll have to do it myself...
< GitHub130>
[bitcoin] morcos opened pull request #6856: Do not allow block file pruning during reindex. (master...noPruneDuringReindex) https://github.com/bitcoin/bitcoin/pull/6856
< GitHub123>
[bitcoin] morcos opened pull request #6857: Require nLastBlockFile to be the highest numbered file. (master...nLastBlockFile) https://github.com/bitcoin/bitcoin/pull/6857