< bitcoin-git>
[bitcoin] pstratem opened pull request #20751: [fuzzing] Use subprocess.Popen call directly, remove use of thread pool. (master...2020-12-22-fuzz-test-runner) https://github.com/bitcoin/bitcoin/pull/20751
< bitcoin-git>
[bitcoin] pstratem opened pull request #20752: [fuzzer] generate new seeds in an infinite loop with random parameters (master...2020-12-fuzz-test-runner-infinite-generate) https://github.com/bitcoin/bitcoin/pull/20752
< phantomcircuit>
sipa, ^ those seem to work nicely
< sipa>
mutate_depth uniformly up to 15, wow
< sipa>
i haven't experimented with that, but assumed that the default of 5 was pretty good
< sipa>
phantomcircuit: i believe it also makes sense to vary the -max_len parameter
< sipa>
giving proportionally more time to short inputs seems to help
< phantomcircuit>
sipa, there's already a strong bias towards smaller inputs with another parameter which is uhhh
< phantomcircuit>
len_control
< phantomcircuit>
i might need to upstream to clang a change to the max_total_time behaviour, for process_message even with 300 seconds it exits after like 100 tests
< sipa>
phantomcircuit: another possibility is just not starting with the full corpus, but just a random subset of limited size
< sipa>
phantomcircuit: the len_control isn't useful for us i think... yes, you want short and long, but you don't want that within one run... we run so long that hopefully there is good coverage on all lengths already
< phantomcircuit>
oh i forgot to add a sleep to the loop
< phantomcircuit>
heh
< phantomcircuit>
sipa, right so random max_len between like 100 and 4,000,000 ?
< sipa>
by default max_len is 4096
< sipa>
and len_control is on
< phantomcircuit>
uh no by default it guesses based on the corpus
< phantomcircuit>
which for process_message ends up being >1MiB
< sipa>
oh, ok, but if the corpus has nothing big, it's set to 4096 afaik
< phantomcircuit>
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 1007323 bytes
< sipa>
longer term i think we want an absolutely max_len configured for every target
< phantomcircuit>
it's based on what's n the corpus and afaict the only way for it to grow is to have lots of things which are near the max and restart the process
< phantomcircuit>
it'll then guess that max_len should be longer and bump it
< phantomcircuit>
sipa, yeah agree, i suggested upto 4MiB since that's the largest valid network message, i think?
< sipa>
but fuzzing with such large inputs is super slow
< sipa>
so i think you want the max_len parameter exponentially distributed between say 64 and 4096 for most targets, and maybe forna few exceptions with a higher max
< phantomcircuit>
does the max length need to be uniformly within the range?
< phantomcircuit>
maybe just powers of 2 is ok?
< sipa>
that's not uniform
< phantomcircuit>
sipa, ?
< sipa>
2**random.uniform is exponential, not uniform
< phantomcircuit>
sipa, oh no i mean like instead of powers of 2 that will also be values inbetween
< phantomcircuit>
it's still distributed exponentially
< sipa>
yeah, why restrict to just powers of 2? :)
< sipa>
(no strong opinion, i'm sure it's all fine)
< phantomcircuit>
btw the largest corpus i have is now for the bloom_filter and asmap tests which is interesting
< phantomcircuit>
i think the process message one just explodes trying to load it's corpus actually
< phantomcircuit>
sipa, yeah i tried specifying -runs instead of -max_total_time and im running into the problem i thought i would
< phantomcircuit>
all the processes are now expensive things to run cause they sort of clog up the works
< jonasschnelli>
what is the reason to extend the fuzzer seeds? Catching more potential issues?
< sipa>
jonasschnelli: that's literally what fuzzing is
< sipa>
finding more seeds
< sipa>
finding more inputs that cover more code, or more potential issues
< jonasschnelli>
I think my understanding of fuzzing is weak... I need to read more about it first
< jonasschnelli>
I always thought its running a random stream against all possible APIs for a very long time
< sipa>
every fuzz target is a function that takes a byte array, and uses it to decide what things to call
< sipa>
some targets just call functions, and don't do anything elae
< jonasschnelli>
okay. I get that.
< sipa>
others have full simulators of what the expected behaviour is going to be and compares with it
< sipa>
fuzzing is finding those input byte arrays, called seeds
< sipa>
it does that by applying random mutations to existing seeds
< sipa>
or trying new things
< sipa>
when people spend CPU time fuzzing, the output is finding more seeds, because it somehow found something that appears either more interesting than all seeds there were before, or because it does the same but with a smaller seed
< jonasschnelli>
so the seeds are actually not a random-stream-seeds that is used for pure fuzzing of a function(or API)? They decide what to call?
< sipa>
that's right, they are the exact input fed to the fuzz target
< sipa>
our CI infrastructure also runs these seeds as part of the tests, but without trying to generate more
< jonasschnelli>
you wrote "more interesting",.. does the fuzz target track to code paths (or percentage) beeing used?
< sipa>
yes
< sipa>
and more
< jonasschnelli>
I see
< sipa>
it has counters for various things; code blocks, branches, ..
< sipa>
if you run in -use_value_profile mode it counts specific values seen in comparisons too
< jonasschnelli>
so one can say there are "better" and "weaker" seeds?
< sipa>
it"s the combimed corpus that matters
< sipa>
but the goal is ultimately finding a small combined corpus with maximal coverage
< jonasschnelli>
the more counters a seed hits the better,.. but overall, you want all "counter" to be hit by a range of seeds?
< sipa>
right
< jonasschnelli>
thanks. TIL
< sipa>
it's annoying that you need a separate build/configure foe it
< sipa>
i have a separate worktree for that
< sipa>
it's fun to play with
< jonasschnelli>
Yes. That's the reason why I haven't looked closer into it. :)
< sipa>
just follow doc/fuzzing.md
< jonasschnelli>
will do
< sipa>
we have a test/fuzz/test_runner.py that lets you more easily do larger scale testing
< sipa>
but just running a fuzzer directly will prpbably be more informative
< sipa>
just compile with --with-sanitizers=fuzzer --enable-fuzz, and run e.g. FUZZ=asmap src/test/fuzz/fuzz
< jonasschnelli>
ok
< sipa>
it'll test you when it finds new inputs, and some metrics about coverage
< sipa>
you need to specify a dir name to leg it save its seeds
< bitcoin-git>
[bitcoin] MarcoFalke opened pull request #20753: rpc: Allow to ignore specific policy reject reasons (master...2012-policyRpcIgnore) https://github.com/bitcoin/bitcoin/pull/20753
< luke-jr>
MarcoFalke: you know #7533 is still maintained, right? could just reopen that..
< gribble>
https://github.com/bitcoin/bitcoin/issues/7533 | RPC: sendrawtransaction: Allow the user to ignore/override specific rejections by luke-jr · Pull Request #7533 · bitcoin/bitcoin · GitHub
< MarcoFalke>
luke-jr: Sure, happy to repoen
< MarcoFalke>
It was tagged "up for grabs", so I implemented a minimal patch that still achieves a goal
< MarcoFalke>
you can rebase on top of mine, if you want
< luke-jr>
MarcoFalke: at this point, I guess it might be more productive if I just review yours and get that in first
< luke-jr>
just seemed strange to go about it that way *shrug*
< MarcoFalke>
Yeah, I wasn't aware that 7533 is still maintained
< luke-jr>
ah
< miketwen_>
looking for some clarity here.. smartfee estimator is used by default for bitcoin node correct? if you are not synced with the blockchain it will use fallbackfee if enabled. correct? In terms of priority or the decision tree I'm wondering how this relates with paytxfee.. and then I would guess settxfee would be the same logic as paytxfee but replace paytxfee?
< jonatack>
miketwenty1: yes iirc, see: ./src/bitcoin-cli help settxfee
< miketwenty1>
jonatack: there are many flags for targets / fees and such.. has anyone made a diagram or decision tree on what takes priority over what by any chance?
< miketwenty1>
in terms of what fee gets paid for a tx
< jonatack>
miketwenty1: i agree that it's a bit confusing. if you want to set an explicit fee rate, know that in 0.21 and master it is now possible to pass the fee rate (not absolute fee, but the fee rate) in sat/vB with all of the various send commands (fee_rate param, see the rpc help)
< jonatack>
miketwenty1: i've noted your suggestion to clarify this somewhere, thanks
< bitcoin-git>
[bitcoin] amitiuttarwar opened pull request #20755: [rpc] Remove deprecated fields from getpeerinfo (master...2020-12-getpeerinfo-deprecate) https://github.com/bitcoin/bitcoin/pull/20755
< jonatack>
miketwenty1: i've begun improving the settxfee help and (new rpc) setfeerate help in #20391 and will look at updating both helps with more info
< jonatack>
and will do the same for estimatesmartfee and the (new rpc) estimatefeerate helps (the new RPCs are in sat/vB)
< jonatack>
and work to make these RPCs more user-friendly and reassuring to use
< bitcoin-git>
[bitcoin] amitiuttarwar opened pull request #20756: [doc] Add missing field (permissions) to the getpeerinfo help (master...2020-12-getpeerinfo-permissions) https://github.com/bitcoin/bitcoin/pull/20756
< bitcoin-git>
[bitcoin] jonatack opened pull request #20757: doc: tor.md and -onlynet helpupdate -onlynet help in src/init.cpp (master...tor-md-doc-updates) https://github.com/bitcoin/bitcoin/pull/20757
< jonatack>
^ when github opens the pull before you finished typing the title
< bitcoin-git>
[bitcoin] ajtowns opened pull request #20758: net-processing refactoring -- lose globals, move implementation details from .h to .cpp (master...202012-netproc-refactor) https://github.com/bitcoin/bitcoin/pull/20758