#bitcoin-core-dev on 2021-04-07 — searchable irc log

02:26 < jamesob> wumpus: whenever is convenient, would you mind swapping my high-prio PR for #21523? I think it should be a pretty uncontroversial change

02:26 < gribble> https://github.com/bitcoin/bitcoin/issues/21523 | validation: run VerifyDB on all chainstates by jamesob · Pull Request #21523 · bitcoin/bitcoin · GitHub

02:27 < fanquake> jamesob: have done

02:27 < jamesob> fanquake: tyty

04:48 < bitcoin-git> [bitcoin] fanquake pushed 3 commits to master: https://github.com/bitcoin/bitcoin/compare/9be7fe484931...245a5cd5604a

04:48 < bitcoin-git> bitcoin/master 6965456 Andrew Chow: Introduce DeferringSignatureChecker and inherit with SignatureExtractor

04:48 < bitcoin-git> bitcoin/master a97a929 Andrew Chow: Test that signrawtx works when a signed CSV and CLTV inputs are present

04:48 < bitcoin-git> bitcoin/master 245a5cd fanquake: Merge #21166: Introduce DeferredSignatureChecker and have SignatureExtract...

04:48 < bitcoin-git> [bitcoin] fanquake merged pull request #21166: Introduce DeferredSignatureChecker and have SignatureExtractorClass subclass it (master...fix-sig-extractor-checker) https://github.com/bitcoin/bitcoin/pull/21166

05:25 < fanquake> Am I misremembering or did someone recently report seeing terrible performance with sqlite based wallets?

05:26 < sipa> i think phantomcircuit was complaining about that a while ago

05:27 < fanquake> I'm currently seeing a 100x performance difference in the wallet tests

05:28 < fanquake> 202536866us to run CreateWallet with sqlite vs 2037701us with bdb.

05:29 < sipa> that's no good

05:33 < bitcoin-git> [bitcoin] MarcoFalke merged pull request #21616: [0.21] build: link against -lsocket if required for *ifaddrs (0.21...backport_21486) https://github.com/bitcoin/bitcoin/pull/21616

05:34 < bitcoin-git> [bitcoin] MarcoFalke pushed 3 commits to master: https://github.com/bitcoin/bitcoin/compare/245a5cd5604a...41a8d2b96ff5

05:34 < bitcoin-git> bitcoin/master fa8fffe MarcoFalke: refactor: Prefer clean assert over UB in coinstats

05:34 < bitcoin-git> bitcoin/master fa9b74f MarcoFalke: Fix assumeutxo crash due to missing base_blockhash

05:34 < bitcoin-git> bitcoin/master 41a8d2b MarcoFalke: Merge #21582: Fix assumeutxo crash due to missing base_blockhash

05:35 < bitcoin-git> [bitcoin] MarcoFalke merged pull request #21582: Fix assumeutxo crash due to missing base_blockhash (master...2104-assumeutxoCrash01) https://github.com/bitcoin/bitcoin/pull/21582

06:01 < fanquake> Thought it might be some dumb macOS thing, but looks like it's happening on Linux too, althought not as bad. Only 27x difference

06:18 < bitcoin-git> [bitcoin] fanquake pushed 3 commits to master: https://github.com/bitcoin/bitcoin/compare/41a8d2b96ff5...c0160ea52ea8

06:18 < bitcoin-git> bitcoin/master 9a36709 Sebastian Falbesoner: wallet: refactor: dedup sqlite statement preparations

06:18 < bitcoin-git> bitcoin/master ea19cc8 Sebastian Falbesoner: wallet: refactor: dedup sqlite statement deletions

06:18 < bitcoin-git> bitcoin/master c0160ea fanquake: Merge #21540: wallet: refactor: dedup sqlite statement preparations/deleti...

06:18 < bitcoin-git> [bitcoin] fanquake merged pull request #21540: wallet: refactor: dedup sqlite statement preparations/deletions (master...2021-wallet-dedup_setupsqlstatements) https://github.com/bitcoin/bitcoin/pull/21540

07:13 < bitcoin-git> [gui] hebasto opened pull request #274: [PoC] [do not merge]: Support runtime appearance adjustment on macOS (master...210407-dark-poc) https://github.com/bitcoin-core/gui/pull/274

08:41 < bitcoin-git> [bitcoin] MarcoFalke pushed 2 commits to master: https://github.com/bitcoin/bitcoin/compare/c0160ea52ea8...6154291cf9ab

08:41 < bitcoin-git> bitcoin/master 3333375 MarcoFalke: fuzz: Fix uninitialized read in test

08:41 < bitcoin-git> bitcoin/master 6154291 MarcoFalke: Merge #21617: fuzz: Fix uninitialized read in i2p test

08:41 < bitcoin-git> [bitcoin] MarcoFalke merged pull request #21617: fuzz: Fix uninitialized read in i2p test (master...2104-fuzzValgrind) https://github.com/bitcoin/bitcoin/pull/21617

08:48 < bitcoin-git> [bitcoin] fanquake opened pull request #21629: build: fix configuring when building depends with NO_BDB=1 (master...fixup_depends_sqlite_only_build) https://github.com/bitcoin/bitcoin/pull/21629

08:50 < bitcoin-git> [bitcoin] fanquake pushed 4 commits to master: https://github.com/bitcoin/bitcoin/compare/6154291cf9ab...2b3e5bf4c0dc

08:50 < bitcoin-git> bitcoin/master c6edcf1 fanquake: build: suppress libevent warnings if supressing external warnings

08:50 < bitcoin-git> bitcoin/master 3b0078f fanquake: doc: fixup -Wdocumentation issues

08:50 < bitcoin-git> bitcoin/master a4e970a fanquake: build: enable -Wdocumentation if suppressing external warnings

08:50 < bitcoin-git> [bitcoin] fanquake merged pull request #21613: build: enable -Wdocumentation (master...enable_wdocumentation) https://github.com/bitcoin/bitcoin/pull/21613

08:54 < bitcoin-git> [bitcoin] MarcoFalke pushed 2 commits to master: https://github.com/bitcoin/bitcoin/compare/2b3e5bf4c0dc...aa69471ecd55

08:54 < bitcoin-git> bitcoin/master 937fd4a Russell Yanofsky: Fix wrong wallet RPC context set after #21366

08:54 < bitcoin-git> bitcoin/master aa69471 MarcoFalke: Merge #21572: Fix wrong wallet RPC context set after #21366

08:55 < bitcoin-git> [bitcoin] MarcoFalke merged pull request #21572: Fix wrong wallet RPC context set after #21366 (master...pr/fixref) https://github.com/bitcoin/bitcoin/pull/21572

09:12 < bitcoin-git> [bitcoin] vasild opened pull request #21630: fuzz: split FuzzedSock interface and implementation (master...FuzzedSock_move) https://github.com/bitcoin/bitcoin/pull/21630

12:19 < bitcoin-git> [bitcoin] vasild opened pull request #21631: i2p: always check the return value of Sock::Wait() (master...SockWait_usage_fix) https://github.com/bitcoin/bitcoin/pull/21631

13:30 < bitcoin-git> [bitcoin] luke-jr reopened pull request #19573: Replace unused BIP 9 logic with draft BIP 8 (master...bip8) https://github.com/bitcoin/bitcoin/pull/19573

13:36 < bitcoin-git> [bitcoin] fanquake opened pull request #21633: refactor: add [[noreturn]] attribute where applicable (master...build_with_noreturn) https://github.com/bitcoin/bitcoin/pull/21633

15:27 < wumpus> fanquake: that sounds bad, i wonder what costs so much time, it's not like we're using any of sqlite's advanced features

15:35 < jeremyrubin> is this about fuzzing timeouts?

16:56 < bitcoin-git> [bitcoin] laanwj pushed 4 commits to master: https://github.com/bitcoin/bitcoin/compare/aa69471ecd55...cb79cabdd9d9

16:56 < bitcoin-git> bitcoin/master 3bb6e7b Jon Atack: rpc: add network field to rpc getnodeaddresses

16:56 < bitcoin-git> bitcoin/master 1b91898 Jon Atack: rpc: simplify/constify getnodeaddresses code

16:56 < bitcoin-git> bitcoin/master 5c44678 Jon Atack: rpc: improve getnodeaddresses help

16:57 < bitcoin-git> [bitcoin] laanwj merged pull request #21594: rpc: add network field to getnodeaddresses (master...getnodeaddresses-network) https://github.com/bitcoin/bitcoin/pull/21594

17:47 < wumpus> jeremyrubin: from what i understand, it is about simply running the wallet tests with sqlite, don't think fuzzing is involved here

20:02 < achow101> maybe it's a performance regression in some version? I don't see this issue on my syste with sqlite 3.35.4

20:08 < sipa> achow101: perhaps it is related to disk i/o speed?

20:08 < achow101> I'm thinking it is related to that

20:08 < achow101> I just tried 3.34.1 and did not see the same performance issue

20:09 < sipa> if it"s doing lots of fsyncs, it may be super slow on network-connected spinning disk storage, but barely noticable on fast nvme ssd

20:09 < wumpus> i remember some issues with fsyncing being really slow when used inside VMs

20:09 < wumpus> yes that

20:09 < wumpus> maybe it's doing a lot of write transactions separately

20:09 < achow101> well the test datadir is /tmp, which IIRC is a ramdisk

20:09 < achow101> *is in /tmp

20:09 < sipa> not on my system

20:09 < achow101> so disk shouldn't matter there?

20:10 < achow101> hmm

20:10 < sipa> hmm

20:10 < wumpus> /tmp is definitely not a ramdisk on all linux distros

20:10 < sipa> as soon as descriotor wallets were merged, the parallellism i could run the tests with dramatically reduced

20:10 < sipa> went from 60 without problems to 10 or so

20:11 < sipa> and i often still see a few time out that i need to rerun manually

20:11 < achow101> I will try this on a slower drive

20:12 < sipa> it's always descriptor wallet tests that time out for me

20:13 < achow101> being disk i/o bound does make sense to me

20:13 < sipa> (not all of them, and not deterministically, so i nevrr thought too much about it)

20:14 < phantomcircuit> sipa, the performance issue i saw wasn't in a vm btw

20:14 < phantomcircuit> wait actually it was im dumb

20:15 < sipa> so if this is related, i don't think it's a regression

20:15 < achow101> how do I make test_bitcoin use a different datadir?

20:15 < sipa> chroot? :p

20:16 < sipa> or just moujt another fs over /tmp

20:17 < sipa> mkdir /non-tmpfs-tmp && sudo mount --bind /non-tmpfs-tmp /tmp

20:18 < sipa> achow101: oh it's the functional tests that time out for m; unit tests are never a problrm

20:18 < achow101> I have other things appear to be using tmp, so I'd rather not mount over it. I think I'll just modify the code

20:19 < wumpus> or pass a different TMPDIR

20:19 < achow101> it seems like the the unit tests can reliably show the performance issue, so I'll go with that

20:19 < wumpus> (assuming nothing is hardcoding /tmp which would be bad)

20:21 < wumpus> alternatively a chroot with everything bind-mounted execept tmp, but that seems overkill in this case

20:21 < achow101> hard coding the path was easier

20:21 < wumpus> i think setting TMPDIR should work i had to do it once for a system with only a small /tmp partition

20:22 < achow101> definitely appears to be disk io

20:23 < achow101> 100% disk utilization and the CreateWallet test is now taking a very long time

20:23 < achow101> we may be using sqlite subotimally

20:26 < jonatack> "it's always descriptor wallet tests that time out for me" --> same (apart from other timeouts/races)

20:27 < sipa> achow101: i can give you ssh access on my system if that helps, but it sounds like you're already able to reproduce?

20:27 < achow101> yes, I believe I have reproduced it

20:40 < jeremyrubin> can you just pass a ramdisk in for where to create the DB for testing lol

20:41 < jeremyrubin> ah i guess this is already discussed above my b

20:41 < achow101> the functional tests let you specify the tempdir the datadirs are made in

21:10 < wumpus> that does require a lot of memory though, the bitcoin functional tests (especially with high parallelism) tend to create many files

21:24 < sipa> with tmpfs /tmp, functional tests at -j60: 126 s runtime, all tests pass

21:29 < phantomcircuit> sipa, through the magic of fsync() {}

21:33 < sipa> with normal /tmp, the same takes 488 s, and 3 tests fail

21:33 < sipa> (this is a 16-core/32-thread system with 32G RAM)

21:33 < phantomcircuit> there's a bunch of functional tests that randomly timeout iirc

21:37 < achow101> We are currently enabling fullfsync (someone suggested we do this to ensure that everything is truly flushed), but the sqlite docs says this "But the implementation of fullfsync involves resetting the disk controller. And so not only is it profoundly slow, it also slows down other unrelated disk I/O. So its use is not recommended."

21:38 < achow101> but that would only affect macs

21:42 < achow101> I would guess that the slow down is because sqlite does an (or two) fsync per write

21:43 < luke-jr> doesn't fsync on Linux also wait for everything else writing to get to disk too?

21:43 < luke-jr> or at least things queued before you

21:47 < phantomcircuit> achow101, performance sensitive sqlite requires using the write ahead log

21:47 < achow101> and a write ahead log is why we don't want bdb

21:48 < phantomcircuit> luke-jr, yes, fsync and fdatasync are for the entire filesystem despite being called on a file or directory

21:50 < phantomcircuit> achow101, which journal mode are we using?

21:50 < achow101> rollback

21:51 < phantomcircuit> achow101, you mean DELETE ?

21:52 < achow101> we use the rollback journal in PERSIST mode

21:52 < achow101> PERSIST is because that's what it does when locking_mode=EXCLUSIVE

21:53 < phantomcircuit> achow101, i guess the real issue is that there basically two categories of 'written' we care about, key material and not-key material

21:54 < phantomcircuit> i don't think with PERSIST requires using synchronous=FULL, but can be NORMAL instead

21:54 < sipa> if it's just creation that's painful, these paranoid modes could also just be enabled after creation

21:54 < phantomcircuit> there's also these flags https://sqlite.org/c3ref/c_sync_dataonly.html

21:55 < phantomcircuit> which are separate from the journal_mode and the synchronous pragmas

21:55 < sipa> but i suspect it's a more general sign of a i/o boundness issue

21:55 < luke-jr> achow101: well, corruption is the reason we don't want bdb… is sqlite's WAL just as prone to loss?

21:56 < phantomcircuit> sipa, everywhere that key material is written could be sandwiched in a bunch of flushing logic, but also that's kinda of asking for missing that somewhere

21:56 < phantomcircuit> luke-jr, if you lose the WAL you lose what was written, but don't corrupt the database

21:56 < achow101> luke-jr: WAL (in general) means we have to do the thing where we constantly flush the log back to the db file. this adds additional overhead, another thread, and is just kind of a pain. It's one of the things bdb does that results in a lot of complexity in that code

21:57 < achow101> sipa: I think if we did all of setup as a single transaction a large part of this would go away.

21:58 < achow101> and doing setup as a single transaction is probably safe

21:58 < luke-jr> achow101: but so long as it doesn't corrupt, it's tolerable IMO

21:58 < phantomcircuit> yes sqlite performance is almost entirely based on commits

21:59 < phantomcircuit> a thousand writes with one commit is going to be faster than hundreds of writes with 10 commits

21:59 < achow101> luke-jr: WAL has the risk of losing data if that log is lost, and I think that isn't tolerable for a completely new db system

22:00 < phantomcircuit> achow101, wrapping the entire setup in a transaction has the issue of making the logic tied more closely to the data layer so

22:01 < phantomcircuit> iono options etc

22:02 < sipa> aright; is that hard to do?

22:03 < achow101> shouldn't be hard to transaction-ize large portions of setup

22:03 < achow101> especially the parts that write a few thousand records

22:04 < luke-jr> won't break abstractions?

22:04 < achow101> no, WalletBatch already has transaction create and commit

22:05 < achow101> although now that I think about it, descriptor wallets don't have a whole lot of records to write

22:05 < achow101> unless the test is making a legacy wallet with sqlite

22:06 < phantomcircuit> do they not pregenerate a bunch of keys?

22:06 < achow101> they're generated on the fly and kept in memory only

22:07 < sipa> what?

22:07 < sipa> the key cache isn't written?

22:07 < achow101> Only the parent xpub is written

22:07 < sipa> oh, right

22:07 < sipa> that's nice

22:07 < sipa> is that recent?

22:07 < achow101> that was in the original

22:08 < sipa> interesting

22:08 < sipa> what if you have a blabla/*h descriptor?

22:09 < achow101> then it stores all of the derived keys

22:09 < sipa> i see, ok

22:09 < achow101> but you have to import such a descriptor explicitly, so the CreateWallet test shouldn't be doing that

22:09 < sipa> right

22:11 < sipa> still, a 1000x slowdown... kind of suggests it's more than a few records?

22:11 < achow101> ... it's making a legacy wallet with sqlite

22:12 < achow101> sipa: which descriptor wallet functional tests timeout?

22:12 < achow101> the functional tests shouldn't be making legacy wallets with sqlite

22:18 < sipa> ha.

22:20 < sipa> rpc_fundrawtransaction.py --descriptors | ✖ Failed | 488 s

22:20 < sipa> wallet_create_tx.py --descriptors | ✖ Failed | 94 s

22:20 < sipa> wallet_keypool.py --descriptors | ✖ Failed | 116 s

22:21 < sipa> what about phantomcircuit's slow test?

22:22 < phantomcircuit> i was definitely using a descriptor wallet

22:23 < phantomcircuit> yeah format sqlite descriptors true

22:25 < achow101> phantomcircuit: what were you testing specifcally that was slow?

22:26 < phantomcircuit> achow101, just creating the wallet took a very long time

22:27 < achow101> ok..

22:56 < achow101> sipa: do those timeout on a createwallet call or something else?

22:56 < achow101> phantomcircuit: were you using an older version of the descriptor wallet pr? there was an iteration that wrote a few thousand records.

23:06 < achow101> it seems like functional test timeout might be from writing all of the transactions that generatetoaddress produces

23:13 < wumpus> if it's only a performance problem in the tests, what about disabling the fsyncing in the tests?

23:14 < wumpus> i'm not sure there is any real life scenario where the wallet would generate so many writes

23:14 < wumpus> it's not like lightning: https://github.com/lightningnetwork/lnd/issues/5186

23:16 < achow101> wumpus: this could be an issue for wallets that receive a lot of transactions

23:17 < achow101> if they are all in one block, it can be optimized to use a db transaction there, but if a lot of unconfirmed txs, that might be a problem

23:17 < wumpus> nah, i doubt any single wallet will receive enough on-network transactions for that to really add up enough