< jamesob> wumpus: whenever is convenient, would you mind swapping my high-prio PR for #21523? I think it should be a pretty uncontroversial change
< gribble> https://github.com/bitcoin/bitcoin/issues/21523 | validation: run VerifyDB on all chainstates by jamesob · Pull Request #21523 · bitcoin/bitcoin · GitHub
< fanquake> jamesob: have done
< jamesob> fanquake: tyty
< bitcoin-git> [bitcoin] fanquake pushed 3 commits to master: https://github.com/bitcoin/bitcoin/compare/9be7fe484931...245a5cd5604a
< bitcoin-git> bitcoin/master 6965456 Andrew Chow: Introduce DeferringSignatureChecker and inherit with SignatureExtractor
< bitcoin-git> bitcoin/master a97a929 Andrew Chow: Test that signrawtx works when a signed CSV and CLTV inputs are present
< bitcoin-git> bitcoin/master 245a5cd fanquake: Merge #21166: Introduce DeferredSignatureChecker and have SignatureExtract...
< bitcoin-git> [bitcoin] fanquake merged pull request #21166: Introduce DeferredSignatureChecker and have SignatureExtractorClass subclass it (master...fix-sig-extractor-checker) https://github.com/bitcoin/bitcoin/pull/21166
< fanquake> Am I misremembering or did someone recently report seeing terrible performance with sqlite based wallets?
< sipa> i think phantomcircuit was complaining about that a while ago
< fanquake> I'm currently seeing a 100x performance difference in the wallet tests
< fanquake> 202536866us to run CreateWallet with sqlite vs 2037701us with bdb.
< sipa> that's no good
< bitcoin-git> [bitcoin] MarcoFalke merged pull request #21616: [0.21] build: link against -lsocket if required for *ifaddrs (0.21...backport_21486) https://github.com/bitcoin/bitcoin/pull/21616
< bitcoin-git> [bitcoin] MarcoFalke pushed 3 commits to master: https://github.com/bitcoin/bitcoin/compare/245a5cd5604a...41a8d2b96ff5
< bitcoin-git> bitcoin/master fa8fffe MarcoFalke: refactor: Prefer clean assert over UB in coinstats
< bitcoin-git> bitcoin/master fa9b74f MarcoFalke: Fix assumeutxo crash due to missing base_blockhash
< bitcoin-git> bitcoin/master 41a8d2b MarcoFalke: Merge #21582: Fix assumeutxo crash due to missing base_blockhash
< bitcoin-git> [bitcoin] MarcoFalke merged pull request #21582: Fix assumeutxo crash due to missing base_blockhash (master...2104-assumeutxoCrash01) https://github.com/bitcoin/bitcoin/pull/21582
< fanquake> Thought it might be some dumb macOS thing, but looks like it's happening on Linux too, althought not as bad. Only 27x difference
< bitcoin-git> [bitcoin] fanquake pushed 3 commits to master: https://github.com/bitcoin/bitcoin/compare/41a8d2b96ff5...c0160ea52ea8
< bitcoin-git> bitcoin/master 9a36709 Sebastian Falbesoner: wallet: refactor: dedup sqlite statement preparations
< bitcoin-git> bitcoin/master ea19cc8 Sebastian Falbesoner: wallet: refactor: dedup sqlite statement deletions
< bitcoin-git> bitcoin/master c0160ea fanquake: Merge #21540: wallet: refactor: dedup sqlite statement preparations/deleti...
< bitcoin-git> [bitcoin] fanquake merged pull request #21540: wallet: refactor: dedup sqlite statement preparations/deletions (master...2021-wallet-dedup_setupsqlstatements) https://github.com/bitcoin/bitcoin/pull/21540
< bitcoin-git> [gui] hebasto opened pull request #274: [PoC] [do not merge]: Support runtime appearance adjustment on macOS (master...210407-dark-poc) https://github.com/bitcoin-core/gui/pull/274
< bitcoin-git> [bitcoin] MarcoFalke pushed 2 commits to master: https://github.com/bitcoin/bitcoin/compare/c0160ea52ea8...6154291cf9ab
< bitcoin-git> bitcoin/master 3333375 MarcoFalke: fuzz: Fix uninitialized read in test
< bitcoin-git> bitcoin/master 6154291 MarcoFalke: Merge #21617: fuzz: Fix uninitialized read in i2p test
< bitcoin-git> [bitcoin] MarcoFalke merged pull request #21617: fuzz: Fix uninitialized read in i2p test (master...2104-fuzzValgrind) https://github.com/bitcoin/bitcoin/pull/21617
< bitcoin-git> [bitcoin] fanquake opened pull request #21629: build: fix configuring when building depends with NO_BDB=1 (master...fixup_depends_sqlite_only_build) https://github.com/bitcoin/bitcoin/pull/21629
< bitcoin-git> [bitcoin] fanquake pushed 4 commits to master: https://github.com/bitcoin/bitcoin/compare/6154291cf9ab...2b3e5bf4c0dc
< bitcoin-git> bitcoin/master c6edcf1 fanquake: build: suppress libevent warnings if supressing external warnings
< bitcoin-git> bitcoin/master 3b0078f fanquake: doc: fixup -Wdocumentation issues
< bitcoin-git> bitcoin/master a4e970a fanquake: build: enable -Wdocumentation if suppressing external warnings
< bitcoin-git> [bitcoin] fanquake merged pull request #21613: build: enable -Wdocumentation (master...enable_wdocumentation) https://github.com/bitcoin/bitcoin/pull/21613
< bitcoin-git> [bitcoin] MarcoFalke pushed 2 commits to master: https://github.com/bitcoin/bitcoin/compare/2b3e5bf4c0dc...aa69471ecd55
< bitcoin-git> bitcoin/master 937fd4a Russell Yanofsky: Fix wrong wallet RPC context set after #21366
< bitcoin-git> bitcoin/master aa69471 MarcoFalke: Merge #21572: Fix wrong wallet RPC context set after #21366
< bitcoin-git> [bitcoin] MarcoFalke merged pull request #21572: Fix wrong wallet RPC context set after #21366 (master...pr/fixref) https://github.com/bitcoin/bitcoin/pull/21572
< bitcoin-git> [bitcoin] vasild opened pull request #21630: fuzz: split FuzzedSock interface and implementation (master...FuzzedSock_move) https://github.com/bitcoin/bitcoin/pull/21630
< bitcoin-git> [bitcoin] vasild opened pull request #21631: i2p: always check the return value of Sock::Wait() (master...SockWait_usage_fix) https://github.com/bitcoin/bitcoin/pull/21631
< bitcoin-git> [bitcoin] luke-jr reopened pull request #19573: Replace unused BIP 9 logic with draft BIP 8 (master...bip8) https://github.com/bitcoin/bitcoin/pull/19573
< bitcoin-git> [bitcoin] fanquake opened pull request #21633: refactor: add [[noreturn]] attribute where applicable (master...build_with_noreturn) https://github.com/bitcoin/bitcoin/pull/21633
< wumpus> fanquake: that sounds bad, i wonder what costs so much time, it's not like we're using any of sqlite's advanced features
< jeremyrubin> is this about fuzzing timeouts?
< bitcoin-git> [bitcoin] laanwj pushed 4 commits to master: https://github.com/bitcoin/bitcoin/compare/aa69471ecd55...cb79cabdd9d9
< bitcoin-git> bitcoin/master 3bb6e7b Jon Atack: rpc: add network field to rpc getnodeaddresses
< bitcoin-git> bitcoin/master 1b91898 Jon Atack: rpc: simplify/constify getnodeaddresses code
< bitcoin-git> bitcoin/master 5c44678 Jon Atack: rpc: improve getnodeaddresses help
< bitcoin-git> [bitcoin] laanwj merged pull request #21594: rpc: add network field to getnodeaddresses (master...getnodeaddresses-network) https://github.com/bitcoin/bitcoin/pull/21594
< wumpus> jeremyrubin: from what i understand, it is about simply running the wallet tests with sqlite, don't think fuzzing is involved here
< achow101> maybe it's a performance regression in some version? I don't see this issue on my syste with sqlite 3.35.4
< sipa> achow101: perhaps it is related to disk i/o speed?
< achow101> I'm thinking it is related to that
< achow101> I just tried 3.34.1 and did not see the same performance issue
< sipa> if it"s doing lots of fsyncs, it may be super slow on network-connected spinning disk storage, but barely noticable on fast nvme ssd
< wumpus> i remember some issues with fsyncing being really slow when used inside VMs
< wumpus> yes that
< wumpus> maybe it's doing a lot of write transactions separately
< achow101> well the test datadir is /tmp, which IIRC is a ramdisk
< achow101> *is in /tmp
< sipa> not on my system
< achow101> so disk shouldn't matter there?
< achow101> hmm
< sipa> hmm
< wumpus> /tmp is definitely not a ramdisk on all linux distros
< sipa> as soon as descriotor wallets were merged, the parallellism i could run the tests with dramatically reduced
< sipa> went from 60 without problems to 10 or so
< sipa> and i often still see a few time out that i need to rerun manually
< achow101> I will try this on a slower drive
< sipa> it's always descriptor wallet tests that time out for me
< achow101> being disk i/o bound does make sense to me
< sipa> (not all of them, and not deterministically, so i nevrr thought too much about it)
< phantomcircuit> sipa, the performance issue i saw wasn't in a vm btw
< phantomcircuit> wait actually it was im dumb
< sipa> so if this is related, i don't think it's a regression
< achow101> how do I make test_bitcoin use a different datadir?
< sipa> chroot? :p
< sipa> or just moujt another fs over /tmp
< sipa> mkdir /non-tmpfs-tmp && sudo mount --bind /non-tmpfs-tmp /tmp
< sipa> achow101: oh it's the functional tests that time out for m; unit tests are never a problrm
< achow101> I have other things appear to be using tmp, so I'd rather not mount over it. I think I'll just modify the code
< wumpus> or pass a different TMPDIR
< achow101> it seems like the the unit tests can reliably show the performance issue, so I'll go with that
< wumpus> (assuming nothing is hardcoding /tmp which would be bad)
< wumpus> alternatively a chroot with everything bind-mounted execept tmp, but that seems overkill in this case
< achow101> hard coding the path was easier
< wumpus> i think setting TMPDIR should work i had to do it once for a system with only a small /tmp partition
< achow101> definitely appears to be disk io
< achow101> 100% disk utilization and the CreateWallet test is now taking a very long time
< achow101> we may be using sqlite subotimally
< jonatack> "it's always descriptor wallet tests that time out for me" --> same (apart from other timeouts/races)
< sipa> achow101: i can give you ssh access on my system if that helps, but it sounds like you're already able to reproduce?
< achow101> yes, I believe I have reproduced it
< jeremyrubin> can you just pass a ramdisk in for where to create the DB for testing lol
< jeremyrubin> ah i guess this is already discussed above my b
< achow101> the functional tests let you specify the tempdir the datadirs are made in
< wumpus> that does require a lot of memory though, the bitcoin functional tests (especially with high parallelism) tend to create many files
< sipa> with tmpfs /tmp, functional tests at -j60: 126 s runtime, all tests pass
< phantomcircuit> sipa, through the magic of fsync() {}
< sipa> with normal /tmp, the same takes 488 s, and 3 tests fail
< sipa> (this is a 16-core/32-thread system with 32G RAM)
< phantomcircuit> there's a bunch of functional tests that randomly timeout iirc
< achow101> We are currently enabling fullfsync (someone suggested we do this to ensure that everything is truly flushed), but the sqlite docs says this "But the implementation of fullfsync involves resetting the disk controller. And so not only is it profoundly slow, it also slows down other unrelated disk I/O. So its use is not recommended."
< achow101> but that would only affect macs
< achow101> I would guess that the slow down is because sqlite does an (or two) fsync per write
< luke-jr> doesn't fsync on Linux also wait for everything else writing to get to disk too?
< luke-jr> or at least things queued before you
< phantomcircuit> achow101, performance sensitive sqlite requires using the write ahead log
< achow101> and a write ahead log is why we don't want bdb
< phantomcircuit> luke-jr, yes, fsync and fdatasync are for the entire filesystem despite being called on a file or directory
< phantomcircuit> achow101, which journal mode are we using?
< achow101> rollback
< phantomcircuit> achow101, you mean DELETE ?
< achow101> we use the rollback journal in PERSIST mode
< achow101> PERSIST is because that's what it does when locking_mode=EXCLUSIVE
< phantomcircuit> achow101, i guess the real issue is that there basically two categories of 'written' we care about, key material and not-key material
< phantomcircuit> i don't think with PERSIST requires using synchronous=FULL, but can be NORMAL instead
< sipa> if it's just creation that's painful, these paranoid modes could also just be enabled after creation
< phantomcircuit> there's also these flags https://sqlite.org/c3ref/c_sync_dataonly.html
< phantomcircuit> which are separate from the journal_mode and the synchronous pragmas
< sipa> but i suspect it's a more general sign of a i/o boundness issue
< luke-jr> achow101: well, corruption is the reason we don't want bdb… is sqlite's WAL just as prone to loss?
< phantomcircuit> sipa, everywhere that key material is written could be sandwiched in a bunch of flushing logic, but also that's kinda of asking for missing that somewhere
< phantomcircuit> luke-jr, if you lose the WAL you lose what was written, but don't corrupt the database
< achow101> luke-jr: WAL (in general) means we have to do the thing where we constantly flush the log back to the db file. this adds additional overhead, another thread, and is just kind of a pain. It's one of the things bdb does that results in a lot of complexity in that code
< achow101> sipa: I think if we did all of setup as a single transaction a large part of this would go away.
< achow101> and doing setup as a single transaction is probably safe
< luke-jr> achow101: but so long as it doesn't corrupt, it's tolerable IMO
< phantomcircuit> yes sqlite performance is almost entirely based on commits
< phantomcircuit> a thousand writes with one commit is going to be faster than hundreds of writes with 10 commits
< achow101> luke-jr: WAL has the risk of losing data if that log is lost, and I think that isn't tolerable for a completely new db system
< phantomcircuit> achow101, wrapping the entire setup in a transaction has the issue of making the logic tied more closely to the data layer so
< phantomcircuit> iono options etc
< sipa> aright; is that hard to do?
< achow101> shouldn't be hard to transaction-ize large portions of setup
< achow101> especially the parts that write a few thousand records
< luke-jr> won't break abstractions?
< achow101> no, WalletBatch already has transaction create and commit
< achow101> although now that I think about it, descriptor wallets don't have a whole lot of records to write
< achow101> unless the test is making a legacy wallet with sqlite
< phantomcircuit> do they not pregenerate a bunch of keys?
< achow101> they're generated on the fly and kept in memory only
< sipa> what?
< sipa> the key cache isn't written?
< achow101> Only the parent xpub is written
< sipa> oh, right
< sipa> that's nice
< sipa> is that recent?
< achow101> that was in the original
< sipa> interesting
< sipa> what if you have a blabla/*h descriptor?
< achow101> then it stores all of the derived keys
< sipa> i see, ok
< achow101> but you have to import such a descriptor explicitly, so the CreateWallet test shouldn't be doing that
< sipa> right
< sipa> still, a 1000x slowdown... kind of suggests it's more than a few records?
< achow101> ... it's making a legacy wallet with sqlite
< achow101> sipa: which descriptor wallet functional tests timeout?
< achow101> the functional tests shouldn't be making legacy wallets with sqlite
< sipa> ha.
< sipa> rpc_fundrawtransaction.py --descriptors | ✖ Failed | 488 s
< sipa> wallet_create_tx.py --descriptors | ✖ Failed | 94 s
< sipa> wallet_keypool.py --descriptors | ✖ Failed | 116 s
< sipa> what about phantomcircuit's slow test?
< phantomcircuit> i was definitely using a descriptor wallet
< phantomcircuit> yeah format sqlite descriptors true
< achow101> phantomcircuit: what were you testing specifcally that was slow?
< phantomcircuit> achow101, just creating the wallet took a very long time
< achow101> ok..
< achow101> sipa: do those timeout on a createwallet call or something else?
< achow101> phantomcircuit: were you using an older version of the descriptor wallet pr? there was an iteration that wrote a few thousand records.
< achow101> it seems like functional test timeout might be from writing all of the transactions that generatetoaddress produces
< wumpus> if it's only a performance problem in the tests, what about disabling the fsyncing in the tests?
< wumpus> i'm not sure there is any real life scenario where the wallet would generate so many writes
< wumpus> it's not like lightning: https://github.com/lightningnetwork/lnd/issues/5186
< achow101> wumpus: this could be an issue for wallets that receive a lot of transactions
< achow101> if they are all in one block, it can be optimized to use a db transaction there, but if a lot of unconfirmed txs, that might be a problem
< wumpus> nah, i doubt any single wallet will receive enough on-network transactions for that to really add up enough