<stickies-v>
looks like corebot needs even more of a tuneup than ibd
<l0rinc>
andrewtoth rebased #31132 after the above changes and added new test cases. We also had our first Umbrel measurement: a full sync finished in 6.5 hours (31% faster than master) B-)
<corebot>
l0rinc: Error: That URL raised <HTTP Error 503: Service Unavailable>
<l0rinc>
stickies-v lol, indeed
<cfields>
Nice!
<l0rinc>
We are also making good progress reviewing #33657. Roman is very responsive, and once the remaining Windows test inconsistencies are fixed, I expect it to be ready for review (and merge) soon.
<kevkevin>
#31132
<corebot>
l0rinc: Error: That URL raised <Connection timed out.>
<dergoegge>
what are the specs on the umbrel machine?
<corebot>
l0rinc: Error: That URL raised <Connection timed out.>
<l0rinc>
Rob pushed a pruned prototype of SwiftSync in #34004. My understanding is that it is meant as a first iteration, not a final implementation; high-level reviews and benchmarks would be interesting.
<l0rinc>
Now that we have stable access to a few benchmarking servers, we noticed some memory anomalies: a -reindex-chainstate with the default 450 MB -dbcache was running 4x slower on an rpi5 with 8 GB of memory than on an rpi5 with 16 GB of memory.
<l0rinc>
We also observed that on the 16 GB system, runs with -dbcache values of 4 GB and higher were a lot slower than with -dbcache of 3 GB, and that an rpi5 with 16 GB of memory ran out of memory with -dbcache of 10 GB`.
<l0rinc>
We are still investigating these, but so far it seems:
<l0rinc>
* heavy memory fragmentation overhead may play a role in the premature OOMs. We are benchmarking lowering `M_ARENA_MAX` on 64-bit systems (similarly to the existing 32-bit architectures) to see if that helps in lower-memory environments (for example, an rpi4 with 2 GB of memory, which currently IBDs in about 3 weeks)
<l0rinc>
* a high -dbcache value may crowd out the OS page cache (blocks + UTXO set), especially when the UTXO set size approaches the memory limit, which could be the cause of the reported sudden slowdown around block 800k (more reads from disk when the files cannot be cached in memory)
<stickies-v>
not sure if rob is here, but might be worth opening an issue for conceptual discussion for swiftsync?
<Novo>
I'll let Rob know
<l0rinc>
* SSDs (and HDDs with certain filesystems) can experience severe performance degradation when they are nearly full, which, paired with more frequent disk access, dramatically slows down validation for existing 1 TB drives
<l0rinc>
The good news is that #31132 seems to result in more than a 50% speedup on these pathological low-end devices.
<dergoegge>
do we also have bench results for higher end machines?
<corebot>
l0rinc: Error: That URL raised <Connection timed out.>
<l0rinc>
yes, on my M4 max reindex-chainstate takes 1.5 hours with #31132 B-)
<stickies-v>
let's move on to the next topic, we're slowing down a bit
<stickies-v>
#topic Cluster Mempool WG Update (sdaftuar, sipa)
<sipa>
I have split off #34023 (which contains just optimizations and follow-ups) from #32545 (which switches cluster linearization to the SFL algorithm). Reviews on the PR have focused mainly on the big algorithm description comment added in the beginning, but I hope to see code review soon. There is also the much simpler #33335 still open.
<cfields>
Been cleaning up my chacha20 simd impl this week (PR incoming), so no net split update.
mudsip has quit []
<dergoegge>
any conclusions from the discussion with aj?
<cfields>
Not yet. I want to get a little further along and get some other opinions. I think we just pretty fundamentally disagree about shared memory and we'll have to pick one direction or the other...
<l0rinc>
cfields: were you able to generalize the simd instructions (i.e. convince the compilers via hints)?
<cfields>
But that decision doesn't have to be made yet, it's a good bit down the road.
<dergoegge>
👍
<eugenesiegel>
quick question
<cfields>
l0rinc: yes, it's generic. Still beats the linux kernel's hand-written asm though :)
<l0rinc>
sweet, looking forward to reviewing it
<eugenesiegel>
I think it was brought up that process isolation was a good thing in the presence of an attacker who could take down our node. What happens if a DoS is found that takes down the process handling the message processing from peers? Does that just mean we get eclipsed since we can't process any more messages?
<darosior>
eugenesiegel: i think in this case the entire node would have to be stopped to let the user know. But that seems tangential to Cory's work
<eugenesiegel>
I thought this was one of the motivations for using shared memory? Or am I mistaken?
<darosior>
As far as i understand his separation of processes was merely a demonstration, not an actual suggestion
<sipa>
i don't think process separation between net and net processing isn't practically interesting - it's a nice demo of clean separation that it's possible, but i don't think it's useful to actually adopt
<cfields>
eugenesiegel: multi-process is a bit out of scope for the net split project. We'd still need to discuss whether or not splitting p2p into a separate process even makes sense. It's just something that becomes possible.
<cfields>
eugenesiegel: as for shared memory, there are 2 models for communicating between net and net_processing. Shared memory (like via CNode, which both sides access) like we have now, or generic handles on both sides.
<cfields>
I prefer the latter as it's a cleaner separation, aj prefers the former as it's more performant.
<sipa>
cfields: i've been casually following the discussion, but haven't formed an opinion myself
<dergoegge>
i'd also lean towards cleaner separation
<dergoegge>
don't think the performance difference matters much
<stickies-v>
pinheadmz: are you here? otherwise we'll park it for next week
<vasild>
what is "generic handles on both sides"?
<cfields>
I think it's worth experimenting with both. Imo it just comes down to what the real-world performance penalty is.
<darosior>
stickies-v: i think he was here, i've also got something to ask about, maybe we can do that until pinheadmz comes back?
<cfields>
vasild: referencing NodeIds and performing lookups in CConnman/PeerMan
<vasild>
cfields: ok, I see, thanks for the clarification
<stickies-v>
cool. i'll let the net split convo finish and then move on
Emc99 has joined #bitcoin-core-dev
<sipa>
from what i understand from the discussion, it's not really about performance, but about whether introducing the handles is actually considered cleaner or not
<bitcoin-git>
bitcoin/master cdaf25f Ava Chow: test: Log IP of download server in get_previous_releases.py
<bitcoin-git>
bitcoin/master b31f786 merge-script: Merge bitcoin/bitcoin#34045: test: Log IP of download server in get_previo...
<bitcoin-git>
[bitcoin] glozow merged pull request #34045: test: Log IP of download server in get_previous_releases.py (master...log-prev-releases-ip) https://github.com/bitcoin/bitcoin/pull/34045
<stickies-v>
alright, looks like we're done with net split?
<cfields>
Hmm, as it's pretty fundamental (though nothing has to be decided just yet), I supppose it's worth having a discussion at some point. I'll tee that up for a wg call at some point in the future.
<darosior>
I'd like to know if anyone else has an opinion about dropping the EOM status from our releases. We do not observe it, and as far as i can tell we never really have. This PR also updates a bunch of details about the release lifecycle and in general what users can expect from releases, to reflect current practices.
<sipa>
concept ack on simiplifying things, but i've refrained from reviewing it because thinking about all the off-by-ones involved about these things hurts my brain
<stickies-v>
sorry i was going to reply on the pr but lost track of it, at first thought i agree that the distinction is (at least currently) unnecessarily complicating things, so concept ack (but i'll do it on the pr too)
WizJin has joined #bitcoin-core-dev
<darosior>
sipa: :) this one does not involve a shift, merely removing one column in the table and clarifying things in the text
<Murch[m]>
I commented, and basically my point is, is something end-of-life when it got its last update, or when another update would be due and it won’t get it
<sipa>
Murch[m]: that's the off-by-one i'm talking about
<darosior>
A major branch N becomes end of life when the version N+3 is released
<Murch[m]>
Seems fine with me
<darosior>
But this is not changed by my PR, my PR just removed the unobserved "end of maintenance" middle state we had at N+2
<Murch[m]>
But would you call the branch “maintained” between N+2 release and N+3 release?
<darosior>
We already maintain all branches until N+3 is released
<darosior>
So, yes
<Murch[m]>
How so? I don’t think there are any backports going to N after N.x is released with the N+2 release.
<Murch[m]>
What does it mean to be “maintained”, then?
<darosior>
There may be if necessary
<Murch[m]>
If that’s so, fine. I just don’t think I’ve ever seen it
<darosior>
cc fanquake since you usually take care of backports
<darosior>
Murch[m]: really? That seems surprising to me
<darosior>
We did backport stuff to 27 before the 30 release for instance
eugenesiegel has quit [Quit: Client closed]
eugenesiegel has joined #bitcoin-core-dev
<Murch[m]>
Okay, then I stand corrected
<fanquake>
I think it's more a coincidence that point releases have stated to "sync up" around major releases
<darosior>
For instance #33564
<sipa>
Murch[m]: there was no 27.x release with those backports however, but it seems "maintained" is about the branch, not necessarily the guarantee of point releases for them
<sipa>
also, backports are already a judgment call in any case between complexity of backport and severity
<fanquake>
Yea, my next Q was going to be, if we push backports, but don't cut a release, is that still maintained? I'd say yes
<fanquake>
And we are actively doing that, ad-hoc
<darosior>
Yes and the release page does mention that as branches age the threshold for something to be backported gets higher
<sipa>
ok i should just review it
<stickies-v>
fanquake: the only reasing not releasing it being that the backports aren't important enough? if so, yes i agree
<fanquake>
I agree that getting rid of a 3'rd state of maintenance is good
<Murch[m]>
sipa: Oh okay
<Murch[m]>
fanquake: Yeah, I’d agree
<Murch[m]>
If there is still work on the branch, I’d call it maintained
<Murch[m]>
darosior: That covers my questions then!
<Murch[m]>
Yeah, I like the simplification
<fanquake>
stickies-v: Yea. Sometimes things are backported that aren't user-facing
<fanquake>
i.e: backport CI fixes incase we do need the CI to run later
<darosior>
The release page was also claiming a bunch of stuff that was imprecise at best and maybe almost misleading
l0rinc has quit [Quit: l0rinc]
<Murch[m]>
I’ll review again, then
<darosior>
Alright, that's it from me then. Thanks!
<stickies-v>
Anything else to discuss?
<Murch[m]>
Maybe just briefly
<sipa>
you added a topic yourself, stickies-v?
<sipa>
oh, that was pinheadmz's topic, i see
<Murch[m]>
I made a patch to BIP 3 to address the review since I motioned to activate. Would love me some review, especially by the people who chimed in.
<stickies-v>
yeah but we'll park pinheadmz's for next week as per his request
<sipa>
Murch[m]: cool, will do
<Murch[m]>
Thanks, sipa
<darosior>
Murch[m]: will take a look
<Murch[m]>
Okay thanks!
<stickies-v>
#endmeeting
<corebot>
stickies-v: Meeting ended at 2025-12-11T16:50+0000