< luke-jr>
I wonder if it would be better to prune debug.log with fallocate/FALLOC_FL_PUNCH_HOLE
< bitcoin-git>
[bitcoin] tjps closed pull request #10059: [trivial] Removed one Boost usage and headers in util.cpp (master...tjps_boost) https://github.com/bitcoin/bitcoin/pull/10059
< wumpus>
achow101: there's been discussion of having a signal to prune the debug log on command, none exists though. SIGHUP does reopen it, so you could rotate it using a stock log rotator
< wumpus>
with default settings the debug log grows only very slowly so this is not an issue for most users
< wumpus>
and developers that enable extra debug options tend to not want to automatically lose information
< wumpus>
e.g. to collect info over a complete sync cycle
< gmaxwell>
the way the prune thing works kinda stinks in any case. when I get log data from users it very frequently has omitted the interesting part, because they restart before asking for help.
< gmaxwell>
and so I can't see why their node has rejected a valid block.
< wumpus>
FALLOC_FL_PUNCH_HOLE wouldn't improve that; though maybe the aggressive name gives some outlet when there's yet another issue with useless debug logs
< wumpus>
fanquake: it's eating blocks instead of syncing them!
< wumpus>
(yes yes '100% progress' is a moving target, so staying in the same place moves you backwards, relatively, you have to run to stay in the same place)
< wumpus>
I guess we should add a max(0, progress)
< fanquake>
wumpus any schedule for 0.14.1 ? I'm going to remove 9464, not important enough to be fixed.
< wumpus>
fanquake: no precise shedule, though agreement seems to be that there should be a 0.14.1 'soon'
< fanquake>
Only two issues tagged now, are there any more sitting around that should be making it in?
< wumpus>
it's still blocked on the out of memory problems, we're going to PR some workarounds for 0.14
< wumpus>
yep I'm working on one and sipa should have one
< fanquake>
ok
< wumpus>
agree about #9464, as a display problem it's fairly low priority and should at least not block 0.14.1
< wumpus>
but if someone fixes it it and the solution is not too invasive should be backported
< bitcoin-git>
[bitcoin] fanquake closed pull request #10089: Fix shadowing of 'what' as described in #10080. (master...fix-what-shadowing) https://github.com/bitcoin/bitcoin/pull/10089
< wumpus>
hrm some RPCs such as getmemoryinfo could be available from the point the RPC server is launched, not just after iniitalization finishes
< wumpus>
also 'stop'
< bitcoin-git>
[bitcoin] laanwj opened pull request #10120: util: Work around (virtual) memory exhaustion on 32-bit w/ glibc (master...2017_03_address_space_exhaustion_workaround) https://github.com/bitcoin/bitcoin/pull/10120
< luke-jr>
wumpus: maybe debug.log should only be pruned when the node considers itself synced + some time has passed?
< luke-jr>
(FALLOC_FL_PUNCH_HOLE possible improvement is that it can do it without rewriting the kept data)
< fanquake>
jonasschnelli no, 9464 is just an issue I created that cover a few different GUI issues/quirks, I should add the negative progress output there as well.
< wumpus>
could also be a test framework related issue
< fanquake>
jonasschnelli what are your thoughts on the OSX Growl notification code? I'm not sure that it's even being developed/supported anymore. They moved to selling it through the mac app store, and last update was in October 2013.
< wumpus>
test_framework.authproxy.JSONRPCException: non-JSON HTTP response with '503 Service Unavailable' from server (-342)
< wumpus>
while waiting for the node to come up... I think you'll get that if no handler is installed
< wumpus>
there is a small window before handlers are registered
< wumpus>
so not completely unexpected
< fanquake>
wumpus did you just restart the build? I was in the middle on reading the log heh
< wumpus>
yes I restarted the build, sorry :)
< wumpus>
may well be that travis is just being slow and brings this issue to the surface
< jnewbery>
Travis is good (too good!) at uncovering sync and timing issues
< wumpus>
yes
< BlueMatt>
jonasschnelli: ie the user might call bumpfee, get an error saying they need to pay more, then call bumpfee again with a higher number, and get another error saying they need to pay /even/ more
< BlueMatt>
which is ok, but kinda annoying
< BlueMatt>
if they're already checked back-to-back, might as well merge them into one error
< bitcoin-git>
[bitcoin] jnewbery opened pull request #10123: Allow debug logs to be excluded from specified component (master...debugexclude) https://github.com/bitcoin/bitcoin/pull/10123
< jonasschnelli>
Reminder for the european CEST people. Meeting will be in 2h 10' (summer time now!).
< kanzure>
i thought we had consensus to eliminate time zones?
< gmaxwell>
kanzure: we did, but the rest of the world hasn't yet.
< bitcoin-git>
[bitcoin] TheBlueMatt opened pull request #10125: Exit bitcoind immediately on shutdown instead of 200ms later (master...2017-03-faster-shutdown) https://github.com/bitcoin/bitcoin/pull/10125
< bitcoin-git>
[bitcoin] jnewbery closed pull request #9630: Add logging to GetAncestor if pindex->pprev == NULL (master...crashlogging) https://github.com/bitcoin/bitcoin/pull/9630
< sdaftuar>
wumpus: i think #9959 is ready for merge
< bitcoin-git>
[bitcoin] laanwj closed pull request #9959: Mining: Prevent slowdown in CreateNewBlock on large mempools (master...2017-03-cnb-optimizations) https://github.com/bitcoin/bitcoin/pull/9959
< sdaftuar>
thanks!
< BlueMatt>
wumpus: what restictions are there on what code you can call from signal handlers?
< sipa>
BlueMatt: as far as I know, only modify simple variables
< wumpus>
BlueMatt: almost nothing can be invoked from signal handlers because signals arrive synchronously, so can be called in any context from any thread
< BlueMatt>
mmm
< wumpus>
everything that has the slightest risk of crashing when called reentrantly can't be used
< wumpus>
so yes, in practice most programs restrict to setting variables
< jtimon>
I don't think anyone commented in mine, at least I got review in others
< gmaxwell>
I forgot about the project tracker thing, but reviewed some of them anyways.
< wumpus>
doesn't matter, we'll only add pulls that were mentioned to have priority in the meeting to that project tracker, so if you pay attention at the meeting you don't need it :)
< wumpus>
some of those have already been worked on or even PRs open to make them faster
< gmaxwell>
jnewbery: if you're around, if 10106 doesn't move forward in a few hours, I recommend you create a new pr, cherry picking that one fix, with your new tests and the fix for the other function. (just because the PR wasn't opened by a regular so they may not be responsive or may not know how to use github well enough to pull your changes.)
< jnewbery>
gmaxwell: I'll do that this afternoon
< wumpus>
yes, tend to agree, they may not know how to cherry-pick them. I could cherry-pick them into his branch but meh
< MarcoFalke_lab>
wumpus: Yes, dropping GetRand() gave a 4* speedup.
< gmaxwell>
jnewbery: and thanks for the awesome response.
< jnewbery>
np
< wumpus>
MarcoFalke_lab: that'll at least move it down a bit in the top 20
< jeremyrubin>
The cuckoocache tests require a bit more discussion to decrease the parameters; I can pick something arbitrary
< jeremyrubin>
The checkqueue tests should also speed up a lot with #9938, but I'm preparing some tweaks to those
< jeremyrubin>
If anyone has high-core machines, they should try benching how that works
< wumpus>
jeremyrubin: alternatively we could have an -extended mode for the unit tests, like we have for the functional tests, to do extra-thorough tests that shouldn't run every time
< MarcoFalke_lab>
jeremyrubin: If the parameters matter, you could provide a switch to test against quick_run and an expensive one
< wumpus>
yes ^^
< gmaxwell>
I think we should have an extended test, and make running it part of the release process.
< jeremyrubin>
Agree
< wumpus>
but for the standard 'make check' there should be a guideline of max ~1 second per test case
< MarcoFalke_lab>
Everyone agrees! :)
< gmaxwell>
I don't care if the tests take an hour to run if it's only a mandatory once in release thing.
< cfields>
sgtm
< jeremyrubin>
no, 2 seconds!
< wumpus>
gmaxwell: yes or once per day or so on master
< jonasschnelli>
(which is failing currently)
< gmaxwell>
jonasschnelli: what is failing?
< wumpus>
yes, travis is broken again, but there's a pull to fix that
< cfields>
note to self: gitian should run whatever extended tests it can
< wumpus>
yes, jonasschnelli has a build server with a very nice web UI
< MarcoFalke_lab>
jonasschnelli: Are you running the tests?
< jonasschnelli>
(it's currently by manual trigger)
< jonasschnelli>
no.. just gitian
< gmaxwell>
somehow I missed this. cool.
< wumpus>
travis already runs the tests, this is to get executables for testing
< jnewbery>
travis is only broken now because we've set it to run the extended tests once per day, so we're currently flushing out all the extended tests that have always failed on travis. I think once #10114 and #10072 are merged the daily runs should succeed reliably
< jonasschnelli>
thanks jnewbery for the info (and the fixes)
< bitcoin-git>
[bitcoin] sdaftuar opened pull request #10127: [0.14 backport] Mining: Prevent slowdown in CreateNewBlock on large mempools (0.14...2017-03-backport-cnb-optimizations) https://github.com/bitcoin/bitcoin/pull/10127
< wumpus>
but I think we should be able to do 0.14.1 tomorrow so to have something to review for the rest of the week :D
< wumpus>
BlueMatt: yes good point
< BlueMatt>
ok, so morcos and sdaftuar get to propose new ones since they dont have ones up, anyone else want to propose ones?
< gmaxwell>
oh someone opened a PR to do something with debug log excludes, and it adds more insane string processing in the debugging critical path. Would anyone mind if I brought back the PR I shelved to make debug categories use an enum?
< cfields>
along the same lines, I'd like to do something similar for net messages
< wumpus>
then you can also map the enum values to strings and automate the help message generation
< gmaxwell>
#9424
< gribble>
https://github.com/bitcoin/bitcoin/issues/9424 | Change LogAcceptCategory to use uint32_t rather than sets of strings. by gmaxwell · Pull Request #9424 · bitcoin/bitcoin · GitHub
< BlueMatt>
sdaftuar: want to bring it up now/
< jnewbery>
gmaxwell: +1. I'll happily review that one
< BlueMatt>
?
< wumpus>
cfields: yes, should be fairly easy to do, I already changed the syntax for net messages to be enum-like at some point
< jtimon>
since #8855 didn't got new review nor re-review I think just leave that (just remind to potential reviewers that #8994 continues it, in case you "don't see the point")
< wumpus>
gmaxwell: will that work for unknown messages too?
< sdaftuar>
but basically matt brought up that we have some edge cases in our code if ConnectTip or DisconnectTip return false
< wumpus>
gmaxwell: does a perfect hash handle collisions? I don't remember
< cfields>
wumpus: perfect hash means no collisions :)
< wumpus>
cfields: for known values
< jeremyrubin>
wouldn't that not be backwards compatible?
< gmaxwell>
wumpus: gperf can check, it's an option.
< wumpus>
cfields: but what if 'moo' maps to the same 32-bit value as 'block'
< jeremyrubin>
someone sends you a new type of message that hashes to something else
< wumpus>
scary
< sipa>
sdaftuar: hmm, i'll review the PR
< sdaftuar>
the last issue i'm trying to resolve is if ConnectTip fails (but the block is not invalid), then we should be in an AbortNode() state. what consistency are we aiming for in this situation?
< cfields>
wumpus: ah, true
< gmaxwell>
wumpus: making it never have collisions for unknows is just an extra comparison with the correct value.
< gmaxwell>
wumpus: which IIRC the gperf emitted code will do, if enabled.
< gmaxwell>
(actually, looks like it's the default)
< BlueMatt>
lol, ok, so lets pick a topic instead of two
< gmaxwell>
sdaftuar: we should be able to shut down and come back up with the underlying issue resolved and continue.
< wumpus>
yes, sorry, I was just curous about gperf
< gmaxwell>
sdaftuar: I don't care if we lose a bunch of recent blocks.
< sipa>
sdaftuar: it's fine if we lose progress in that case, but if at all avoidable, the on disk state should not be corrupted
< BlueMatt>
anyway, so I further observed that shutdown upon AbortNode can take up to 200ms (see recent PR), which is somewhat frightening, given that mempool and chainstate may not be consistent which we assume in many places :/
< sdaftuar>
gmaxwell: sipa: what should we do with the mempool?
< sipa>
sdaftuar: who cares?
< BlueMatt>
so we keep running for a while normally and possibly have incorrect assumptions
< gmaxwell>
I don't care.
< wumpus>
just drop it
< sipa>
sdaftuar: at restart we'll try to reimport what is on disk
< sipa>
and re-run all necessary consistency checks
< sipa>
so mempool.dat doesn't need to be consistent with the chainstate
< BlueMatt>
and wallet?
< sdaftuar>
i think the concern matt was raising was if before shutdown, it's possible for eg an RPC call to get incorrect results
< sdaftuar>
or the wallet
< wumpus>
yes, no tight coupling, good design
< BlueMatt>
wumpus: point is we *have* tight coupling
< wumpus>
wallet doesn't need to be in sync either, though it should not be ahead
< sipa>
sdaftuar: ugh
< BlueMatt>
and continue running after realizing something is corrupted
< gmaxwell>
well abortnode should be instant-ish.
< gmaxwell>
BlueMatt: so you're fixing that, what is there to discuss?
< wumpus>
a bit behind (as long as it's not further than the prune distance) is ok, it will catch up in next start
< BlueMatt>
gmaxwell: i mean the new pr is an improvement, but its not a fix
< BlueMatt>
its not like you cant have something pending cs_main when coming out of Disconnect/ConnectTip
< wumpus>
but if the wallet is ahead of the chain it will trigger a rescan IIRC
< gmaxwell>
so perhaps abortnode needs to Abort()?
< BlueMatt>
gmaxwell: thats pretty much the question
< BlueMatt>
thats super shit, though, of course
< gmaxwell>
and not faffabout witht politely shutting off threads.
< wumpus>
I don't think so
< sdaftuar>
i inadvertently introduced an assert failure in one of these situations. maybe that's a feature not a flaw! :)
< gmaxwell>
Why is it shit? we should be durable across a power off, which is worse than aborting.
< wumpus>
the point of abortnode is to be able to show a GUI dialog to the user
< gmaxwell>
ah
< wumpus>
if you abort() the user will never know to check the debug.log
< BlueMatt>
wumpus: well we can show a gui dialog and abort() prior to unlocking cs_main
< BlueMatt>
:P
< jeremyrubin>
have a graceful shutdown falg
< jeremyrubin>
*flag
< gmaxwell>
wumpus: but it's not unlikely that we can't show a gui... if we can't allocate (likely cause).
< BlueMatt>
which would effectively be sufficient
< wumpus>
I always thought that AbortNode was for errors that could tolerate a graceful shutdown
< wumpus>
the really terrible errors we already assert() or abort() on
< jeremyrubin>
and if we're shutting down gracefully, write to a file "was graceful"
< BlueMatt>
gmaxwell: AbortNode() is usually disk corruption
< jeremyrubin>
On next start, show dialogue first thing?
< wumpus>
please don't make it routine to abort() for everything
< wumpus>
only for things that really warrant it
< wumpus>
not crash on every single error
< BlueMatt>
or too little disk space
< sdaftuar>
or too little disk space, not quite teh same as corruption
< wumpus>
that's just a mess
< gmaxwell>
okay, sure.
< wumpus>
for some problems it's fine to try to clean up normally
< wumpus>
but we still need to exit with a message
< wumpus>
that's what abortnode is for
< wumpus>
if BlueMatt can make it work faster that's great, but don't silently kill the program on every error
< gmaxwell>
Y'all so worried about the dressings on the coffin. "It's already dead!" But sure. Sorry I was only thinking of OOM, it's just a recent subject.
< BlueMatt>
sipa: they dont, afaik, aside from there now being two structs that can be merged
< jeremyrubin>
I think deleting a file on graceful shutdown should work
< jeremyrubin>
And then starting when that file is present shows the dialouge rather than starting
< BlueMatt>
gmaxwell: yea, AbortNode isnt used for thigns like OOM and total death
< BlueMatt>
disk space also does it
< BlueMatt>
but it can also be db corruption
< jeremyrubin>
This means you don't do anything at all on the Abort, but requires the user to restart their node to see the message
< BlueMatt>
so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different?
< wumpus>
BlueMatt: yes, makes sense to me
< wumpus>
BlueMatt: or add a flag
< morcos>
I haven't really thought this all the way through, but doesn't it seem that as we move more and more towards things happening in other threads, that you'd rather let those threads finish what they are doing
< gmaxwell>
BlueMatt: you could also harden things up against your current concer by having the rpc stuff always check the shutdown flag right after taking cs_main (whenever it does)?
< morcos>
If you're committing wallet changes for instance
< gmaxwell>
BlueMatt: and if it finds out that its on its way down, just returning at that point.
< wumpus>
in the long run morcos we should move toward looser coupling of things
< wumpus>
I agree
< BlueMatt>
gmaxwell: yea, I'm worried that if we start down that path we have to check it everywhere
< BlueMatt>
eg wallet may make decisions based on some corrupted mempool state
< morcos>
Let's say you just got some new address from the wallet, and the wallet hasn't committed that state yet and then you Abort()
< BlueMatt>
(I havent thought all the way through all the potential issues here, just a potential concern)
< wumpus>
morcos: luckily we have the keypool to handle that specific contingency
< jeremyrubin>
Every thread could have their own unique ungraceful-close file that it should delete (via RAII) on clean exit. Starting with any present would show error. Uncoupled!
< gmaxwell>
morcos: well at least the worst you get there is replay (due to keypool/hdwallets).. though replay can be pretty bad.
< wumpus>
pretty bad but well at least they will never lose live keys
< BlueMatt>
jeremyrubin: I have a feeling we can be more agressive on the super-strange issues, afaiu this stuff is pretty much hit just with out-of-disk everything else is rare enough.....
< sipa>
maybe we should just have some std::atomic<bool> shits_on_fire_yo; which when set prevents RPCs etc
< wumpus>
jeremyrubin: that's not what I mean with uncoupling, what I mean is that if one thread messes up it doesn't need to take the others with it because it can operate more-or-less independently
< gmaxwell>
sipa: well shutdown can be more or less that.
< sipa>
that can even be set from a signal handler
< BlueMatt>
<BlueMatt> so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different?
< jeremyrubin>
sipa: what do you do with an in progress RPC?
< BlueMatt>
in ShutdownSoon cases (eg out of disk) we're ok to continue and just shut down, i think
< sipa>
jeremyrubin: we can only do so much
< wumpus>
you can't generally break off an in-progress RPC
< BlueMatt>
in other cases, we can just assert(false)
< wumpus>
although some will pay attention to fShutdown
< BlueMatt>
because they're super rare anyway
< jeremyrubin>
wumpus: ^ yes, O
< gmaxwell>
but really, what probably should be done is having the rpcs being able to handle being told "No, you can't have access to [whatever chain state you wanted] right now." then these error conditions that could leave them unstable... could set them.
< BlueMatt>
and if your disk is corrupted we probably shouldnt be flushing wallet and chainstate as a part of shutdown anyway
< BlueMatt>
gmaxwell: yea, but thats a huge rewrite for cases that should be super rare
< gmaxwell>
Or we take a whole different approach to failure messages, .e.g. wrap bitcoin-qt in another program that when qt exists it can go through the logs and give you a useful error. (though this doesn't answer morcos' wallet flush, but really that should be in another process.)
< wumpus>
BlueMatt: I think there should be a clear separation between (A) I/O issues such as out of disk spae, which just happen regularly (B) rare implementation issues such as internal consistency errors
< cfields>
iirc rpc has a IsRPCShuttingDown() or so, but only a few things (gbt only, maybe?) checks it
< jeremyrubin>
wumpus: * yes, I'm aware that isn't quite what you meant, just making noise about having a graceful shutdown file because I think it's a reasonable way to mark if a node shut down dirty
< wumpus>
(A) normal errors should just give the user an error as AbortNode does now and shut down properly
< wumpus>
(B) internal state corruption errors could break off the process immediately
< gmaxwell>
It's obnoxious that we can't preallocate resources such that we can guarentee that we never take a failure at an inoppturtune time.
< gmaxwell>
having to slather the code in error handling to deal with that is not ideal.
< jeremyrubin>
gmaxwell: you can preallocate lots of things?
< gmaxwell>
wumpus: "I had to abort processing a block" is a weak kind of internal state corruption. It leaves assumed invariants violated.
< gmaxwell>
jeremyrubin: in particular we can't make leveldb's disk usage constant.
< BlueMatt>
wumpus: yes, thats my proposal
< BlueMatt>
wumpus: but, specifically, B would include things like chainstate-on-disk-corruption
< BlueMatt>
which it already partially does, just not completely
< wumpus>
having code to handle normal errors is perfectly normal and all code has that, I agree that paranoid memory/disk corruption errors tend not to be possible to handle in a sane way
< wumpus>
BlueMatt: yes
< BlueMatt>
ok, soooo, acks on:<BlueMatt> <BlueMatt> so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different?
< wumpus>
BlueMatt: as I said beforer, yes, that or add a flag/criticality level
< BlueMatt>
s/use make sure/make sure/
< jeremyrubin>
BlueMatt: maybe if you paste it again
< BlueMatt>
sure, that too
< BlueMatt>
jeremyrubin: ok, <BlueMatt> ok, soooo, acks on:<BlueMatt> <BlueMatt> so maybe the solution is AbortNode gets renamed to ShutdownSoon() and use make sure disk corruption is something different?
< gmaxwell>
wumpus: for example, if we run out of space while flushing our view of the blockchain will be corrupt, and information about the blockchain from rpcs will be wrong. In a world where error handling is free, every code path that gets access to the blockchain data would be able to gracefully handle being told it's just not available. But I think that would be an unreasonable and dangerous amount
< gmaxwell>
of complexity. :(
< BlueMatt>
wumpus: I missed that statement, but, yes
< gmaxwell>
BlueMatt: that sounds okay to me. but I don't know that diskfull is really a shutdown soon though we really do need to give it a nice message. :(
< wumpus>
gmaxwell: note that the out-of-disk error happens *before* the disk is entirely full
< jeremyrubin>
BlueMatt: sgtm
< wumpus>
gmaxwell: there's a threshold
< wumpus>
gmaxwell: so that should already be mitigated
< gmaxwell>
wumpus: yes, but it's not atomic. you can't check and then successfully allocate space.
< wumpus>
no, it's not perfect, but it works pretty well
< wumpus>
I've never had corruption on full disk
< wumpus>
also leveldb write failing shouldn't generally be fatal
< gmaxwell>
on my desktop, which runs with debug=1 it almost always gets checked at the start of the flush. It doesn't corrupt things on disk, but as matt points out the rpc would return somewhat incorrect results during that time.
< wumpus>
it just means the last transaction is not committed
< jtimon>
it seems it's time to abort the meeting
< wumpus>
#endmeeting
< lightningbot>
Meeting ended Thu Mar 30 20:01:06 2017 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)
< BlueMatt>
wumpus: we need to change that to #abort()
< gmaxwell>
But I wanted to cleanly flush!
< bitcoin-git>
[bitcoin] theuni opened pull request #10129: scheduler: fix sub-second precision with boost < 1.50 (master...fix-scheduler-millisecs) https://github.com/bitcoin/bitcoin/pull/10129
< cfields>
BlueMatt: ^^ fixes the ramped-up cpu issue
< wumpus>
cfields: 10129 needs backport I suppose?
< BlueMatt>
cfields: ahh, lol, I dont have old boost
< BlueMatt>
wumpus: only if we also backport the wallet-flush-in-cscheduler, probably?
< BlueMatt>
(which i think we should, but....)
< BlueMatt>
questionable given boost issues
< wumpus>
eh no, let's not do that, sorry
< cfields>
wumpus: won't hurt, but doesn't currently matter
< BlueMatt>
wumpus: fair, yea
< cfields>
wumpus: probably best to go ahead and backport, just in case any other stuff is backported in the future that relies on the correct behavior
< BlueMatt>
yea, i mean if it turns out there are complications and its not dead simple, not backporting seems like the best idea
< gmaxwell>
wallet-flush-in-cscheduler removes a thread?
< wumpus>
cfields: I assued that was the point because one of your other pulls changes the times to std::chrono instead
< wumpus>
this one is backportable at least
< cfields>
wumpus: yes, and BlueMatt was discussing possibly backporting other scheduler changes. Was just being proactive :)
< gmaxwell>
if wallet-flush-in-cscheduler reduces the thread count, then it's also perhaps interesting in a backport. (less memory / address space!)
< wumpus>
gmaxwell: it does, but imo it's too risky to backport, we're not entirely sure of it yet this might be only the first issue with it
< gmaxwell>
aww
< gmaxwell>
:P
< gmaxwell>
K
< BlueMatt>
gmaxwell: yea, thats why i was proposing backport....but its not simple, apparently, because boost :(
< wumpus>
I've always been kind of wary of cscheduler
< BlueMatt>
wumpus: yea.....
< gmaxwell>
yea... well we've had a lot of problems with it in the past.
< wumpus>
there's been lots of issues with it historically at least in the tests
< wumpus>
yes
< cfields>
wumpus: same. even moreso after diving in to debug that one.
< cfields>
in fairness though, i think many issues have been boost-related
< gmaxwell>
the general idea of having all the rare periodic stuff in a thread that handles them, is perfectly reasonsble.
< wumpus>
yes, and most bugs don't show up in our actual single-threaded usage
< cfields>
wumpus: right. I'd be very wary of adding a second thread. I'd expect to uncover a bug or two for sure.
< wumpus>
yes, the idea is reasonable
< gmaxwell>
oh we were thinking about adding another thread to service the schedueler?
< cfields>
gmaxwell: that's how it's designed, anyway
< wumpus>
gmaxwell: nonono, please not, it's just that it was designed to be able to work with multiple threads, but that functionality has never been properly verified
< gmaxwell>
yea, well I think thats a bad idea.
< wumpus>
there used to be a test that tested it with multiple threads, but that failed randomly and we never found out why
< gmaxwell>
At least it always seemed to be that the whole utility for such a thing is periodic things that don't have strong realtime requirements.
< BlueMatt>
wumpus: sounds like its time to just rewrite the thing from scratch just because....
< wumpus>
BlueMatt: or just drop the mult-thread functionality
< BlueMatt>
well, ok, that too
< wumpus>
I think it could be shortened a lot with c++11 functionality like lambdas and tasks etc
< BlueMatt>
i mean whatever, leave it there, I plan on moving lots of things to cscheduler, just will have to make it sane before turning it on
< BlueMatt>
ie c++11 stuff, probably
< gmaxwell>
the periodic wallet flush is seperate from the realtime wallet flush that happens after issuing a key or whatever, right?
< wumpus>
same for the closure stuff and worker queue in http_server
< gmaxwell>
well it should be a min-heap, :P I'm sure there is one of those in stl no?
< BlueMatt>
gmaxwell: by "wallet flush", I mean "wallet compaction"
< wumpus>
no, stl does not have heap types afaik
< BlueMatt>
wallet flush, indeed, is the thing that happens when you write something directly
< wumpus>
boost does *ducks*
< wumpus>
yes please call it compaction not flush
< gmaxwell>
BlueMatt: okay sounds fine then... not the stuff that we need a critical realtime response on.
< wumpus>
yes
< BlueMatt>
indeed, the function name changed to compaction from flush when it moved into cscheduler
< BlueMatt>
because thats a more accurate description
< bitcoin-git>
[bitcoin] gmaxwell reopened pull request #9424: Change LogAcceptCategory to use uint32_t rather than sets of strings. (master...log_category_simplify) https://github.com/bitcoin/bitcoin/pull/9424
< wumpus>
I don't think I remember ever seeing that pull before. Then again, so many pulls get openened, it's easy to lose track :/
< wumpus>
oh end of dec 2016, yea, wasn't too active at that time