< fanquake>
Also wether we are going to bump sdk & clang at the same time.
< Randolf>
I wonder if someone could take a quick look at PR 12393 at https://github.com/bitcoin/bitcoin/pull/12393 for me. The committer - practicalswift - indicated that they implemented a change that I suggested, but that change isn't appearing for me. Am I missing something, or is there something else
< Randolf>
that the committer needs to do? Thanks.
< fanquake>
Randolf The change is not on GitHub yet. practicalswift may have forgotten to push the changes after fixing.
< provoostenator>
Can someone answer my new questions at #12401? I need a little bit more context to understand my own bugfix :-) I can update the PR based on that feedback.
< bitcoin-git>
[bitcoin] MarcoFalke closed pull request #12394: gitian-builder.sh: fix --setup doc, since lxc is default (master...2018/02/gitian-build-setup-lxc) https://github.com/bitcoin/bitcoin/pull/12394
< provoostenator>
michagogo: regarding Ubuntu inside of Windows and scary instructions. I'm currently trying to upgrade a fresh install to 18.04 and then see if I can compile Bitcoin
< provoostenator>
If that works, I'll make a PR to point out that possibility.
< jamesob_>
is latestblock in rpc/blockchain.cpp supposed to correspond to the latest block on *any* chain, or on activeChain?
< michagogo>
provoostenator: Hm, does do-release-upgrade actually do anything?
< michagogo>
IIRC even upgrading 14.04-16.04 isn't supported, advice is to delete the environment and start fresh
< provoostenator>
michagogo do-release-upgrade -d
< provoostenator>
Probably not recommended of course. I just got a crash somewhere during the upgrade. We'll see...
< provoostenator>
Windows 10 uses 16.04, so that was my starting point.
< provoostenator>
And that in turn is running in a Virtual Box on my iMac, because: must go deeper:
< arubi>
what kernel is running in windows' linux subsystem? not linux right?
< provoostenator>
Their bug filing flow is pretty nice... It generates a report, uploads a bunch of stuff to some temporary place and gives you a short URL to open in a browser with most of the bug report filled out...
< bitcoin-git>
[bitcoin] MarcoFalke closed pull request #12120: Add dev guideline limiting auto usage. (master...2018-01-auto-devnotes) https://github.com/bitcoin/bitcoin/pull/12120
< provoostenator>
locales are always screwed up on the Windows 10 Ubuntu, this tends to help: sudo gunzip /usr/share/i18n/charmaps/UTF-8.gz
< bitcoin-git>
[bitcoin] jamesob opened pull request #12407: Ensure nStatus is set properly for all invalid blocks (master...jamesob/2018-02-mark-headers-invalid) https://github.com/bitcoin/bitcoin/pull/12407
< BlueMatt_>
we need to improve the at-tip relay latency
< provoostenator>
I managed to get my Windows-in-a-VirtualBox build to throw: ERROR: AcceptBlockHeader: block 0000000000000000004c746b087820b517771e136d6776b21586c93c333da523 is marked invalid
< provoostenator>
That node suffered some above before, so it's probalby not a bug.
< provoostenator>
some abuse before
< BlueMatt_>
provoostenator: What sort of abuse?
< provoostenator>
I think it ran out of disk space last time I used it, about a month ago.
< sipa>
that still shouldn't happen
< provoostenator>
And now I copied a binary that I built using Gitian
< BlueMatt_>
That shouldn't leave the database in an inconsistent state
< provoostenator>
Right, I can upload debug.log, I think it still has previous stuff
< grafcaps>
=2
< BlueMatt>
provoostenator: note imposter
< BlueMatt_>
BlueMatt: ?
< spudowiar>
echeveria: The other BlueMatt has arrived
< BlueMatt>
I mean it wrote out that the block was invalid due to a script failure when it first hit that error and will never reconsider a block automatically
< provoostenator>
Parent machine runs all sorts of stuff, but seems fine.
< BlueMatt_>
.
< BlueMatt_>
..
< BlueMatt_>
...
< BlueMatt_>
..
< BlueMatt_>
.
< BlueMatt>
what about cpu temps?
< provoostenator>
Where do I find that?
< BlueMatt>
errr, on osx no idea
< spudowiar>
provoostenator: Open the case, stick hand inside
< provoostenator>
One does not simply ... open an iMac case
< BlueMatt_>
.
< BlueMatt_>
..
< BlueMatt_>
...
< BlueMatt_>
........
< BlueMatt_>
.........
< BlueMatt_>
........
< BlueMatt_>
...
< BlueMatt_>
..
< BlueMatt_>
.
< sipa>
?
< provoostenator>
Isn't a compiler error more likely? Is there a command to make it retry / redownload that block?
< BlueMatt>
sipa: plz ban imposter...
< BlueMatt>
provoostenator: I'd think not? we have an impressive ability to break peoples' hardware
< BlueMatt>
provoostenator: maybe try running a memtest?
< spudowiar>
provoostenator: You could add the raw block over RPC
< BlueMatt>
irc command is reconsiderblock
< spudowiar>
oh, well there's that :)
< provoostenator>
The thing is, this block is exactly the next one in line since I restarted it.
< provoostenator>
So I suspect if there was a CPU fire, it was last month.
< provoostenator>
I'll try the reconsider thing
< BlueMatt>
err, s/irc/rpc/
< BlueMatt>
huh? yes, your bitcoind rejected a block, it then stored on disk the fact that it thinks that block is invalid
< BlueMatt>
and then will always and forevermore refuse to reconsider that block
< BlueMatt>
unless you do so manually
< provoostenator>
That definately did something
< BlueMatt>
(which I'm really starting to think we need to start automatically do....)
< BlueMatt>
(which I'm really starting to think we need to start automatically doing....)
< provoostenator>
Yeah, reconsidering the last N blocks at launch if they're invalid, in cases like this?
< sipa>
we could make it retry a block one second time if validation fails
< sipa>
and if it then succeed, tell the user they need non-broken hardware?
< BlueMatt>
no, just if you find a block to be invalid, eg due to script checks, always re-run the scripts single-threaded and if it passes then then put a file on disk that says my-hardware-is-broken and refuse to start
< provoostenator>
It's chugging along nicely now, although QT doesn't update the count after Synching Headers.
< spudowiar>
When will reconsidering a block work? (unless you invalidated it manually)
< spudowiar>
(except for hardware issues)
< BlueMatt>
yea, what sipa said
< sipa>
spudowiar: if the signature checks failed due to hardware failure the first time
< sipa>
in particular, it won't work if your UTXO set is corrupted
< BlueMatt>
problem is I've seen one report of a rejected block from a node on a server vm
< BlueMatt>
which is frightening
< provoostenator>
In fact, getblockchaininfo rpc still has the headers stuck at 499615, while blocks are now at 499686 and increasing
< spudowiar>
Would it be useful to log the reason the block validation failed?
< spudowiar>
I feel like one out of 9000 cases could be something scary :)
< BlueMatt>
well in this case we see the sciript check failure
< sipa>
spudowiar: there's nothing interesting to report
< sipa>
"script validation failed"
< BlueMatt>
it may be useful to log the scriptPubKey being spent from
< sipa>
(almost always)
< BlueMatt>
as it would indicate if the utxo set is corrupted vs hardware failed to run the script right
< sipa>
you'd need to report the full spending TX as well
< sipa>
(as it affects sighash)
< BlueMatt>
well that isnt a utxo thing
< BlueMatt>
I'm casually starting to worry there is some stupidly-rare race in leveldb or so...so at least we'd have better info on the possibility of that
< provoostenator>
I do remember from last month that that Windows VM managed to completely crash my iMac. I don't remember if that was while QT was running, but that's quite possibly.
< sipa>
is it working now after the reconsider?
< provoostenator>
sipa: it's sycning yes, although the header count doesn't update unless I restart, after which it again doesn't update. Block count does
< sipa>
hmm, that sounds like an unrelated bug
< sipa>
but maybe specific to reconsiderblock
< provoostenator>
The "marked invalid" thing indeed happened last month.
< sipa>
yeah, that's just script validation failure
< provoostenator>
After that it it start banning peers, though not sure if that's causal (BAN THRESHOLD EXCEEDED)
< sipa>
it may well be
< provoostenator>
That would make sense, lots of bans in a row, preceded by ERROR: AcceptBlockHeader: prev block invalid
< sipa>
yup
< provoostenator>
It should probably go into "maybe I'm insane" mode
< sipa>
but interestingly your UTXO set seems unaffected
< sipa>
just one signature validation that failed
< sipa>
so it seems likely it was just faulty RAM or overheating CPU
< provoostenator>
Also notably no crash around that time. Wouldn't a CPU issue kill the host machine?
< sipa>
no
< sipa>
it's just a bit flip somewhere, probably
< provoostenator>
And why has this never happened to my other bitcoind nodes on the same machine?
< sipa>
CPU bugs are rare, even in overheating systems
< provoostenator>
I've done almost a dozen full IBD's on it.
< sipa>
on normally functioning CPU bitflips occur as well
< sipa>
but only once per decade or so
< sipa>
(of continuous load)
< provoostenator>
Alright, I'm off to bed. I'll let the VM continue to sync. Will let you know if it's still misbehaves when that's done.
< grubles>
anything in dmesg?
< provoostenator>
By the way, QT didn't give any useful hint that this was going on, other than just not making any progress in syncing and very little P2P activity.
< provoostenator>
Not sure what the UI could say, some translation of "Either you're being sybil attacked by a dozen nodes or something is wrong with your machine. Call sipa."
< BlueMatt>
provoostenator: if you're rejecting the chain and all your peers have high-work chain which you think is invalid, its more than just a sybil
< provoostenator>
One thing the node could try is to "reconsider" an older block that it already knows is valid. If that's suddenly invalid, it knows the machine is broken. If it's valid, it can "reconsider" the block it thought was invalid but all it peers insist is valid.
< BlueMatt>
sipa: yea, i once saw a very large miner get their node to reject their chain while they were sleeping, resulting in them building a like 4-block-long chain....their utxo set after a reconsiderblock was fine, but it seems that the utxo set in memory got corrupted
< BlueMatt>
provoostenator: its too nondeterministic
< BlueMatt>
even if your ram is maybe a bit sketchy you'll still only fail blocks with very low probability
< provoostenator>
Well it should give up if the behavior is too non deterministic, but at least some retying makes sense, right?
< provoostenator>
Especially if these type of issues can only lead to false negatives, though I don't know if that's true.
< provoostenator>
Meanwhile the windows node now says "Number of blocks left: 6000" and getblockchaininfo now shows the correct header count.
< sipa>
provoostenator: i think we'd be better off just continuously spending 1% of the time doing some elliptic curve operations with known results