< luke-jr>
BlueMatt: re #15482 can you add something to the BIP on a rationale for using the witness-stripped size for <65 byte limit?
< gribble>
https://github.com/bitcoin/bitcoin/issues/15482 | Implement BIPXXXs new softfork rules (The Great Consensus Cleanup) by TheBlueMatt · Pull Request #15482 · bitcoin/bitcoin · GitHub
< sipa>
luke-jr: the witness-stripped size is what matters for the merkle tree vulnerability
< luke-jr>
sipa: only in one merkle tree, though?
< luke-jr>
I guess <64 bytes isn't possible with any sensible output anyway, even if witness is stripped
< luke-jr>
anyway, just a topic to cover in the rationale IMO
< kanzure>
just fyi, bitcoin-dev mailing list is temporarily broken
< kanzure>
i assume temporarily.
< sipa>
luke-jr: agree
< fanquake>
wumpus/sipa can you block 3ggerman29 from GH. They are spamming addresses/comments.
< sipa>
give me an irc command to run
< fanquake>
sipa pretty sure you'll have to go to GitHub, don't think we've got a ban bot
< sipa>
oh!
< sipa>
sorry, i thought you meant someone spamming on irc
< sipa>
can't do so right now
< luke-jr>
lol
< fanquake>
sipa no worries
< BlueMatt>
luke-jr: wait, I'm confused, you want me to add rationale for why we care only about non-witness size and why we dont have similar concerns in wtxid merkle tree?
< luke-jr>
BlueMatt: no.. why we don't ONLY protect the witness tree, basically
< luke-jr>
(and probably good to mention why it doesn't matter?)
< Muoi>
I run a Bitcoin node, there is no IP inserted to tried table since December, is this a problem?
< gmaxwell>
Muoi: thats interesting. it's supposed to generate feeler connections, are you using -connect?
< Muoi>
No, I run a default node, with only some additional customized logging messages.
< gmaxwell>
current version?
< Muoi>
There is no feeler connection was made since 18 Dec 2018
< Muoi>
0.17.0
< gmaxwell>
Muoi: how many connections do you have up?
< Muoi>
Outgoing connections are always 8 or 9.
< Muoi>
"Potential stale tip detected, will try using extra outbound peer" was printed out frequently also
< gmaxwell>
that sounds like you're actually deadlocked.
< Muoi>
6-12 times per day
< gmaxwell>
Muoi: what is your most recent block?
< Muoi>
ATM 564704
< gmaxwell>
Thats currentish.
< Muoi>
yes, stale tip happens, but not every time the node checks, I think.
< Muoi>
For example, on 25 Feb 2019:
< Muoi>
2019-02-25T06:36:35Z Potential stale tip detected, will try using extra outbound peer (last tip update: 1849 seconds ago) 2019-02-25T09:03:36Z Potential stale tip detected, will try using extra outbound peer (last tip update: 2138 seconds ago) 2019-02-25T09:14:06Z Potential stale tip detected, will try using extra outbound peer (last tip update: 2768 seconds ago) 2019-02-25T11:20:08Z Potential stale tip detected, will try us
< gmaxwell>
well that will happen from time to time normally (when blocks are far apart) so it's not shocking but 12 times per day sounds like too much.
< Muoi>
Yep, I agree.
< Muoi>
But isn't that feeler connections should still happen?
< gmaxwell>
it should, indeed. it shoulds like a bug.
< Muoi>
It happens with my another node also. But after I reboot the node, it starts having feeler connections again.
< Muoi>
Seems like the SetTryNewOutboundPeer() is incorrectly set somewhere.
< gmaxwell>
My first guess would be to look for interactions with the stale tip logic.
< luke-jr>
has the node been restarted since this began happening?
< Muoi>
No, I have been running it non-stop. To clarify more, I run few nodes actually. Some of them have this problem.
< Muoi>
I restart only one node and problem is gone, so I wouldn't want to do it on other nodes.
< Muoi>
To trigger feeler connection, tip must not be stale and the number of outgoing connections must be >=8, right?
< Muoi>
But stale tip are detected few times per day, and number of outbound conn. is always 8. But no feeler connection.
< Muoi>
I have looked at the stale tip logic in the last 12 hours, couldn't figure it out so I brought the problem here.
< Muoi>
Btw, I would be very gladful if you guys can help to take a look at this problem. I can provide detailed debug.log, node specs, etc..
< gmaxwell>
Muoi: thanks someone will. If you can hang around (or leave an email). It would be helpful to open an issue too.
< gmaxwell>
(I'm not looking at it right now since it's late for me and I should be asleep)
< Muoi>
Okay sure, I will open an issue. FYI, my email is muoitran@comp.nus.edu.sg. Please feel free to drop an email so we can discuss more. Good night!.
< bitcoin-git>
[bitcoin] darosior opened pull request #15483: Adding a 'logpath' entry to getrpcinfo, as discussed in #15438 (master...getrpcinfo_logpath) https://github.com/bitcoin/bitcoin/pull/15483
< sdaftuar>
Muoi: hi, i'd like to investigate the p2p issue you're seeing (i wrote the code that seems to be involved). can you post your debug.log somewhere? if there might be sensitive information in it (eg wallet related things, ip addresses, etc) i'm most interested in just a few lines that might appear:
< sdaftuar>
lines involving "UpdateTip", "Potential stale tip", "disconnecting extra outbound peer", "keeping outbound peer"
< sdaftuar>
also "setting try another outbound peer"
< bitcoin-git>
[bitcoin] MarcoFalke closed pull request #13972: Remove 16 bits from versionbits signalling system (BIP320) (master...reservedbits2) https://github.com/bitcoin/bitcoin/pull/13972
< bitcoin-git>
[bitcoin] MarcoFalke reopened pull request #13972: Remove 16 bits from versionbits signalling system (BIP320) (master...reservedbits2) https://github.com/bitcoin/bitcoin/pull/13972
< phantomcircuit>
Muoi, possibly your system clock is broken
< sdaftuar>
phantomcircuit: it seems to be a bug in the addrman/net interaction, see #15484. can someone tag that with the Bug label? we should fix it...
< * gmaxwell>
grumbles about testing on cloud services.
< BlueMatt>
fails on the same line, too
< gmaxwell>
try again... :(
< BlueMatt>
yea, I bumped the time limit, lets see if it fixes it
< gmaxwell>
Unrelated, do we need to move off linuxfoundation lists? AFAICT the mailing list has been dysfunctional for some time and now isn't working at all.
< BlueMatt>
I am curious why that generated a failure, though, given it also fails on the pre-fork-changes commit which I dont see how it could possibly have a performance effect aside from running a different set of tests in parallell at the time
< jnewbery>
what's the pre-fork-changes commit?
< gmaxwell>
performance of travis executors seems to be extremely inconsistent.
< BlueMatt>
jnewbery: the commit that only changed the getblocktemplate/miner.cpp behavior
< BlueMatt>
(and added a new test)
< BlueMatt>
ie #15481
< gribble>
https://github.com/bitcoin/bitcoin/issues/15481 | Restrict timestamp when mining a diff-adjustment block to prev-600 by TheBlueMatt · Pull Request #15481 · bitcoin/bitcoin · GitHub
< gmaxwell>
kanzure: do you know whats up with the mailing list?
< sipa>
admin interface is down, i heard
< BlueMatt>
last I heard he said he'd "escalated" it
< BlueMatt>
so....who knows
< jnewbery>
and it hasn't failed on master recently?
< luke-jr>
gmaxwell: PM
< BlueMatt>
jnewbery: what hasnt?
< BlueMatt>
also, processnewblock_signals_ordering is busted on windows, but if I restart it it works again
< BlueMatt>
soooo ummm...thats bad
< kanzure>
gmaxwell: no, i don't know what's going on. it's just "internal server error" all the time.
< kanzure>
gmaxwell: i escalated to warren, who knows the linux foundation people
< jnewbery>
the test that's failing - feature_assumevalid.py ?
< jnewbery>
ah, I think I see what's happening
< jnewbery>
feature_assumevalid.py always takes a long time on travis. usually around 150 seconds
< jnewbery>
for me locally, it runs in less than 10
< jnewbery>
which presumably is hitting storage in AcceptBlock()
< jnewbery>
any disk activity is super slow in Travis
< jnewbery>
I don't think there's any way to speed it up. The test is feeding a long chain of blocks to a node where one of the early blocks has an invalid signature, and checking that the chain is accepted when assumevalid is set
< jnewbery>
I can't see any reason your change would make a difference there, unless it just happens to be running a different test that has disk i/o in parallel
< jnewbery>
Actually, I guess we don't need to feed it the full chain of blocks. As long as it accepts the invalid block (height 102), then it's testing the assumevalid behaviour
< bitcoin-git>
[bitcoin] adamjonas opened pull request #15485: add rpc_misc.py, mv test getmemoryinfo, add test mallocinfo (master...test_rpc_misc) https://github.com/bitcoin/bitcoin/pull/15485
< BlueMatt>
jnewyea, the new test *also* accepts like 2k blocks at once, so understandably it may interfere
< BlueMatt>
it appears to have been fixed by further increasing the timeout
< BlueMatt>
jnewbery:
< jnewbery>
Consider moving assumevalid.py and the new test to EXTENDED_SCRIPTS
< jnewbery>
that way they wouldn't get run on Travis
< sipa>
do the extended tests get run anywhere automatically?
< jnewbery>
no :(
< sipa>
could we enable then for travis master builds?
< jnewbery>
Travis is too slow for any tests that hit disk too much
< sipa>
(but not PRs?)
< sipa>
could MarcoFalke's Drahtbot run then at some regular interval? I don't know what hardware it runs on
< jnewbery>
Previously, a bunch more of the tests were EXTENDED, and they got moved to BASIC
< jnewbery>
the EXTENDED tests were run nightly on a travis cron job, with the exception of pruning and dbcrash
< jnewbery>
those were too slow to be run even nightly, because the travis job would time out
< gmaxwell>
how long until we ship a serious bug that the extended tests would have caught?
< jnewbery>
It would be nice to have a non-cloud CI
< jnewbery>
I think a long time. Only pruning and dbcrash are extended tests now
< jnewbery>
There are probably more critical paths that don't have any test coverage at all
< bitcoin-git>
[bitcoin] sdaftuar opened pull request #15486: [net] Allow feeler connections to go to the same netgroup as existing outbound peers (master...2019-02-addrman-collisions) https://github.com/bitcoin/bitcoin/pull/15486
< gmaxwell>
sdaftuar: dud you see the report last night that feelers were stopping?
< gmaxwell>
did
< gmaxwell>
oh you are fixing it
< sdaftuar>
:)
< gmaxwell>
ohh nice
< gmaxwell>
thats subtle
< BlueMatt>
damn, sdaftuar is smart...
< gmaxwell>
sorry I haven't looked this morning, been dealing with meatspace stuff. but I probably wouldn't have figured that out anyways.
< sdaftuar>
luckily the reporter's logging made this easier to find... our default logging doesn't provide the ip you're colliding with
< gmaxwell>
Your change seems reasonable but it also seems like a really indirect fix.
< sdaftuar>
yeah i agree. i haven't quite wrapped my head around the addrman well enough to propose a fix there
< gmaxwell>
I think the tried buffer should have a maximum size and random forgetting if it overfills.
< gmaxwell>
I don't think addrman's behavior needs to change.
< sdaftuar>
i think it does? at least, from reading the original PR it seems to
< sdaftuar>
the issue is that you might not have any new collisions for a long time
< sdaftuar>
so you have just one tried entry, which you have to try each time
< sdaftuar>
one tried collision entry i mean
< sdaftuar>
but then we fail because it's in the same netgroup
< sdaftuar>
so we could bail out differently i guess and pull from regular Select
< gmaxwell>
making it unconditionally remove when it makes use of it would be sufficient
< sdaftuar>
well, that's trickier to write i think -- i believe we use it asynchronously, so if we drained it there we'd need to put it somewhere else so that we could resolve the collision later the right way
< gmaxwell>
I don't think it's critical that absolutely always retry an evicted try.
< gmaxwell>
just so long as an attacker can't cause us to never try.
< sdaftuar>
i actually don't recall why we do it this way in the first place... the concern is around an attacker getting us to remove things from the tried table that we shouldn't?
< sdaftuar>
don't we use some kind of salted hashing or something to make that hard to do?
< * sdaftuar>
hopes ethan will pop in and fix this in some simple way
< gmaxwell>
sdaftuar: kind of, basically it's just creating an anti-eviction bias. In simulation it helps a lot.
< gmaxwell>
we resist the attacker flooding our table, but can't block it completely.
< gmaxwell>
So we essentially give anything that was already in there a second chance.
< sdaftuar>
oh, so if we looked at collisions as "if we're not sure the existing entry is bad, keep it" that's probably fine?
< gmaxwell>
like: we want to evict old tried data in favor of new known to be working stuff... but only because the old tried data is old and may no longer be a working node. So the idea behind try before evict is to resolve that "we don't know".
< gmaxwell>
If we knew an entry in tried was working we'd prefer to never evict it in favor of something new.
< sdaftuar>
(from ethan, pasting from a slack channel at his request)
< sdaftuar>
Ethan Heilman [3:57 PM]
< sdaftuar>
We make the following assumption: The addresses in the tried table contain some non-malicious addresses because if they were all malicious controlled addresses the attack has already won and you would not learn about non-malicious addresses. Based on this assumption you don't want to evict an address from the tried table if that address is still up and connectable.
< sdaftuar>
Ethan Heilman [3:57 PM]
< sdaftuar>
So we never evict an address without trying it
< sdaftuar>
exactly
< sdaftuar>
Ethan Heilman [3:59 PM]
< sdaftuar>
I haven't read it closely but PR #15486 seems like a reasonable fix at face value.
< gribble>
https://github.com/bitcoin/bitcoin/issues/15486 | [net] Allow feeler connections to go to the same netgroup as existing outbound peers by sdaftuar · Pull Request #15486 · bitcoin/bitcoin · GitHub
< sdaftuar>
The second role of feeler connections is testing addresses in new and adding them to tried
< sdaftuar>
both of these purposes don't need to care about the group the address is from
< gmaxwell>
right but is the fix complete? I'm concerned that if there is a persistant issue with one of the addresses that we'll get stuck on it.
< gmaxwell>
like, do we correctly count it as unreachable under all possibility failure conditions... clearly not currently.
< gmaxwell>
An alternative to your fix would be counting the grouping failure as failed, that would be dumb because feelers shouldn't be obeying the network group rule regardless.. but it would also be a fix.
< ethan_>
We can enumerate the ways we believe it can fail and then cover all those cases or mark it with a number of attempts uint and if that number is greater than 3 remove it
< gmaxwell>
My thought was just anytime we attempt to try we should remove it from the list. Maybe this means some spurrious removals will happen, but I don't think that matters.
< gmaxwell>
like if 1:50 try before evicts didn't really try but evicted anyways, ... so what?
< gmaxwell>
network connections fail by chance sometimes too.
< sdaftuar>
my thought was that if something has been in the m_tried_collisions for too long, we just remove it?
< sdaftuar>
basically add one extra condition in ResolveCollisions() where after 20 minutes or something we give up on an eviction attempt.
< ethan_>
that seems reasonable
< sdaftuar>
or do the eviction anyway, not sure which way makes more sense
< ethan_>
you should evict if you are failing to connect or resolve the goodness of the to-be-evicted address
< ethan_>
if you can't test an IP and the reason isn't because of its group, then you aren't going to be able to connect to the IP anyway and it should be evicted
< sdaftuar>
ethan_: gmaxwell: i updated #15486 with a timeout in ResolveCollisions
< gribble>
https://github.com/bitcoin/bitcoin/issues/15486 | [net] Allow feeler connections to go to the same netgroup as existing outbound peers by sdaftuar · Pull Request #15486 · bitcoin/bitcoin · GitHub
< MarcoFalke>
So it is not recommended to post to the mailing list right now, because non-approved people can not post to it?
< sipa>
i think nobody is approved
< parmado>
i came in as a guest, which is an option
< luke-jr>
parmado: the channel is intentionally open to the public; however, if you have questions about Bitcoin, #bitcoin is usually a better place to ask
< parmado>
thanks. is spectating okay for the time being?
< sipa>
of course!
< parmado>
thanks.
< parmado>
is the consensus here that bitcoin will dominate amongst electronic assets, or is there substantial disagreement on that?
< sipa>
parmado: #bitcoin
< gwillen>
hmm, I just rebased a feature branch, and now loading my testnet wallet is warning me: “Warning: Error reading ! All keys read correctly, but transaction data or address book entries might be missing or incorrect.”
< gwillen>
I added some debug prints, and I am seeing errors when loading an object of strType="keymeta", and I am seeing lots of them, and all of them are caught exceptions (meaning that strErr="" so nothing useful prints)
< gwillen>
achow101: I think you touched this recently? Should I be worried either that I did something wrong, or that you did?
< gmaxwell>
ethan_: the only kind of case where I think it might make sense to not always remove a connection from that list is when we're offline.
< gmaxwell>
but we don't really know when we're offline in any case..
< gmaxwell>
sdaftuar: you should probably add a reported by credit to the feelerfix, it's esp to do when the report was good enough to make fixing it easier/possible.
< sipa>
credit?
< sipa>
oh!
< sipa>
a "reported by" credit, not a reported "by credit"
< gmaxwell>
indeed.
< promag_>
sipa: with #15402 it's possible to have concurrent invalidateblock right?