< GitHub4> [bitcoin] nomnombtc opened pull request #8608: Install manpages via make install, also add some autogenerated manpages (master...man_automake2) https://github.com/bitcoin/bitcoin/pull/8608
< btcdrak> is there a way to sign each commit during an interactive rebase?
< * luke-jr> peers at rebroad.
< roasbeef> btcdrak: signoff-rebase = "!GIT_SEQUENCE_EDITOR='sed -i -re s/^pick/e/' sh -c 'git rebase -i $1 && while git rebase --continue; do git commit --amend -S --no-edit; done' -"
< roasbeef> btcdrak: stuff that into an alias in your .gitconfig
< luke-jr> hm, I would have thought verifying blocks was a read-only operation, but it seems to be writing about as much as it's reading?
< luke-jr> more actually
< gmaxwell> should be much more,
< gmaxwell> if it's reading your dbcache is too small.
< luke-jr> what's it writing?
< gmaxwell> the chainstate.
< wumpus> right, it's not the verification that causes writing, but *accepting* the blocks and updating the state
< wumpus> rejecting blocks should be a read-only operations
< wumpus> as well as verifying transactions outside blocks
< wumpus> gmaxwell: re: BU removing outward connection limit, sigh, it was to be expected somehow, you can't stop anti-social people by just telling them not to do a certain behavior, we'll need explicit anti-DoS measures for that
< wumpus> such a system could block spy-nodes connecting to everyone in one go
< wumpus> that there is no damn reason to connect to connect to more nodes also won't stop anyone
< wumpus> 'look mom we're real anarchists, we can misbehave on an open P2P network!'
< midnightmagic> any such mechanism if successful would at least in the short term eradicate blockchain.info's spying, as well as the various chainalysis mechs.
< wumpus> their name, their whole raison d'etre, is 'remove limits', so they removed another limit. It doesn't matter whether it had a purpose, they've remved a few lines of very limiting-feeling codes, and now they feel happy
< wumpus> pre-emtively filling all connection slots would also prevent blockchain.info's spying :-)
< midnightmagic> :-)
< wumpus> if this escalates and there is no solution, in the longer run, it will turn the P2P network to a ghetto, and I'd expect it to transition it to more of a F2F network w/ authenticated connections
< wumpus> but also a few more mass-connectors likely won't break the netwerk, it just depends on the scale
< midnightmagic> Bitcoin Classic -- DDoS'ing the Bitcoin network since 2016.
< midnightmagic> I wonder if someone could get gavin to scold them for that
< midnightmagic> or. Unlimited. Same diff.
< wumpus> it reminds me a bit of seeders versus leechers discussion for bittorrent, with the difference is that 'defecting' there is advantageous to the leecher, here it's just about being a jerk just because
< wumpus> though connecting to lots of nodes can get you blocks fractially faster, which is useful for miners, you can accomplish the same thing by listening and advertising
< wumpus> you can set a virtually unlimited number of incoming connection slots, then by having a node with a high uptime, it will drift up in the DNS seeder rankings, which means its address will be dealt out more often, resulting in more connections to other nodes. It's a slower process but the social thing to do.
< luke-jr> shouldn't those blocks (and the chainstate) already be correct? O.o
< wumpus> but blocks are a delta to the chainstate
< wumpus> if you accept a block, by definition, you have to update the chainstate. And undo files are produced, too, a reverse-delta just in case.
< luke-jr> so it's actually rewinding and re-accepting them? somehow I thought it just verified the current state was sane
< gmaxwell> wumpus: what you say is true, but its much harder to deal with abusive behavior when you also have many more 'honest' people also being abusive out of ignorance. (in part, because the motivational structure is different; e.g. we can discourage spy nodes by reducing information leaks, moving more users into tor, etc.)
< wumpus> luke-jr: no, just applying the block results in writes to the utxo state, to mark outputs as spent, add new outputs. it rewinds only on reorg
< gmaxwell> (But ignorance, "I'm gonna help the network out by connecting to EVERYTHING!" ... is harder to resolve by rational measures)
< luke-jr> wumpus: I'm talking about the verifying blocks at startup.. checkblocks/checklevel
< wumpus> luke-jr: ohhh! that wasn't clear to me.
< wumpus> luke-jr: no, that should be read-only
< luke-jr> sorry
< luke-jr> hmm
< wumpus> gmaxwell: sure, at least ignorance can be improved by trying harder to inform people about things
< wumpus> people may assume 'moaaarrrr outgoing connections is better' unless it's explained , .e.g in user interfaces, blog posts, that it's bad for yourself as well as others, with alternatives how to get more connections. Sure, some people will ignore it, or be jerks, but hopefully a miniority and most will heed.
< wumpus> luke-jr: the rewinding with checklevel 3 completely happens in memory, if it is writing anything to disk that'd be very wrong
< wumpus> (there could be a bug of course....)
< luke-jr> having trouble reproducing now. flushed Linux's disk caches though, and even with just reading, it's going super-slow :|
< luke-jr> (as in 1% every 2 minutes or so, ETA 3 hours at this rate)
< * luke-jr> ponders if iotop would report writing if it was swapping other processes to disk to do caching for us
< wumpus> yes it reports swapping as writing, but wouldn't account swapping of *other* processes to disk to bitcoind
< wumpus> IIRC there's a kswapd that gets all the blame for swapping
< luke-jr> hmm
< luke-jr> it did pick up speed and finished in 15 mins (just now) fwiw
< luke-jr> didn't see any writing this time either
< wumpus> phew
< luke-jr> but I guess I did manage to reproduce https://www.reddit.com/r/Bitcoin/comments/4zrxs1/qtcore_client_taking_ages_to_start/ after all
< luke-jr> just a matter of dropping caches
< luke-jr> gmaxwell: can you confirm on your slow laptop? echo 3 > /proc/sys/vm/drop_caches
< wumpus> usually if the client takes ages to start there's a backlog of blocks to verify
< sipa> question: how many times has the initial verification at startup actually caught corruption
< sipa> nobody knows, of course
< wumpus> I don't know - but I think it would be just as effective to just check a few blocks
< luke-jr> sipa: not sure if it's still a problem, but every time when we had those powerfail-corrupts-db problem? (unless that was caught by something else?)
< wumpus> the latest blocks are the most likely to be corrupt and below that it drops off
< sipa> luke-jr: those result in a leveldb checksum error
< * luke-jr> didn't realise the slowish startup time because he had checkblocks=4 checklevel=6 in his bitcoin.conf
< sipa> there are only 4 checklevels :)
< wumpus> luke-jr: same here - I think the default checkblocks should be much lower
< luke-jr> well, if there's ever more added, I'm ready! :P
< wumpus> a as-thorough-as-possible check on just a few blocks
< sipa> and 4 blocks is not very much
< wumpus> well, take 10 then
< luke-jr> sipa: my PC is on UPS ;)
< sipa> wumpus: jonasschnelli has a patch to switch to txcount based limiting
< luke-jr> hmm, that's an interesting idea.
< wumpus> that's good, but I think the effective default check depth should also be lowered, don't know if it does that
< wumpus> or maybe write a flag on 'clean' shutdown, and do a reduced check in that case?
< luke-jr> set it for the equivalent checkblocks back when it was introduced? :p
< sipa> wumpus: it sets the default to 100000 txn
< sipa> i guess it can be lower even
< wumpus> sipa: ok
< sipa> but it also insists on at least 6 blocks
< gmaxwell> we need to do something about the startup check.
< gmaxwell> unfortunately just checking a couple blocks is not much of a test, but anything more takes too long for most people.
< sipa> wumpus: interesting idea
< luke-jr> I wonder how difficult it would be to background it
< gmaxwell> wumpus: neat idea!
< gmaxwell> when we were having windows corruption, what was needed to reliably detect that?
< wumpus> just opening the leveldb IIRC
< btcdrak> roasbeef: thanks!!
< wumpus> or maybe the first access. No thorough checking was necessary
< * sipa> suggests: 10000 txn (which corresponds to ~6 blocks currently)
< wumpus> sounds good to me
< wumpus> it's very helpful that leveldb has its own corruption detection here
< gmaxwell> why bother with the txn count?
< gmaxwell> just set it to 6 blocks?
< wumpus> because that auto-adapts
< gmaxwell> so? other than segwit the txn count of blocks won't change much.
< wumpus> if blocks grow again, it won't become slower
< wumpus> heh yes that's a completely different discission
< sipa> gmaxwell: if you are early in IBD you probably want more blockss
< gmaxwell> sipa: okay, I'll but that.. but kind of a corner case.
< wumpus> but the fact is that transaction could is a better measure of run time
< wumpus> count*
< gmaxwell> I was at the verge of suggesting this just do a single block.
< wumpus> and amount of data checked
< sipa> heh, fine by me as well
< gmaxwell> It's not yielding a high payoff in detected errors, the ones I know it detected (chainstate version corruption, need checking back to your last restat to reliably report)
< wumpus> well the first priority is to make the insane wait time go away
< wumpus> whether the new number becomes low or ultra-low is less important :)
< gmaxwell> One or two block keeps the code live and working, I think this code is useful around refactorings and changes to the code.
< btcdrak> i guess the reply to Roger's testnet response is no, BU runs in production and is a menace to the network
< btcdrak> BU has no peer review and no chance in hell of being used for real
< btcdrak> wumpus: rebased #7562
< sipa> btcdrak: what was the problem with the tests?
< btcdrak> gmaxwell: can you explain a bit more about your suggestion regarding my attempt at extracting the MacOSX sdk on linux?
< btcdrak> sipa: changing the defaul tx version affected a couple of tests that were comparing hash values, or sorting txs by hash. obviously the hashes change when tx version is bumped. Those seem innocuous. There is one test however that I dont understand which I commented on. Not sure why it is affected by changing the version number.
< btcdrak> I dont know if it's the tests' fault and innocuous, or revealing an issue.
< gmaxwell> btcdrak: so, first someone extracts it via OSX. Then we take the extracted binary and find all the offsets for its data in the decompressed file.
< gmaxwell> we can distribute the offset list.
< gmaxwell> actually, a took like xdelta or another binary diff tool might just handle it.
< sipa> but the dmg file is compressed
< sipa> can we decompress it?
< sipa> without "installing", that is
< gmaxwell> it's a 7z file according to drak
< gmaxwell> so I was assuming that.
< btcdrak> gmaxwell: one thing I am not sure about is if 7z is actually extracting it correctly. It seems to be, and the bug seems more like in the linux implementation of hfsplus; but it is feasible that the compression algo was tweaked and 7z is unaware
< gmaxwell> unlikely, 7z has checksums.
< sipa> so we'd xdelta the 7z-decompressed dmg with the resulting compressed .tar?
< gmaxwell> uncompressed tar.
< gmaxwell> oh what we want is a tgz.
< gmaxwell> yes.
< gmaxwell> so the file we normally get out of it.
< gmaxwell> and if the resulting delta is small, we call it done.
< luke-jr> btcdrak: what version of 7zip did you use?
< sipa> btcdrak: where is hfsplus used?
< btcdrak> luke-jr: version 9.20
< luke-jr> hmm, it tells me: Error: Can not open file as archive
< luke-jr> oh, my DMG file is truncated
< btcdrak> so basically the files we want are in Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk, but using the method above you get a bunch of empty files. On a Mac you run "tar -C /Volumes/Xcode/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/ -czf MacOSX10.11.sdk.tar.gz MacOSX10.11.sdk"
< sipa> but we don't need all of the contents of that dmg, right?
< luke-jr> not nearly
< luke-jr> but it's copyrighted and non-redistributable :<
< btcdrak> sipa right, we just need the files from Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/* which we turn into a tar.gz
< sipa> i see
< btcdrak> That gives us 47MB tar.gz :-p
< btcdrak> we drop that in the gitian-builder/inputs/ folder when performing the Gitian rituals.
< sipa> got it
< * luke-jr> waits on copy over LAN.. Xcode_7.3.1.dmg 27% 1383MB 6.5MB/s 09:08 ETA
< sipa> 14m here
< luke-jr> you and your fast internet :x :p
< sipa> luke-jr: this is over t-mobile roaming data
< luke-jr> sadly, T-Mobile data here seems to only be about 5 Mbps
< gmaxwell> 2hr 49m remaining here.
< btcdrak> RIP sipa's roaming data charges
< sipa> btcdrak: it's free
< sipa> it's effectively cheaper than my swiss wired or wireless internet
< sipa> s/free/fixed price/
< gmaxwell> btcdrak: a while back tmobile ceo put out some open letter about "data abusers" and for some reason it said, 'mining bitcoin' -- for a while we wondered if perhaps he was talking about blockstream. https://newsroom.t-mobile.com/news-and-blogs/stopping-network-abusers.htm
< gmaxwell> apparently blockstream was using several TB a month on tmobile.
< gmaxwell> doesn't everyone run a full node on their phone?
< luke-jr> lol, xdelta is 45 MB
< sipa> :(
< sipa> luke-jr: how did you decompress the dmg?
< luke-jr> 7z
< gmaxwell> lol
< gmaxwell> might need more options for it to actually find the match
< luke-jr> maybe there's some tweaking to the tar we need to put it in a better order?
< gmaxwell> yes, that too.
< luke-jr> oh wait
< luke-jr> hmm, nm
< luke-jr> the uncompressed tar is 306 MB FWIW
< luke-jr> but xdelta's patch is compressed.
< gmaxwell> were you using the compressed tar as the input?
< luke-jr> no
< sipa> hpmount, mount -t hfsplus, 7zr... all fail to open the file
< gmaxwell> maybe we should just encrypt the damn tar with the whole dmg file as the key, and then call that sufficient for legal purposes. :P
< luke-jr> sipa: need 7z, not 7zr
< luke-jr> 7zr has formats removed
< luke-jr> gmaxwell: hmm, I wonder if that'd be okay
< luke-jr> I turned off xdelta's patch compression and it's 261 MB :/
< gmaxwell> smaller than 300 so it found something.
< gmaxwell> luke-jr: I think doing that would be defensible.
< luke-jr> xdelta3: warning: input position 310378496 overflowed instruction buffer, needed 43091 (vs. 32768), consider changing -I
< gmaxwell> consider changing -I
< * luke-jr> sets -I to unlimited
< luke-jr> not doing much better :<
< luke-jr> -rw-r--r-- 1 luke-jr luke-jr 261M Aug 27 09:24 MacOSX.xdelta
< gmaxwell> there are some other binary patching tools, xdelta was the first that came to mind.
< sipa> xdelta and xdelta3 seem to be different things
< gmaxwell> open-vcdiff
< luke-jr> I'm writing something custom
< luke-jr> hmm
< luke-jr> I can't mmap this DMG :<
< btcdrak> well surely the better solution would be to fix the hfsplus driver?
< gmaxwell> luke-jr: upgrade to a darn 64 bit OS already luke!
< luke-jr> oh, if I build with -static I can
< sipa> i can't get it below 46 MB
< sipa> that's with
< sipa> xdelta3 encode -P $((2**30)) -I 0 -B 2147483648 -W 16777216
< sipa> (the highest values for each of the limits that work)
< sipa> it runs too fast, though
< sipa> this is an 8 GB source file, and it only takes a few seconds
< gmaxwell> "I demand my np-complete problems solving software be slow"
< sipa> i demand it actually uses the whole input
< gmaxwell> try open-vcdiff?
< sipa> 42 MB
< sipa> gmaxwell: seems that uses the same algorithm as xdelta3
< gmaxwell> failure?
< luke-jr> http://codepad.org/Yi5FEWML <-- any obvious bugs or possible optimizations?
< btcdrak> also, I tried the dmg2img tool, but that doesnt appear to have been updated to handle the altered compression
< GitHub180> [bitcoin] ajtowns closed pull request #8575: leveldb: generate lib independent of locale sort (master...leveldb-locale-reproducible) https://github.com/bitcoin/bitcoin/pull/8575
< luke-jr> welp, failed to find the first 82 byte file :/
< gmaxwell> luke-jr: well the needles may not be in the file in a contigious chunk.
< luke-jr> gmaxwell: surely 82 bytes would be :<
< gmaxwell> unless the file system stores the first n bytes of files in the inode table.
< luke-jr> hmm
< gmaxwell> (so that magic is fast)
< gmaxwell> or just to prevent the indirection
< luke-jr> would it make sense to target the 2nd 4096 bytes of the file then?
< gmaxwell> to test your tool grab the second 4096 bytes of the haystack.
< gmaxwell> but yes, you could try that.
< luke-jr> http://codepad.org/VvfwXbqo isn't getting anything so far either
< luke-jr> you mean of the needle, right?
< gmaxwell> I mean grab a 4096 byte chunk out of the data you are searching in, just to verify your tool works.
< luke-jr> oh
< gmaxwell> dd if=5.hfs of=junk bs=4096 skip=1234 count=1
< luke-jr> "In Mac OS X Snow Leopard 10.6, HFS+ compression was added. In open source and some other areas this is referred to as AppleFSCompression. Compressed data may be stored in either an extended attribute or the resource fork."
< luke-jr> I bet this is what we're dealing with :|
< sipa> seems likely, indeed
< gmaxwell> bleh
< luke-jr> http://www.spinics.net/lists/linux-fsdevel/msg55545.html anyone want to implement? :/
< sipa> leveldb 1.19 was released
< sipa> 52 files changed, 1976 insertions(+), 429 deletions(-)
< sipa> (compared to 1.18)
< gmaxwell> doesn't sound too bad to review.
< sipa> they added a cache pruning
< gmaxwell> doesn't sound especially relevant then.
< sipa> it's been 2 years since their previous release
< sipa> glad to see activity :)
< sipa> ARM64 support for memory barriers seems relevant
< sipa> cache size estimation
< gmaxwell> oh the arm64 might fix performance on odroid c2
< sipa> this is interesting but not included: https://github.com/google/leveldb/pull/309
< gmaxwell> wumpus started benchmarking what that might look like.
< gmaxwell> odroid c2 has a crc32c accelerator that is sutiable too.
< luke-jr> found implementation of HFS+ compression
< sipa> where?
< luke-jr> SleuthKit
< * luke-jr> suggests we drop manpages from the tarball
< luke-jr> xz -d <inodes.xz | while read inode filename; do filename="y/$filename"; p=$(dirname $filename); mkdir -vp $p; icat ../5.hfs $inode >"$filename"; done
< * luke-jr> tests result in gitian
< luke-jr> the tarball is missing a bunch of stuff that looks like dummy files (but still 66 MB)
< sipa> luke-jr: nice!
< luke-jr> FWIW, I generated the inode list using mount and stat :p
< * luke-jr> would be surprised if SleuthKit didn't have a way to get that info, but didn't see it
< luke-jr> fls seems to be the tool
< sipa> sleuthkit added support for HFS+ in version 3.1
< luke-jr> I have 4.0.2
< sipa> ubuntu 12.04 has sleuthkit 3.2.3
< sipa> so i think we're good
< luke-jr> oh.
< luke-jr> except gitian doesn't like it :<
< sipa> due to the missing manpages...?
< luke-jr> hmm
< luke-jr> I think symlinks
< sipa> does it do a checksum check perhaps on the .tgz?
< luke-jr> can't find the lib for c++
< luke-jr> it can't, everyone's .tgz is different ;p
< luke-jr> yeah, I think it's missing symlinks
< luke-jr> fls ../5.hfs -rpF 154283 | perl -nle 'm/^(r|l)\S*\s(\d+)\:\s*(.*$)/ && print "$1 $2 $3"' | while read type inode filename; do filename="MacOSX10.11.sdk/$filename"; mkdir -p "$(dirname "$filename")"; if [ "$type" = "l" ]; then ln -s $(icat ../5.hfs $inode) "$filename"; else icat ../5.hfs $inode >"$filename"; fi; done
< luke-jr> tempting to figure out some kind of trick to download data on demand for this :P
< luke-jr> hm, surprised there's no single-file curl FUSE fs
< luke-jr> gitian build matches
< sipa> \o/
< btcdrak> wow
< luke-jr> ?
< CodeShark> I think btcdrak just considers perl code to be inherently aesthetically pleasing :p
< luke-jr> lol
< luke-jr> latest draft has no perl sadly: http://codepad.org/1wMp5vse
< luke-jr> I'll wait until I'm actually awake to turn it into a PR
< luke-jr> -rw-r--r-- 1 luke-jr luke-jr 21M Aug 27 12:33 MacOSX10.11.sdk.tar.gz
< luke-jr> night
< phantomcircuit> luke-jr, the client can take ages to start if you closed it after you downloaded a bunch of blocks but before you processed them
< phantomcircuit> cause it processes all of them in AppInit2
< phantomcircuit> (actually it doesn't anymore so this shouldn't be an issue in 0.13.x)
< sipa> leveldb 1.19 should start up significantly faster
< phantomcircuit> oh and leveldb reads it's journal which is potentially very slow also
< GitHub3> [bitcoin] sipa opened pull request #8610: Share unused mempool memory with coincache (master...sharemem) https://github.com/bitcoin/bitcoin/pull/8610
< GitHub62> [bitcoin] sipa opened pull request #8611: Reduce default number of blocks to check at startup (master...fastcheck) https://github.com/bitcoin/bitcoin/pull/8611
< phantomcircuit> sipa, #8610 instead of doing that can you add a framework for limiting memory globally?
< GitHub51> [bitcoin] sipa opened pull request #8612: Check for compatibility with download in FindNextBlocksToDownload (master...fixwitban) https://github.com/bitcoin/bitcoin/pull/8612
< sipa> phantomcircuit: i don't have 5 years
< phantomcircuit> it can probably even be as simple as a global memory limit goal and percentages for now
< sipa> percentages isn't good enough
< sipa> you need something that detects "oh, the mempool actually does not need its maximum usage... let's move some of its allocation elsewhere"
< sipa> but can change that back when the mempool grows
< phantomcircuit> percentages and an atomic of how much everything thinks it's using?
< sipa> what does that solve?
< phantomcircuit> "usage is below 90% of limit i can exceed my limit"
< phantomcircuit> i guess you need callbacks to apply memory pressure then
< sipa> yeah
< phantomcircuit> but you need that for sharing memory at all
< sipa> utxo set and mempool are things that are constantly checked anyway
< sipa> what this PR implements is mempool < maxmempool && coincache + mempool < maxtotal
< sipa> so arguably it already has something like a global limit (though it's just coincache + mempool)
< sipa> the mempool has complicated semantics wrt computing relay fee based on its size
< sipa> making its actual limit dynamic is harder, i think
< sipa> also, i don't think it's needed to for example give your mempool 10 GB of memory even if you have 64 GB available... that'll just make your memory a sewer for spam that nobody else on the network accepts
< GitHub42> [bitcoin] sipa opened pull request #8613: [preview] LevelDB 1.19 (master...leveldb119) https://github.com/bitcoin/bitcoin/pull/8613
< gmaxwell> sipa: I think we should backport feeler connections. Beyond the security/robustness improvement, they'll help reduce the harm of network density loss w/ segwit.
< GitHub126> [bitcoin] luke-jr opened pull request #8617: Include instructions to extract Mac OS X SDK on Linux using 7zip and SleuthKit (master...gitian_osx_extractor) https://github.com/bitcoin/bitcoin/pull/8617
< luke-jr> sipa: so have you confirmed already that LevelDB 1.19 is reasonably certain to not have forking changes? or are you assuming we'll do collectively that before merging/releasing?
< sipa> luke-jr: i'm reasonably certain, yes, but i encourage review
< sipa> i'm just pring it to bitcoin already to make testing easier