#bitcoin-core-dev on 2016-08-30 — searchable irc log

02:18 < luke-jr> 9mo old said her first non-mama/papa word: "dot" [dot dot]

03:06 < GitHub7> [bitcoin] isle2983 opened pull request #8625: [doc] - clarify statement about parallel jobs in rpc-tests.py (master...rpcTestsDoc) https://github.com/bitcoin/bitcoin/pull/8625

03:25 < jeremyrubin> luke-jr: ls

03:25 < jeremyrubin> oops

03:25 < * jeremyrubin> shameful

03:31 < isle2983> usually they start with 'grep' before getting into directory navigation...

03:35 < luke-jr> jeremyrubin: ls: cannot open directory .: Transport endpoint is not connected

05:51 < jeremyrubin> luke-jr: I was going to ask you a question because I thought it was something you had worked on, but it wasn't. Forgot to switch tabs before typing. At least I wasn't sudo'ing ;)

05:51 < luke-jr> :P

05:51 < jeremyrubin> ANyways; what I was going to ask generally is about how std::thread is used currently in core

05:52 < jeremyrubin> I can't seem to get it to properly link or something in wine

05:52 < jeremyrubin> (in use on my own code)

05:52 < jeremyrubin> but on master it is already in use in httpserver.h

06:02 < GitHub173> [bitcoin] netsafe opened pull request #8626: Berkeley DB v6 compatibility fix (master...netsafe-patch-1) https://github.com/bitcoin/bitcoin/pull/8626

06:06 < GitHub184> [bitcoin] MarcoFalke pushed 2 new commits to master: https://github.com/bitcoin/bitcoin/compare/89de1538ce1f...c01a6c48b982

06:06 < GitHub184> bitcoin/master 1467561 isle2983: [doc] - clarify statement about parallel jobs in rpc-tests.py

06:06 < GitHub184> bitcoin/master c01a6c4 MarcoFalke: Merge #8625: [doc] - clarify statement about parallel jobs in rpc-tests.py...

06:06 < GitHub194> [bitcoin] MarcoFalke closed pull request #8625: [doc] - clarify statement about parallel jobs in rpc-tests.py (master...rpcTestsDoc) https://github.com/bitcoin/bitcoin/pull/8625

11:38 < GitHub114> [bitcoin] laanwj pushed 5 new commits to master: https://github.com/bitcoin/bitcoin/compare/c01a6c48b982...7b9889586501

11:38 < GitHub114> bitcoin/master eda4cfb Andrew Chow: Create an easy to use gitian building script...

11:38 < GitHub114> bitcoin/master 498d8da Andrew Chow: Check for OSX SDK

11:38 < GitHub114> bitcoin/master 6ffd6b4 Andrew Chow: Create option to detach sign gitian builds and not commit the files in the script...

11:38 < GitHub123> [bitcoin] laanwj closed pull request #8566: Easy to use gitian building script (master...gitian-build-script) https://github.com/bitcoin/bitcoin/pull/8566

11:39 < GitHub4> [bitcoin] laanwj pushed 2 new commits to master: https://github.com/bitcoin/bitcoin/compare/7b9889586501...2b23dbaee5b8

11:39 < GitHub4> bitcoin/master 203f212 Pieter Wuille: Reduce default number of blocks to check at startup

11:39 < GitHub4> bitcoin/master 2b23dba Wladimir J. van der Laan: Merge #8611: Reduce default number of blocks to check at startup...

11:39 < GitHub82> [bitcoin] laanwj closed pull request #8611: Reduce default number of blocks to check at startup (master...fastcheck) https://github.com/bitcoin/bitcoin/pull/8611

11:40 < jonasschnelli> What does "2016-08-29 20:22:53 socket send error Bad file descriptor (9)" mean? Running out of file descriptors?

11:41 < jonasschnelli> One of my local node ran into this over night

11:46 < wumpus> I don't think it's that, '9' isn't really a high number

11:46 < wumpus> could be a use-after-close of some kind

11:47 < jonasschnelli> It resulted in a shutdown at least

11:47 < jonasschnelli> Also has bad allocs on the same machine.. could be memory related, though, its a brand new computer (means nothing, i know)

11:47 < wumpus> ugh :/

11:48 < jonasschnelli> DDR3L ram

11:48 < wumpus> yes, memory corruption could definitely result in this, maybe the fd field was overwritten

11:48 < wumpus> what do you exactly mean by 'bad allocs'?

11:49 < jonasschnelli> bitcoind crashed with a std::expection bad alloc (I don't have the exact output right now)

11:50 < jonasschnelli> Here we go:

11:50 < jonasschnelli> EXCEPTION: St9bad_alloc

11:50 < wumpus> that's running out of memory, not memory corruption

11:50 < jonasschnelli> std::bad_alloc

11:50 < jonasschnelli> bitcoin in ProcessMessages()

11:50 < jonasschnelli> hmm....

11:50 < wumpus> (well it can be memory corruption if the heap's administractive structures are corrupted, however that much more likely results in a segmentation fault)

11:50 < jonasschnelli> free -h --> total 16GB

11:51 < wumpus> strange. Does it have swap enabled?

11:51 < jonasschnelli> Headless debian with only bitcoind running..

11:52 < wumpus> swap is extrememly important in Linux, even if you have enough memory, otherwise (AFAIK) it won't overcommit virtual memory and such

11:52 < jonasschnelli> "free" tells me, mem: Total, 15GB, used 2.7GB (restarted node with -dbcache=4000), Swap: total 17GB, used 0GB

11:52 < wumpus> okay, that's not it then

11:52 < wumpus> really strange

11:52 < jonasschnelli> The machine has 16GB physical memory... I don't think it ran out of memory

11:52 < jonasschnelli> I keep en eye on that

11:52 < wumpus> did you change dbcache?

11:53 < jonasschnelli> Yes. Always ran with -dbache=4000

11:53 < jonasschnelli> But codewise its pure master

11:53 < jonasschnelli> at a5bb6387f751e630c329f34cac2d38bffa8ff9cf

11:53 < wumpus> ok... no, that won't be the issue I think

11:57 < jonasschnelli> Heres the debug log: http://paste.ubuntu.com/23111528/

11:57 < jonasschnelli> Line 839 is the std::bad_alloc

11:57 < jonasschnelli> then there are some socket send error Bad file descriptor (9)

11:58 < jonasschnelli> Really strange the "Misbehaving: 85.214.213.91:8333 (0 -> 100) BAN THRESHOLD EXCEEDED" ... I hope its not an exploit.

12:01 < wumpus> well I think the bad_alloc causes that rejection/banning

12:01 < wumpus> it's unfortunate that we don't know which exact allocation failed

12:03 < wumpus> apparently it's somewhere in the ConnectBlock() inputs logic

12:04 < wumpus> 0000000000000000243ecc39a5c110fea174e34e4a2d00b5f2038ab2e2f5cf70 is the valid block at height 322006 - so if it was an exploit, it's not by sending a corrupted block

12:05 < wumpus> kind of bad that a bad_alloc causes block rejection though

12:05 < wumpus> after restarting you probably had to explicitly re-verify the block?

12:06 < jonasschnelli> wumpus: I had to reindex at some point... IIRC, I had to do it afterwards.

12:06 < jonasschnelli> But maybe the reindex was on a different datadir/run

12:07 < jonasschnelli> At L912 is looks after a valid restart without reindex

12:30 < wumpus> travis is misbehaving badly again: https://github.com/bitcoin/bitcoin/issues/8532#issuecomment-243419143

12:30 < wumpus> I doubt it can be the result of any of today's commits

12:50 < sipa> wumpus: i think 9 may be the errno code?

12:54 < wumpus> sipa: ah, yes, probably

15:27 < jonasschnelli> The node above stalled at height 322005

15:27 < jonasschnelli> last 3000 lines of debug log: http://paste.ubuntu.com/23112229/

15:27 < jonasschnelli> getblockchaininfo: http://paste.ubuntu.com/23112227/

15:28 < jonasschnelli> No new logprinf since 2h

15:28 < jonasschnelli> But bitcoind is running: jonassc+ 1000 89.6 8.0 1614436 1331624 pts/1 SLl+ 13:40 204:02 ./src/bitcoind --dbcache=4000

15:28 < jonasschnelli> deadlock?

15:29 < sipa> jonasschnelli: getchaintips

15:29 < sipa> jonasschnelli: getpeerinfo

15:30 < jonasschnelli> sipa: http://paste.ubuntu.com/23112236/

15:30 < jonasschnelli> peerinfo: http://paste.ubuntu.com/23112238

15:32 < jonasschnelli> attached gdb and bt is: http://paste.ubuntu.com/23112247/

15:32 < jonasschnelli> wait.. thats useless. nm

15:33 < jonasschnelli> RPC server works.. but network layer seems to be dead

15:34 < sipa> jonasschnelli: thread apply all bt

15:34 < jonasschnelli> sipa: was just doing this:

15:34 < jonasschnelli> http://pastebin.com/sWbcbz8U

15:38 < jtimon> now that we're C++11, what should I use instead of boost::scoped_ptr<> ?

15:40 < sipa> std::unique_ptr

15:41 < sipa> jonasschnelli: what is on net.cpp:1909

15:42 < * jonasschnelli> looking

15:42 < jonasschnelli> messageHandlerCondition.timed_wait(lock, boost::posix_time::microsec_clock::universal_time() + boost::posix_time::milliseconds(100));

15:42 < sipa> i don't see any deadlock

15:42 < sipa> or any lock at all, even

15:42 < jonasschnelli> sipa: https://github.com/bitcoin/bitcoin/blob/master/src/net.cpp#L1909

15:57 < jonasschnelli> Is there a reason why a peer request headers and compact blocks (sendheaders and sendcmpct) to nodes not signaling NODE_NETWORK?

15:58 < jonasschnelli> I guess an SPV node at 70014 can just ignore those..

16:53 < jtimon> sipa thanks!

17:33 < sipa> jonasschnelli: read the bip

17:33 < sipa> it explicitly explains that :)

18:08 < jonasschnelli> sipa: Thanks. I should do that.

18:34 < jeremyrubin> Can anyone run `make bench` on wine 32 bit build?

18:36 < cfields> jeremyrubin: i can in a little bit

18:36 < jeremyrubin> kk thanks

18:46 < cfields> jeremyrubin: actually, "teach a man to fish" and all that... :)

18:46 < cfields> jeremyrubin: have you tried building/running for win32?

18:50 < jeremyrubin> cfields: yes

18:51 < cfields> jeremyrubin: you had issues, or just want to compare results?

18:51 < jeremyrubin> cfields: wine: Unhandled page fault on read access to 0x00000004 at address 0x6117a9 (thread 0009), starting debugger...

18:52 < jeremyrubin> cfields: errors. Playing around with things it seems to be some kind of link time issue I suspect

18:53 < cfields> jeremyrubin: errors running? or running under gdb? 'cause wine+gdb is a different beast :)

18:54 < jeremyrubin> cfields: there are two main issues. The first is the sys/time.h depends. I removed that for a std::chrono solution (can send you code) then, removing all test code, and by removing all the boost dependencies (replacing with standard way), I can run just the benchmarking framework.

18:54 < jeremyrubin> cfields: not under gdb

18:54 < jeremyrubin> cfields: adding the benchmarks back I can run again, so i'm doing a "bisect" on which of the benchmarks is causing the loading fault now, but I think it's link time because it doesn't even run

18:55 < jeremyrubin> cfields: I tried adding "-static" to LDFLAGS

18:55 < cfields> jeremyrubin: win32 builds are already static

18:56 < * jonasschnelli> setups mingw32 depends builds

18:56 < cfields> jeremyrubin: i'm afraid i'm missing some context, though. Does the current bench code not work in win32?

18:57 < jeremyrubin> cfields: I don't think so; let me test on master

18:57 < jeremyrubin> cfields: where can I see the static flags? I don't think they're set for bench

18:58 < cfields> jeremyrubin: ah, ok

18:58 < cfields> jeremyrubin: they're kinda a maze, sec

18:58 < jeremyrubin> cfields: `bench_bench_bitcoin_LDFLAGS = $(RELDFLAGS) $(AM_LDFLAGS) $(LIBTOOL_APP_LDFLAGS)`

18:58 < jeremyrubin> in Makefile.bench.include

18:58 < cfields> jeremyrubin: IIRC it's the LIBTOOL_APP_LDFLAGS that sets static

18:59 < jeremyrubin> in Makefile.test.include `test_test_bitcoin_LDFLAGS = $(RELDFLAGS) $(AM_LDFLAGS) $(LIBTOOL_APP_LDFLAGS) -static`

18:59 < cfields> # -static is interpreted by libtool, where it has a different meaning.

18:59 < cfields> # In libtool-speak, it's -all-static.

18:59 < cfields> AX_CHECK_LINK_FLAG([[-static]],[LIBTOOL_APP_LDFLAGS="$LIBTOOL_APP_LDFLAGS -all-static"])

19:00 < jeremyrubin> so as a minimal example; I'm failing with only the example bench included. I'm running on my branch, but let me try on master (I shouldn't have any changes that affect that tho)

19:01 < cfields> ok. trying here too.

19:01 < cfields> you're building with depends?

19:01 < cfields> or is that why you hacked it to be dependency-less?

19:02 < jeremyrubin> I'm building by this:

19:02 < jeremyrubin> (well, whatever it says in doc/build-windows)

19:02 < cfields> ok

19:03 < jeremyrubin> cd depends; make HOST=i686-w64-mingw32 -j4; cd ..; ./configure --prefix=`pwd`/depends/i686-w64-mingw32; make

19:03 < cfields> right

19:04 < jeremyrubin> also you may want this:

19:04 < jeremyrubin> std::chrono::duration<double> result {std::chrono::system_clock::now().time_since_epoch()};

19:04 < jeremyrubin> return result.count();

19:05 < jeremyrubin> for bench.cpp gettimedouble

19:08 < jeremyrubin> (not sure if the sys/time.h include is problematic)

19:08 < cfields> yea, we should aim to nuke those.

19:09 < jeremyrubin> Yeah I can separately PR nuking them; pretty easy to remove that & the boost depends as well

19:09 < cfields> (i'll be PR'ing my threading refactor in a few hours which will let us kill off a ton of boost stuff, chrono included)

19:10 < cfields> jeremyrubin: boost depends everywhere? or in bench?

19:10 < jeremyrubin> in bench

19:10 < jeremyrubin> I also had a theory that std::thread was the reason my builds were failing. Apparently std::thread support is shakey in wine?

19:10 < cfields> ah, ok. that'll be nice to have :)

19:10 < jeremyrubin> or rather in the x-compiler

19:11 < jeremyrubin> seems to be fixed now though

19:11 < jeremyrubin> can't wait to see the -death- removal of boost::thread

19:11 < cfields> jeremyrubin: that'd be libstdc++. Surely it just uses win primitives under the hood, though

19:12 < jeremyrubin> cfields: see https://github.com/meganz/mingw-std-threads

19:12 < jeremyrubin> cfields: seems to be addressed though now; as when I compiled there was a version of std::thread present

19:13 < jeremyrubin> cfields: also, forgot to mention that test_bitcoin.exe runs ok; so that was part of my inkling it was a build setting

19:13 < cfields> jeremyrubin: i'm not sure what to say there, we rely on std::thread for mingw64 already

19:14 < cfields> sounds like you're chasing all kinds of things :)

19:15 < jeremyrubin> cfields: indeed

19:23 < jeremyrubin> cfields: master segfaults as well

19:23 < jeremyrubin> just finished my build

19:23 < cfields> jeremyrubin: interesting

19:23 < cfields> jeremyrubin: ok, still building here. Had to setup a VM, current OS is wonky

19:26 < jeremyrubin> cfields: I can run it with WINEDEBUG=+all but I don't really know how to read that

20:49 < GitHub24> [bitcoin] jtimon opened pull request #8629: C++11: s/boost::scoped_ptr/std::unique_ptr/ (master...0.13-boost-scoped-ptr) https://github.com/bitcoin/bitcoin/pull/8629

21:16 < cfields> jeremyrubin: finally got it built, crashes here too

21:27 < jeremyrubin> cfields: Cool/not cool

21:28 < cfields> jeremyrubin: is it only win32, not win64?

21:28 < jeremyrubin> cfields: didn't try win64; I'll do a build and report back shortly

21:28 < cfields> ok

21:29 < jeremyrubin> cfields: I guess it's not the most critical thing to fix, but I wanted to make travis print out benchmarking info in case tests are timing out due to poor performance will help debugging

21:29 < cfields> jeremyrubin: sure, sounds useful

21:29 < cfields> jeremyrubin: but since it's already busted in master, no need to make it a blocker for anything else you're working on

21:29 < jeremyrubin> cfields: Although looking at what's slow, it seems that PrevectorTestInt is really long on windows

21:30 < jeremyrubin> cfields: So I'm thinking about also changing the build_aux test driver to tee the log and print out the test messages so that it can see what it timed out on

21:31 < cfields> jeremyrubin: by all means. last time i poked at that, it fought me hard. printing that would be great.

21:32 < jeremyrubin> cfields: yeah I've spent the morning mucking through automake crap

21:33 < jeremyrubin> cfields: in any case; the current build system is functionally broken because if you add tests that make it go over 10 min it breaks :)

21:34 < cfields> heh, the test driver enforces that?

21:36 < jeremyrubin> cfields: travis does

21:36 < cfields> oh, sure

21:36 < jeremyrubin> cfields: it assumes tests failed if no output

21:36 < jeremyrubin> wait do you know where the build_aux/test_driver is generated?

21:38 < jeremyrubin> it looks like it comes from autogen

21:38 < cfields> comes from automake iirc

21:40 < jeremyrubin> ugh. yeah you're right

21:43 < * jeremyrubin> ponders just making the tests periodically put a '.' to stderr to solve it

21:49 < jeremyrubin> cfields: `err:seh:setup_exception stack overflow 2656 bytes in thread 0024 eip 00002b619`

21:53 < jeremyrubin> cfields: Think I should just open an issue?