< gmaxwell>
right. well there is also no need to discard bytes from K_2 but it does that too. the performance hit is especially gratitious for lengths.
< gmaxwell>
sipa: oh geesh, for every message we do _another_ chacha20 run to derrive the poly key.
< gmaxwell>
so encrypting a single small message requires 3 runs of the chacha20 function, one to encrypt the length, one to establsh the poly1305 key, and one to encrypt the payload.
< gmaxwell>
this seems pants on head stupid.
< gmaxwell>
The polykey needs to be per packet for poly1305's requirements, so I suppose it's only throwing out 32 bytes of chacha output.
< gmaxwell>
and it seems that they didn't so that you could just use a RFC5116 implementation of it.
< * gmaxwell>
cries
< gmaxwell>
So encrypting a 12 byte message will run on the order of 109 cycles/byte... which means that for small messages a straighforward implementation of AES-GCM would likely be faster, even on hardware without AES instructions.
< gmaxwell>
(109 cycles/byte for the chacha20 part alone)
< sipa>
gmaxwell: but what is the average message length for us?
< sipa>
it seems we don't keep stats on message counts
< cfields>
luke-jr: hmm?
< cfields>
luke-jr: if qt's copy is missing the files that need patching, what's to patch?
< luke-jr>
cfields: libpng has optimisations for ARM and POWER in separate files missing in Qt, but Qt's copy of the normal files still tries to link them
< cfields>
luke-jr: armhf/aarch64 build fine, what's different about power?
< luke-jr>
cfields: I don't know how ARM works
< cfields>
(not arguing, just trying to understand)
< cfields>
luke-jr: anyway, breaking out libpng is fine with me. IIRC I didn't do that because it requires zlib, as does qt, so that would've meant 2 copies of zlib. But we've since broken zlib out anyway I believe.
< luke-jr>
yeah
< cfields>
luke-jr: while you're at it, feel free to flip -qt-jpeg to -disable-jpeg too
< cfields>
something like those options, anyway
< cfields>
I think we've had no need for jpegs for a long time
< gmaxwell>
sipa: the most common message by far is transaction inv.
< gmaxwell>
sipa: it's just so weird that it uses 3 chacha runs, the poly1305 run has 32 bytes totally unused.
< Jmabsd>
wait, so Bitcoin has the tendency to print (256 & 160bit) hashes in *reverse* order, right - block hashes, transaction hashes and merkle root hashes.
< Jmabsd>
What about pubkey hashes (20B), pubkeys (32B) and signatures (64B) - are those printed in normal or reverse byte order? so, I have a P2SH pubkey script, say. in there is a 20B hash of my redeemscript, right. when I use Bitcoin Core's script disassembly function, will it print that hash in byte or normal order? i mean there is an outer extent to what Core prints in reverse order - for instance, binary transaction dumps (in hex) are in
< Jmabsd>
*normal* order, not reverse.
< sipa>
Jmabsd: that's just printing the bytes one by one
< sipa>
it's only when a hash is interested as a number the printing gets reversed
< sipa>
because the bytes are interpreted as little-endian number, but then printed in big endian for human consumption (humans want to see numbers in big endian)
< sipa>
but a script is a number
< luke-jr>
isn't*
< Jmabsd>
gotcha.
< sipa>
*indeed, isn't
< Jmabsd>
aha. so let's see - if you print a hex dump of a signature (71/72/73B), that's not a hash and hence printed in normal order
< Jmabsd>
a P2SH hash, for instance when printing the disassembly of a P2SH pubkey script - will the 20B hash there be printed in reverse ordeR?
< Jmabsd>
also if a pubkey (32B) is printed out, could that ever be in reverse order?
< luke-jr>
why don't you just try it and see? -.-
< sipa>
Jmabsd: pubkeys are not 32 bytes, and they're not hashes
< Jmabsd>
sipa: so the hex printer for other byte structures are never printed in reverse orders.
< sipa>
indeed
< sipa>
only for things that are internally treated as numbers
< Jmabsd>
but.. a P2SH 20B hash, that's a hash right. for printing purposes, is it considered a hash or a byte blob?
< sipa>
nope!
< sipa>
because the printer cannot know it is a hash
< sipa>
you'd need to execute the script to know it is treated as such
< sipa>
the script opcode is just "put some bytes on the stack"
< sipa>
so, not reversed there
< Jmabsd>
(sorry disconnect)
< Jmabsd>
last, > interesting. except for the HD wallet root seed (160b=20B), there is no instance ever where a 20B hash e.g. in P2SH pubkeyscript, is printed in reverse order.
< Jmabsd>
> sipa, right and when getting a disassembly printout in Bitcoin Core and related tools, those 20B:s are printed in normal order
< Jmabsd>
the proper way to phrase Core's reversing policy is something like, "any hash that is not part of another binary blob or produced as script data, is hex-serialized in reverse byte order."
< Jmabsd>
i'd hope any hash values introduced in the future will not be reversed though.
< sipa>
i don't see why not
< sipa>
we've always treated hash outputs as numbers and printed them as such
< sipa>
if byte swapping is the hardest problem to deal with, i'm not very worried :)
< jamesob>
re: memory usage increase: preliminary bisections are in and MarcoFalke and I are betting it's the leveldb changes. https://i.imgur.com/8aXRzwe.png
< gmaxwell>
jamesob: wait. how are we measuring memory usage in that benchmark?
< jamesob>
gmaxwell: time -f %M (ie resident set size)
< gmaxwell>
Also setting the maximum maps really low, like..2 might be interesting.
< gmaxwell>
but if this is the problem, MADV_RANDOM is probably the fix to the extent that its an actual problem at all.
< gmaxwell>
Though we should do a reindex benchmark to make sure MADV_RANDOM doesn't hurt performance.
< wumpus>
PSA: if after the latest merge you get a linker error "/usr/local/include/boost/smart_ptr/shared_ptr.hpp:728: undefined reference to `translationInterface", you need to do a 'make clean' and re-do the make and it will work
< luke-jr>
wumpus: is there a reason the gitian linux yml has g++-riscv64-linux-gnu as a dep? seems to pull in GCC 7 when we're using GCC 8 now?
< luke-jr>
wumpus: if `make clean` ever fixes something, that means there's a bug in the build system :/
< sipa>
gmaxwell, jamesob: LMDB uses MADV_RANDOM it seems
< sipa>
(though its design is different, i don't know their access patterns)
< wumpus>
luke-jr: yes, it must be missing some changes in dependency detection between source and header files (another one is if you change something in univalue, it won't detect it)
< wumpus>
luke-jr: it pulls in both gcc 7 and 8, I think that's necessary due to some strangeness with the packages (some symlink will only exist when g++-riscv64-linux-gnu is also installed)
< luke-jr>
DONTNEED sounds wrong?
< luke-jr>
wumpus: ah, weird
< wumpus>
luke-jr: you might be able to get around it, but I noticed and tried as well and ran into a dead end
< sipa>
luke-jr: to diagnoze
< sipa>
luke-jr: it would be interesting to see what the effect on RSS is with DONTNEED, to have an idea to what extent our memory usage is due to mmap caching
< jamesob>
sipa: giving it a shot now
< wumpus>
luke-jr: at least it's not MADV_HWPOISON!
< luke-jr>
wumpus: lol
< gmaxwell>
It's plausable to me that MADV_RANDOM will help performance or at least be neutral.
< wumpus>
yes, to me too
< wumpus>
our access pattern is more or less random
< gmaxwell>
I don't recall now, though I know I researched this before... does leveldb's bisection interpolate assuming keys are uniform and that their values are uniformly sized or does it plain bisect?
< wumpus>
(except in the rare times it's iterating over the whole utxo set in order, like when computing statistics)
< gmaxwell>
esp in the case of plain bisection, prefetching is a bad behavior.
< wumpus>
I don't know
< sipa>
gmaxwell: there is an index at the end of each ldb file
< sipa>
i assume it bisects in that index in a naive way, but i'm not sure
< jamesob>
sipa: bench is running; we'll know how your change works in about six hours
< sipa>
jamesob: awesome
< luke-jr>
hm, I think I will regret relatime when I try to prune gitian caches