< sipa> gmaxwell: if we use an encoding for the checksum which maps all hex characters + punctuation into the same "high 5bits" of the 2-symbol encoding, we essentially can ignore the 1-char-2-symbol-error blowup
< sipa> as everything else (uppercase characters, lowcase above f) only occurs inside base58 things, which have additional protection already
< gmaxwell> sipa: nice!
< sipa> gmaxwell: in theory a gf(25) code would suffice for this
< sipa> we have exactly 25 characters that occur "unprotected" i think
< sipa> ()[]*/,'0123456789abcdefh
< sipa> though base32 is a bit easier to implement :)
< gmaxwell> sipa: and just alias the other characters near uniformly down to the unprotected ones?
< gmaxwell> You don't want ()[]*/,' in the checksum so you'd want to have alternative ones for those.
< gmaxwell> e.g. the checksum's charset would be different from the rest.
< sipa> gmaxwell: you can expand all data characters into two symbols
< sipa> you just don't care about cases where the second one differs
< sipa> and indeed, for the checksum we can just use the bech32 charset
< fanquake> sipa while testing #15250, I saw a single failure like https://gist.github.com/fanquake/c34aef5f4adc02d6bffec4143dcf08bc, but haven't been able to reproduce. Any thoughts?
< gribble> https://github.com/bitcoin/bitcoin/issues/15250 | Use RdSeed when available, and reduce RdRand load by sipa · Pull Request #15250 · bitcoin/bitcoin · GitHub
< fanquake> I see you've pushed new changes, so may no longer be relevant.
< sipa> fanquake: what kind of failure?
< sipa> oh
< sipa> fanquake: yes, that should be fixed
< sipa> gmaxwell just pointed that out
< fanquake> sipa ok, thanks.
< provoostenator> Sounds sipa: like you have 7 characters to spare then, let's bike shed! ("-", ";" and "$", "%", "&", etc would be good for future extensions)
< provoostenator> For example ranges normally don't make sense in descriptors, but one might have a setup with hardened derivations and only a limited range of hot private keys.
< provoostenator> A future extension could support ranges that reason, so it's nice to have room for "-" in the checksum mechanism.
< gmaxwell> those sound okay, though % and & are less likely to survive being passed around on the web, and get mangled in html documents...
< gmaxwell> # is another candidate.
< gmaxwell> or ! (not very shell friendly, though # isn't perfect in that respect either)
< gmaxwell> | is a fine character too.
< booyah> gmaxwell: maybe not a big concer, but "!" is absolute bitch to use in cli/bash
< bitcoin-git> [bitcoin] d3spwn opened pull request #15268: doc: suggest using timeoutstopsec in systemd file during IBD (master...systemd-tweaks) https://github.com/bitcoin/bitcoin/pull/15268
< provoostenator> "$" is also not ideal in shell, so yeah, being html and bash friendly adds some constraints.
< bitcoin-git> [bitcoin] MarcoFalke opened pull request #15270: Pull leveldb subtree (master...Mf1901-subtreeLeveldb) https://github.com/bitcoin/bitcoin/pull/15270
< phantomcircuit> gmaxwell, ; will get escaped as well
< bitcoin-git> [bitcoin] MarcoFalke pushed 3 new commits to master: https://github.com/bitcoin/bitcoin/compare/ab46fe6ec1b3...b78f6c61c452
< bitcoin-git> bitcoin/master 2434ab5 Ben Woosley: Scripts and tools: Fix devtools/copyright_header.py to always honor exclusions...
< bitcoin-git> bitcoin/master ad5e5a1 Ben Woosley: Scripts and tools: Drop no-longer-relevant copyright holder names...
< bitcoin-git> bitcoin/master b78f6c6 MarcoFalke: Merge #15258: Scripts and tools: Fix devtools/copyright_header.py to always honor exclusions...
< bitcoin-git> [bitcoin] MarcoFalke closed pull request #15258: Scripts and tools: Fix devtools/copyright_header.py to always honor exclusions (master...copyright-header-abs) https://github.com/bitcoin/bitcoin/pull/15258
< sipa> gmaxwell: for codes with length >24000 (about what we'd need for something containing 100 xpubs), 7 characters for distance 4, 10 characters for distance 5
< sipa> (this is algebraice distance, i can't analyze things exhaustively for this length)
< sipa> (and 1 character for distance 2, 4 characters for distance 3)
< sipa> i think 7 characters is fine; it will detect any 3 errors within the "basic 32 characters" or 1 error in and 1 error out, and has a random fail chance of less than 1 in 34 billion
< sipa> actually, we can have a conversion that maps 2 characters to 3 symbols, increasing the maximum length
< sipa> oh, or even 3 characters to 4 symbols
< sipa> you can partition all non-whitespace ascii characters into 3 groups of 32 each, and then encode 3 group numbers into 5 bits
< gmaxwell> you can exaust analyize to pick between codes for shorter lengths, so you should do that once you've found parameters that are okay for the longer lengths.
< sipa> right
< sipa> though up to what length?
< gmaxwell> (and at least pick a code that doesn't have a threshold effect hump-- which is less of an issue for longer lengths anyways)
< gmaxwell> I dunno, you've got a bunch of descriptor examples.
< gmaxwell> 2 of 3 multisigs are probably interesting.