< MarcoFalke>
sipa: cfields: wumpus: re fuzz tests. Replied here #19366
< gribble>
https://github.com/bitcoin/bitcoin/issues/19366 | tests: Provide main(...) function in fuzzer. Allow building uninstrumented harnesses with --enable-fuzz. by practicalswift · Pull Request #19366 · bitcoin/bitcoin · GitHub
< shesek>
is there a customary notation for referring to the nth script of a descriptor identified by its checksum, akin to <bip32-fingerprint>/<index>?
< hebasto>
sipa: thanks for clarification!
< sipa>
yw!
< sipa>
shesek: not sure what you mean
< sipa>
ah, i see
< sipa>
no
< sipa>
(i kind of think that using fingerprints for bip32 was a bad idea too... they're too small to guarantee no collisions)
< shesek>
would it make sense to invent one? this could also use a `/` separator like bip32 key origins, though a different character might be better because this doesn't have multiple levels of nesting like key origins
< shesek>
yeah, I recently realized that using fingerprints as the unique wallet identifier in bwt wasn't a good idea either... https://github.com/shesek/bwt/issues/42
< shesek>
I'm planning to switch to descriptor checksums as the primary unique identifier, once bwt switches over to being descriptor-based
< sipa>
descriptor checksums have a range of 2^40, so somewhat better than bip32 fingerprints (which are only 2^32)
< sipa>
but they're also not really intended as an identifier, as they're protecting the text representation of a descriptor (which can change, as there can be multiple ways to refer to the same descriptor; h vs ' for hardened paths, lowercase/uppercase in hex, stripping of private keys or not, ...)
< shesek>
wouldn't using getdescriptorinfo to canonicalize them (or implementing the same canonicalization externally) solve that?
< sipa>
that assumes there is some standardization of descriptors in the first place :)
< sipa>
but yes, getdescriptorinfo does that, but i'm not sure we want to commit to it never changing
< shesek>
would you consider 40 bits sufficient to avoid collisions? say in a setup where you're dynamically tracking many descriptors, for example one xpub for the deposits of each customer
< shesek>
I do need some way to uniquely identify descriptors which will remain constant. would you suggest using the hash of the user-provided descriptor instead of relying on getdescriptorinfo's canonicalization for that?
< sipa>
change the 100000 to the number of descriptors involved
< sipa>
and it'll give you the probability of a collision
< shesek>
nice, that's very useful, thank you!
< shesek>
so definitely does not seem sufficient :)
< shesek>
wouldn't work well at today's bitstamp scale
< shesek>
or even much less really
< shesek>
sipa, what probability would you consider to be acceptable?
< sipa>
up to your application
< shesek>
I guess I could just take something like 160 bits, which is probably far more than actually necessary, but the performance/storage cost seems insignificant enough that there's no real reason not to
< shesek>
what does core use internally?
< sipa>
achow101: ^
< achow101>
for identifying descriptors in a DescriptorScriptPubKeyMan, we sha256 the canonicalized string representation
< shesek>
I've been thinking to make it possible to use shorter scripthash/txid hashes in the bwt indexes for low memory/storage environments. electrs does something similar to make its indexes more compact, truncating to 8 bytes. maybe the wallet id could be similarly user configurable, according to their resource constraints and scale of use
< shesek>
its hard to pick the right numbers myself, I can't really tell how people are going to use this
< achow101>
shesek: I think the "customary notation" for the nth script is to just replace the variable indexes with the actual index
< achow101>
e.g. instead of /0/*, you would put /0/1
< achow101>
but you would lose that it was ranged to begin with and which index was ranged
< shesek>
achow101, its also pretty big, this is something that I'm providing in the tx json next to each wallet input/output. and also hard to associate with the descriptor wallet this came from, for example if you want to know which user to credit for a deposit
< shesek>
its comparable to how bip 32 key origins are used as identifies (which is what I'm currently doing)
< cfields>
sipa: sorry for hammering on the Span questions. Still trying to convince myself it's safe.