Why Nostr? What is Njump?
2024-06-29 22:24:31

mleku on Nostr: i'm getting quite good at writing JSON parsers now... i write them so they "consume" ...

i'm getting quite good at writing JSON parsers now... i write them so they "consume" the input bytes and return what comes after each element is decoded and converted to the runtime format

though i've done them about 3 different ways... for the most part you create a simple unmarshal function for each element, like a JSON string, or a decimal number, and then you write simple state machines that recognise stuff

the objects are the harder ones, they have several states as they pass through, the keys are strings, but they don't need unescaping because the spec uses simple lower case in most cases, this covers the events and filters

then you have envelopes, which all have a common initial part, which you can implement as a preliminary which returns the label EVENT or REQ or whatever as a string and then write a switch on that to invoke the different envelope unmarshalers

i'm near the end of it now... just finally drafted up a "filters" parser, which is constructed similarly to the other array parsers and instead of string or numbers like with tags or kinds it parses filter objects out

i'm looking at it all and i think after i've got all the envelopes done i'm going to try and generalise the array and object parsing code so it all just calls on a simple single function that is passed an array of the expected types and unmarshalers of the fields of each object

it's not a hell of a lot of code, altogether, but by collapsing repetition down it becomes easier to eliminate any bugs

the code is way way faster, in a large part because i am switching from strings to bytes in all the parsing, the Go string type is a force of garbage production, and the entire set of interfaces for marshaling and unmarshaling in the standard library is not designed with the idea in mind of reusing buffers

so, it's a case of short-sighted architectural decisions, and most people just use these pre-made things and then cry when the C++ and Rust and C coders write far more efficient handlers for this data

not the language, the goddamn standard library

one of the things that is happening though, is increasing amounts of the standard library are being extended and reworked to use more memory-conserving algorithms, notably, there is very few things you can't now do with byte slices that previously were only done with strings

personally, i'd eradicate the string type from the language altogether, and eliminate the whole existence of the equality operator on a structured type like this, i'm betting that before another 10 version bumps, much of the Go standard library will be written with reusable buffers in mind

it's not even hard to do buffer reuse, using the sync.Pool already, just that so much of the necessary functions for doing things like decoding numbers in text form or encoding them use strings which precludes any use of a pool or reusing buffers or avoiding allocations

i think that garbage collection is a very handy feature, but overall, the more your code relies on it, the more you cede control over scheduling of processing to random stop the world pauses that can kill syncrhorisation and throughput in subtle, impossible to debug ways, that can only be properly fixed by switching to APIs that enable buffer reuse and eliminate immutable, garbage bomb strings
Author Public Key
npub1fjqqy4a93z5zsjwsfxqhc2764kvykfdyttvldkkkdera8dr78vhsmmleku