Performance woes. I'm appalled.
-
Don't I feel stupid.
Approx stack size of local JSON stuff is 160 bytes
Read 1290495 nodes and 20383269 characters in 416.631000 ms at 45.603904MB/s
Skipped 1290495 nodes and 20383269 characters in 184.131000 ms at 103.187405MB/s
utf8 scanned 20383269 characters in 146.422000 ms at 129.761921MB/s
raw ascii i/o 20383269 characters in 58.902000 ms at 322.569692MB/s
raw ascii block i/o 19 blocks in 3.183000 ms at 5969.211436MB/sMuch better. I was using the wrong gcc options. I'm used to msvc
Real programmers use butterflies
Are you familiar with the Compiler Explorer[^] ? It's a very useful tool for looking at the assembly generated by gcc and other compilers
-
Are you familiar with the Compiler Explorer[^] ? It's a very useful tool for looking at the assembly generated by gcc and other compilers
I like to do broad, algorithmic optimizations before I try to outsmart the compiler. I've gotten at least a 3 times speed improvement by changing my parsing to use strpbrk() over a memory mapped file. :-D
Approx stack size of local JSON stuff is 176 bytes
Read 1231370 nodes and 20383269 characters in 268.944000 ms at 70.646677MB/s
Skipped 1231370 nodes and 20383269 characters in 35.784000 ms at 530.963559MB/s
utf8 scanned 20383269 characters in 78.679000 ms at 241.487563MB/s
raw ascii i/o 20383269 characters in 58.141000 ms at 326.791765MB/s
raw ascii block i/o 19 blocks in 3.369000 ms at 5639.655684MB/sThe bold is the relevant line here. That's doing a parse of the bones of the document (looking for {}[]") in order to skip over it in a structured way. That style of parsing is used for searching, for example, when you're trying to find all ids in a document. It's using the mmap technique i mentioned. Here's snagging all "id" fields out of a 20MB file and reading their values.
Approx stack size of local JSON stuff is 152 bytes
Found 40008 fields and scanned 20383269 characters in 34.664000 ms at 548.119086MB/sThe bytes used stuff is roughly how much memory the query takes - including the sizes of the JsonReader and LexSource member variables.
Real programmers use butterflies
-
Read 1231388 nodes and 20383269 characters in 1069.479000 ms at 17.765660MB/s
Skipped 1231388 nodes and 20383269 characters in 534.699000 ms at 35.534011MB/s
utf8 scanned 20383269 characters in 377.561000 ms at 50.322994MB/s
raw ascii i/o 20383269 characters in 62.034000 ms at 306.283651MB/s
raw ascii block i/o 19 blocks in 49.023000 ms at 387.573180MB/sThe first line is full JSON parsing The second line is JSON "skipping" - a minimal read where it doesn't normalize anything it just moves as fast as possible through the document. The third line is ut8 reading through my input source class but without doing anything JSON related The fourth line is calling fgetc() in a loop The fifth line is falling fread() in a loop and then scanning over the characters in each block (so i'm not totally cheating by not examining characters) The issue here is the difference between my third line and the fourth line (utf8 scan vs fgetc). The trouble is even when I removed the encoding it made no measurable difference in speed. Underneath everything both are using fgetc. Even when I changed mine to block read using fread() it didn't speed things up. I'm at a loss. I'm not asking a question here, mostly just expressing frustration because i have not a clue how to optimize this.
Real programmers use butterflies
-
I suspect that the utf8 scanning is using fgetc underneath to return one character at a time. This would greatly simplify the implementation of the utf8 scanner.
What I use under the covers depends on what kind of LexSource you use. Mainly I use memory mapped files now, for speed, but I'm implementing one using fread and buffered access and we'll see how that stacks up. I'm very nearly breaking 600MB/s of JSON searching on my machine. :)
Real programmers use butterflies