Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. A little light reading

A little light reading

Scheduled Pinned Locked Moved The Lounge
com
18 Posts 9 Posters 2 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • L Lost User

    SSE 4.2 is overrated though. The instructions are neat, but they don't execute very quickly. There are also no AVX2 equivalents, just a VEX-encoded version of the 128bit operations. Overall, SSE 4.2 usually doesn't work out that well, though it has niche uses, and [it turns out that the PCMPEQB & PMOVMSKB combo wins](http://0x80.pl/articles/simd-strfind.html). It's a bit more boring perhaps, but it turns out that just because something is made for a particular purpose, that doesn't make it the best for that purpose. Glibc also [uses generic instructions](https://code.woboq.org/userspace/glibc/sysdeps/x86\_64/multiarch/strlen-avx2.S.html) instead of the SSE 4.2 special stuff.

    honey the codewitchH Offline
    honey the codewitchH Offline
    honey the codewitch
    wrote on last edited by
    #8

    In the end I didn't have to worry about it. I found out how optimized strpbrk() is and I'm using it over a memory mapped file. I'm searching through JSON picking out fields about 560MB/s now :-D That's satisfying enough, and more portable (memory mapped stuff isn't 100% but i have code for windows and i think either posix or linux so it works with both and falls back)

    Real programmers use butterflies

    1 Reply Last reply
    0
    • L Lost User

      Hmmm, Peter is an old codeproject member. About 15 years ago he wrote the fastest Mandelbrot/Julia rendering engine[^]. Best Wishes, -David Delaune

      R Offline
      R Offline
      Rick York
      wrote on last edited by
      #9

      That is pretty interesting stuff! I used to be a fractal fanatic and spent a lot of time optimizing algorithms and investigating alternatives. Then I came across GPUs and CUDA and the search was over.

      "They have a consciousness, they have a life, they have a soul! Damn you! Let the rabbits wear glasses! Save our brothers! Can I get an amen?"

      W 1 Reply Last reply
      0
      • C Chris Maunder

        Interesting that you post this since I was going to suggest, to your previous thread about perf, that you dip down into ASM to wring out some speed. And here you are. (I've always admire those fluent in any dialect of ASM, but have never actually bothered trying to learn a single instruction. Maybe about time I just spent a weekend diving in)

        cheers Chris Maunder

        theoldfoolT Offline
        theoldfoolT Offline
        theoldfool
        wrote on last edited by
        #10

        Too high level. Anyone remember:

        C:\>debug
        -D
        0B06:0100 75 60 C6 46 00 00 8A 7E-04 F6 C7 04 74 E6 C6 46 u`.F...~....t..F
        0B06:0110 00 02 8B 76 02 80 3C 00-74 4B B3 2E 34 00 F5 0A ...v..<.tK..4...
        0B06:0120 B3 3A 38 5C FE 74 05 C6-46 00 01 4E 32 DB 86 1C .:8\.t..F..N2...
        0B06:0130 E8 39 EB 3B D6 73 1B 56-51 8B CE 8B F2 AC E8 B2 .9.;.s.VQ.......
        0B06:0140 E1 74 09 AC 3B F1 72 F5-59 5E EB 0B 3B F1 72 ED .t..;.r.Y^..;.r.
        0B06:0150 59 5E 3A 5C FF 74 0E B4-3B CD 21 86 1C 73 95 E8 Y^:\.t..;.!..s..
        0B06:0160 9B DA E9 C9 D7 E9 C3 D7-89 7E 02 80 46 01 0C B8 .........~..F...
        0B06:0170 3F 2E B9 08 00 F3 AA 86-C4 AA 86 C4 B1 03 F3 AA ?...............

        Now, those were the (so-called) good old days. :)

        If you can keep your head while those about you are losing theirs, perhaps you don't understand the situation.

        honey the codewitchH 1 Reply Last reply
        0
        • theoldfoolT theoldfool

          Too high level. Anyone remember:

          C:\>debug
          -D
          0B06:0100 75 60 C6 46 00 00 8A 7E-04 F6 C7 04 74 E6 C6 46 u`.F...~....t..F
          0B06:0110 00 02 8B 76 02 80 3C 00-74 4B B3 2E 34 00 F5 0A ...v..<.tK..4...
          0B06:0120 B3 3A 38 5C FE 74 05 C6-46 00 01 4E 32 DB 86 1C .:8\.t..F..N2...
          0B06:0130 E8 39 EB 3B D6 73 1B 56-51 8B CE 8B F2 AC E8 B2 .9.;.s.VQ.......
          0B06:0140 E1 74 09 AC 3B F1 72 F5-59 5E EB 0B 3B F1 72 ED .t..;.r.Y^..;.r.
          0B06:0150 59 5E 3A 5C FF 74 0E B4-3B CD 21 86 1C 73 95 E8 Y^:\.t..;.!..s..
          0B06:0160 9B DA E9 C9 D7 E9 C3 D7-89 7E 02 80 46 01 0C B8 .........~..F...
          0B06:0170 3F 2E B9 08 00 F3 AA 86-C4 AA 86 C4 B1 03 F3 AA ?...............

          Now, those were the (so-called) good old days. :)

          If you can keep your head while those about you are losing theirs, perhaps you don't understand the situation.

          honey the codewitchH Offline
          honey the codewitchH Offline
          honey the codewitch
          wrote on last edited by
          #11

          Yes. That reminds me of when i learned 6502 bytecode before i realized i had a built in mini-assembler.

          Real programmers use butterflies

          T 1 Reply Last reply
          0
          • honey the codewitchH honey the codewitch

            Yes. That reminds me of when i learned 6502 bytecode before i realized i had a built in mini-assembler.

            Real programmers use butterflies

            T Offline
            T Offline
            trønderen
            wrote on last edited by
            #12

            The first assembler I used was simply adding symbols. The instruction set was very regular (the CPU architecture from the days long before microcode), so opcode, modifiers and offsets all had their fixed place in the instruction word. We played around with this: To generate a MUL (multiply) instruction, you could rather use ADD ADD, as the opcode for MUL was twice the opcode of ADD :-)

            1 Reply Last reply
            0
            • C Chris Maunder

              Interesting that you post this since I was going to suggest, to your previous thread about perf, that you dip down into ASM to wring out some speed. And here you are. (I've always admire those fluent in any dialect of ASM, but have never actually bothered trying to learn a single instruction. Maybe about time I just spent a weekend diving in)

              cheers Chris Maunder

              J Offline
              J Offline
              Jorgen Andersson
              wrote on last edited by
              #13

              If you consider one weekend enough... :omg:

              Wrong is evil and must be defeated. - Jeff Ello Never stop dreaming - Freddie Kruger

              1 Reply Last reply
              0
              • C Chris Maunder

                Interesting that you post this since I was going to suggest, to your previous thread about perf, that you dip down into ASM to wring out some speed. And here you are. (I've always admire those fluent in any dialect of ASM, but have never actually bothered trying to learn a single instruction. Maybe about time I just spent a weekend diving in)

                cheers Chris Maunder

                W Offline
                W Offline
                W Balboos GHB
                wrote on last edited by
                #14

                Chris Maunder wrote:

                but have never actually bothered trying to learn a single instruction.

                Allow me to get you started:

                MOV Chris, Good_Book;
                JMP ASM_PRO;

                Ravings en masse^

                "The difference between genius and stupidity is that genius has its limits." - Albert Einstein

                "If you are searching for perfection in others, then you seek disappointment. If you seek perfection in yourself, then you will find failure." - Balboos HaGadol Mar 2010

                C 1 Reply Last reply
                0
                • R Rick York

                  That is pretty interesting stuff! I used to be a fractal fanatic and spent a lot of time optimizing algorithms and investigating alternatives. Then I came across GPUs and CUDA and the search was over.

                  "They have a consciousness, they have a life, they have a soul! Damn you! Let the rabbits wear glasses! Save our brothers! Can I get an amen?"

                  W Offline
                  W Offline
                  W Balboos GHB
                  wrote on last edited by
                  #15

                  You need to check out FRACTINT[^] - more fractal than even a fanatical fanatic can handle. It just seems to have more and more features. The Wikipedia link hardly touches the surface.

                  Ravings en masse^

                  "The difference between genius and stupidity is that genius has its limits." - Albert Einstein

                  "If you are searching for perfection in others, then you seek disappointment. If you seek perfection in yourself, then you will find failure." - Balboos HaGadol Mar 2010

                  R 1 Reply Last reply
                  0
                  • W W Balboos GHB

                    You need to check out FRACTINT[^] - more fractal than even a fanatical fanatic can handle. It just seems to have more and more features. The Wikipedia link hardly touches the surface.

                    Ravings en masse^

                    "The difference between genius and stupidity is that genius has its limits." - Albert Einstein

                    "If you are searching for perfection in others, then you seek disappointment. If you seek perfection in yourself, then you will find failure." - Balboos HaGadol Mar 2010

                    R Offline
                    R Offline
                    Rick York
                    wrote on last edited by
                    #16

                    Yes, I have it and it is quite good. I got lots of ideas from it. For the highest performing fractal program I have ever seen - check out the Mandelbrot sample that comes with the CUDA SDK. It calculates in real time. You can pan and zoom and updates are instantaneous. It is really fast.

                    "They have a consciousness, they have a life, they have a soul! Damn you! Let the rabbits wear glasses! Save our brothers! Can I get an amen?"

                    1 Reply Last reply
                    0
                    • W W Balboos GHB

                      Chris Maunder wrote:

                      but have never actually bothered trying to learn a single instruction.

                      Allow me to get you started:

                      MOV Chris, Good_Book;
                      JMP ASM_PRO;

                      Ravings en masse^

                      "The difference between genius and stupidity is that genius has its limits." - Albert Einstein

                      "If you are searching for perfection in others, then you seek disappointment. If you seek perfection in yourself, then you will find failure." - Balboos HaGadol Mar 2010

                      C Offline
                      C Offline
                      Chris Maunder
                      wrote on last edited by
                      #17

                      I have a short attention span. Does it contain pictures and large fonts?

                      cheers Chris Maunder

                      W 1 Reply Last reply
                      0
                      • C Chris Maunder

                        I have a short attention span. Does it contain pictures and large fonts?

                        cheers Chris Maunder

                        W Offline
                        W Offline
                        W Balboos GHB
                        wrote on last edited by
                        #18

                        Does what?

                        Ravings en masse^

                        "The difference between genius and stupidity is that genius has its limits." - Albert Einstein

                        "If you are searching for perfection in others, then you seek disappointment. If you seek perfection in yourself, then you will find failure." - Balboos HaGadol Mar 2010

                        1 Reply Last reply
                        0
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Don't have an account? Register

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups