Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Talk about demoralizing

Talk about demoralizing

Scheduled Pinned Locked Moved The Lounge
algorithms
19 Posts 8 Posters 1 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • honey the codewitchH honey the codewitch

    It's difficult to profile because it's cross compiled to run on an IoT device, where I don't have access to profiling. I *can* do it but it involves setting up a lot of test code in order to use GFX from my PC. Besides, I know where it's taking the time, and there's not much I can do about it. The source machine is running at between 160MHz and 240MHz and the operation cannot be readily parallelized without using RAM i don't have. Algorithmically, I've optimized it about as much as I can. The mixing plan for finding two dither colors is like O(log N) which isn't that bad. The problem is I have to do it for 200x200 pixels. And for my color display I have to do it twice per frame because it's a 3 color display organized in two monochrome planes - one white, and one red, and I don't have the RAM to store the frame between rendering the two planes. On a superscalar PC running at GHz speeds this is no problem, and I *thought* it shouldn't be a problem even for this machine, but I guess I dramatically underestimated the time it takes this algorithm to run. I actually have two algos. The first one is similar to the one used by photoshop (but different enough that it avoids patent infringement) The second one is a much faster, simpler algorithm. Color dithers are apparently not as easy as I thought.

    Real programmers use butterflies

    Greg UtasG Offline
    Greg UtasG Offline
    Greg Utas
    wrote on last edited by
    #5

    I knew nothing of dithering, so I just read about it. Interesting. Eventually you'll be a numerical analysis weenie. :-D It also means that what I was going to ask, namely whether you could cache frequently used results, seems to make no sense.

    Robust Services Core | Software Techniques for Lemmings | Articles
    The fox knows many things, but the hedgehog knows one big thing.

    <p><a href="https://github.com/GregUtas/robust-services-core/blob/master/README.md">Robust Services Core</a>
    <em>The fox knows many things, but the hedgehog knows one big thing.</em></p>

    honey the codewitchH 1 Reply Last reply
    0
    • Greg UtasG Greg Utas

      I knew nothing of dithering, so I just read about it. Interesting. Eventually you'll be a numerical analysis weenie. :-D It also means that what I was going to ask, namely whether you could cache frequently used results, seems to make no sense.

      Robust Services Core | Software Techniques for Lemmings | Articles
      The fox knows many things, but the hedgehog knows one big thing.

      honey the codewitchH Offline
      honey the codewitchH Offline
      honey the codewitch
      wrote on last edited by
      #6

      Yeah. My current plan is to stick with nearest matching color only for color displays. I'm going to try dithering once again on black and white since those algorithms run much faster. Edit: Black and white dithering is fast fast fast, so I've added support for it to b&w e-paper displays. I don't do it for monochrome displays yet, and I'm not sure I will since they can update in real time and dithering interferes with that, but since you can turn it off maybe I'll add support for it. I only do nearest color matching for color e-ink displays. The color dithering as I said, was just too expensive.

      Real programmers use butterflies

      1 Reply Last reply
      0
      • honey the codewitchH honey the codewitch

        I spent all day and night working on dithering routines. I finally got it all compiling, and it worked the first time it ran. And it only took 5 minutes to render a frame. :(( :(( :(( So I replaced it with an algorithm that's supposed to be much faster. I got it down to 3 minutes. I cannot use this code that ran myself ragged developing. *headdesk*

        Real programmers use butterflies

        S Offline
        S Offline
        Super Lloyd
        wrote on last edited by
        #7

        Well.. look at it this way, the code you took 1 day to write is almost as good as this PhD code the guy spent a few years perfecting! :O :)

        A new .NET Serializer All in one Menu-Ribbon Bar Taking over the world since 1371!

        1 Reply Last reply
        0
        • honey the codewitchH honey the codewitch

          I spent all day and night working on dithering routines. I finally got it all compiling, and it worked the first time it ran. And it only took 5 minutes to render a frame. :(( :(( :(( So I replaced it with an algorithm that's supposed to be much faster. I got it down to 3 minutes. I cannot use this code that ran myself ragged developing. *headdesk*

          Real programmers use butterflies

          E Offline
          E Offline
          Espen Harlinn
          wrote on last edited by
          #8

          I guess you do a bit of rounding then? If you do, it may be that it is possible to improve that by a lot of bit-fiddling … On x64 I've seen significant performance improvement, between 20% and 300%, for the following: isnan, isinf, signbit, frexp, min, max, trunc, round, clamp and lerp I have no idea about how well this will work out for an ARM cpu, but here is the core of my implementation (Sorry about the formatting, paste and encode as HTML doesn't work well for C++ code anymore :(():

          template
          struct FractionWidth;

          template <>
          struct FractionWidth
          {
          static constexpr UInt32 value = 23;
          };
          template <>
          struct FractionWidth
          {
          static constexpr UInt32 value = 52;
          };

          template
          struct ExponentWidth;
          template <>
          struct ExponentWidth
          {
          static constexpr UInt32 value = 8;
          };
          template <>
          struct ExponentWidth
          {
          static constexpr UInt32 value = 11;
          };

          template
          struct ExponenBias;
          template <>
          struct ExponenBias
          {
          static constexpr UInt32 value = _FBIAS;
          };
          template <>
          struct ExponenBias
          {
          static constexpr UInt32 value = _DBIAS;
          };

          template
          struct InfinityUnsignedValue;
          template <>
          struct InfinityUnsignedValue
          {
          static constexpr UInt32 value = 0X7F800000UL;
          };
          template <>
          struct InfinityUnsignedValue
          {
          static constexpr UInt64 value = 0x7FF0000000000000ULL;
          };

          template
          struct NegativeInfinityUnsignedValue;
          template <>
          struct NegativeInfinityUnsignedValue
          {
          static constexpr UInt32 value = 0xFF800000UL;
          };
          template <>
          struct NegativeInfinityUnsignedValue
          {
          static constexpr UInt64 value = 0xFFF0000000000000ULL;
          };
          template
          struct QuietNaNUnsignedValue;
          template <>
          struct QuietNaNUnsignedValue
          {
          static constexpr UInt32 value = 0XFFC00001UL;
          };
          template <>
          struct QuietNaNUnsignedValue
          {
          static constexpr UInt64 value = 0x7FF0000000000001ULL;
          };

          #pragma pack(push,1)
          template
          struct FloatingPoint
          {
          using ValueType = std::remove_cvref_t;
          using UIntType = MakeUnsigned;

          static constexpr Int32 FractionWidth = static\_cast( Internal::FractionWidth::value );
          static constexpr Int32 ExponentWidth = static\_cast( Internal::ExponentWidth::value );
          
          static constexpr Int32 ExponentBias = ( 1 << ( ExponentWidth - 1 ) ) - 1;
          
          st
          
          honey the codewitchH D 2 Replies Last reply
          0
          • E Espen Harlinn

            I guess you do a bit of rounding then? If you do, it may be that it is possible to improve that by a lot of bit-fiddling … On x64 I've seen significant performance improvement, between 20% and 300%, for the following: isnan, isinf, signbit, frexp, min, max, trunc, round, clamp and lerp I have no idea about how well this will work out for an ARM cpu, but here is the core of my implementation (Sorry about the formatting, paste and encode as HTML doesn't work well for C++ code anymore :(():

            template
            struct FractionWidth;

            template <>
            struct FractionWidth
            {
            static constexpr UInt32 value = 23;
            };
            template <>
            struct FractionWidth
            {
            static constexpr UInt32 value = 52;
            };

            template
            struct ExponentWidth;
            template <>
            struct ExponentWidth
            {
            static constexpr UInt32 value = 8;
            };
            template <>
            struct ExponentWidth
            {
            static constexpr UInt32 value = 11;
            };

            template
            struct ExponenBias;
            template <>
            struct ExponenBias
            {
            static constexpr UInt32 value = _FBIAS;
            };
            template <>
            struct ExponenBias
            {
            static constexpr UInt32 value = _DBIAS;
            };

            template
            struct InfinityUnsignedValue;
            template <>
            struct InfinityUnsignedValue
            {
            static constexpr UInt32 value = 0X7F800000UL;
            };
            template <>
            struct InfinityUnsignedValue
            {
            static constexpr UInt64 value = 0x7FF0000000000000ULL;
            };

            template
            struct NegativeInfinityUnsignedValue;
            template <>
            struct NegativeInfinityUnsignedValue
            {
            static constexpr UInt32 value = 0xFF800000UL;
            };
            template <>
            struct NegativeInfinityUnsignedValue
            {
            static constexpr UInt64 value = 0xFFF0000000000000ULL;
            };
            template
            struct QuietNaNUnsignedValue;
            template <>
            struct QuietNaNUnsignedValue
            {
            static constexpr UInt32 value = 0XFFC00001UL;
            };
            template <>
            struct QuietNaNUnsignedValue
            {
            static constexpr UInt64 value = 0x7FF0000000000001ULL;
            };

            #pragma pack(push,1)
            template
            struct FloatingPoint
            {
            using ValueType = std::remove_cvref_t;
            using UIntType = MakeUnsigned;

            static constexpr Int32 FractionWidth = static\_cast( Internal::FractionWidth::value );
            static constexpr Int32 ExponentWidth = static\_cast( Internal::ExponentWidth::value );
            
            static constexpr Int32 ExponentBias = ( 1 << ( ExponentWidth - 1 ) ) - 1;
            
            st
            
            honey the codewitchH Offline
            honey the codewitchH Offline
            honey the codewitch
            wrote on last edited by
            #9

            If it was closer to performing like I need I'd twiddle with optimizations like this, but this sort of improvement isn't going to change the code from taking minutes to taking seconds, and that's what I need. I've basically abandoned color dithering for this project.

            Real programmers use butterflies

            1 Reply Last reply
            0
            • honey the codewitchH honey the codewitch

              I spent all day and night working on dithering routines. I finally got it all compiling, and it worked the first time it ran. And it only took 5 minutes to render a frame. :(( :(( :(( So I replaced it with an algorithm that's supposed to be much faster. I got it down to 3 minutes. I cannot use this code that ran myself ragged developing. *headdesk*

              Real programmers use butterflies

              L Offline
              L Offline
              lmoelleb
              wrote on last edited by
              #10

              I am surprised it is that bad. I thought something like the ESP32 at 240Mhz would do a Floyd-Steinberg reasonable well, at least keeping up with the 680x0's and 86x86's I used to run Floyd-Steinberg on 25 odd years ago. I know it is a different architecture, but it is also a 200Mhz speed advantage. But maybe it was just a lot slower than I recall it - back then we where amazed it displayed something at all. I am quite sure it never took a minute though - but if it was ½ or 20 seconds who knows. I wonder if it is memory access or something slowing it down. Takes ages to develop these kind of things though. As soon as it displays something, you loose the next couple of hours looking at it before moving on. :)

              honey the codewitchH 2 Replies Last reply
              0
              • L lmoelleb

                I am surprised it is that bad. I thought something like the ESP32 at 240Mhz would do a Floyd-Steinberg reasonable well, at least keeping up with the 680x0's and 86x86's I used to run Floyd-Steinberg on 25 odd years ago. I know it is a different architecture, but it is also a 200Mhz speed advantage. But maybe it was just a lot slower than I recall it - back then we where amazed it displayed something at all. I am quite sure it never took a minute though - but if it was ½ or 20 seconds who knows. I wonder if it is memory access or something slowing it down. Takes ages to develop these kind of things though. As soon as it displays something, you loose the next couple of hours looking at it before moving on. :)

                honey the codewitchH Offline
                honey the codewitchH Offline
                honey the codewitch
                wrote on last edited by
                #11

                I can't do floyd steinberg because of the memory requirements. I do a similar style as Thomas Knoll's adobe photoshop grid dithering method for my "slow" dithering, and an optimized Yliluoma algorithm for my "fast" dithering. Both are far too slow.

                Real programmers use butterflies

                1 Reply Last reply
                0
                • L lmoelleb

                  I am surprised it is that bad. I thought something like the ESP32 at 240Mhz would do a Floyd-Steinberg reasonable well, at least keeping up with the 680x0's and 86x86's I used to run Floyd-Steinberg on 25 odd years ago. I know it is a different architecture, but it is also a 200Mhz speed advantage. But maybe it was just a lot slower than I recall it - back then we where amazed it displayed something at all. I am quite sure it never took a minute though - but if it was ½ or 20 seconds who knows. I wonder if it is memory access or something slowing it down. Takes ages to develop these kind of things though. As soon as it displays something, you loose the next couple of hours looking at it before moving on. :)

                  honey the codewitchH Offline
                  honey the codewitchH Offline
                  honey the codewitch
                  wrote on last edited by
                  #12

                  Adding were you doing color dithering? My black and white bayer dithering is quite fast. Also if you were doing color dithering there are much faster algos you can use when simply simulating a higher bit depth, but I actually have to do color matching to a palette. Here's how I have to choose two colors to blend:

                  template
                  gfx_result dither_mixing_plan_fast(const PaletteType* palette, typename PaletteType::mapped_pixel_type color, dither_mixing_plan_data_fast* plan) {
                  gfx_result rr ;
                  if(nullptr==plan || nullptr==palette) {
                  return gfx_result::invalid_argument;
                  }
                  rgb_pixel<24> rgb888;
                  rr = convert(color,&rgb888);
                  if(gfx_result::success!=rr) {
                  return rr;
                  }
                  const unsigned r= rgb888.template channel(),
                  g=rgb888.template channel(),
                  b=rgb888.template channel();

                  \*plan = { {0,0}, 0.5 };
                  double least\_penalty = 1e99;
                  for(unsigned index1 = 0; index1 < 16; ++index1)
                  for(unsigned index2 = index1; index2 < 16; ++index2)
                  {
                      // Determine the two component colors
                      typename PaletteType::mapped\_pixel\_type mpx1;
                      rr=palette->map(typename PaletteType::pixel\_type(index1),&mpx1);
                      if(gfx\_result::success!=rr) {
                          return rr;
                      }
                      typename PaletteType::mapped\_pixel\_type mpx2;
                      rr=palette->map(typename PaletteType::pixel\_type(index2),&mpx1);
                      if(gfx\_result::success!=rr) {
                          return rr;
                      }
                      rr = convert(mpx1,&rgb888);
                      if(gfx\_result::success!=rr) {
                          return rr;
                      }   
                      unsigned r1= rgb888.template channel(), 
                              g1=rgb888.template channel(), 
                              b1=rgb888.template channel();
                      rr = convert(mpx2,&rgb888);
                      if(gfx\_result::success!=rr) {
                          return rr;
                      }
                      unsigned r2= rgb888.template channel(), 
                              g2=rgb888.template channel(), 
                              b2=rgb888.template channel();
                      int ratio = 32;
                      if(mpx1.native\_value != mpx2.native\_value)
                      {
                          // Determine the ratio of mixing for each channel.
                          //   solve(r1 + ratio\*(r2-r1)/64 = r, ratio)
                          // Take a weighed average of these three ratios according to the
                          // perceived luminosity of each channel (accordin
                  
                  L 1 Reply Last reply
                  0
                  • E Espen Harlinn

                    I guess you do a bit of rounding then? If you do, it may be that it is possible to improve that by a lot of bit-fiddling … On x64 I've seen significant performance improvement, between 20% and 300%, for the following: isnan, isinf, signbit, frexp, min, max, trunc, round, clamp and lerp I have no idea about how well this will work out for an ARM cpu, but here is the core of my implementation (Sorry about the formatting, paste and encode as HTML doesn't work well for C++ code anymore :(():

                    template
                    struct FractionWidth;

                    template <>
                    struct FractionWidth
                    {
                    static constexpr UInt32 value = 23;
                    };
                    template <>
                    struct FractionWidth
                    {
                    static constexpr UInt32 value = 52;
                    };

                    template
                    struct ExponentWidth;
                    template <>
                    struct ExponentWidth
                    {
                    static constexpr UInt32 value = 8;
                    };
                    template <>
                    struct ExponentWidth
                    {
                    static constexpr UInt32 value = 11;
                    };

                    template
                    struct ExponenBias;
                    template <>
                    struct ExponenBias
                    {
                    static constexpr UInt32 value = _FBIAS;
                    };
                    template <>
                    struct ExponenBias
                    {
                    static constexpr UInt32 value = _DBIAS;
                    };

                    template
                    struct InfinityUnsignedValue;
                    template <>
                    struct InfinityUnsignedValue
                    {
                    static constexpr UInt32 value = 0X7F800000UL;
                    };
                    template <>
                    struct InfinityUnsignedValue
                    {
                    static constexpr UInt64 value = 0x7FF0000000000000ULL;
                    };

                    template
                    struct NegativeInfinityUnsignedValue;
                    template <>
                    struct NegativeInfinityUnsignedValue
                    {
                    static constexpr UInt32 value = 0xFF800000UL;
                    };
                    template <>
                    struct NegativeInfinityUnsignedValue
                    {
                    static constexpr UInt64 value = 0xFFF0000000000000ULL;
                    };
                    template
                    struct QuietNaNUnsignedValue;
                    template <>
                    struct QuietNaNUnsignedValue
                    {
                    static constexpr UInt32 value = 0XFFC00001UL;
                    };
                    template <>
                    struct QuietNaNUnsignedValue
                    {
                    static constexpr UInt64 value = 0x7FF0000000000001ULL;
                    };

                    #pragma pack(push,1)
                    template
                    struct FloatingPoint
                    {
                    using ValueType = std::remove_cvref_t;
                    using UIntType = MakeUnsigned;

                    static constexpr Int32 FractionWidth = static\_cast( Internal::FractionWidth::value );
                    static constexpr Int32 ExponentWidth = static\_cast( Internal::ExponentWidth::value );
                    
                    static constexpr Int32 ExponentBias = ( 1 << ( ExponentWidth - 1 ) ) - 1;
                    
                    st
                    
                    D Offline
                    D Offline
                    David ONeil
                    wrote on last edited by
                    #13

                    You made me curious. It looks like this worked. There were a couple quirk texts in your paste I had to eliminate, but the main thing was using Notepad++ to convert it to ANSI before pasting here. Weird - it took a bit of work, like 5 mins puttering... For my own code, I do a two-step process. First paste as HTML and encode, then copy everything, delete it, and repaste as C++.

                    template <typename T>
                    struct FractionWidth;

                    template <>
                    struct FractionWidth<float>
                    {
                    static constexpr UInt32 value = 23;
                    };
                    template <>
                    struct FractionWidth<double>
                    {
                    static constexpr UInt32 value = 52;
                    };

                    template <typename T>
                    struct ExponentWidth;
                    template <>
                    struct ExponentWidth<float>
                    {
                    static constexpr UInt32 value = 8;
                    };
                    template <>
                    struct ExponentWidth<double>
                    {
                    static constexpr UInt32 value = 11;
                    };

                    template <typename T>
                    struct ExponenBias;
                    template <>
                    struct ExponenBias<float>
                    {
                    static constexpr UInt32 value = _FBIAS;
                    };
                    template <>
                    struct ExponenBias<double>
                    {
                    static constexpr UInt32 value = _DBIAS;
                    };

                    template <typename T>
                    struct InfinityUnsignedValue;
                    template <>
                    struct InfinityUnsignedValue<float>
                    {
                    static constexpr UInt32 value = 0X7F800000UL;
                    };
                    template <>
                    struct InfinityUnsignedValue<double>
                    {
                    static constexpr UInt64 value = 0x7FF0000000000000ULL;
                    };

                    template <typename T>
                    struct NegativeInfinityUnsignedValue;
                    template <>
                    struct NegativeInfinityUnsignedValue<float>
                    {
                    static constexpr UInt32 value = 0xFF800000UL;
                    };
                    template <>
                    struct NegativeInfinityUnsignedValue<double>
                    {
                    static constexpr UInt64 value = 0xFFF0000000000000ULL;
                    };
                    template <typename T>
                    struct QuietNaNUnsignedValue;
                    template <>
                    struct QuietNaNUnsignedValue<float>
                    {
                    static constexpr UInt32 value = 0XFFC00001UL;
                    };
                    template <>
                    struct QuietNaNUnsignedValue<double>
                    {
                    static constexpr UInt64 value = 0x7FF0000000000001ULL;
                    };

                    pragma pack(push,1);

                    template<typename T>
                    struct FloatingPoint
                    {
                    using ValueType = std::remove_cvref_t<T>;
                    using UIntType = MakeUnsigned<ValueType>;

                    <pre>
                    static constexpr Int32 FractionWidth = static_cast<Int32>( Internal::FractionWidth<ValueType>::value );
                    static constexpr Int32 ExponentWidth = static_cast<Int32>( Internal::ExponentWidth<ValueTyp

                    1 Reply Last reply
                    0
                    • honey the codewitchH honey the codewitch

                      I spent all day and night working on dithering routines. I finally got it all compiling, and it worked the first time it ran. And it only took 5 minutes to render a frame. :(( :(( :(( So I replaced it with an algorithm that's supposed to be much faster. I got it down to 3 minutes. I cannot use this code that ran myself ragged developing. *headdesk*

                      Real programmers use butterflies

                      D Offline
                      D Offline
                      Daniel Pfeffer
                      wrote on last edited by
                      #14

                      :confused: A N^2-level grey scale dithering for a 2-colour screen (B&W) may be performed with an NxN integer matrix. If N is a power of 2, the most expensive operation would be masking the X, Y ordinates to the range 0 .. N-1, and reading the dither threshold value. Unless you are dealing with giant pictures, how would the dithering take that long? (You could even speed the rendering by dithering a partial block of the picture in memory, and then bitblt-ing the block to the screen. The size of the block would depend on the memory available.) As for colour dithering, assuming that you have an 8-colour screen (on/off for each of R,G,B), perhaps you could use a B&W dithering algorithm on each level - R, G, B, and then combine them. I do not know about the quality of the colours, though...

                      Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.

                      honey the codewitchH 1 Reply Last reply
                      0
                      • D Daniel Pfeffer

                        :confused: A N^2-level grey scale dithering for a 2-colour screen (B&W) may be performed with an NxN integer matrix. If N is a power of 2, the most expensive operation would be masking the X, Y ordinates to the range 0 .. N-1, and reading the dither threshold value. Unless you are dealing with giant pictures, how would the dithering take that long? (You could even speed the rendering by dithering a partial block of the picture in memory, and then bitblt-ing the block to the screen. The size of the block would depend on the memory available.) As for colour dithering, assuming that you have an 8-colour screen (on/off for each of R,G,B), perhaps you could use a B&W dithering algorithm on each level - R, G, B, and then combine them. I do not know about the quality of the colours, though...

                        Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.

                        honey the codewitchH Offline
                        honey the codewitchH Offline
                        honey the codewitch
                        wrote on last edited by
                        #15

                        Grayscale dithering is not color dithering. Also it's much faster to dither when you have a palette where the colors are evenly distributed/gradiated. When you're doing color matching to dither between, say 7 available colors while loading a 24-bit color JPEG, that's a different story. You can't color dither like you suggest. You have to mix two colors, not just one color and a fixed color like black This is what finding the two colors to mix looks like - or at least one of the ways I use:

                        template
                        gfx_result dither_mixing_plan_fast(const PaletteType* palette, typename PaletteType::mapped_pixel_type color, dither_mixing_plan_data_fast* plan) {
                        gfx_result rr ;
                        if(nullptr==plan || nullptr==palette) {
                        return gfx_result::invalid_argument;
                        }
                        rgb_pixel<24> rgb888;
                        rr = convert(color,&rgb888);
                        if(gfx_result::success!=rr) {
                        return rr;
                        }
                        const unsigned r= rgb888.template channel(),
                        g=rgb888.template channel(),
                        b=rgb888.template channel();

                        \*plan = { {0,0}, 0.5 };
                        double least\_penalty = 1e99;
                        for(unsigned index1 = 0; index1 < PaletteType::size; ++index1)
                        for(unsigned index2 = index1; index2 < PaletteType::size; ++index2)
                        {
                            // Determine the two component colors
                            typename PaletteType::mapped\_pixel\_type mpx1;
                            rr=palette->map(typename PaletteType::pixel\_type(index1),&mpx1);
                            if(gfx\_result::success!=rr) {
                                return rr;
                            }
                            typename PaletteType::mapped\_pixel\_type mpx2;
                            rr=palette->map(typename PaletteType::pixel\_type(index2),&mpx1);
                            if(gfx\_result::success!=rr) {
                                return rr;
                            }
                            rr = convert(mpx1,&rgb888);
                            if(gfx\_result::success!=rr) {
                                return rr;
                            }   
                            unsigned r1= rgb888.template channel(), 
                                    g1=rgb888.template channel(), 
                                    b1=rgb888.template channel();
                            rr = convert(mpx2,&rgb888);
                            if(gfx\_result::success!=rr) {
                                return rr;
                            }
                            unsigned r2= rgb888.template channel(), 
                                    g2=rgb888.template channel(), 
                                    b2=rgb888.template channel();
                            int ratio = 32;
                            if(mpx1.native\_value != mpx2.native\_value)
                            {
                                // Determine the ratio of mixing for
                        
                        1 Reply Last reply
                        0
                        • honey the codewitchH honey the codewitch

                          Adding were you doing color dithering? My black and white bayer dithering is quite fast. Also if you were doing color dithering there are much faster algos you can use when simply simulating a higher bit depth, but I actually have to do color matching to a palette. Here's how I have to choose two colors to blend:

                          template
                          gfx_result dither_mixing_plan_fast(const PaletteType* palette, typename PaletteType::mapped_pixel_type color, dither_mixing_plan_data_fast* plan) {
                          gfx_result rr ;
                          if(nullptr==plan || nullptr==palette) {
                          return gfx_result::invalid_argument;
                          }
                          rgb_pixel<24> rgb888;
                          rr = convert(color,&rgb888);
                          if(gfx_result::success!=rr) {
                          return rr;
                          }
                          const unsigned r= rgb888.template channel(),
                          g=rgb888.template channel(),
                          b=rgb888.template channel();

                          \*plan = { {0,0}, 0.5 };
                          double least\_penalty = 1e99;
                          for(unsigned index1 = 0; index1 < 16; ++index1)
                          for(unsigned index2 = index1; index2 < 16; ++index2)
                          {
                              // Determine the two component colors
                              typename PaletteType::mapped\_pixel\_type mpx1;
                              rr=palette->map(typename PaletteType::pixel\_type(index1),&mpx1);
                              if(gfx\_result::success!=rr) {
                                  return rr;
                              }
                              typename PaletteType::mapped\_pixel\_type mpx2;
                              rr=palette->map(typename PaletteType::pixel\_type(index2),&mpx1);
                              if(gfx\_result::success!=rr) {
                                  return rr;
                              }
                              rr = convert(mpx1,&rgb888);
                              if(gfx\_result::success!=rr) {
                                  return rr;
                              }   
                              unsigned r1= rgb888.template channel(), 
                                      g1=rgb888.template channel(), 
                                      b1=rgb888.template channel();
                              rr = convert(mpx2,&rgb888);
                              if(gfx\_result::success!=rr) {
                                  return rr;
                              }
                              unsigned r2= rgb888.template channel(), 
                                      g2=rgb888.template channel(), 
                                      b2=rgb888.template channel();
                              int ratio = 32;
                              if(mpx1.native\_value != mpx2.native\_value)
                              {
                                  // Determine the ratio of mixing for each channel.
                                  //   solve(r1 + ratio\*(r2-r1)/64 = r, ratio)
                                  // Take a weighed average of these three ratios according to the
                                  // perceived luminosity of each channel (accordin
                          
                          L Offline
                          L Offline
                          lmoelleb
                          wrote on last edited by
                          #16

                          It was color. As far as I recall, we had a 16 color fixed palette, and anything available above that would be "allocated" as images where displayed (I think the Amiga supported 32, 64, 128, 256 - while our Windows support would go directly to 256). We also supported 16/24 bit, but I believe we just did nearest color on 16 bit. Palette entries where allocated based on one or another algorithm based on distance to nearest "existing" color, the number of pixels using the color, and the number of free palette entries. But most likely our images where just small enough that we did not encounter memory issues. I recall we did support Amiga 500, but we might have had restrictions on features there. Most systems would have had at least 1MB, and then it is no problem keeping an extra scanline or two in memory for dithering.

                          honey the codewitchH 1 Reply Last reply
                          0
                          • L lmoelleb

                            It was color. As far as I recall, we had a 16 color fixed palette, and anything available above that would be "allocated" as images where displayed (I think the Amiga supported 32, 64, 128, 256 - while our Windows support would go directly to 256). We also supported 16/24 bit, but I believe we just did nearest color on 16 bit. Palette entries where allocated based on one or another algorithm based on distance to nearest "existing" color, the number of pixels using the color, and the number of free palette entries. But most likely our images where just small enough that we did not encounter memory issues. I recall we did support Amiga 500, but we might have had restrictions on features there. Most systems would have had at least 1MB, and then it is no problem keeping an extra scanline or two in memory for dithering.

                            honey the codewitchH Offline
                            honey the codewitchH Offline
                            honey the codewitch
                            wrote on last edited by
                            #17

                            lmoelleb wrote:

                            Palette entries where allocated based on one or another algorithm based on distance to nearest "existing" color, the number of pixels using the color, and the number of free palette entries.

                            I'm not sure what you mean by nearest existing color, as my algo has to find *two* colors in order to determine what to blend with what. I have a KD tree implementation waiting in the wings for larger palettes since it sorts in such a way as to speed up distance based matching, but it's not helpful for say, 16 colors. I may "pre-expand" the palette, mixing colors beforehand, so a 16 color palette becomes (16*15)/2 colors, and then trying throwing that into a kd_tree and see what happens. But that's my biggest issue, is finding the two colors to blend. The rest is fast.

                            Real programmers use butterflies

                            L 1 Reply Last reply
                            0
                            • honey the codewitchH honey the codewitch

                              lmoelleb wrote:

                              Palette entries where allocated based on one or another algorithm based on distance to nearest "existing" color, the number of pixels using the color, and the number of free palette entries.

                              I'm not sure what you mean by nearest existing color, as my algo has to find *two* colors in order to determine what to blend with what. I have a KD tree implementation waiting in the wings for larger palettes since it sorts in such a way as to speed up distance based matching, but it's not helpful for say, 16 colors. I may "pre-expand" the palette, mixing colors beforehand, so a 16 color palette becomes (16*15)/2 colors, and then trying throwing that into a kd_tree and see what happens. But that's my biggest issue, is finding the two colors to blend. The rest is fast.

                              Real programmers use butterflies

                              L Offline
                              L Offline
                              lmoelleb
                              wrote on last edited by
                              #18

                              We where mainly (maybe only) loading images with a palette. So we where dithering one palette to another, meaning there was a max of 256 colors in the source image (and often less than that). I This allowed us to quickly calculate the total number of a given color, and for each color calculate how close it as to existing colors in our palette. I guess it is pretty useless these days as no one works with palettes (I think we had a jpg decoder for fun, but probably was sticking to gif and various other formats for images we really needed). Once the palette was locked in for an image, we could just run the Floyd-Steinberg. It only requires finding the nearest color per pixel as "blending" is done by pushing the error ahead of the calculations (at the cost of one scanline extra memory consumed - though I guess you could stamp it into the bitmap data directly). No problem on a 500KB system with relative small images. We had the advantage advantage that low CPU spec systems where typically also running lower colors (so we only had to search for nearest color in 16 or 32 target colors), while systems running 256 colors typically also had more CPU power.

                              honey the codewitchH 1 Reply Last reply
                              0
                              • L lmoelleb

                                We where mainly (maybe only) loading images with a palette. So we where dithering one palette to another, meaning there was a max of 256 colors in the source image (and often less than that). I This allowed us to quickly calculate the total number of a given color, and for each color calculate how close it as to existing colors in our palette. I guess it is pretty useless these days as no one works with palettes (I think we had a jpg decoder for fun, but probably was sticking to gif and various other formats for images we really needed). Once the palette was locked in for an image, we could just run the Floyd-Steinberg. It only requires finding the nearest color per pixel as "blending" is done by pushing the error ahead of the calculations (at the cost of one scanline extra memory consumed - though I guess you could stamp it into the bitmap data directly). No problem on a 500KB system with relative small images. We had the advantage advantage that low CPU spec systems where typically also running lower colors (so we only had to search for nearest color in 16 or 32 target colors), while systems running 256 colors typically also had more CPU power.

                                honey the codewitchH Offline
                                honey the codewitchH Offline
                                honey the codewitch
                                wrote on last edited by
                                #19

                                Actually palettes are very useful these days for e-paper displays, which are either monochrome, or have a *fixed palette* of a handful of colors. That's primarily why my GFX library supports it.

                                Real programmers use butterflies

                                1 Reply Last reply
                                0
                                Reply
                                • Reply as topic
                                Log in to reply
                                • Oldest to Newest
                                • Newest to Oldest
                                • Most Votes


                                • Login

                                • Don't have an account? Register

                                • Login or register to search.
                                • First post
                                  Last post
                                0
                                • Categories
                                • Recent
                                • Tags
                                • Popular
                                • World
                                • Users
                                • Groups