Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. .NET (Core and Framework)
  4. Incorporating values directly (as is/bytewise) into a string

Incorporating values directly (as is/bytewise) into a string

Scheduled Pinned Locked Moved .NET (Core and Framework)
csharpc++performancehelpquestion
13 Posts 6 Posters 19 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • P Offline
    P Offline
    primem0ver
    wrote on last edited by
    #1

    I know this may seem arbitrary to some, but the nature of the project that I am currently working on makes this capability an easy way to solve several problems. I want an efficient, seamless way to integrate values into a stream of characters. I have done this in the past by just converting byte values into chars or even using value.tostring() but it would be more efficient if the characters could be read directly from the memory containing the values. (Examples: an integer is read as 2 characters, a long is read as 4 characters and a GUID is read as 8 characters). This would be really simple in C++. In c# it is turning out to be a huge challenge. I tried using a generic with using StructLayout.Explicit but I got a runtime error stating that this is not supported. Using non-generics doesn't work either because strings and arrays are classes while values are structs and mixing them also doesn't work for the simple reason that the non zero terminated strings used in .NET track data that is probably inconsistent with this layout. Is there any way to directly read (or even copy) data from numeric value bytes into a string? I can think of one possibility using unsafe copying but was wondering if I had any other better (more efficient) options.

    P T L J 4 Replies Last reply
    0
    • P primem0ver

      I know this may seem arbitrary to some, but the nature of the project that I am currently working on makes this capability an easy way to solve several problems. I want an efficient, seamless way to integrate values into a stream of characters. I have done this in the past by just converting byte values into chars or even using value.tostring() but it would be more efficient if the characters could be read directly from the memory containing the values. (Examples: an integer is read as 2 characters, a long is read as 4 characters and a GUID is read as 8 characters). This would be really simple in C++. In c# it is turning out to be a huge challenge. I tried using a generic with using StructLayout.Explicit but I got a runtime error stating that this is not supported. Using non-generics doesn't work either because strings and arrays are classes while values are structs and mixing them also doesn't work for the simple reason that the non zero terminated strings used in .NET track data that is probably inconsistent with this layout. Is there any way to directly read (or even copy) data from numeric value bytes into a string? I can think of one possibility using unsafe copying but was wondering if I had any other better (more efficient) options.

      P Offline
      P Offline
      primem0ver
      wrote on last edited by
      #2

      I was unable to create a structure that did this. However, I was able to create a generic extender that uses "unsafe" copying to perform the task. The code is below. Still not sure this is the most efficient way to accomplish the task, but it works. The code requires using System.Runtime.InteropServices

      public unsafe static string ToBytewiseString(this T item) where T : struct
      {
      Type t = typeof(T);
      int size = t.GetSize();
      GCHandle pinnedHandle = GCHandle.Alloc(item, GCHandleType.Pinned);
      IntPtr ptr = pinnedHandle.AddrOfPinnedObject();
      StringBuilder result = new StringBuilder();
      int increment;
      if ((size % 2) == 1)
      increment = 1;
      else
      increment = 2;
      for (int offset = 0; offset < size; offset += increment)
      {
      char* c = (char*)((byte*)ptr + offset);
      if (increment == 1)
      {
      ushort c_ = *c;
      result.Append((char)(c_ >> 4));
      --offset;
      increment = 2;
      }
      else
      result.Append(*c);
      }
      pinnedHandle.Free();
      return result.ToString();
      }

      A 1 Reply Last reply
      0
      • P primem0ver

        I know this may seem arbitrary to some, but the nature of the project that I am currently working on makes this capability an easy way to solve several problems. I want an efficient, seamless way to integrate values into a stream of characters. I have done this in the past by just converting byte values into chars or even using value.tostring() but it would be more efficient if the characters could be read directly from the memory containing the values. (Examples: an integer is read as 2 characters, a long is read as 4 characters and a GUID is read as 8 characters). This would be really simple in C++. In c# it is turning out to be a huge challenge. I tried using a generic with using StructLayout.Explicit but I got a runtime error stating that this is not supported. Using non-generics doesn't work either because strings and arrays are classes while values are structs and mixing them also doesn't work for the simple reason that the non zero terminated strings used in .NET track data that is probably inconsistent with this layout. Is there any way to directly read (or even copy) data from numeric value bytes into a string? I can think of one possibility using unsafe copying but was wondering if I had any other better (more efficient) options.

        T Offline
        T Offline
        trønderen
        wrote on last edited by
        #3

        I am tempted to suggest that you switch to C/C++. If I understand your question right, you are asking for a C# equivalent of C/C++ 'union'. In my opinion, not offering unions is one of the strong arguments for C# over C/C++. For the oldtimers: union is a C variant of FORTRAN COMMON blocks, which is one of the craziest ideas of language design! Also, it was one of the greatest threats ever to software robustness. Doing a simple search for 'C# union' I hit upon C# equivalent to C "union"[^]. I never was aware of StructLayout(LayoutKind.Explicit) and FieldOffset(). Honestly: I haven't been missing out on anything valuable. I may try to forget that I have ever seen it. But maybe you can make use of it.

        L P J 3 Replies Last reply
        0
        • T trønderen

          I am tempted to suggest that you switch to C/C++. If I understand your question right, you are asking for a C# equivalent of C/C++ 'union'. In my opinion, not offering unions is one of the strong arguments for C# over C/C++. For the oldtimers: union is a C variant of FORTRAN COMMON blocks, which is one of the craziest ideas of language design! Also, it was one of the greatest threats ever to software robustness. Doing a simple search for 'C# union' I hit upon C# equivalent to C "union"[^]. I never was aware of StructLayout(LayoutKind.Explicit) and FieldOffset(). Honestly: I haven't been missing out on anything valuable. I may try to forget that I have ever seen it. But maybe you can make use of it.

          L Offline
          L Offline
          Lost User
          wrote on last edited by
          #4

          trønderen wrote:

          union is a C variant of FORTRAN COMMON blocks

          No, COMMON blocks were there so you could share memory between modules. A C union does not provide any sharing capability.

          T 1 Reply Last reply
          0
          • L Lost User

            trønderen wrote:

            union is a C variant of FORTRAN COMMON blocks

            No, COMMON blocks were there so you could share memory between modules. A C union does not provide any sharing capability.

            T Offline
            T Offline
            trønderen
            wrote on last edited by
            #5

            The common property between unions and common blocks is that they both provide to different users a common blob, telling: Here is a binary blob - interpret it any way you want! Details in accessability are different; in C you may have somewhat better control over which modules have access to the union definition. And you have collected all the different interpretations of that binary blob in one place (at least until you start casting pointer types). Yet, the basic concept is the same: A binary blob that can be interpreted in any way that the accessor would like to. Sure, it must be one of the alternatives in the union definition. Just like an interpretation of a Fortran COMMON block must be according to one of the alternative source code definitions that COMMON block in the modules that may access it. You may argue that collecting the COMMON block definitions / union alternatives in a single place is an improvement. Yes, it is, but the basic idea of a binary blob providing multiple interpretations is the same. You may argue that while any Fortran module might access that binary COMMON blob, only those C modules including the definition of the union, and knowing the address of (if you like: a pointer to) a union instance, this doesn't give any sort of protection against the uncontrolled interpretation of the binary blob. Even Fortran COMMON block had some accessability control: You had not only the plain, anonymous COMMON blocks but also named COMMON blocks - sort of comparable to letting only selected modules #include the union type definition: If you didn't know the name of the blob, you didn't have access to it. Sure, it was a poor kind of protection, but lots of protection is based on the (lack of) knowledge of how to access it. C/C++ provides a somewhat better protection. In this case, considering how easily any C pointer can be cast into a pointer of any other type - and note: the definition of the target type is arbitrary; it doesn't have to be any centrally managed type definition - the type control of C/C++ lies much closer to the weak Fortran type check than to that of C#.

            J 1 Reply Last reply
            0
            • T trønderen

              I am tempted to suggest that you switch to C/C++. If I understand your question right, you are asking for a C# equivalent of C/C++ 'union'. In my opinion, not offering unions is one of the strong arguments for C# over C/C++. For the oldtimers: union is a C variant of FORTRAN COMMON blocks, which is one of the craziest ideas of language design! Also, it was one of the greatest threats ever to software robustness. Doing a simple search for 'C# union' I hit upon C# equivalent to C "union"[^]. I never was aware of StructLayout(LayoutKind.Explicit) and FieldOffset(). Honestly: I haven't been missing out on anything valuable. I may try to forget that I have ever seen it. But maybe you can make use of it.

              P Offline
              P Offline
              primem0ver
              wrote on last edited by
              #6

              Geesh. Did you read my OP? Considering the rest of your post, I guess I shouldn't be that surprised. I already tried an Explicit layout. From my OP: "I tried using a generic with using StructLayout.Explicit but I got a runtime error stating that this is not supported". In other words currently, .NET doesn't allow using explicit layouts with generics. And just because YOU don't find anything valuable doesn't mean it isn't. You apparently don't understand the nature of efficient memory management; nor "robustness"; and your disregard for "oldtimers" is very revealing about both your experience and nature. I would change that attitude before you spend your life eating your words. FYI, my reply above to my own topic is a solution to the problem. What is more robust than not having to introduce a new data type?

              J 1 Reply Last reply
              0
              • P primem0ver

                I know this may seem arbitrary to some, but the nature of the project that I am currently working on makes this capability an easy way to solve several problems. I want an efficient, seamless way to integrate values into a stream of characters. I have done this in the past by just converting byte values into chars or even using value.tostring() but it would be more efficient if the characters could be read directly from the memory containing the values. (Examples: an integer is read as 2 characters, a long is read as 4 characters and a GUID is read as 8 characters). This would be really simple in C++. In c# it is turning out to be a huge challenge. I tried using a generic with using StructLayout.Explicit but I got a runtime error stating that this is not supported. Using non-generics doesn't work either because strings and arrays are classes while values are structs and mixing them also doesn't work for the simple reason that the non zero terminated strings used in .NET track data that is probably inconsistent with this layout. Is there any way to directly read (or even copy) data from numeric value bytes into a string? I can think of one possibility using unsafe copying but was wondering if I had any other better (more efficient) options.

                L Offline
                L Offline
                Lost User
                wrote on last edited by
                #7

                I use MemoryStreams and BinaryReaders / BinaryWriters for handling "binary" data; "characters" being something that depends on the encoding.

                "Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I

                J 1 Reply Last reply
                0
                • T trønderen

                  I am tempted to suggest that you switch to C/C++. If I understand your question right, you are asking for a C# equivalent of C/C++ 'union'. In my opinion, not offering unions is one of the strong arguments for C# over C/C++. For the oldtimers: union is a C variant of FORTRAN COMMON blocks, which is one of the craziest ideas of language design! Also, it was one of the greatest threats ever to software robustness. Doing a simple search for 'C# union' I hit upon C# equivalent to C "union"[^]. I never was aware of StructLayout(LayoutKind.Explicit) and FieldOffset(). Honestly: I haven't been missing out on anything valuable. I may try to forget that I have ever seen it. But maybe you can make use of it.

                  J Offline
                  J Offline
                  jschell
                  wrote on last edited by
                  #8

                  trønderen wrote:

                  I am tempted to suggest that you switch to C/C++.

                  That is what I was thinking also.

                  trønderen wrote:

                  I may try to forget that I have ever seen it. But maybe you can make use of it.

                  lol. Probably a good idea. Tricks like that in C/C++ were to, presumably, to squeeze extra performance out of some small system. Large systems are not impacted by micro optimizations and so one should focus on real performance solutions rather than attempting stuff like this.

                  1 Reply Last reply
                  0
                  • P primem0ver

                    Geesh. Did you read my OP? Considering the rest of your post, I guess I shouldn't be that surprised. I already tried an Explicit layout. From my OP: "I tried using a generic with using StructLayout.Explicit but I got a runtime error stating that this is not supported". In other words currently, .NET doesn't allow using explicit layouts with generics. And just because YOU don't find anything valuable doesn't mean it isn't. You apparently don't understand the nature of efficient memory management; nor "robustness"; and your disregard for "oldtimers" is very revealing about both your experience and nature. I would change that attitude before you spend your life eating your words. FYI, my reply above to my own topic is a solution to the problem. What is more robust than not having to introduce a new data type?

                    J Offline
                    J Offline
                    jschell
                    wrote on last edited by
                    #9

                    primem0ver wrote:

                    ou apparently don't understand the nature of efficient memory management; nor "robustness"; and your disregard for "oldtimers" is very revealing about both your experience and nature.

                    Hmmmm... I have 15 years in C/C++. And probably 10 in assembly. I worked on systems with 4k memory. I have written heap management replacement systems for C and C++ specifically implemented to improve performance for specific applications. And I have spent decades doing bit twiddling. And I have used the union mechanism in C/C++. I also have more than a decade in Java. And more than a decade in C#. Each. In contrast to that I specialize in large systems built to handle millions of customers with sustained throughput of thousands of TPS. I have friends who work with hundreds of thousands of sustained TPS. I have profiled applications extensively in C++, C# and Java. Not to mention decades of designing applications. Just wanted to establish what my actual experience is before commenting on what you said. Presumably this is based on an actual documented design or actual profiling of an existing application under realistic loads and on realistic hardware. Given that is the case then I would suggest is that if you have a project which actually requires a micro optimization at that level that you should seriously think of using a different language than C#. Such as C/C++. You can create a library with the required functionality and link it in to your C# application. Or even create a stand alone application which handles requests from the C# app. If I had that actual need I would go with the stand alone server. It is going to make maintenance and implementation a lot easier.

                    1 Reply Last reply
                    0
                    • T trønderen

                      The common property between unions and common blocks is that they both provide to different users a common blob, telling: Here is a binary blob - interpret it any way you want! Details in accessability are different; in C you may have somewhat better control over which modules have access to the union definition. And you have collected all the different interpretations of that binary blob in one place (at least until you start casting pointer types). Yet, the basic concept is the same: A binary blob that can be interpreted in any way that the accessor would like to. Sure, it must be one of the alternatives in the union definition. Just like an interpretation of a Fortran COMMON block must be according to one of the alternative source code definitions that COMMON block in the modules that may access it. You may argue that collecting the COMMON block definitions / union alternatives in a single place is an improvement. Yes, it is, but the basic idea of a binary blob providing multiple interpretations is the same. You may argue that while any Fortran module might access that binary COMMON blob, only those C modules including the definition of the union, and knowing the address of (if you like: a pointer to) a union instance, this doesn't give any sort of protection against the uncontrolled interpretation of the binary blob. Even Fortran COMMON block had some accessability control: You had not only the plain, anonymous COMMON blocks but also named COMMON blocks - sort of comparable to letting only selected modules #include the union type definition: If you didn't know the name of the blob, you didn't have access to it. Sure, it was a poor kind of protection, but lots of protection is based on the (lack of) knowledge of how to access it. C/C++ provides a somewhat better protection. In this case, considering how easily any C pointer can be cast into a pointer of any other type - and note: the definition of the target type is arbitrary; it doesn't have to be any centrally managed type definition - the type control of C/C++ lies much closer to the weak Fortran type check than to that of C#.

                      J Offline
                      J Offline
                      jsc42
                      wrote on last edited by
                      #10

                      Whilst you could use COMMON (in FORTRAN) to 'union' data in separate routines, you could not redefined the same COMMON block in the same routine. However, you could use another statement called EQUIVALENT (IIRC - it is some decades since I last wrote any FORTRAN) which does do 'union's - it was often used to remap data in COMMON blocks, but it was not restricted solely to data in COMMON blocks although that was its most frequent use in programs that I inherited in the 1980s.

                      1 Reply Last reply
                      0
                      • P primem0ver

                        I was unable to create a structure that did this. However, I was able to create a generic extender that uses "unsafe" copying to perform the task. The code is below. Still not sure this is the most efficient way to accomplish the task, but it works. The code requires using System.Runtime.InteropServices

                        public unsafe static string ToBytewiseString(this T item) where T : struct
                        {
                        Type t = typeof(T);
                        int size = t.GetSize();
                        GCHandle pinnedHandle = GCHandle.Alloc(item, GCHandleType.Pinned);
                        IntPtr ptr = pinnedHandle.AddrOfPinnedObject();
                        StringBuilder result = new StringBuilder();
                        int increment;
                        if ((size % 2) == 1)
                        increment = 1;
                        else
                        increment = 2;
                        for (int offset = 0; offset < size; offset += increment)
                        {
                        char* c = (char*)((byte*)ptr + offset);
                        if (increment == 1)
                        {
                        ushort c_ = *c;
                        result.Append((char)(c_ >> 4));
                        --offset;
                        increment = 2;
                        }
                        else
                        result.Append(*c);
                        }
                        pinnedHandle.Free();
                        return result.ToString();
                        }

                        A Offline
                        A Offline
                        arshad hussain 2022
                        wrote on last edited by
                        #11

                        The simplest way in which this task can be performed is by converting the integer explicitly into string datatype using the basic type conversion coding like codeprozone and adding it to appropriate position.

                        1 Reply Last reply
                        0
                        • L Lost User

                          I use MemoryStreams and BinaryReaders / BinaryWriters for handling "binary" data; "characters" being something that depends on the encoding.

                          "Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I

                          J Offline
                          J Offline
                          jschell
                          wrote on last edited by
                          #12

                          I believe the question is about directly mapping from memory into a value. And not how to store it in a stream.

                          1 Reply Last reply
                          0
                          • P primem0ver

                            I know this may seem arbitrary to some, but the nature of the project that I am currently working on makes this capability an easy way to solve several problems. I want an efficient, seamless way to integrate values into a stream of characters. I have done this in the past by just converting byte values into chars or even using value.tostring() but it would be more efficient if the characters could be read directly from the memory containing the values. (Examples: an integer is read as 2 characters, a long is read as 4 characters and a GUID is read as 8 characters). This would be really simple in C++. In c# it is turning out to be a huge challenge. I tried using a generic with using StructLayout.Explicit but I got a runtime error stating that this is not supported. Using non-generics doesn't work either because strings and arrays are classes while values are structs and mixing them also doesn't work for the simple reason that the non zero terminated strings used in .NET track data that is probably inconsistent with this layout. Is there any way to directly read (or even copy) data from numeric value bytes into a string? I can think of one possibility using unsafe copying but was wondering if I had any other better (more efficient) options.

                            J Offline
                            J Offline
                            jschell
                            wrote on last edited by
                            #13

                            Efficient is a subjective word. It often is used in the place of fast or sometimes associated with less memory or throughput. Performance is a also often used as well. None of those mean anything without a context. A medical monitor, a CRC controller and a facebook page are vastly different things and performance means something different for all of them. In general and almost always the following is what impacts this 1. Business requirements (highest) 2. Architecture 3. Design 4. Implementation (lowest). This also includes adhoc designs that were done without thinking. It has been proven that developers do not predict impactors on performance when based solely on the implementation level and without profiling. The other levels require human skill. Optimizations at the first level are capable of having orders of magnitude impacts on the performance of systems. The impact goes done significantly at each level. At the bottom level implementation improvements are unlikely to improve the system by more than 10% unless the a factor comes into play that is actually better addressed by a failure in the levels above it.

                            primem0ver wrote:

                            I can think of one possibility using unsafe copying but was wondering if I had any other better (more efficient) options.

                            The memory mapped variables that you are referring to are "unsafe" because, as proven by decades in C++/C that programmers use them wrong, especially over time. And those failures lead to application crashes. Not just small annoyances but rather problems that make the OS terminate the application immediately. Often in ways that seemingly have nothing to do with where the actual bad code is. So presumably the need for efficiency is real one. One that has been measured. One that is not actually a failure from one of the other levels. So if a real need exists and one that has been localized, measured, and designed such that such an optimization can improve something in the enterprise, then as suggested elsewhere use C/C++. Then map away. And if was me I would create a separate executable with just that code. Then when the exe crashes it will not take the rest of the enterprise down.

                            1 Reply Last reply
                            0
                            Reply
                            • Reply as topic
                            Log in to reply
                            • Oldest to Newest
                            • Newest to Oldest
                            • Most Votes


                            • Login

                            • Don't have an account? Register

                            • Login or register to search.
                            • First post
                              Last post
                            0
                            • Categories
                            • Recent
                            • Tags
                            • Popular
                            • World
                            • Users
                            • Groups