Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Other Discussions
  3. The Weird and The Wonderful
  4. Case closed: NULLs are somehow valid in strings now?

Case closed: NULLs are somehow valid in strings now?

Scheduled Pinned Locked Moved The Weird and The Wonderful
perlcsharpcomdata-structurestools
8 Posts 7 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D Offline
    D Offline
    djdanlib 0
    wrote on last edited by
    #1

    I was beating my head on the desk all day over a stupid issue of a perl script not getting all of its arguments. You see... .NET strings can happily contain arbitrary NULL characters. (I didn't know that until today. Beware of this behavior.) Microsoft decided in their infinite wisdom that we programmers would just love to abandon the old NULL termination method where NULL signaled the end of a string. So that means .NET considers NULL a valid character in our strings, even though the definition of NULL is "THIS DATA DOES NOT EXIST and cannot be compared or evaluated." I was compiling a command line by concatenating multiple strings together. You can probably guess where this is going. Yes, an interop issue! Command lines ARE in fact NULL terminated. I use a WINAPI call that gets the short name for a long filename because somebody wrote a Perl script that can't handle filenames with spaces. Being a venerable old WINAPI function that's still with us since the Win95 days, it returns the result as a NULL terminated char array. So I said "return new string(buf)" to get the result as a string, and good old .NET said "Oh, that's okay, we can have NULLs in strings now!" So guess what happened? After I appended that short name to the argument string, everything after it got truncated when I passed it to a command line, and it rained on elementary school playgrounds around the world. Oh, the sadness was overwhelming. So I had to do a .Replace("\0", "") on that string before giving it to the ProcessInfo and all was golden and happy and the birds sang and rainbows issued forth from the heavens. Case closed.

    A L S 3 Replies Last reply
    0
    • D djdanlib 0

      I was beating my head on the desk all day over a stupid issue of a perl script not getting all of its arguments. You see... .NET strings can happily contain arbitrary NULL characters. (I didn't know that until today. Beware of this behavior.) Microsoft decided in their infinite wisdom that we programmers would just love to abandon the old NULL termination method where NULL signaled the end of a string. So that means .NET considers NULL a valid character in our strings, even though the definition of NULL is "THIS DATA DOES NOT EXIST and cannot be compared or evaluated." I was compiling a command line by concatenating multiple strings together. You can probably guess where this is going. Yes, an interop issue! Command lines ARE in fact NULL terminated. I use a WINAPI call that gets the short name for a long filename because somebody wrote a Perl script that can't handle filenames with spaces. Being a venerable old WINAPI function that's still with us since the Win95 days, it returns the result as a NULL terminated char array. So I said "return new string(buf)" to get the result as a string, and good old .NET said "Oh, that's okay, we can have NULLs in strings now!" So guess what happened? After I appended that short name to the argument string, everything after it got truncated when I passed it to a command line, and it rained on elementary school playgrounds around the world. Oh, the sadness was overwhelming. So I had to do a .Replace("\0", "") on that string before giving it to the ProcessInfo and all was golden and happy and the birds sang and rainbows issued forth from the heavens. Case closed.

      A Offline
      A Offline
      AspDotNetDev
      wrote on last edited by
      #2

      djdanlib wrote:

      the definition of NULL is "THIS DATA DOES NOT EXIST and cannot be compared or evaluated."

      To me, that is more an indicator that the "NULL character" is named poorly, not that .NET should exclude it from allowed characters in strings. NULL terminating strings seems silly to me.

      [Forum Guidelines]

      P 1 Reply Last reply
      0
      • A AspDotNetDev

        djdanlib wrote:

        the definition of NULL is "THIS DATA DOES NOT EXIST and cannot be compared or evaluated."

        To me, that is more an indicator that the "NULL character" is named poorly, not that .NET should exclude it from allowed characters in strings. NULL terminating strings seems silly to me.

        [Forum Guidelines]

        P Offline
        P Offline
        PIEBALDconsult
        wrote on last edited by
        #3

        aspdotnetdev wrote:

        NULL terminating strings seems silly to me.

        I agree. I've had trouble when receiving data from a serial connection.

        L 1 Reply Last reply
        0
        • P PIEBALDconsult

          aspdotnetdev wrote:

          NULL terminating strings seems silly to me.

          I agree. I've had trouble when receiving data from a serial connection.

          L Offline
          L Offline
          Luc Pattyn
          wrote on last edited by
          #4

          IMO one needs to choose one of two extremes for comfort: 1. use a "printable protocol", i.e. only transmit printable characters, i.e. the ASCII range [0x20,0x7E] and ignore everything else (including tabs, CR, LF, NULL). Every part in the serial chain (drivers, protocol stacks, modems, ...) will let them through unmodified. 2. use binary data, i.e. make sure your serial path is fully binary and doesn't touch any byte, does not replace CR by CRLF, does not swallow NULL, etc. I normally start out with #1 as it tends to work right away. When performance would become important, I'd consider switching to #2. :)

          Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum

          Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.

          P 1 Reply Last reply
          0
          • L Luc Pattyn

            IMO one needs to choose one of two extremes for comfort: 1. use a "printable protocol", i.e. only transmit printable characters, i.e. the ASCII range [0x20,0x7E] and ignore everything else (including tabs, CR, LF, NULL). Every part in the serial chain (drivers, protocol stacks, modems, ...) will let them through unmodified. 2. use binary data, i.e. make sure your serial path is fully binary and doesn't touch any byte, does not replace CR by CRLF, does not swallow NULL, etc. I normally start out with #1 as it tends to work right away. When performance would become important, I'd consider switching to #2. :)

            Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum

            Please use <PRE> tags for code snippets, they preserve indentation, and improve readability.

            P Offline
            P Offline
            PIEBALDconsult
            wrote on last edited by
            #5

            I was communicating with some sort of device -- I think the terminal server was causing the trouble.

            1 Reply Last reply
            0
            • D djdanlib 0

              I was beating my head on the desk all day over a stupid issue of a perl script not getting all of its arguments. You see... .NET strings can happily contain arbitrary NULL characters. (I didn't know that until today. Beware of this behavior.) Microsoft decided in their infinite wisdom that we programmers would just love to abandon the old NULL termination method where NULL signaled the end of a string. So that means .NET considers NULL a valid character in our strings, even though the definition of NULL is "THIS DATA DOES NOT EXIST and cannot be compared or evaluated." I was compiling a command line by concatenating multiple strings together. You can probably guess where this is going. Yes, an interop issue! Command lines ARE in fact NULL terminated. I use a WINAPI call that gets the short name for a long filename because somebody wrote a Perl script that can't handle filenames with spaces. Being a venerable old WINAPI function that's still with us since the Win95 days, it returns the result as a NULL terminated char array. So I said "return new string(buf)" to get the result as a string, and good old .NET said "Oh, that's okay, we can have NULLs in strings now!" So guess what happened? After I appended that short name to the argument string, everything after it got truncated when I passed it to a command line, and it rained on elementary school playgrounds around the world. Oh, the sadness was overwhelming. So I had to do a .Replace("\0", "") on that string before giving it to the ProcessInfo and all was golden and happy and the birds sang and rainbows issued forth from the heavens. Case closed.

              L Offline
              L Offline
              Lost User
              wrote on last edited by
              #6

              .Net strings are Unicode and the type of encoding is therefore important. Could it be that, somewhere along the way, the strings are read with the wrong encoding? This easily could produce some strange results.

              A while ago he asked me what he should have printed on my business cards. I said 'Wizard'. I read books which nobody else understand. Then I do something which nobody understands. After that the computer does something which nobody understands. When asked, I say things about the results which nobody understand. But everybody expects miracles from me on a regular basis. Looks to me like the classical definition of a wizard.

              D 1 Reply Last reply
              0
              • L Lost User

                .Net strings are Unicode and the type of encoding is therefore important. Could it be that, somewhere along the way, the strings are read with the wrong encoding? This easily could produce some strange results.

                A while ago he asked me what he should have printed on my business cards. I said 'Wizard'. I read books which nobody else understand. Then I do something which nobody understands. After that the computer does something which nobody understands. When asked, I say things about the results which nobody understand. But everybody expects miracles from me on a regular basis. Looks to me like the classical definition of a wizard.

                D Offline
                D Offline
                Dave Calkins
                wrote on last edited by
                #7

                That was my thought too. Unicode encoded with UCS-2 will commonly have every other byte NULL. So, yes, you can have NULLs in a string and its perfectly vaid in that encoding.

                1 Reply Last reply
                0
                • D djdanlib 0

                  I was beating my head on the desk all day over a stupid issue of a perl script not getting all of its arguments. You see... .NET strings can happily contain arbitrary NULL characters. (I didn't know that until today. Beware of this behavior.) Microsoft decided in their infinite wisdom that we programmers would just love to abandon the old NULL termination method where NULL signaled the end of a string. So that means .NET considers NULL a valid character in our strings, even though the definition of NULL is "THIS DATA DOES NOT EXIST and cannot be compared or evaluated." I was compiling a command line by concatenating multiple strings together. You can probably guess where this is going. Yes, an interop issue! Command lines ARE in fact NULL terminated. I use a WINAPI call that gets the short name for a long filename because somebody wrote a Perl script that can't handle filenames with spaces. Being a venerable old WINAPI function that's still with us since the Win95 days, it returns the result as a NULL terminated char array. So I said "return new string(buf)" to get the result as a string, and good old .NET said "Oh, that's okay, we can have NULLs in strings now!" So guess what happened? After I appended that short name to the argument string, everything after it got truncated when I passed it to a command line, and it rained on elementary school playgrounds around the world. Oh, the sadness was overwhelming. So I had to do a .Replace("\0", "") on that string before giving it to the ProcessInfo and all was golden and happy and the birds sang and rainbows issued forth from the heavens. Case closed.

                  S Offline
                  S Offline
                  Stephen Hewitt
                  wrote on last edited by
                  #8

                  Nothing's wrong with strings that can contain embedded NULLs. The fact that NULL terminated strings are so popular is mainly historic. There are times when it's handy or even necessary for a string to contain embedded NULLs, for example the Win32 SHFileOperation[^] function uses strings that have embedded NULLs.

                  Steve

                  1 Reply Last reply
                  0
                  Reply
                  • Reply as topic
                  Log in to reply
                  • Oldest to Newest
                  • Newest to Oldest
                  • Most Votes


                  • Login

                  • Don't have an account? Register

                  • Login or register to search.
                  • First post
                    Last post
                  0
                  • Categories
                  • Recent
                  • Tags
                  • Popular
                  • World
                  • Users
                  • Groups