Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. Binary or Non-Binary File

Binary or Non-Binary File

Scheduled Pinned Locked Moved C#
csharpc++helpquestion
5 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D Offline
    D Offline
    DotNetDominator
    wrote on last edited by
    #1

    Is there a way in C# or C++ to determine if given file is Binary or Non-Binary. There are some forums which suggests to check the bytes of file and look for null byte. Is there any other way around? Any help would be appriciated.

    J L 2 Replies Last reply
    0
    • D DotNetDominator

      Is there a way in C# or C++ to determine if given file is Binary or Non-Binary. There are some forums which suggests to check the bytes of file and look for null byte. Is there any other way around? Any help would be appriciated.

      J Offline
      J Offline
      Judah Gabriel Himango
      wrote on last edited by
      #2

      Perhaps this API will help: IsTextUnicode[^]

      Tech, life, family, faith: Give me a visit. I'm currently blogging about: Repentance The apostle Paul, modernly speaking: Epistles of Paul Judah Himango

      1 Reply Last reply
      0
      • D DotNetDominator

        Is there a way in C# or C++ to determine if given file is Binary or Non-Binary. There are some forums which suggests to check the bytes of file and look for null byte. Is there any other way around? Any help would be appriciated.

        L Offline
        L Offline
        Luc Pattyn
        wrote on last edited by
        #3

        Hi, all files are binary, some contain text (in ASCII, Unicode, whatever), some contain an image, or some other kind of data. So you might have to clarify your question. :)

        Luc Pattyn [My Articles] [Forum Guidelines]

        D 1 Reply Last reply
        0
        • L Luc Pattyn

          Hi, all files are binary, some contain text (in ASCII, Unicode, whatever), some contain an image, or some other kind of data. So you might have to clarify your question. :)

          Luc Pattyn [My Articles] [Forum Guidelines]

          D Offline
          D Offline
          DotNetDominator
          wrote on last edited by
          #4

          Of course, all files are Binary, but i want to differentiate files based on the printable characters they contain. Basically, i need it for a utility which would compare two files and write the differences between them to third file, or may update one file by comparing it to the others. I can only tell this much. Since, such comparision for "binary" files like DLL, Jar etc are meaningless i wanted to identify them before i compare them. I can't change the utility i will use for such comparision. I wrote following method, which i think would work fine. Do you think it would work across all character sets? I am just reading file byte by byte and looking for a byte which is zero. Then i know that the file is binary. static bool isBinary(ref BinaryReader binaryReader) { bool nullByteFound = false; int i = 0; byte unsignedByte; while (i < binaryReader.BaseStream.Length) { unsignedByte = binaryReader.ReadByte(); if (unsignedByte == 0){ nullByteFound = true; break; } i++; } Console.WriteLine("Bull= " + nullByteFound); return nullByteFound; } The other API IsTextUnicode may also help in solving problem if i retrieve IS_TEXT_UNICODE_NULL_BYTES flag. Thanks all for your help on this.

          L 1 Reply Last reply
          0
          • D DotNetDominator

            Of course, all files are Binary, but i want to differentiate files based on the printable characters they contain. Basically, i need it for a utility which would compare two files and write the differences between them to third file, or may update one file by comparing it to the others. I can only tell this much. Since, such comparision for "binary" files like DLL, Jar etc are meaningless i wanted to identify them before i compare them. I can't change the utility i will use for such comparision. I wrote following method, which i think would work fine. Do you think it would work across all character sets? I am just reading file byte by byte and looking for a byte which is zero. Then i know that the file is binary. static bool isBinary(ref BinaryReader binaryReader) { bool nullByteFound = false; int i = 0; byte unsignedByte; while (i < binaryReader.BaseStream.Length) { unsignedByte = binaryReader.ReadByte(); if (unsignedByte == 0){ nullByteFound = true; break; } i++; } Console.WriteLine("Bull= " + nullByteFound); return nullByteFound; } The other API IsTextUnicode may also help in solving problem if i retrieve IS_TEXT_UNICODE_NULL_BYTES flag. Thanks all for your help on this.

            L Offline
            L Offline
            Luc Pattyn
            wrote on last edited by
            #5

            Hi, if a text file is encoded using ASCII or ANSI or some other 8-bit character set, then zero-testing looks acceptable. if a text file is encoded using some 16-bit encoding scheme, then zero bytes can occur in text files (e.g. the char 0x0100, 0x0200, etc). You could check the first few bytes of the file, Unicode/UTF8/UTF16 use special values here; if these match you might assume it is text and skip further testing (and once in a while such assumption will be wrong); if they dont match you could assume it is an 8-bit encoding, and do the zero test. Whatever you do, since 100% confidence will not be achievable, I see no point in checking more than a few hundred bytes before deciding text/no text. :)

            Luc Pattyn [My Articles] [Forum Guidelines]

            1 Reply Last reply
            0
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Don't have an account? Register

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • World
            • Users
            • Groups