Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. ATL / WTL / STL
  4. Unicode streams and ASCII files

Unicode streams and ASCII files

Scheduled Pinned Locked Moved ATL / WTL / STL
questionc++htmlcomlearning
6 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • P Offline
    P Offline
    Patje
    wrote on last edited by
    #1

    In my application I want to access files, where the underlying character type (ASCII or Unicode) is transparent for the application. Suppose that in a reporting module a report is written to file, like this: myfile << reportHeader << std::endl; for (all columns) myfile << columnname; myfile << std::endl; for (all records) { for (all columns) myfile << data; myfile << std::endl; } Now, I want my application to transparently write the report to ASCII or Unicode files, depending on the user specification. Currently, you have to do it like this: std::ofstream asciiFile; std::wofstream unicodeFile; if (user wants Unicode) { unicodeFile.open ("output.txt"); unicodeFile << reportHeader << std::endl; } else { asciiFile.open ("output.txt"); asciiFile << reportHeader << std::endl; } for (all columns) { if (user wants Unicode) unicodeFile << columnname; else asciiFile << columnName; } ... and so on The disadvantages seem obvious: - clumsy, unreadable code (especially if the write-to-file logic is spread over several methods) - writing Unicode strings (std::wstring) to an Ascii stream doesn't even work; it produced garbage Therefore, I need a kind of transparent stream, so that I can write the following: transparentstream myfile; myfile.setMode (ascii or unicode); myfile.open ("output.txt"); myfile << reportHeader << std::endl; for (all columns) myfile << columnname; myfile << std::endl; for (all records) { for (all columns) myfile << data; myfile << std::endl; } Problems are: - from which stream do I inherit the transparentstream? - how do I define the transparentstream so all defined output operators keep on working? - where do I put the conversion logic? (basic_buf? basic_filebuf?) Of course I want something similar for input, where the stream can find out itself whether the file is Unicode (starts with 0xfffe or 0xfeff) or plain Ascii. And as an additional challenge: is it possible to have such a transparent stream that can do something like this? transparentstream mystream; mystream.open("http://www.mywebsite.com/mypage.html"); mystream >> ...; And if this is possible, how do I implement such a stream of buf class? Did some of you already encounter problems trying to mix the STL and Unicode files? How did you solve it? Thanks for you suggestions. Enjoy life, this is not a rehearsa

    V B 2 Replies Last reply
    0
    • P Patje

      In my application I want to access files, where the underlying character type (ASCII or Unicode) is transparent for the application. Suppose that in a reporting module a report is written to file, like this: myfile << reportHeader << std::endl; for (all columns) myfile << columnname; myfile << std::endl; for (all records) { for (all columns) myfile << data; myfile << std::endl; } Now, I want my application to transparently write the report to ASCII or Unicode files, depending on the user specification. Currently, you have to do it like this: std::ofstream asciiFile; std::wofstream unicodeFile; if (user wants Unicode) { unicodeFile.open ("output.txt"); unicodeFile << reportHeader << std::endl; } else { asciiFile.open ("output.txt"); asciiFile << reportHeader << std::endl; } for (all columns) { if (user wants Unicode) unicodeFile << columnname; else asciiFile << columnName; } ... and so on The disadvantages seem obvious: - clumsy, unreadable code (especially if the write-to-file logic is spread over several methods) - writing Unicode strings (std::wstring) to an Ascii stream doesn't even work; it produced garbage Therefore, I need a kind of transparent stream, so that I can write the following: transparentstream myfile; myfile.setMode (ascii or unicode); myfile.open ("output.txt"); myfile << reportHeader << std::endl; for (all columns) myfile << columnname; myfile << std::endl; for (all records) { for (all columns) myfile << data; myfile << std::endl; } Problems are: - from which stream do I inherit the transparentstream? - how do I define the transparentstream so all defined output operators keep on working? - where do I put the conversion logic? (basic_buf? basic_filebuf?) Of course I want something similar for input, where the stream can find out itself whether the file is Unicode (starts with 0xfffe or 0xfeff) or plain Ascii. And as an additional challenge: is it possible to have such a transparent stream that can do something like this? transparentstream mystream; mystream.open("http://www.mywebsite.com/mypage.html"); mystream >> ...; And if this is possible, how do I implement such a stream of buf class? Did some of you already encounter problems trying to mix the STL and Unicode files? How did you solve it? Thanks for you suggestions. Enjoy life, this is not a rehearsa

      V Offline
      V Offline
      valikac
      wrote on last edited by
      #2

      Are you referring to overloaded << and >> operators? ofstream &operator<<(ofstream &of, const CData const &data) ifstream &operator>>(ifstream &if, const CData const &data) Kuphryn

      P 1 Reply Last reply
      0
      • V valikac

        Are you referring to overloaded << and >> operators? ofstream &operator<<(ofstream &of, const CData const &data) ifstream &operator>>(ifstream &if, const CData const &data) Kuphryn

        P Offline
        P Offline
        Patje
        wrote on last edited by
        #3

        Not exactly. I want to write to (or read from) a stream without having to worry about whether the file is Ascii or Unicode. Afaik, STL forces me to know beforehand what the file is and use either an ofstream (for Ascii) or a wofstream (for Unicode), which means that all output statements should be doubled in my application. In my original post I gave a simple example for a reporting module. The first code examples shows how to do it in Ascii only.
        The second example shows how STL currently forces me to write this module if I want to support both Ascii and Unicode in my application.
        The third example shows how I would hope to write it, at least if somebody has a brilliant idea. Enjoy life, this is not a rehearsal !!!

        V 1 Reply Last reply
        0
        • P Patje

          Not exactly. I want to write to (or read from) a stream without having to worry about whether the file is Ascii or Unicode. Afaik, STL forces me to know beforehand what the file is and use either an ofstream (for Ascii) or a wofstream (for Unicode), which means that all output statements should be doubled in my application. In my original post I gave a simple example for a reporting module. The first code examples shows how to do it in Ascii only.
          The second example shows how STL currently forces me to write this module if I want to support both Ascii and Unicode in my application.
          The third example shows how I would hope to write it, at least if somebody has a brilliant idea. Enjoy life, this is not a rehearsal !!!

          V Offline
          V Offline
          valikac
          wrote on last edited by
          #4

          You determine the file type via input? One solution is to overload the operators as mentioned. There could be multiple objects (CObject and CObjectUnicode). Kuphryn

          1 Reply Last reply
          0
          • P Patje

            In my application I want to access files, where the underlying character type (ASCII or Unicode) is transparent for the application. Suppose that in a reporting module a report is written to file, like this: myfile << reportHeader << std::endl; for (all columns) myfile << columnname; myfile << std::endl; for (all records) { for (all columns) myfile << data; myfile << std::endl; } Now, I want my application to transparently write the report to ASCII or Unicode files, depending on the user specification. Currently, you have to do it like this: std::ofstream asciiFile; std::wofstream unicodeFile; if (user wants Unicode) { unicodeFile.open ("output.txt"); unicodeFile << reportHeader << std::endl; } else { asciiFile.open ("output.txt"); asciiFile << reportHeader << std::endl; } for (all columns) { if (user wants Unicode) unicodeFile << columnname; else asciiFile << columnName; } ... and so on The disadvantages seem obvious: - clumsy, unreadable code (especially if the write-to-file logic is spread over several methods) - writing Unicode strings (std::wstring) to an Ascii stream doesn't even work; it produced garbage Therefore, I need a kind of transparent stream, so that I can write the following: transparentstream myfile; myfile.setMode (ascii or unicode); myfile.open ("output.txt"); myfile << reportHeader << std::endl; for (all columns) myfile << columnname; myfile << std::endl; for (all records) { for (all columns) myfile << data; myfile << std::endl; } Problems are: - from which stream do I inherit the transparentstream? - how do I define the transparentstream so all defined output operators keep on working? - where do I put the conversion logic? (basic_buf? basic_filebuf?) Of course I want something similar for input, where the stream can find out itself whether the file is Unicode (starts with 0xfffe or 0xfeff) or plain Ascii. And as an additional challenge: is it possible to have such a transparent stream that can do something like this? transparentstream mystream; mystream.open("http://www.mywebsite.com/mypage.html"); mystream >> ...; And if this is possible, how do I implement such a stream of buf class? Did some of you already encounter problems trying to mix the STL and Unicode files? How did you solve it? Thanks for you suggestions. Enjoy life, this is not a rehearsa

            B Offline
            B Offline
            Bobby Mihalca
            wrote on last edited by
            #5

            Streams are transparent, you should use locale codecvt facet so that the buffer will be converted before saving and after loading. Your code should look like this: std::wofstream myfile; //myfile.setMode (ascii or unicode); std::locale loc(std::locale(),new std::codecvt<wchar_t,ascii ? char : wchar_t,std::mbstate_t>()); myfile.imbue(loc); myfile.open ("output.txt"); myfile << reportHeader << std::endl; for (all columns) myfile << columnname; myfile << std::endl; for (all records) { for (all columns) myfile << data; myfile << std::endl; } Note that myfile is always wchar_t and codecvt converts the caracters to char. If your internal data is char based you should use a char based stream and imbue it with a codecvt that converts from char to wchar_t is you need to save Unicode. Also when saving to Unicode you should write 0xEF 0xBB 0xBF first to indicate the file is Unicode (is what you will find at the beginning of utf-16 xml files) and when reading the file read the first 3 bytes, test them and then use the right codecvt facet to convert to your internal data.

            V 1 Reply Last reply
            0
            • B Bobby Mihalca

              Streams are transparent, you should use locale codecvt facet so that the buffer will be converted before saving and after loading. Your code should look like this: std::wofstream myfile; //myfile.setMode (ascii or unicode); std::locale loc(std::locale(),new std::codecvt<wchar_t,ascii ? char : wchar_t,std::mbstate_t>()); myfile.imbue(loc); myfile.open ("output.txt"); myfile << reportHeader << std::endl; for (all columns) myfile << columnname; myfile << std::endl; for (all records) { for (all columns) myfile << data; myfile << std::endl; } Note that myfile is always wchar_t and codecvt converts the caracters to char. If your internal data is char based you should use a char based stream and imbue it with a codecvt that converts from char to wchar_t is you need to save Unicode. Also when saving to Unicode you should write 0xEF 0xBB 0xBF first to indicate the file is Unicode (is what you will find at the beginning of utf-16 xml files) and when reading the file read the first 3 bytes, test them and then use the right codecvt facet to convert to your internal data.

              V Offline
              V Offline
              valikac
              wrote on last edited by
              #6

              Very interesting. Kuphryn

              1 Reply Last reply
              0
              Reply
              • Reply as topic
              Log in to reply
              • Oldest to Newest
              • Newest to Oldest
              • Most Votes


              • Login

              • Don't have an account? Register

              • Login or register to search.
              • First post
                Last post
              0
              • Categories
              • Recent
              • Tags
              • Popular
              • World
              • Users
              • Groups