Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. ATL / WTL / STL
  4. STL std::string help needed

STL std::string help needed

Scheduled Pinned Locked Moved ATL / WTL / STL
c++helpquestion
22 Posts 3 Posters 98 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • L Lost User

    Good Morning Sir, can we use std::string for handling names like chinese letters, japanese letters? I mean we can use them for string comparison like "some chinese stuff"=="some chinese stuff" will it work?? Thank you sir for your time excuse my english

    D Offline
    D Offline
    Daniel Pfeffer
    wrote on last edited by
    #8

    Inside your program, the best way to represent characters is using the wchar_t-based types (e.g. std::wstring). This enables simple processing (all characters are represented by a single wchar_t value), and so on. If you wish to call a library that only supports char-based types (e.g. std::string), you must convert whar_t types to char type, call the library, and convert the results back. In C++11, the standard way to do this is something like this:

    #include
    #include
    #include

    std::wstring_convert> converter;

    std::wstring wide_source;
    std::string narrow_target = converter.to_bytes(wide_source);

    std::string narrow_source;
    std::wstring wide_target = converter.from_bytes(narrow_source);

    If you have an important point to make, don't try to be subtle or clever. Use a pile driver. Hit the point once. Then come back and hit it again. Then hit it a third time - a tremendous whack. --Winston Churchill

    L 2 Replies Last reply
    0
    • D Daniel Pfeffer

      Inside your program, the best way to represent characters is using the wchar_t-based types (e.g. std::wstring). This enables simple processing (all characters are represented by a single wchar_t value), and so on. If you wish to call a library that only supports char-based types (e.g. std::string), you must convert whar_t types to char type, call the library, and convert the results back. In C++11, the standard way to do this is something like this:

      #include
      #include
      #include

      std::wstring_convert> converter;

      std::wstring wide_source;
      std::string narrow_target = converter.to_bytes(wide_source);

      std::string narrow_source;
      std::wstring wide_target = converter.from_bytes(narrow_source);

      If you have an important point to make, don't try to be subtle or clever. Use a pile driver. Hit the point once. Then come back and hit it again. Then hit it a third time - a tremendous whack. --Winston Churchill

      L Offline
      L Offline
      Lost User
      wrote on last edited by
      #9

      Thank you for your solution sir but the wide character is something like this, L"F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" so I am converting it to the string by the above method described by you, "F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" I've have already found a way to convert the std::string to char* using strcppy so finally I get this, "F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" the same thing as of string, but I have a function( from 3rd party library) which takes char* as an argument so, I have char* value as F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg but the function shows returns -1(file not found) since the unicode fonts didn't changed from 检查.jpg to 检查.jpg so how to open the file using that function I have checked the work flow of this function using Debugger by creating the break-points and checked the values using Immediate window. Below is my code:

      // Template is passed as wstring i.e duplicates is equal to std::wstring
      template
      std::string Duplicates::compute_hash(duplicates file_loc)
      {
      std::wstring_convert> converter;
      std::string narrow_target = converter.to_bytes(file_loc);
      char *cstr = new char[narrow_target.length() + 1];
      strcpy(cstr, narrow_target.c_str());
      //This function takes char* as an argument
      std::string hash = CALL_MD5_Function(cstr);
      delete[] cstr;
      std::cout << hash;
      return hash;
      }

      D 1 Reply Last reply
      0
      • J Jochen Arndt

        It depends on your project settings (Unicode or ANSI / multi byte), the used encoding / code page if not a Unicode project, and what the library supports. If you have a Unicode build (recommended), there is probably also a Unicode version for the library. If the library does not use the same encoding as your project you have to convert the strings.

        L Offline
        L Offline
        Lost User
        wrote on last edited by
        #10

        Thank you for your solution sir but the wide character is something like this, L"F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" so I am converting it to the string by the above method described by you, "F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" I've have already found a way to convert the std::string to char* using strcppy so finally I get this, "F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" the same thing as of string, but I have a function( from 3rd party library) which takes char* as an argument so, I have char* value as F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg but the function shows returns -1(file not found) since the unicode fonts didn't changed from 检查.jpg to 检查.jpg so how to open the file using that function I have checked the work flow of this function using Debugger by creating the break-points and checked the values using Immediate window. Below is my code:

        // Template is passed as wstring i.e duplicates is equal to std::wstring
        template
        std::string Duplicates::compute_hash(duplicates file_loc)
        {
        std::wstring_convert> converter;
        std::string narrow_target = converter.to_bytes(file_loc);
        char *cstr = new char[narrow_target.length() + 1];
        strcpy(cstr, narrow_target.c_str());
        //This function takes char* as an argument
        std::string hash = CALL_MD5_Function(cstr);
        delete[] cstr;
        std::cout << hash;
        return hash;
        }

        1 Reply Last reply
        0
        • L Lost User

          It depends what encoding you use for the Chinese characters. If it is Unicode then you need wstring, but if it's UTF8 then you will need to create a new type based on basic_string. Look at the samples in the link I gave you.

          L Offline
          L Offline
          Lost User
          wrote on last edited by
          #11

          Thank you for your solution sir but the wide character is something like this, L"F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" so I am converting it to the string by the above method described by you, "F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" I've have already found a way to convert the std::string to char* using strcppy so finally I get this, "F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" the same thing as of string, but I have a function( from 3rd party library) which takes char* as an argument so, I have char* value as F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg but the function shows returns -1(file not found) since the unicode fonts didn't changed from 检查.jpg to 检查.jpg so how to open the file using that function I have checked the work flow of this function using Debugger by creating the break-points and checked the values using Immediate window. Below is my code:

          // Template is passed as wstring i.e duplicates is equal to std::wstring
          template
          std::string Duplicates::compute_hash(duplicates file_loc)
          {
          std::wstring_convert> converter;
          std::string narrow_target = converter.to_bytes(file_loc);
          char *cstr = new char[narrow_target.length() + 1];
          strcpy(cstr, narrow_target.c_str());
          //This function takes char* as an argument
          std::string hash = CALL_MD5_Function(cstr);
          delete[] cstr;
          std::cout << hash;
          return hash;
          }

          L J 2 Replies Last reply
          0
          • L Lost User

            Thank you for your solution sir but the wide character is something like this, L"F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" so I am converting it to the string by the above method described by you, "F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" I've have already found a way to convert the std::string to char* using strcppy so finally I get this, "F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" the same thing as of string, but I have a function( from 3rd party library) which takes char* as an argument so, I have char* value as F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg but the function shows returns -1(file not found) since the unicode fonts didn't changed from 检查.jpg to 检查.jpg so how to open the file using that function I have checked the work flow of this function using Debugger by creating the break-points and checked the values using Immediate window. Below is my code:

            // Template is passed as wstring i.e duplicates is equal to std::wstring
            template
            std::string Duplicates::compute_hash(duplicates file_loc)
            {
            std::wstring_convert> converter;
            std::string narrow_target = converter.to_bytes(file_loc);
            char *cstr = new char[narrow_target.length() + 1];
            strcpy(cstr, narrow_target.c_str());
            //This function takes char* as an argument
            std::string hash = CALL_MD5_Function(cstr);
            delete[] cstr;
            std::cout << hash;
            return hash;
            }

            L Offline
            L Offline
            Lost User
            wrote on last edited by
            #12

            As you have discovered, converting the Unicode string to ASCII does not work.

            1 Reply Last reply
            0
            • L Lost User

              Thank you for your solution sir but the wide character is something like this, L"F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" so I am converting it to the string by the above method described by you, "F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" I've have already found a way to convert the std::string to char* using strcppy so finally I get this, "F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" the same thing as of string, but I have a function( from 3rd party library) which takes char* as an argument so, I have char* value as F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg but the function shows returns -1(file not found) since the unicode fonts didn't changed from 检查.jpg to 检查.jpg so how to open the file using that function I have checked the work flow of this function using Debugger by creating the break-points and checked the values using Immediate window. Below is my code:

              // Template is passed as wstring i.e duplicates is equal to std::wstring
              template
              std::string Duplicates::compute_hash(duplicates file_loc)
              {
              std::wstring_convert> converter;
              std::string narrow_target = converter.to_bytes(file_loc);
              char *cstr = new char[narrow_target.length() + 1];
              strcpy(cstr, narrow_target.c_str());
              //This function takes char* as an argument
              std::string hash = CALL_MD5_Function(cstr);
              delete[] cstr;
              std::cout << hash;
              return hash;
              }

              J Offline
              J Offline
              Jochen Arndt
              wrote on last edited by
              #13

              The function name CALL_MD5_Function indicates that it is calculating an MD5 hash sum. But that algorithm is a binary operation and usually requires passing a byte array and a length. With C/C++ char* pointers are often used to pass byte arrays (using uint8_t* would be better). So a char* is not always an indication for a string type. You are calculating the hash for file names which use different encodings on different platforms (e.g. UTF-16LE on Windows and UTF-8 on Linux). In such cases you have to know (or define) which encoding has to be used for calculations of the hash sum. Then you have to convert the file name strings to that encoding before calculating the hash sum. If it is used only on a single platform, just cast the wide string pointer and pass the length in bytes (the length is missing in your function call; I assume it is just a wrapper to the real function passing strlen). Finally, why do you want to get the MD5 sum of file names? It is usually calculated for file content which is just binary.

              L 1 Reply Last reply
              0
              • J Jochen Arndt

                The function name CALL_MD5_Function indicates that it is calculating an MD5 hash sum. But that algorithm is a binary operation and usually requires passing a byte array and a length. With C/C++ char* pointers are often used to pass byte arrays (using uint8_t* would be better). So a char* is not always an indication for a string type. You are calculating the hash for file names which use different encodings on different platforms (e.g. UTF-16LE on Windows and UTF-8 on Linux). In such cases you have to know (or define) which encoding has to be used for calculations of the hash sum. Then you have to convert the file name strings to that encoding before calculating the hash sum. If it is used only on a single platform, just cast the wide string pointer and pass the length in bytes (the length is missing in your function call; I assume it is just a wrapper to the real function passing strlen). Finally, why do you want to get the MD5 sum of file names? It is usually calculated for file content which is just binary.

                L Offline
                L Offline
                Lost User
                wrote on last edited by
                #14

                No, sir the function will get the MD5 of the file itself, not for the names of the files. I am on a windows platform and this function is not going to be used for *nix platforms. so what shall I do sir

                J 1 Reply Last reply
                0
                • L Lost User

                  It depends what encoding you use for the Chinese characters. If it is Unicode then you need wstring, but if it's UTF8 then you will need to create a new type based on basic_string. Look at the samples in the link I gave you.

                  L Offline
                  L Offline
                  Lost User
                  wrote on last edited by
                  #15

                  Thank you so much sir for your kind help, I finally found a way. Thank you once again for your time sir.

                  1 Reply Last reply
                  0
                  • L Lost User

                    No, sir the function will get the MD5 of the file itself, not for the names of the files. I am on a windows platform and this function is not going to be used for *nix platforms. so what shall I do sir

                    J Offline
                    J Offline
                    Jochen Arndt
                    wrote on last edited by
                    #16

                    Use a wide string version of that function. If you have the sources, change the file name parameter to be a wide string and call the wide string version of the used file open function.

                    L 1 Reply Last reply
                    0
                    • J Jochen Arndt

                      Use a wide string version of that function. If you have the sources, change the file name parameter to be a wide string and call the wide string version of the used file open function.

                      L Offline
                      L Offline
                      Lost User
                      wrote on last edited by
                      #17

                      Thank you for your kind help sir, I have modified the function and now it is working! Thank you once again for your time!

                      1 Reply Last reply
                      0
                      • D Daniel Pfeffer

                        Inside your program, the best way to represent characters is using the wchar_t-based types (e.g. std::wstring). This enables simple processing (all characters are represented by a single wchar_t value), and so on. If you wish to call a library that only supports char-based types (e.g. std::string), you must convert whar_t types to char type, call the library, and convert the results back. In C++11, the standard way to do this is something like this:

                        #include
                        #include
                        #include

                        std::wstring_convert> converter;

                        std::wstring wide_source;
                        std::string narrow_target = converter.to_bytes(wide_source);

                        std::string narrow_source;
                        std::wstring wide_target = converter.from_bytes(narrow_source);

                        If you have an important point to make, don't try to be subtle or clever. Use a pile driver. Hit the point once. Then come back and hit it again. Then hit it a third time - a tremendous whack. --Winston Churchill

                        L Offline
                        L Offline
                        Lost User
                        wrote on last edited by
                        #18

                        Thank you for your kind help sir, I have modified the function and now it is working! Thank you once again for your time!

                        1 Reply Last reply
                        0
                        • L Lost User

                          Thank you for your solution sir but the wide character is something like this, L"F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" so I am converting it to the string by the above method described by you, "F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" I've have already found a way to convert the std::string to char* using strcppy so finally I get this, "F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg" the same thing as of string, but I have a function( from 3rd party library) which takes char* as an argument so, I have char* value as F:\\dupelicateFinder\\New folder\\New folder\\检查.jpg but the function shows returns -1(file not found) since the unicode fonts didn't changed from 检查.jpg to 检查.jpg so how to open the file using that function I have checked the work flow of this function using Debugger by creating the break-points and checked the values using Immediate window. Below is my code:

                          // Template is passed as wstring i.e duplicates is equal to std::wstring
                          template
                          std::string Duplicates::compute_hash(duplicates file_loc)
                          {
                          std::wstring_convert> converter;
                          std::string narrow_target = converter.to_bytes(file_loc);
                          char *cstr = new char[narrow_target.length() + 1];
                          strcpy(cstr, narrow_target.c_str());
                          //This function takes char* as an argument
                          std::string hash = CALL_MD5_Function(cstr);
                          delete[] cstr;
                          std::cout << hash;
                          return hash;
                          }

                          D Offline
                          D Offline
                          Daniel Pfeffer
                          wrote on last edited by
                          #19

                          Filenames, unfortunately, can be a problem. In order to work with multi-byte character filenames (rather than Unicode), you must convert them according to your Operating System's requirements. For Windows, this typically means using the crorrect Code Page for your system. See the WideCharToMultibyte() and the MultibyteToWideChar() APIs for details.

                          If you have an important point to make, don't try to be subtle or clever. Use a pile driver. Hit the point once. Then come back and hit it again. Then hit it a third time - a tremendous whack. --Winston Churchill

                          L 1 Reply Last reply
                          0
                          • D Daniel Pfeffer

                            Filenames, unfortunately, can be a problem. In order to work with multi-byte character filenames (rather than Unicode), you must convert them according to your Operating System's requirements. For Windows, this typically means using the crorrect Code Page for your system. See the WideCharToMultibyte() and the MultibyteToWideChar() APIs for details.

                            If you have an important point to make, don't try to be subtle or clever. Use a pile driver. Hit the point once. Then come back and hit it again. Then hit it a third time - a tremendous whack. --Winston Churchill

                            L Offline
                            L Offline
                            Lost User
                            wrote on last edited by
                            #20

                            Sir, If I convert it according to my os requirement then we cannot guarantee it works with other os sir? Thank you :confused:

                            D 1 Reply Last reply
                            0
                            • L Lost User

                              Sir, If I convert it according to my os requirement then we cannot guarantee it works with other os sir? Thank you :confused:

                              D Offline
                              D Offline
                              Daniel Pfeffer
                              wrote on last edited by
                              #21

                              If you are converting filenames from Unicode to multi-byte, then you must do this according to the rules of the O/S. However, this can be encapsulated in a single class. Use conditional compilation (e.g #ifdef WINDOWS or #ifdef LINUX) to choose the correct version of the class.

                              If you have an important point to make, don't try to be subtle or clever. Use a pile driver. Hit the point once. Then come back and hit it again. Then hit it a third time - a tremendous whack. --Winston Churchill

                              L 1 Reply Last reply
                              0
                              • D Daniel Pfeffer

                                If you are converting filenames from Unicode to multi-byte, then you must do this according to the rules of the O/S. However, this can be encapsulated in a single class. Use conditional compilation (e.g #ifdef WINDOWS or #ifdef LINUX) to choose the correct version of the class.

                                If you have an important point to make, don't try to be subtle or clever. Use a pile driver. Hit the point once. Then come back and hit it again. Then hit it a third time - a tremendous whack. --Winston Churchill

                                L Offline
                                L Offline
                                Lost User
                                wrote on last edited by
                                #22

                                Thank you sir for your kind help and time, I now understand. However I converted the some of the functions in that library to accept std::wstring which I made it easy that way

                                1 Reply Last reply
                                0
                                Reply
                                • Reply as topic
                                Log in to reply
                                • Oldest to Newest
                                • Newest to Oldest
                                • Most Votes


                                • Login

                                • Don't have an account? Register

                                • Login or register to search.
                                • First post
                                  Last post
                                0
                                • Categories
                                • Recent
                                • Tags
                                • Popular
                                • World
                                • Users
                                • Groups