Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. MultiByteToWideChar crashes out on longer strings [modified]

MultiByteToWideChar crashes out on longer strings [modified]

Scheduled Pinned Locked Moved C / C++ / MFC
c++csharpvisual-studiotestingdebugging
8 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R Offline
    R Offline
    RichardBrock
    wrote on last edited by
    #1

    I've got a code segment below that converts the block of chars read in from an RSS XML file and converts it to the unicode equivalent using a code page (the code page ID is determined beforehand by scanning for the the encoding ID in the XML). Sometimes it works, sometimes it crashes the application. I'm using the buffer requirement returned by 'MultiByteToWideChar' to allocate the buffer required (+1 because the block of chars I'm passing does not include the 0 terminator). What's really odd is that on small strings it works fine, but on a 216 character string it crashes. If I extend the buffer allocation by 1 then it works all the time. I should not have to kludge buffer allocation to make things work. int nSizeReq = MultiByteToWideChar(m_nCPID,0,(const char* m_pChars,nBlockLength,0,0); TCHAR* pszConverted = new TCHAR[nSizeReq+1]; _tcsnset(pszConverted,0,nSizeReq+1); MultiByteToWideChar(m_nCPID, 0, (const char*)m_pChars,nBlockLength, pszConverted, nSizeReq); //up to here it always works, but the next step crashes because pszConverted has damage //past its memory allocation CString strConverted = pszConverted Any ideas why this could be happening? ps. it runs just great in debug, and if I run it in release mode whilst in visual studio it also works. Run the release by itself and *boom* [edit] Coded in C++ (MFC application) using Visual Studio 2008 Unicode is defined Testing on Vista.

    modified on Friday, March 13, 2009 10:51 AM

    L A 2 Replies Last reply
    0
    • R RichardBrock

      I've got a code segment below that converts the block of chars read in from an RSS XML file and converts it to the unicode equivalent using a code page (the code page ID is determined beforehand by scanning for the the encoding ID in the XML). Sometimes it works, sometimes it crashes the application. I'm using the buffer requirement returned by 'MultiByteToWideChar' to allocate the buffer required (+1 because the block of chars I'm passing does not include the 0 terminator). What's really odd is that on small strings it works fine, but on a 216 character string it crashes. If I extend the buffer allocation by 1 then it works all the time. I should not have to kludge buffer allocation to make things work. int nSizeReq = MultiByteToWideChar(m_nCPID,0,(const char* m_pChars,nBlockLength,0,0); TCHAR* pszConverted = new TCHAR[nSizeReq+1]; _tcsnset(pszConverted,0,nSizeReq+1); MultiByteToWideChar(m_nCPID, 0, (const char*)m_pChars,nBlockLength, pszConverted, nSizeReq); //up to here it always works, but the next step crashes because pszConverted has damage //past its memory allocation CString strConverted = pszConverted Any ideas why this could be happening? ps. it runs just great in debug, and if I run it in release mode whilst in visual studio it also works. Run the release by itself and *boom* [edit] Coded in C++ (MFC application) using Visual Studio 2008 Unicode is defined Testing on Vista.

      modified on Friday, March 13, 2009 10:51 AM

      L Offline
      L Offline
      led mike
      wrote on last edited by
      #2

      First thing I notice is the code you posted does not check the return value of MultiByteToWideChar. Therefore your next operations are running on blind faith. This is not usually considered a Software Development Best Practice.

      R 1 Reply Last reply
      0
      • R RichardBrock

        I've got a code segment below that converts the block of chars read in from an RSS XML file and converts it to the unicode equivalent using a code page (the code page ID is determined beforehand by scanning for the the encoding ID in the XML). Sometimes it works, sometimes it crashes the application. I'm using the buffer requirement returned by 'MultiByteToWideChar' to allocate the buffer required (+1 because the block of chars I'm passing does not include the 0 terminator). What's really odd is that on small strings it works fine, but on a 216 character string it crashes. If I extend the buffer allocation by 1 then it works all the time. I should not have to kludge buffer allocation to make things work. int nSizeReq = MultiByteToWideChar(m_nCPID,0,(const char* m_pChars,nBlockLength,0,0); TCHAR* pszConverted = new TCHAR[nSizeReq+1]; _tcsnset(pszConverted,0,nSizeReq+1); MultiByteToWideChar(m_nCPID, 0, (const char*)m_pChars,nBlockLength, pszConverted, nSizeReq); //up to here it always works, but the next step crashes because pszConverted has damage //past its memory allocation CString strConverted = pszConverted Any ideas why this could be happening? ps. it runs just great in debug, and if I run it in release mode whilst in visual studio it also works. Run the release by itself and *boom* [edit] Coded in C++ (MFC application) using Visual Studio 2008 Unicode is defined Testing on Vista.

        modified on Friday, March 13, 2009 10:51 AM

        A Offline
        A Offline
        Akt_4_U
        wrote on last edited by
        #3

        How are you calculating this nBlockLength?

        prvn

        R 1 Reply Last reply
        0
        • L led mike

          First thing I notice is the code you posted does not check the return value of MultiByteToWideChar. Therefore your next operations are running on blind faith. This is not usually considered a Software Development Best Practice.

          R Offline
          R Offline
          RichardBrock
          wrote on last edited by
          #4

          Yep, you make a good point, I updated the code to check the value: int nSizeReq = MultiByteToWideChar(m_nCPID,0,(const char* m_pChars,nBlockLength,0,0); TCHAR* pszConverted = new TCHAR[nSizeReq+1]; _tcsnset(pszConverted,0,nSizeReq+1); int nConverted = MultiByteToWideChar(m_nCPID, 0, (const char*)m_pChars,nBlockLength, pszConverted, nSizeReq); int nTest = wcslen(pszConverted); the results: nConverted = 214. nSizeReq = 214, but nTest = 206. Weird.

          L 1 Reply Last reply
          0
          • A Akt_4_U

            How are you calculating this nBlockLength?

            prvn

            R Offline
            R Offline
            RichardBrock
            wrote on last edited by
            #5

            I have a file open using CreateFile, I use ReadFile to locate the start and end tags in the XML file for the field, e.g. <description>.....</description> (Internet RSS news feed in Arabic). The nBlockLength indicates the number of characters extracted between > and <, the m_pChars buffer holds the actual character data.

            1 Reply Last reply
            0
            • R RichardBrock

              Yep, you make a good point, I updated the code to check the value: int nSizeReq = MultiByteToWideChar(m_nCPID,0,(const char* m_pChars,nBlockLength,0,0); TCHAR* pszConverted = new TCHAR[nSizeReq+1]; _tcsnset(pszConverted,0,nSizeReq+1); int nConverted = MultiByteToWideChar(m_nCPID, 0, (const char*)m_pChars,nBlockLength, pszConverted, nSizeReq); int nTest = wcslen(pszConverted); the results: nConverted = 214. nSizeReq = 214, but nTest = 206. Weird.

              L Offline
              L Offline
              led mike
              wrote on last edited by
              #6

              RichardBrock wrote:

              Weird.

              What are you compiling to? Try int nTest = _tcslen(pszConverted);

              R 1 Reply Last reply
              0
              • L led mike

                RichardBrock wrote:

                Weird.

                What are you compiling to? Try int nTest = _tcslen(pszConverted);

                R Offline
                R Offline
                RichardBrock
                wrote on last edited by
                #7

                The project outputs a 32 bit Windows executable (target platforms are XP and Vista), MFC linked is static. I tried _tcslen as you suggested, same result. I'm testing from a live RSS feed, so the news item length has changed but here's the latest output from my outputdebugstring placed just after the 2nd MultiByteToWideChar call. 'return value = 140 (wcslen is 308) nSizeReq = 140 nBlockLength = 140' so you can see the function call returns 140, the buffer allocated was 140 and the block length read from the file is 140. But the converted string is 308 in length, obviously overruning memory allocated to it. Do you think compiler optimization could be causing a problem? I'm compiling with 'Enable link-time code generation (/GL)'. Btw, a previous call for the preceding news item yields: 'return value = 121 (wcslen is 121) nSizeReq = 121 nBlockLength = 121'

                L 1 Reply Last reply
                0
                • R RichardBrock

                  The project outputs a 32 bit Windows executable (target platforms are XP and Vista), MFC linked is static. I tried _tcslen as you suggested, same result. I'm testing from a live RSS feed, so the news item length has changed but here's the latest output from my outputdebugstring placed just after the 2nd MultiByteToWideChar call. 'return value = 140 (wcslen is 308) nSizeReq = 140 nBlockLength = 140' so you can see the function call returns 140, the buffer allocated was 140 and the block length read from the file is 140. But the converted string is 308 in length, obviously overruning memory allocated to it. Do you think compiler optimization could be causing a problem? I'm compiling with 'Enable link-time code generation (/GL)'. Btw, a previous call for the preceding news item yields: 'return value = 121 (wcslen is 121) nSizeReq = 121 nBlockLength = 121'

                  L Offline
                  L Offline
                  led mike
                  wrote on last edited by
                  #8

                  You did not answer my question, I guess I didn't make it clear but I assumed you had some knowledge about what you were doing.

                  led mike wrote:

                  What are you compiling to?

                  Since the subject of this discussion is character sets that's what that question is about. Are you compiling to _UNICODE or what? You should check out the example of MultiByteToWideChar in this article[^] Try to look for the differences in your code, there an obvious difference. Next I strongly urge you to study this subject thoroughly before you attempt your implementation. My experience is that working with conversion requires a sound understanding of this subject. I believe there are great articles here on Code Project that cover this topic well.

                  1 Reply Last reply
                  0
                  Reply
                  • Reply as topic
                  Log in to reply
                  • Oldest to Newest
                  • Newest to Oldest
                  • Most Votes


                  • Login

                  • Don't have an account? Register

                  • Login or register to search.
                  • First post
                    Last post
                  0
                  • Categories
                  • Recent
                  • Tags
                  • Popular
                  • World
                  • Users
                  • Groups