Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Other Discussions
  3. Clever Code
  4. Funny strings

Funny strings

Scheduled Pinned Locked Moved Clever Code
c++csharpvisual-studiodebugging
11 Posts 8 Posters 6 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • V Vladimir Svrkota

    Hello guys. Recently I've been working on some relatively complex regular expressions. It took me a while before I finally figured out what was wrong with an obviously correct (but very long) regular expression. To illustrate the issue I'm talking about, fire up your Visual Studio 2008 SP1 Express/Pro/Team, create a new C++ Windows console application and replace the contents of your main cpp file with the following code:

    #include "stdafx.h"

    #include <stdio.h>
    #include <iostream>
    #include <string>

    using namespace std;

    int _tmain(int argc, _TCHAR* argv[])
    {
    // Fun starts here. Check the output. It quite differs from "(??)".

    printf("(??)\\n");
    
    string str1 = "(??)";
    cout << str1 << endl;
    
    // An additional backspace ends the fun...
    
    printf("(?\\?)\\n");
    
    string str2 = "(?\\?)";
    cout << str2 << endl;
    
    // ...and so does the closing parenthesis removal.
    
    printf("(??\\n");
    
    string str3 = "(??";
    cout << str3 << endl;
    
    // The same thing happens to wstring too.
    
    wstring str4 = L"(??)"; // bad one
    wcout << str4 << endl;
    
    wstring str5 = L"(?\\?)"; // good one
    wcout << str5 << endl;
    
    wstring str6 = L"(??"; // good one
    wcout << str6 << endl;
    
    return 0;
    

    }

    As you can see, the output quite differs form the expected one in case of "(??)" string. Since my regular expression contained a lot of '*', '[', ']', '\', '?', '(', and ')' characters, it was quite difficult to figure out that the value of a variable displayed in debug window differs from the one assigned to it in the source code (actually I figured it out by accident). So be careful when using regular expressions ;-) Oh, one more thing. I just tried this example on GCC compiler. In all cases the output is as expected. Hmmm...

    -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

    O Offline
    O Offline
    oggenok64
    wrote on last edited by
    #2

    Ye olde trigraph strikes again. It's well documented in my old Kernighan & Ritchie: "The C Programming Language", 2nd edition from 1988.

    O V 2 Replies Last reply
    0
    • O oggenok64

      Ye olde trigraph strikes again. It's well documented in my old Kernighan & Ritchie: "The C Programming Language", 2nd edition from 1988.

      O Offline
      O Offline
      oggenok64
      wrote on last edited by
      #3

      The string literal "??)" will be replaced with the single character ']' which looks consistent with what you've seen. I don't have VS around so i can't immediately reproduce the output. By default gcc ignores trigraphs, so it's not a surprise you don't see the "problem" there. Try compiling with the -trigraphs switch.

      1 Reply Last reply
      0
      • V Vladimir Svrkota

        Hello guys. Recently I've been working on some relatively complex regular expressions. It took me a while before I finally figured out what was wrong with an obviously correct (but very long) regular expression. To illustrate the issue I'm talking about, fire up your Visual Studio 2008 SP1 Express/Pro/Team, create a new C++ Windows console application and replace the contents of your main cpp file with the following code:

        #include "stdafx.h"

        #include <stdio.h>
        #include <iostream>
        #include <string>

        using namespace std;

        int _tmain(int argc, _TCHAR* argv[])
        {
        // Fun starts here. Check the output. It quite differs from "(??)".

        printf("(??)\\n");
        
        string str1 = "(??)";
        cout << str1 << endl;
        
        // An additional backspace ends the fun...
        
        printf("(?\\?)\\n");
        
        string str2 = "(?\\?)";
        cout << str2 << endl;
        
        // ...and so does the closing parenthesis removal.
        
        printf("(??\\n");
        
        string str3 = "(??";
        cout << str3 << endl;
        
        // The same thing happens to wstring too.
        
        wstring str4 = L"(??)"; // bad one
        wcout << str4 << endl;
        
        wstring str5 = L"(?\\?)"; // good one
        wcout << str5 << endl;
        
        wstring str6 = L"(??"; // good one
        wcout << str6 << endl;
        
        return 0;
        

        }

        As you can see, the output quite differs form the expected one in case of "(??)" string. Since my regular expression contained a lot of '*', '[', ']', '\', '?', '(', and ')' characters, it was quite difficult to figure out that the value of a variable displayed in debug window differs from the one assigned to it in the source code (actually I figured it out by accident). So be careful when using regular expressions ;-) Oh, one more thing. I just tried this example on GCC compiler. In all cases the output is as expected. Hmmm...

        -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

        P Offline
        P Offline
        PIEBALDconsult
        wrote on last edited by
        #4

        They who don't learn from history are doomed to be bitten on the bottom by it. :-D

        1 Reply Last reply
        0
        • O oggenok64

          Ye olde trigraph strikes again. It's well documented in my old Kernighan & Ritchie: "The C Programming Language", 2nd edition from 1988.

          V Offline
          V Offline
          Vladimir Svrkota
          wrote on last edited by
          #5

          OMG! :-D Have the same book, cleaned up the dust and there they were (trigraphs), staring right back at me, making me just stand there in shame :-D . Thanks, Søren.

          -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

          O 1 Reply Last reply
          0
          • V Vladimir Svrkota

            OMG! :-D Have the same book, cleaned up the dust and there they were (trigraphs), staring right back at me, making me just stand there in shame :-D . Thanks, Søren.

            -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

            O Offline
            O Offline
            oggenok64
            wrote on last edited by
            #6

            No, you are not standing in shame. A great many people have bitten by trigraphs because they are so counterintuitive.

            1 Reply Last reply
            0
            • V Vladimir Svrkota

              Hello guys. Recently I've been working on some relatively complex regular expressions. It took me a while before I finally figured out what was wrong with an obviously correct (but very long) regular expression. To illustrate the issue I'm talking about, fire up your Visual Studio 2008 SP1 Express/Pro/Team, create a new C++ Windows console application and replace the contents of your main cpp file with the following code:

              #include "stdafx.h"

              #include <stdio.h>
              #include <iostream>
              #include <string>

              using namespace std;

              int _tmain(int argc, _TCHAR* argv[])
              {
              // Fun starts here. Check the output. It quite differs from "(??)".

              printf("(??)\\n");
              
              string str1 = "(??)";
              cout << str1 << endl;
              
              // An additional backspace ends the fun...
              
              printf("(?\\?)\\n");
              
              string str2 = "(?\\?)";
              cout << str2 << endl;
              
              // ...and so does the closing parenthesis removal.
              
              printf("(??\\n");
              
              string str3 = "(??";
              cout << str3 << endl;
              
              // The same thing happens to wstring too.
              
              wstring str4 = L"(??)"; // bad one
              wcout << str4 << endl;
              
              wstring str5 = L"(?\\?)"; // good one
              wcout << str5 << endl;
              
              wstring str6 = L"(??"; // good one
              wcout << str6 << endl;
              
              return 0;
              

              }

              As you can see, the output quite differs form the expected one in case of "(??)" string. Since my regular expression contained a lot of '*', '[', ']', '\', '?', '(', and ')' characters, it was quite difficult to figure out that the value of a variable displayed in debug window differs from the one assigned to it in the source code (actually I figured it out by accident). So be careful when using regular expressions ;-) Oh, one more thing. I just tried this example on GCC compiler. In all cases the output is as expected. Hmmm...

              -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

              I Offline
              I Offline
              iam123
              wrote on last edited by
              #7

              lol :)

              1 Reply Last reply
              0
              • V Vladimir Svrkota

                Hello guys. Recently I've been working on some relatively complex regular expressions. It took me a while before I finally figured out what was wrong with an obviously correct (but very long) regular expression. To illustrate the issue I'm talking about, fire up your Visual Studio 2008 SP1 Express/Pro/Team, create a new C++ Windows console application and replace the contents of your main cpp file with the following code:

                #include "stdafx.h"

                #include <stdio.h>
                #include <iostream>
                #include <string>

                using namespace std;

                int _tmain(int argc, _TCHAR* argv[])
                {
                // Fun starts here. Check the output. It quite differs from "(??)".

                printf("(??)\\n");
                
                string str1 = "(??)";
                cout << str1 << endl;
                
                // An additional backspace ends the fun...
                
                printf("(?\\?)\\n");
                
                string str2 = "(?\\?)";
                cout << str2 << endl;
                
                // ...and so does the closing parenthesis removal.
                
                printf("(??\\n");
                
                string str3 = "(??";
                cout << str3 << endl;
                
                // The same thing happens to wstring too.
                
                wstring str4 = L"(??)"; // bad one
                wcout << str4 << endl;
                
                wstring str5 = L"(?\\?)"; // good one
                wcout << str5 << endl;
                
                wstring str6 = L"(??"; // good one
                wcout << str6 << endl;
                
                return 0;
                

                }

                As you can see, the output quite differs form the expected one in case of "(??)" string. Since my regular expression contained a lot of '*', '[', ']', '\', '?', '(', and ')' characters, it was quite difficult to figure out that the value of a variable displayed in debug window differs from the one assigned to it in the source code (actually I figured it out by accident). So be careful when using regular expressions ;-) Oh, one more thing. I just tried this example on GCC compiler. In all cases the output is as expected. Hmmm...

                -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

                L Offline
                L Offline
                Lost User
                wrote on last edited by
                #8

                Trigraphs should be dropped. No one needs them. They're only causing bugs these days.

                1 Reply Last reply
                0
                • V Vladimir Svrkota

                  Hello guys. Recently I've been working on some relatively complex regular expressions. It took me a while before I finally figured out what was wrong with an obviously correct (but very long) regular expression. To illustrate the issue I'm talking about, fire up your Visual Studio 2008 SP1 Express/Pro/Team, create a new C++ Windows console application and replace the contents of your main cpp file with the following code:

                  #include "stdafx.h"

                  #include <stdio.h>
                  #include <iostream>
                  #include <string>

                  using namespace std;

                  int _tmain(int argc, _TCHAR* argv[])
                  {
                  // Fun starts here. Check the output. It quite differs from "(??)".

                  printf("(??)\\n");
                  
                  string str1 = "(??)";
                  cout << str1 << endl;
                  
                  // An additional backspace ends the fun...
                  
                  printf("(?\\?)\\n");
                  
                  string str2 = "(?\\?)";
                  cout << str2 << endl;
                  
                  // ...and so does the closing parenthesis removal.
                  
                  printf("(??\\n");
                  
                  string str3 = "(??";
                  cout << str3 << endl;
                  
                  // The same thing happens to wstring too.
                  
                  wstring str4 = L"(??)"; // bad one
                  wcout << str4 << endl;
                  
                  wstring str5 = L"(?\\?)"; // good one
                  wcout << str5 << endl;
                  
                  wstring str6 = L"(??"; // good one
                  wcout << str6 << endl;
                  
                  return 0;
                  

                  }

                  As you can see, the output quite differs form the expected one in case of "(??)" string. Since my regular expression contained a lot of '*', '[', ']', '\', '?', '(', and ')' characters, it was quite difficult to figure out that the value of a variable displayed in debug window differs from the one assigned to it in the source code (actually I figured it out by accident). So be careful when using regular expressions ;-) Oh, one more thing. I just tried this example on GCC compiler. In all cases the output is as expected. Hmmm...

                  -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

                  J Offline
                  J Offline
                  John R Shaw
                  wrote on last edited by
                  #9

                  I forgot about that? :laugh: I'll be adding a warning to my personal regular expression library documentation, because some how that never showed up in the tests. :doh: Note: g++ gives a warning when it sees a trigraph.

                  INTP "Program testing can be used to show the presence of bugs, but never to show their absence." - Edsger Dijkstra "I have never been lost, but I will admit to being confused for several weeks. " - Daniel Boone

                  H 1 Reply Last reply
                  0
                  • V Vladimir Svrkota

                    Hello guys. Recently I've been working on some relatively complex regular expressions. It took me a while before I finally figured out what was wrong with an obviously correct (but very long) regular expression. To illustrate the issue I'm talking about, fire up your Visual Studio 2008 SP1 Express/Pro/Team, create a new C++ Windows console application and replace the contents of your main cpp file with the following code:

                    #include "stdafx.h"

                    #include <stdio.h>
                    #include <iostream>
                    #include <string>

                    using namespace std;

                    int _tmain(int argc, _TCHAR* argv[])
                    {
                    // Fun starts here. Check the output. It quite differs from "(??)".

                    printf("(??)\\n");
                    
                    string str1 = "(??)";
                    cout << str1 << endl;
                    
                    // An additional backspace ends the fun...
                    
                    printf("(?\\?)\\n");
                    
                    string str2 = "(?\\?)";
                    cout << str2 << endl;
                    
                    // ...and so does the closing parenthesis removal.
                    
                    printf("(??\\n");
                    
                    string str3 = "(??";
                    cout << str3 << endl;
                    
                    // The same thing happens to wstring too.
                    
                    wstring str4 = L"(??)"; // bad one
                    wcout << str4 << endl;
                    
                    wstring str5 = L"(?\\?)"; // good one
                    wcout << str5 << endl;
                    
                    wstring str6 = L"(??"; // good one
                    wcout << str6 << endl;
                    
                    return 0;
                    

                    }

                    As you can see, the output quite differs form the expected one in case of "(??)" string. Since my regular expression contained a lot of '*', '[', ']', '\', '?', '(', and ')' characters, it was quite difficult to figure out that the value of a variable displayed in debug window differs from the one assigned to it in the source code (actually I figured it out by accident). So be careful when using regular expressions ;-) Oh, one more thing. I just tried this example on GCC compiler. In all cases the output is as expected. Hmmm...

                    -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

                    D Offline
                    D Offline
                    Dave Calkins
                    wrote on last edited by
                    #10

                    I don't get it. What's going on there? I get the same results you seemed to indicate with Visual C++ 2005.

                    1 Reply Last reply
                    0
                    • J John R Shaw

                      I forgot about that? :laugh: I'll be adding a warning to my personal regular expression library documentation, because some how that never showed up in the tests. :doh: Note: g++ gives a warning when it sees a trigraph.

                      INTP "Program testing can be used to show the presence of bugs, but never to show their absence." - Edsger Dijkstra "I have never been lost, but I will admit to being confused for several weeks. " - Daniel Boone

                      H Offline
                      H Offline
                      Hal Angseesing
                      wrote on last edited by
                      #11

                      VS has a warning (C4837) for this as well - by default (pre vs2010) was off by default. We have (as part of our standard headers that apply to all compile units) a set of warnings we always turn on (and some we turn off). Somebody had slipped that one in a while back :)

                      1 Reply Last reply
                      0
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Don't have an account? Register

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups