Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Other Discussions
  3. Clever Code
  4. Funny strings

Funny strings

Scheduled Pinned Locked Moved Clever Code
c++csharpvisual-studiodebugging
11 Posts 8 Posters 6 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • V Offline
    V Offline
    Vladimir Svrkota
    wrote on last edited by
    #1

    Hello guys. Recently I've been working on some relatively complex regular expressions. It took me a while before I finally figured out what was wrong with an obviously correct (but very long) regular expression. To illustrate the issue I'm talking about, fire up your Visual Studio 2008 SP1 Express/Pro/Team, create a new C++ Windows console application and replace the contents of your main cpp file with the following code:

    #include "stdafx.h"

    #include <stdio.h>
    #include <iostream>
    #include <string>

    using namespace std;

    int _tmain(int argc, _TCHAR* argv[])
    {
    // Fun starts here. Check the output. It quite differs from "(??)".

    printf("(??)\\n");
    
    string str1 = "(??)";
    cout << str1 << endl;
    
    // An additional backspace ends the fun...
    
    printf("(?\\?)\\n");
    
    string str2 = "(?\\?)";
    cout << str2 << endl;
    
    // ...and so does the closing parenthesis removal.
    
    printf("(??\\n");
    
    string str3 = "(??";
    cout << str3 << endl;
    
    // The same thing happens to wstring too.
    
    wstring str4 = L"(??)"; // bad one
    wcout << str4 << endl;
    
    wstring str5 = L"(?\\?)"; // good one
    wcout << str5 << endl;
    
    wstring str6 = L"(??"; // good one
    wcout << str6 << endl;
    
    return 0;
    

    }

    As you can see, the output quite differs form the expected one in case of "(??)" string. Since my regular expression contained a lot of '*', '[', ']', '\', '?', '(', and ')' characters, it was quite difficult to figure out that the value of a variable displayed in debug window differs from the one assigned to it in the source code (actually I figured it out by accident). So be careful when using regular expressions ;-) Oh, one more thing. I just tried this example on GCC compiler. In all cases the output is as expected. Hmmm...

    -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

    O P I L J 6 Replies Last reply
    0
    • V Vladimir Svrkota

      Hello guys. Recently I've been working on some relatively complex regular expressions. It took me a while before I finally figured out what was wrong with an obviously correct (but very long) regular expression. To illustrate the issue I'm talking about, fire up your Visual Studio 2008 SP1 Express/Pro/Team, create a new C++ Windows console application and replace the contents of your main cpp file with the following code:

      #include "stdafx.h"

      #include <stdio.h>
      #include <iostream>
      #include <string>

      using namespace std;

      int _tmain(int argc, _TCHAR* argv[])
      {
      // Fun starts here. Check the output. It quite differs from "(??)".

      printf("(??)\\n");
      
      string str1 = "(??)";
      cout << str1 << endl;
      
      // An additional backspace ends the fun...
      
      printf("(?\\?)\\n");
      
      string str2 = "(?\\?)";
      cout << str2 << endl;
      
      // ...and so does the closing parenthesis removal.
      
      printf("(??\\n");
      
      string str3 = "(??";
      cout << str3 << endl;
      
      // The same thing happens to wstring too.
      
      wstring str4 = L"(??)"; // bad one
      wcout << str4 << endl;
      
      wstring str5 = L"(?\\?)"; // good one
      wcout << str5 << endl;
      
      wstring str6 = L"(??"; // good one
      wcout << str6 << endl;
      
      return 0;
      

      }

      As you can see, the output quite differs form the expected one in case of "(??)" string. Since my regular expression contained a lot of '*', '[', ']', '\', '?', '(', and ')' characters, it was quite difficult to figure out that the value of a variable displayed in debug window differs from the one assigned to it in the source code (actually I figured it out by accident). So be careful when using regular expressions ;-) Oh, one more thing. I just tried this example on GCC compiler. In all cases the output is as expected. Hmmm...

      -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

      O Offline
      O Offline
      oggenok64
      wrote on last edited by
      #2

      Ye olde trigraph strikes again. It's well documented in my old Kernighan & Ritchie: "The C Programming Language", 2nd edition from 1988.

      O V 2 Replies Last reply
      0
      • O oggenok64

        Ye olde trigraph strikes again. It's well documented in my old Kernighan & Ritchie: "The C Programming Language", 2nd edition from 1988.

        O Offline
        O Offline
        oggenok64
        wrote on last edited by
        #3

        The string literal "??)" will be replaced with the single character ']' which looks consistent with what you've seen. I don't have VS around so i can't immediately reproduce the output. By default gcc ignores trigraphs, so it's not a surprise you don't see the "problem" there. Try compiling with the -trigraphs switch.

        1 Reply Last reply
        0
        • V Vladimir Svrkota

          Hello guys. Recently I've been working on some relatively complex regular expressions. It took me a while before I finally figured out what was wrong with an obviously correct (but very long) regular expression. To illustrate the issue I'm talking about, fire up your Visual Studio 2008 SP1 Express/Pro/Team, create a new C++ Windows console application and replace the contents of your main cpp file with the following code:

          #include "stdafx.h"

          #include <stdio.h>
          #include <iostream>
          #include <string>

          using namespace std;

          int _tmain(int argc, _TCHAR* argv[])
          {
          // Fun starts here. Check the output. It quite differs from "(??)".

          printf("(??)\\n");
          
          string str1 = "(??)";
          cout << str1 << endl;
          
          // An additional backspace ends the fun...
          
          printf("(?\\?)\\n");
          
          string str2 = "(?\\?)";
          cout << str2 << endl;
          
          // ...and so does the closing parenthesis removal.
          
          printf("(??\\n");
          
          string str3 = "(??";
          cout << str3 << endl;
          
          // The same thing happens to wstring too.
          
          wstring str4 = L"(??)"; // bad one
          wcout << str4 << endl;
          
          wstring str5 = L"(?\\?)"; // good one
          wcout << str5 << endl;
          
          wstring str6 = L"(??"; // good one
          wcout << str6 << endl;
          
          return 0;
          

          }

          As you can see, the output quite differs form the expected one in case of "(??)" string. Since my regular expression contained a lot of '*', '[', ']', '\', '?', '(', and ')' characters, it was quite difficult to figure out that the value of a variable displayed in debug window differs from the one assigned to it in the source code (actually I figured it out by accident). So be careful when using regular expressions ;-) Oh, one more thing. I just tried this example on GCC compiler. In all cases the output is as expected. Hmmm...

          -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

          P Offline
          P Offline
          PIEBALDconsult
          wrote on last edited by
          #4

          They who don't learn from history are doomed to be bitten on the bottom by it. :-D

          1 Reply Last reply
          0
          • O oggenok64

            Ye olde trigraph strikes again. It's well documented in my old Kernighan & Ritchie: "The C Programming Language", 2nd edition from 1988.

            V Offline
            V Offline
            Vladimir Svrkota
            wrote on last edited by
            #5

            OMG! :-D Have the same book, cleaned up the dust and there they were (trigraphs), staring right back at me, making me just stand there in shame :-D . Thanks, Søren.

            -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

            O 1 Reply Last reply
            0
            • V Vladimir Svrkota

              OMG! :-D Have the same book, cleaned up the dust and there they were (trigraphs), staring right back at me, making me just stand there in shame :-D . Thanks, Søren.

              -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

              O Offline
              O Offline
              oggenok64
              wrote on last edited by
              #6

              No, you are not standing in shame. A great many people have bitten by trigraphs because they are so counterintuitive.

              1 Reply Last reply
              0
              • V Vladimir Svrkota

                Hello guys. Recently I've been working on some relatively complex regular expressions. It took me a while before I finally figured out what was wrong with an obviously correct (but very long) regular expression. To illustrate the issue I'm talking about, fire up your Visual Studio 2008 SP1 Express/Pro/Team, create a new C++ Windows console application and replace the contents of your main cpp file with the following code:

                #include "stdafx.h"

                #include <stdio.h>
                #include <iostream>
                #include <string>

                using namespace std;

                int _tmain(int argc, _TCHAR* argv[])
                {
                // Fun starts here. Check the output. It quite differs from "(??)".

                printf("(??)\\n");
                
                string str1 = "(??)";
                cout << str1 << endl;
                
                // An additional backspace ends the fun...
                
                printf("(?\\?)\\n");
                
                string str2 = "(?\\?)";
                cout << str2 << endl;
                
                // ...and so does the closing parenthesis removal.
                
                printf("(??\\n");
                
                string str3 = "(??";
                cout << str3 << endl;
                
                // The same thing happens to wstring too.
                
                wstring str4 = L"(??)"; // bad one
                wcout << str4 << endl;
                
                wstring str5 = L"(?\\?)"; // good one
                wcout << str5 << endl;
                
                wstring str6 = L"(??"; // good one
                wcout << str6 << endl;
                
                return 0;
                

                }

                As you can see, the output quite differs form the expected one in case of "(??)" string. Since my regular expression contained a lot of '*', '[', ']', '\', '?', '(', and ')' characters, it was quite difficult to figure out that the value of a variable displayed in debug window differs from the one assigned to it in the source code (actually I figured it out by accident). So be careful when using regular expressions ;-) Oh, one more thing. I just tried this example on GCC compiler. In all cases the output is as expected. Hmmm...

                -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

                I Offline
                I Offline
                iam123
                wrote on last edited by
                #7

                lol :)

                1 Reply Last reply
                0
                • V Vladimir Svrkota

                  Hello guys. Recently I've been working on some relatively complex regular expressions. It took me a while before I finally figured out what was wrong with an obviously correct (but very long) regular expression. To illustrate the issue I'm talking about, fire up your Visual Studio 2008 SP1 Express/Pro/Team, create a new C++ Windows console application and replace the contents of your main cpp file with the following code:

                  #include "stdafx.h"

                  #include <stdio.h>
                  #include <iostream>
                  #include <string>

                  using namespace std;

                  int _tmain(int argc, _TCHAR* argv[])
                  {
                  // Fun starts here. Check the output. It quite differs from "(??)".

                  printf("(??)\\n");
                  
                  string str1 = "(??)";
                  cout << str1 << endl;
                  
                  // An additional backspace ends the fun...
                  
                  printf("(?\\?)\\n");
                  
                  string str2 = "(?\\?)";
                  cout << str2 << endl;
                  
                  // ...and so does the closing parenthesis removal.
                  
                  printf("(??\\n");
                  
                  string str3 = "(??";
                  cout << str3 << endl;
                  
                  // The same thing happens to wstring too.
                  
                  wstring str4 = L"(??)"; // bad one
                  wcout << str4 << endl;
                  
                  wstring str5 = L"(?\\?)"; // good one
                  wcout << str5 << endl;
                  
                  wstring str6 = L"(??"; // good one
                  wcout << str6 << endl;
                  
                  return 0;
                  

                  }

                  As you can see, the output quite differs form the expected one in case of "(??)" string. Since my regular expression contained a lot of '*', '[', ']', '\', '?', '(', and ')' characters, it was quite difficult to figure out that the value of a variable displayed in debug window differs from the one assigned to it in the source code (actually I figured it out by accident). So be careful when using regular expressions ;-) Oh, one more thing. I just tried this example on GCC compiler. In all cases the output is as expected. Hmmm...

                  -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

                  L Offline
                  L Offline
                  Lost User
                  wrote on last edited by
                  #8

                  Trigraphs should be dropped. No one needs them. They're only causing bugs these days.

                  1 Reply Last reply
                  0
                  • V Vladimir Svrkota

                    Hello guys. Recently I've been working on some relatively complex regular expressions. It took me a while before I finally figured out what was wrong with an obviously correct (but very long) regular expression. To illustrate the issue I'm talking about, fire up your Visual Studio 2008 SP1 Express/Pro/Team, create a new C++ Windows console application and replace the contents of your main cpp file with the following code:

                    #include "stdafx.h"

                    #include <stdio.h>
                    #include <iostream>
                    #include <string>

                    using namespace std;

                    int _tmain(int argc, _TCHAR* argv[])
                    {
                    // Fun starts here. Check the output. It quite differs from "(??)".

                    printf("(??)\\n");
                    
                    string str1 = "(??)";
                    cout << str1 << endl;
                    
                    // An additional backspace ends the fun...
                    
                    printf("(?\\?)\\n");
                    
                    string str2 = "(?\\?)";
                    cout << str2 << endl;
                    
                    // ...and so does the closing parenthesis removal.
                    
                    printf("(??\\n");
                    
                    string str3 = "(??";
                    cout << str3 << endl;
                    
                    // The same thing happens to wstring too.
                    
                    wstring str4 = L"(??)"; // bad one
                    wcout << str4 << endl;
                    
                    wstring str5 = L"(?\\?)"; // good one
                    wcout << str5 << endl;
                    
                    wstring str6 = L"(??"; // good one
                    wcout << str6 << endl;
                    
                    return 0;
                    

                    }

                    As you can see, the output quite differs form the expected one in case of "(??)" string. Since my regular expression contained a lot of '*', '[', ']', '\', '?', '(', and ')' characters, it was quite difficult to figure out that the value of a variable displayed in debug window differs from the one assigned to it in the source code (actually I figured it out by accident). So be careful when using regular expressions ;-) Oh, one more thing. I just tried this example on GCC compiler. In all cases the output is as expected. Hmmm...

                    -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

                    J Offline
                    J Offline
                    John R Shaw
                    wrote on last edited by
                    #9

                    I forgot about that? :laugh: I'll be adding a warning to my personal regular expression library documentation, because some how that never showed up in the tests. :doh: Note: g++ gives a warning when it sees a trigraph.

                    INTP "Program testing can be used to show the presence of bugs, but never to show their absence." - Edsger Dijkstra "I have never been lost, but I will admit to being confused for several weeks. " - Daniel Boone

                    H 1 Reply Last reply
                    0
                    • V Vladimir Svrkota

                      Hello guys. Recently I've been working on some relatively complex regular expressions. It took me a while before I finally figured out what was wrong with an obviously correct (but very long) regular expression. To illustrate the issue I'm talking about, fire up your Visual Studio 2008 SP1 Express/Pro/Team, create a new C++ Windows console application and replace the contents of your main cpp file with the following code:

                      #include "stdafx.h"

                      #include <stdio.h>
                      #include <iostream>
                      #include <string>

                      using namespace std;

                      int _tmain(int argc, _TCHAR* argv[])
                      {
                      // Fun starts here. Check the output. It quite differs from "(??)".

                      printf("(??)\\n");
                      
                      string str1 = "(??)";
                      cout << str1 << endl;
                      
                      // An additional backspace ends the fun...
                      
                      printf("(?\\?)\\n");
                      
                      string str2 = "(?\\?)";
                      cout << str2 << endl;
                      
                      // ...and so does the closing parenthesis removal.
                      
                      printf("(??\\n");
                      
                      string str3 = "(??";
                      cout << str3 << endl;
                      
                      // The same thing happens to wstring too.
                      
                      wstring str4 = L"(??)"; // bad one
                      wcout << str4 << endl;
                      
                      wstring str5 = L"(?\\?)"; // good one
                      wcout << str5 << endl;
                      
                      wstring str6 = L"(??"; // good one
                      wcout << str6 << endl;
                      
                      return 0;
                      

                      }

                      As you can see, the output quite differs form the expected one in case of "(??)" string. Since my regular expression contained a lot of '*', '[', ']', '\', '?', '(', and ')' characters, it was quite difficult to figure out that the value of a variable displayed in debug window differs from the one assigned to it in the source code (actually I figured it out by accident). So be careful when using regular expressions ;-) Oh, one more thing. I just tried this example on GCC compiler. In all cases the output is as expected. Hmmm...

                      -- Vladimir Svrkota, AlfaNum Novi Sad, Serbia.

                      D Offline
                      D Offline
                      Dave Calkins
                      wrote on last edited by
                      #10

                      I don't get it. What's going on there? I get the same results you seemed to indicate with Visual C++ 2005.

                      1 Reply Last reply
                      0
                      • J John R Shaw

                        I forgot about that? :laugh: I'll be adding a warning to my personal regular expression library documentation, because some how that never showed up in the tests. :doh: Note: g++ gives a warning when it sees a trigraph.

                        INTP "Program testing can be used to show the presence of bugs, but never to show their absence." - Edsger Dijkstra "I have never been lost, but I will admit to being confused for several weeks. " - Daniel Boone

                        H Offline
                        H Offline
                        Hal Angseesing
                        wrote on last edited by
                        #11

                        VS has a warning (C4837) for this as well - by default (pre vs2010) was off by default. We have (as part of our standard headers that apply to all compile units) a set of warnings we always turn on (and some we turn off). Somebody had slipped that one in a while back :)

                        1 Reply Last reply
                        0
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Don't have an account? Register

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups