Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Algorithms
  4. Pattern matching in trees, lists and strings alike

Pattern matching in trees, lists and strings alike

Scheduled Pinned Locked Moved Algorithms
regexc++htmlcomhardware
6 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • B Offline
    B Offline
    bjongejan
    wrote on last edited by
    #1

    There are many programming languages that do string pattern matching, either using regular expressions or using more elegant and powerful mechanisms, like SNOBOL[^]. There are also many programming languages that do tree pattern matching, like Haskell, but you cannot easily write a pattern that attempts to match an element in a list (a right descending tree), unless it is the first element. Many tasks involve complex, irregular and not very large structured data. The most appropriate programming languages for tackling such tasks should have this kind of pattern matching in their tool box. Examples of application domains for such a language:

    • HTML/XML validation and correction
    • C++ source code analysis
    • Analysis of email chains
    • Automatic chaining of tools in a workflow
    • Transformation from syntax to semantics in natural language processing
    • Computer algebra

    Starting in the eighties, I have made and maintained such a language, Bracmat (GitHub[^],Rosetta Code[^]), and I have used it for all purposes listed above. Problem is, it is not an established programming language and therefore not the right language for production software, so I'm looking for a replacement. I hope that I am too pessimistic and that there are in fact programming languages with seizable user communities that support the pattern matching facilities that I am looking for. The programming language that I am looking for should have a pattern matching mechanism that:

    1. allows wild cards not only in the tail part of a pattern, but anywhere in a pattern,
    2. in the case of matching against a list, is capable of matching any number of consecutive elements anywhere in the list, and not just the 'head' of the list,
    3. is visually clean, attractive and readable,
    4. allows 'normal' instructions to be embedded in patterns (evaluation of variables, writing to files, function calls, other pattern match operations, etc.),
    5. can assign a matched part of the subject to a variable,
    6. c
    D 1 Reply Last reply
    0
    • B bjongejan

      There are many programming languages that do string pattern matching, either using regular expressions or using more elegant and powerful mechanisms, like SNOBOL[^]. There are also many programming languages that do tree pattern matching, like Haskell, but you cannot easily write a pattern that attempts to match an element in a list (a right descending tree), unless it is the first element. Many tasks involve complex, irregular and not very large structured data. The most appropriate programming languages for tackling such tasks should have this kind of pattern matching in their tool box. Examples of application domains for such a language:

      • HTML/XML validation and correction
      • C++ source code analysis
      • Analysis of email chains
      • Automatic chaining of tools in a workflow
      • Transformation from syntax to semantics in natural language processing
      • Computer algebra

      Starting in the eighties, I have made and maintained such a language, Bracmat (GitHub[^],Rosetta Code[^]), and I have used it for all purposes listed above. Problem is, it is not an established programming language and therefore not the right language for production software, so I'm looking for a replacement. I hope that I am too pessimistic and that there are in fact programming languages with seizable user communities that support the pattern matching facilities that I am looking for. The programming language that I am looking for should have a pattern matching mechanism that:

      1. allows wild cards not only in the tail part of a pattern, but anywhere in a pattern,
      2. in the case of matching against a list, is capable of matching any number of consecutive elements anywhere in the list, and not just the 'head' of the list,
      3. is visually clean, attractive and readable,
      4. allows 'normal' instructions to be embedded in patterns (evaluation of variables, writing to files, function calls, other pattern match operations, etc.),
      5. can assign a matched part of the subject to a variable,
      6. c
      D Offline
      D Offline
      dusty_dex
      wrote on last edited by
      #2

      Perhaps I'm being facetious, but shouldn't you at least study parsers/compilers? flex/yacc/bison compilerdesigndetails.blogspot.co.uk/2012/02/compiler-design.html

      "It's true that hard work never killed anyone. But I figure, why take the chance." - Ronald Reagan That's what machines are for. Got a problem? Sleep on it.

      B 1 Reply Last reply
      0
      • D dusty_dex

        Perhaps I'm being facetious, but shouldn't you at least study parsers/compilers? flex/yacc/bison compilerdesigndetails.blogspot.co.uk/2012/02/compiler-design.html

        "It's true that hard work never killed anyone. But I figure, why take the chance." - Ronald Reagan That's what machines are for. Got a problem? Sleep on it.

        B Offline
        B Offline
        bjongejan
        wrote on last edited by
        #3

        Indeed. Bracmat would have been a better programming language if I had been a computer scientist. For software that is going to be maintained by others, Bracmat is currently not a solution. That is why I am looking for a community driven programming language that has pattern matching (on strings, lists, trees, see the list above) in its tool box. Clarify your answer if I misunderstood you.

        Bart Jongejan

        D 1 Reply Last reply
        0
        • B bjongejan

          Indeed. Bracmat would have been a better programming language if I had been a computer scientist. For software that is going to be maintained by others, Bracmat is currently not a solution. That is why I am looking for a community driven programming language that has pattern matching (on strings, lists, trees, see the list above) in its tool box. Clarify your answer if I misunderstood you.

          Bart Jongejan

          D Offline
          D Offline
          dusty_dex
          wrote on last edited by
          #4

          If you posted in Algorithms then it gave the wrong impression that you are looking for help coding a solution of your own. Try Perl ..www.perl.org Also have a look at Scala which has a very small syntax to learn, and is suitable for creating a Domain Specific Language of your own. scala-lang.org

          "It's true that hard work never killed anyone. But I figure, why take the chance." - Ronald Reagan That's what machines are for. Got a problem? Sleep on it.

          B 1 Reply Last reply
          0
          • D dusty_dex

            If you posted in Algorithms then it gave the wrong impression that you are looking for help coding a solution of your own. Try Perl ..www.perl.org Also have a look at Scala which has a very small syntax to learn, and is suitable for creating a Domain Specific Language of your own. scala-lang.org

            "It's true that hard work never killed anyone. But I figure, why take the chance." - Ronald Reagan That's what machines are for. Got a problem? Sleep on it.

            B Offline
            B Offline
            bjongejan
            wrote on last edited by
            #5

            Neither Perl or Scala can solve this simple problem with a simple pattern: Given a list of pairs, each pair consisting of a first name and a family name, give me two names of different persons that have the same family name. I think [Tom](<a href=)[^] (a cousin of Scala, I think) can do it, but the syntax of patterns in that language is very verbose. In Bracmat, you would do: (Abel.Hirst) (Benjamin.Foster) (Letty.Johnson) (George.Hanson) (Chris.Johnson) (Priscilla.Stein):?list; !list:? (?name1.?familyName) ? (?name2.!familyName) ? & out$(!name1 and !name2 "have the same family name (" !familyName ")") | out$"No two persons have the same family name"; which outputs Letty and Chris have the same family name ( Johnson ) The pattern is just ? (?name1.?familyName) ? (?name2.!familyName) ?. Scala only allows a wild card in the tail of a pattern. Perl 6 seems to have modernized its pattern matching, but the subject is still a string, never a list. Correct me if I'm wrong. For a perhaps more convincing, but longer, example, see Dinesman's multiple-dwelling problem[^].

            K 1 Reply Last reply
            0
            • B bjongejan

              Neither Perl or Scala can solve this simple problem with a simple pattern: Given a list of pairs, each pair consisting of a first name and a family name, give me two names of different persons that have the same family name. I think [Tom](<a href=)[^] (a cousin of Scala, I think) can do it, but the syntax of patterns in that language is very verbose. In Bracmat, you would do: (Abel.Hirst) (Benjamin.Foster) (Letty.Johnson) (George.Hanson) (Chris.Johnson) (Priscilla.Stein):?list; !list:? (?name1.?familyName) ? (?name2.!familyName) ? & out$(!name1 and !name2 "have the same family name (" !familyName ")") | out$"No two persons have the same family name"; which outputs Letty and Chris have the same family name ( Johnson ) The pattern is just ? (?name1.?familyName) ? (?name2.!familyName) ?. Scala only allows a wild card in the tail of a pattern. Perl 6 seems to have modernized its pattern matching, but the subject is still a string, never a list. Correct me if I'm wrong. For a perhaps more convincing, but longer, example, see Dinesman's multiple-dwelling problem[^].

              K Offline
              K Offline
              Kosta Cherry
              wrote on last edited by
              #6

              You can do anything in C++. your example can be done as this:

              std::string mystdarr[][2] = {{"Abel","Hirst"},{"Benjamin","Foster"},{"Letty","Johnson"},{"George","Hanson"},{"Chris","Johnson"},{"Priscilla","Stein"}, {"",""}};
              int arrsize = sizeof(mystdarr)/sizeof(mystdarr[0]);
              for (auto it=0; it < arrsize - 1; ++it) {
              auto itFound = std::find_if(&mystdarr[it+1], &mystdarr[arrsize-1], [&](const std::string (&cmp)[2]) { return !cmp[1].compare(mystdarr[it][1]); });
              if (itFound != (std::string (*)[2])(mystdarr[arrsize-1]))
              std::cout << mystdarr[it][0].c_str() << " and " << (*itFound)[0].c_str() << " have the same family name (" << (*itFound)[1].c_str() << ")\n";
              }

              Or, if you'd like less cryptic code (it's cryptic because built-in arrays in C/C++ are by no means flexible), you can do this:

              struct SS
              {
              std::string firstName;
              std::string lastName;
              SS(const std::string& fname, const std::string& lname) : firstName(fname), lastName(lname) {};
              };
              std::vector myarr;
              myarr.push_back(SS("Abel", "Hirst"));
              myarr.push_back(SS("Benjamin", "Foster"));
              myarr.push_back(SS("Letty", "Johnson"));
              myarr.push_back(SS("George", "Hanson"));
              myarr.push_back(SS("Chris", "Johnson"));
              myarr.push_back(SS("Priscilla", "Stein"));
              for (auto it=myarr.begin(); it!=myarr.end(); ++it)
              {
              auto itFound = std::find_if(it + 1, myarr.end(), [&](const SS& cmp) {return !cmp.lastName.compare(it->lastName);});
              if (itFound != myarr.end())
              std::cout << it->firstName.c_str() << " and " << itFound->firstName.c_str() << " have the same family name (" << it->lastName.c_str() << ")\n";
              }

              1 Reply Last reply
              0
              Reply
              • Reply as topic
              Log in to reply
              • Oldest to Newest
              • Newest to Oldest
              • Most Votes


              • Login

              • Don't have an account? Register

              • Login or register to search.
              • First post
                Last post
              0
              • Categories
              • Recent
              • Tags
              • Popular
              • World
              • Users
              • Groups