Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Regular Expressions
  4. Where to start - need to extract...using regular expression

Where to start - need to extract...using regular expression

Scheduled Pinned Locked Moved Regular Expressions
helpc++linuxregextutorial
3 Posts 2 Posters 4 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • L Offline
    L Offline
    Lost User
    wrote on last edited by
    #1

    Where to start - need to extract... I am asking for help as far as where to start looking to resolve this issue. I really do not want a solution without knowing how it was formed - been there, done that and did not learn much from such approach. I do not need RTFM instructions either... Here is my task I am using C++ "system calls" to run what is normally a command while running "terminal" in Linux. ( do not ask for reasons ...) I get what I call "raw" output which includes "control characters" - as such these are NOT visible while the command is run in "terminal". I have some success using regular expression to remove these control characters... Now I want to use regular expression to extract ( first word ) on each line which DOES not have an option(s) and eventually retrieve the command description located on the same line hence I like to build a "dictionary of commands without options "... Here is an example - command list does not have "options" system-alias expect as an option

    admin Admin Policy Submenu
    list List available controllers
    show [ctrl] Controller information
    select Select default controller
    devices List available devices
    paired-devices List paired devices
    system-alias Set controller alias
    reset-alias Reset controller alias

    I would appreciate a reply in style ..."try this xyz resource and pay attention to chapter such and such..." Thanks any help will be greatly appreciated. PS I know to to build regular expression using Internet resource...

    K 1 Reply Last reply
    0
    • L Lost User

      Where to start - need to extract... I am asking for help as far as where to start looking to resolve this issue. I really do not want a solution without knowing how it was formed - been there, done that and did not learn much from such approach. I do not need RTFM instructions either... Here is my task I am using C++ "system calls" to run what is normally a command while running "terminal" in Linux. ( do not ask for reasons ...) I get what I call "raw" output which includes "control characters" - as such these are NOT visible while the command is run in "terminal". I have some success using regular expression to remove these control characters... Now I want to use regular expression to extract ( first word ) on each line which DOES not have an option(s) and eventually retrieve the command description located on the same line hence I like to build a "dictionary of commands without options "... Here is an example - command list does not have "options" system-alias expect as an option

      admin Admin Policy Submenu
      list List available controllers
      show [ctrl] Controller information
      select Select default controller
      devices List available devices
      paired-devices List paired devices
      system-alias Set controller alias
      reset-alias Reset controller alias

      I would appreciate a reply in style ..."try this xyz resource and pay attention to chapter such and such..." Thanks any help will be greatly appreciated. PS I know to to build regular expression using Internet resource...

      K Offline
      K Offline
      k5054
      wrote on last edited by
      #2

      First off, have you tried pre-pending TERM=dumb to your command string. That *should* remove all the control chars from the command output e.g.

      [k5054@localhost ~]$ TERM=vt100 infocmp

      Reconstructed via infocmp from file: /usr/share/terminfo/v/vt100

      vt100|vt100-am|DEC VT100 (w/advanced video),
      am, mc5i, msgr, xenl, xon,
      cols#80, it#8, lines#24, vt#3,
      acsc=``aaffggjjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~,
      bel=^G, blink=\E[5m$<2>, bold=\E[1m$<2>,
      clear=\E[H\E[J$<50>, cr=\r, csr=\E[%i%p1%d;%p2%dr,
      cub=\E[%p1%dD, cub1=^H, cud=\E[%p1%dB, cud1=\n,
      cuf=\E[%p1%dC, cuf1=\E[C$<2>,
      cup=\E[%i%p1%d;%p2%dH$<5>, cuu=\E[%p1%dA,
      cuu1=\E[A$<2>, ed=\E[J$<50>, el=\E[K$<3>, el1=\E[1K$<3>,
      enacs=\E(B\E)0, home=\E[H, ht=^I, hts=\EH, ind=\n, ka1=\EOq,
      ka3=\EOs, kb2=\EOr, kbs=^H, kc1=\EOp, kc3=\EOn, kcub1=\EOD,
      kcud1=\EOB, kcuf1=\EOC, kcuu1=\EOA, kent=\EOM, kf0=\EOy,
      kf1=\EOP, kf10=\EOx, kf2=\EOQ, kf3=\EOR, kf4=\EOS, kf5=\EOt,
      kf6=\EOu, kf7=\EOv, kf8=\EOl, kf9=\EOw, lf1=pf1, lf2=pf2,
      lf3=pf3, lf4=pf4, mc0=\E[0i, mc4=\E[4i, mc5=\E[5i, rc=\E8,
      rev=\E[7m$<2>, ri=\EM$<5>, rmacs=^O, rmam=\E[?7l,
      rmkx=\E[?1l\E>, rmso=\E[m$<2>, rmul=\E[m$<2>,
      rs2=\E<\E>\E[?3;4;5l\E[?7;8h\E[r, sc=\E7,
      sgr=\E[0%?%p1%p6%|%t;1%;%?%p2%t;4%;%?%p1%p3%|%t;7%;%?%p4%t;5%;m%?%p9%t\016%e\017%;$<2>,
      sgr0=\E[m\017$<2>, smacs=^N, smam=\E[?7h, smkx=\E[?1h\E=,
      smso=\E[7m$<2>, smul=\E[4m$<2>, tbc=\E[3g,

      [k5054@localhost ~]$ TERM=dumb infocmp

      Reconstructed via infocmp from file: /usr/share/terminfo/d/dumb

      dumb|80-column dumb tty,
      am,
      cols#80,
      bel=^G, cr=\r, cud1=\n, ind=\n,
      [k5054@localhost ~]$

      In general you can set any environment variable this way, so you might do something like

      LD_LIBRARY_PATH=/home/k5054/lib DEBUG=1 ./foo

      Which would add LD_LIBRARY_PATH and DEBUG variables to the environment, but only for the duration of the given command. But on to your problem. Assuming you've managed to remove your control characters, what it looks like you want to do is to match any line that does not have an option to it. Based on what you have here, you could match on any line that does not contain either a '[' (i.e. a required argumetn) or a '<' (i.e. an optional argument). So the regex for that would be [^<\[]. Note we need to esc

      L 1 Reply Last reply
      0
      • K k5054

        First off, have you tried pre-pending TERM=dumb to your command string. That *should* remove all the control chars from the command output e.g.

        [k5054@localhost ~]$ TERM=vt100 infocmp

        Reconstructed via infocmp from file: /usr/share/terminfo/v/vt100

        vt100|vt100-am|DEC VT100 (w/advanced video),
        am, mc5i, msgr, xenl, xon,
        cols#80, it#8, lines#24, vt#3,
        acsc=``aaffggjjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~,
        bel=^G, blink=\E[5m$<2>, bold=\E[1m$<2>,
        clear=\E[H\E[J$<50>, cr=\r, csr=\E[%i%p1%d;%p2%dr,
        cub=\E[%p1%dD, cub1=^H, cud=\E[%p1%dB, cud1=\n,
        cuf=\E[%p1%dC, cuf1=\E[C$<2>,
        cup=\E[%i%p1%d;%p2%dH$<5>, cuu=\E[%p1%dA,
        cuu1=\E[A$<2>, ed=\E[J$<50>, el=\E[K$<3>, el1=\E[1K$<3>,
        enacs=\E(B\E)0, home=\E[H, ht=^I, hts=\EH, ind=\n, ka1=\EOq,
        ka3=\EOs, kb2=\EOr, kbs=^H, kc1=\EOp, kc3=\EOn, kcub1=\EOD,
        kcud1=\EOB, kcuf1=\EOC, kcuu1=\EOA, kent=\EOM, kf0=\EOy,
        kf1=\EOP, kf10=\EOx, kf2=\EOQ, kf3=\EOR, kf4=\EOS, kf5=\EOt,
        kf6=\EOu, kf7=\EOv, kf8=\EOl, kf9=\EOw, lf1=pf1, lf2=pf2,
        lf3=pf3, lf4=pf4, mc0=\E[0i, mc4=\E[4i, mc5=\E[5i, rc=\E8,
        rev=\E[7m$<2>, ri=\EM$<5>, rmacs=^O, rmam=\E[?7l,
        rmkx=\E[?1l\E>, rmso=\E[m$<2>, rmul=\E[m$<2>,
        rs2=\E<\E>\E[?3;4;5l\E[?7;8h\E[r, sc=\E7,
        sgr=\E[0%?%p1%p6%|%t;1%;%?%p2%t;4%;%?%p1%p3%|%t;7%;%?%p4%t;5%;m%?%p9%t\016%e\017%;$<2>,
        sgr0=\E[m\017$<2>, smacs=^N, smam=\E[?7h, smkx=\E[?1h\E=,
        smso=\E[7m$<2>, smul=\E[4m$<2>, tbc=\E[3g,

        [k5054@localhost ~]$ TERM=dumb infocmp

        Reconstructed via infocmp from file: /usr/share/terminfo/d/dumb

        dumb|80-column dumb tty,
        am,
        cols#80,
        bel=^G, cr=\r, cud1=\n, ind=\n,
        [k5054@localhost ~]$

        In general you can set any environment variable this way, so you might do something like

        LD_LIBRARY_PATH=/home/k5054/lib DEBUG=1 ./foo

        Which would add LD_LIBRARY_PATH and DEBUG variables to the environment, but only for the duration of the given command. But on to your problem. Assuming you've managed to remove your control characters, what it looks like you want to do is to match any line that does not have an option to it. Based on what you have here, you could match on any line that does not contain either a '[' (i.e. a required argumetn) or a '<' (i.e. an optional argument). So the regex for that would be [^<\[]. Note we need to esc

        L Offline
        L Offline
        Lost User
        wrote on last edited by
        #3

        Thanks for prompt reply. Unfortunately I need to limit my reply... I had an eye surgery and having a heck of a time reading small font... and there is no easy way to set EVERYTHING to larger font... each app has it own setting... I should have thought about that BEFORE getting my eyeballs refurbish... Now if I use CAPS some people will get offended.... again... CHEERS

        1 Reply Last reply
        0
        Reply
        • Reply as topic
        Log in to reply
        • Oldest to Newest
        • Newest to Oldest
        • Most Votes


        • Login

        • Don't have an account? Register

        • Login or register to search.
        • First post
          Last post
        0
        • Categories
        • Recent
        • Tags
        • Popular
        • World
        • Users
        • Groups