Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Regular Expressions
  4. Regular expression to return all keys in a Bibtex file

Regular expression to return all keys in a Bibtex file

Scheduled Pinned Locked Moved Regular Expressions
regexhelpcombusinesstutorial
4 Posts 3 Posters 19 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Offline
    M Offline
    Member_15942356
    wrote on last edited by
    #1

    A Bibtex file is a structured file (I have shown an example of two records at the end). I would like to extract the 'keys', which is the text between an '@' and a comma but only get the text AFTER the '{' So, in the line @Article{m2023a, it would return 'm2023a' .. failing that, I could just get all those lines and then do another regex to further refine. The best I have come up with so far is: /@([^,]*)\,/ but I can't help feeling that there is a better way, and even this is not quite right. An example of a Bibtex file is (this is two records, there could be hundreds):

    @Article{m2023a,
    author = {S. Macdonald},
    journal = {Social Science Information},
    title = {The gaming of citation and authorship in academic journals: a warning from medicine},
    year = {2023},
    doi = {10.1177/05390184221142218},
    issue = {In Press},
    }

    @Misc{b2017a,
    author = {S. Buranyi},
    title = {Is the staggeringly profitable business of scientific publishing bad for science?},
    year = {2017},
    journal = {The Guardian, 27 June 2017},
    url = {https://www.theguardian.com/science/2017/jun/27/profitable-business-scientific-publishing-bad-for-science},
    }

    Richard DeemingR J 2 Replies Last reply
    0
    • M Member_15942356

      A Bibtex file is a structured file (I have shown an example of two records at the end). I would like to extract the 'keys', which is the text between an '@' and a comma but only get the text AFTER the '{' So, in the line @Article{m2023a, it would return 'm2023a' .. failing that, I could just get all those lines and then do another regex to further refine. The best I have come up with so far is: /@([^,]*)\,/ but I can't help feeling that there is a better way, and even this is not quite right. An example of a Bibtex file is (this is two records, there could be hundreds):

      @Article{m2023a,
      author = {S. Macdonald},
      journal = {Social Science Information},
      title = {The gaming of citation and authorship in academic journals: a warning from medicine},
      year = {2023},
      doi = {10.1177/05390184221142218},
      issue = {In Press},
      }

      @Misc{b2017a,
      author = {S. Buranyi},
      title = {Is the staggeringly profitable business of scientific publishing bad for science?},
      year = {2017},
      journal = {The Guardian, 27 June 2017},
      url = {https://www.theguardian.com/science/2017/jun/27/profitable-business-scientific-publishing-bad-for-science},
      }

      Richard DeemingR Offline
      Richard DeemingR Offline
      Richard Deeming
      wrote on last edited by
      #2

      So you just want the value between the { and the , on the lines starting with @? That seems simple enough:

      ^@[^{]+\{([^,]+),

      Demo[^]


      "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

      "These people looked deep within my soul and assigned me a number based on the order in which I joined" - Homer

      M 1 Reply Last reply
      0
      • Richard DeemingR Richard Deeming

        So you just want the value between the { and the , on the lines starting with @? That seems simple enough:

        ^@[^{]+\{([^,]+),

        Demo[^]


        "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

        M Offline
        M Offline
        Member_15942356
        wrote on last edited by
        #3

        Thank you - it might seem simple to you :-), but I really appreciate the help.

        1 Reply Last reply
        0
        • M Member_15942356

          A Bibtex file is a structured file (I have shown an example of two records at the end). I would like to extract the 'keys', which is the text between an '@' and a comma but only get the text AFTER the '{' So, in the line @Article{m2023a, it would return 'm2023a' .. failing that, I could just get all those lines and then do another regex to further refine. The best I have come up with so far is: /@([^,]*)\,/ but I can't help feeling that there is a better way, and even this is not quite right. An example of a Bibtex file is (this is two records, there could be hundreds):

          @Article{m2023a,
          author = {S. Macdonald},
          journal = {Social Science Information},
          title = {The gaming of citation and authorship in academic journals: a warning from medicine},
          year = {2023},
          doi = {10.1177/05390184221142218},
          issue = {In Press},
          }

          @Misc{b2017a,
          author = {S. Buranyi},
          title = {Is the staggeringly profitable business of scientific publishing bad for science?},
          year = {2017},
          journal = {The Guardian, 27 June 2017},
          url = {https://www.theguardian.com/science/2017/jun/27/profitable-business-scientific-publishing-bad-for-science},
          }

          J Offline
          J Offline
          jschell
          wrote on last edited by
          #4

          Member 15942356 wrote:

          but I can't help feeling that there is a better way

          Probably there is a better way unless you really only want the key and will not want anything else. If you are going to want something else (or several things) then the better way is to write (or find) and actual parser. So code, not just regex, which parses files based on the structure specified from the spec.

          1 Reply Last reply
          0
          Reply
          • Reply as topic
          Log in to reply
          • Oldest to Newest
          • Newest to Oldest
          • Most Votes


          • Login

          • Don't have an account? Register

          • Login or register to search.
          • First post
            Last post
          0
          • Categories
          • Recent
          • Tags
          • Popular
          • World
          • Users
          • Groups