Regular expression to return all keys in a Bibtex file
-
A Bibtex file is a structured file (I have shown an example of two records at the end). I would like to extract the 'keys', which is the text between an '@' and a comma but only get the text AFTER the '{' So, in the line @Article{m2023a, it would return 'm2023a' .. failing that, I could just get all those lines and then do another regex to further refine. The best I have come up with so far is: /@([^,]*)\,/ but I can't help feeling that there is a better way, and even this is not quite right. An example of a Bibtex file is (this is two records, there could be hundreds):
@Article{m2023a,
author = {S. Macdonald},
journal = {Social Science Information},
title = {The gaming of citation and authorship in academic journals: a warning from medicine},
year = {2023},
doi = {10.1177/05390184221142218},
issue = {In Press},
}@Misc{b2017a,
author = {S. Buranyi},
title = {Is the staggeringly profitable business of scientific publishing bad for science?},
year = {2017},
journal = {The Guardian, 27 June 2017},
url = {https://www.theguardian.com/science/2017/jun/27/profitable-business-scientific-publishing-bad-for-science},
} -
A Bibtex file is a structured file (I have shown an example of two records at the end). I would like to extract the 'keys', which is the text between an '@' and a comma but only get the text AFTER the '{' So, in the line @Article{m2023a, it would return 'm2023a' .. failing that, I could just get all those lines and then do another regex to further refine. The best I have come up with so far is: /@([^,]*)\,/ but I can't help feeling that there is a better way, and even this is not quite right. An example of a Bibtex file is (this is two records, there could be hundreds):
@Article{m2023a,
author = {S. Macdonald},
journal = {Social Science Information},
title = {The gaming of citation and authorship in academic journals: a warning from medicine},
year = {2023},
doi = {10.1177/05390184221142218},
issue = {In Press},
}@Misc{b2017a,
author = {S. Buranyi},
title = {Is the staggeringly profitable business of scientific publishing bad for science?},
year = {2017},
journal = {The Guardian, 27 June 2017},
url = {https://www.theguardian.com/science/2017/jun/27/profitable-business-scientific-publishing-bad-for-science},
} -
Thank you - it might seem simple to you :-), but I really appreciate the help.
-
A Bibtex file is a structured file (I have shown an example of two records at the end). I would like to extract the 'keys', which is the text between an '@' and a comma but only get the text AFTER the '{' So, in the line @Article{m2023a, it would return 'm2023a' .. failing that, I could just get all those lines and then do another regex to further refine. The best I have come up with so far is: /@([^,]*)\,/ but I can't help feeling that there is a better way, and even this is not quite right. An example of a Bibtex file is (this is two records, there could be hundreds):
@Article{m2023a,
author = {S. Macdonald},
journal = {Social Science Information},
title = {The gaming of citation and authorship in academic journals: a warning from medicine},
year = {2023},
doi = {10.1177/05390184221142218},
issue = {In Press},
}@Misc{b2017a,
author = {S. Buranyi},
title = {Is the staggeringly profitable business of scientific publishing bad for science?},
year = {2017},
journal = {The Guardian, 27 June 2017},
url = {https://www.theguardian.com/science/2017/jun/27/profitable-business-scientific-publishing-bad-for-science},
}Member 15942356 wrote:
but I can't help feeling that there is a better way
Probably there is a better way unless you really only want the key and will not want anything else. If you are going to want something else (or several things) then the better way is to write (or find) and actual parser. So code, not just regex, which parses files based on the structure specified from the spec.