Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Regular Expressions
  4. Regular Expression to find parts of a <script/img src=""> or <link href=""> attribute value

Regular Expression to find parts of a <script/img src=""> or <link href=""> attribute value

Scheduled Pinned Locked Moved Regular Expressions
regexjavascripthtmlcsssharepoint
1 Posts 1 Posters 2 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S Offline
    S Offline
    Steve M Penner
    wrote on last edited by
    #1

    # Regular Expression to find parts of a `` or `<link href=''>` attribute value Been using my go-to [regex101.com](https://regex101.com/) editor to work this out, but I always have problems with URLs and filesystem paths. I generally have the 'https' URL/resource in order. I am trying to read and parse the _link_ __'href'__ and _img/script_ __'src'__ attribute values from the elements extracted in the markup. The groupings/captures I want are 1. "path provider" (PowerShell terminology), basically the drive 2. The path leading to the file part. I prefer groupings between the path separator "\" or "/", both must be accounted for but will accept a long string Thus, suppose D:\a\b\c\file.ext This part can be grouped as '\a\b\c' but if it can multiple groups '\a', '\b', '\c', even better. One more more path separators required 3. The file basename without path separator 4. The file extension with the leading '.' which is the last '.' of the path My working pattern/RE is: ___^([a-zA-Z]:)?(([\/\\]?[^\/\\]+)*)[\/\\]\([^\.]+)\.(\S+)$___ The pattern might be more specific regular expressions separated by the alternative separator (|) instead of trying to match the strings with a single expression. I specifically include the '^' and '$' start and end assertions for the markup attribute value. ## Test string #1: ${SPREST_JS_FolderPath}/SPListREST.js - No path provider/drive, so no Group 1 - OK - Group 2: ${SPREST_JS_FolderPath} # Item (ii) - Group 3: ${SPREST_JS_FolderPath} # repeat of Group 2 -- not wanted - Group 4: SPListREST # file basename Item (iii) - Group 5: js # file type/extension Item (iv) ## Test string #2 D:\dev\SharePoint\SPTools\src\pagestyle.css - Group 1: D: # Item (i) - Group 2: \dev\SharePoint\SPTools\src # Item (ii) exactly as required if groupings by '\pathseg' not possible - Group 3: \src # the last path segment--unwanted - Group 4: pagestyle # file basename Item (iii) - Group 5: css # file type/extension Item (iv) ## Test string #3 ./js/SPREST/SPRestEmail.js - No path provider/drive, so no group 1 - Group 2: ./js/SPREST # Item (ii) exactly as required if groupings by '\pathseg' not possible - Group 3: /SPREST # the last path segment--unwanted - Group 4: SPRestEmail # file basename Item (iii) - Group 5: js # file type/extension Item (iv) [composed in Markdown, so presentation affected by y</x-turndown>

    1 Reply Last reply
    0
    Reply
    • Reply as topic
    Log in to reply
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes


    • Login

    • Don't have an account? Register

    • Login or register to search.
    • First post
      Last post
    0
    • Categories
    • Recent
    • Tags
    • Popular
    • World
    • Users
    • Groups