RegEx: Get Values from HTML Attribute Tags
-
I need to get the values from below following html snippet. So far I came up with this regex which helps me trim it down to the values I needed, but to automate this I need to join 2 regex statements to get the result "18" which is where I am stuck at. Or Please suggest a better method for me get the values. I am using WebHarvey scraping tool. The program is based on .net but it doesn't support inserting .net code so I need only regex command. First Regex Statement
(?s)(?<=attribute bathroom).+?(?=\/span)
Result:" title="Bathrooms" style=" ">
18<Second Regex Statement
(?s)(?<=).+?(?=<)
Result: 18 HTML Snippet* xxx1 * Factory * 18 * 18 * 5,010m**2** | 9,270m**2**
-
I need to get the values from below following html snippet. So far I came up with this regex which helps me trim it down to the values I needed, but to automate this I need to join 2 regex statements to get the result "18" which is where I am stuck at. Or Please suggest a better method for me get the values. I am using WebHarvey scraping tool. The program is based on .net but it doesn't support inserting .net code so I need only regex command. First Regex Statement
(?s)(?<=attribute bathroom).+?(?=\/span)
Result:" title="Bathrooms" style=" ">
18<Second Regex Statement
(?s)(?<=).+?(?=<)
Result: 18 HTML Snippet* xxx1 * Factory * 18 * 18 * 5,010m**2** | 9,270m**2**
-
I need to get the values from below following html snippet. So far I came up with this regex which helps me trim it down to the values I needed, but to automate this I need to join 2 regex statements to get the result "18" which is where I am stuck at. Or Please suggest a better method for me get the values. I am using WebHarvey scraping tool. The program is based on .net but it doesn't support inserting .net code so I need only regex command. First Regex Statement
(?s)(?<=attribute bathroom).+?(?=\/span)
Result:" title="Bathrooms" style=" ">
18<Second Regex Statement
(?s)(?<=).+?(?=<)
Result: 18 HTML Snippet* xxx1 * Factory * 18 * 18 * 5,010m**2** | 9,270m**2**
Don't try to use Regex to parse an HTML document. You'll end up with an extremely fragile solution, where even the slightest change to the source document will cause it to break. Use a proper HTML parsing library instead - for example, AngleSharp[^].
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
-
Don't try to use Regex to parse an HTML document. You'll end up with an extremely fragile solution, where even the slightest change to the source document will cause it to break. Use a proper HTML parsing library instead - for example, AngleSharp[^].
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
In my question I have mentioned "
I am using WebHarvey scraping tool. The program is based on .net but it doesn't support inserting .net code so I need only regex command.
" I cannot use any solution except using regex in this tool. When 2 of my regex statements are bringing the result I wanted then I am pretty sure using regex can get the solution needed but due to lack of knowledge I am stuck here. Parsing HTML with regex is not best practice but I am willing to take the risk. Suggest a solution please.
-
In my question I have mentioned "
I am using WebHarvey scraping tool. The program is based on .net but it doesn't support inserting .net code so I need only regex command.
" I cannot use any solution except using regex in this tool. When 2 of my regex statements are bringing the result I wanted then I am pretty sure using regex can get the solution needed but due to lack of knowledge I am stuck here. Parsing HTML with regex is not best practice but I am willing to take the risk. Suggest a solution please.
-
In my question I have mentioned "
I am using WebHarvey scraping tool. The program is based on .net but it doesn't support inserting .net code so I need only regex command.
" I cannot use any solution except using regex in this tool. When 2 of my regex statements are bringing the result I wanted then I am pretty sure using regex can get the solution needed but due to lack of knowledge I am stuck here. Parsing HTML with regex is not best practice but I am willing to take the risk. Suggest a solution please.
He was saying instead of using WebHarvery, use AngleSharp instead.
Asking questions is a skill CodeProject Forum Guidelines Google: C# How to debug code Seriously, go read these articles.
Dave Kreskowiak