Thanks Original Griff, That is beyond my technical know-how at this point but I am looking to learn. I am using this within Octoparse which from what I have learnt to date can only use regex to make the fields absolute / more accurate. So I think I am stuck with trying to make it work using regex. Unless anyone knows differently or can help with the regex please?
Member_15883893
Posts
-
Sometimes its a bullet item, sometimes it is not, sometimes it is multiple bullet items -
Sometimes its a bullet item, sometimes it is not, sometimes it is multiple bullet itemsI am trying to write some regex to pull out fields from a set of web pages. The information contained in them can vary for example they can have all or some of the fields (I think I have identified all the possibilities). and I think I can deal with this by including all the potential options and have data returned if the field is present as long as I can figure out how to make them absolute references. The other challenge is that sometimes these fields contain bullet lists which can have 1 or more bullet items which I don't know how to handle. Example is below and i am trying to identify the details associated with "Type of surveyor", "Works for", "Business type", "Surveying services", "Partners and directors", "Accreditations", "Registered valuer". If anyone can help that would be greatly appreciated
Patterson Surveying
Patterson Surveying is an independent surveying firm run by Paul Patterson
Type of surveyor
* Chartered Valuation Surveyor
Works for
* Residential customers * Commercial contracts
Business type
Private Practice
Surveying services
* Building surveying * RICS Home Survey – Level 2
Partners and Directors
* Mr P M Patterson MRICS <