Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Java
  4. Getting data from a web site using Jsoup

Getting data from a web site using Jsoup

Scheduled Pinned Locked Moved Java
questionjavahtmlwpfcom
2 Posts 2 Posters 4 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D Offline
    D Offline
    David Crow
    wrote on last edited by
    #1

    The URL of the site is: https://www.lowes.com/pd/7-16-CAT-PS2-10-OSB-Sheathing-Application-as-4-x-8/50382768[^] The information that I am trying to obtain is the price. Using Chrome's Inspect view, I was able to drill down to the price element in question. The Xpath of that element looks like:

    /html/body/div[3]/section/div/div[7]/div[1]/div[2]/div[1]/span/div[1]

    I really wanted to see that in some "tree" form, so I walked the HTML backward from that element back to the top. That looks like:

                  $30.65 
    

    I was able to throw some Java code together (read: ugly) to drill down to that level but got stopped at the next-to-last DIV element.

    // a bunch of other calls to getElementsByAttributeValue()
    ...
    Elements elements = element.getElementsByAttributeValue("class", "style__ProductTitleWrapper-PDP__sc-1a0l1ro-11 fUmbqY");

    This returns a 1-item collection as it should. If I then follow that with:

    Element element = elements.get(0);
    elements = element.getElementsByAttributeValue("class", "styles__PriceWrapper-sc-1c3t51u-0 ZQLLV priceWrapper");

    It returns an empty collection. I assume this is because no such class could be found. However I can plainly see that DIV element in the Inspect view. I can get the sibling elements to the one in question. I even tried searching for getElementsByAttributeValue("tabindex", "0"), and while it correctly found 2 elements, neither are the one I want. Any idea(s) as to what I'm missing (or not understanding)? Thanks. DC

    "One man's wage rise is another man's price increa

    E 1 Reply Last reply
    0
    • D David Crow

      The URL of the site is: https://www.lowes.com/pd/7-16-CAT-PS2-10-OSB-Sheathing-Application-as-4-x-8/50382768[^] The information that I am trying to obtain is the price. Using Chrome's Inspect view, I was able to drill down to the price element in question. The Xpath of that element looks like:

      /html/body/div[3]/section/div/div[7]/div[1]/div[2]/div[1]/span/div[1]

      I really wanted to see that in some "tree" form, so I walked the HTML backward from that element back to the top. That looks like:

                    $30.65 
      

      I was able to throw some Java code together (read: ugly) to drill down to that level but got stopped at the next-to-last DIV element.

      // a bunch of other calls to getElementsByAttributeValue()
      ...
      Elements elements = element.getElementsByAttributeValue("class", "style__ProductTitleWrapper-PDP__sc-1a0l1ro-11 fUmbqY");

      This returns a 1-item collection as it should. If I then follow that with:

      Element element = elements.get(0);
      elements = element.getElementsByAttributeValue("class", "styles__PriceWrapper-sc-1c3t51u-0 ZQLLV priceWrapper");

      It returns an empty collection. I assume this is because no such class could be found. However I can plainly see that DIV element in the Inspect view. I can get the sibling elements to the one in question. I even tried searching for getElementsByAttributeValue("tabindex", "0"), and while it correctly found 2 elements, neither are the one I want. Any idea(s) as to what I'm missing (or not understanding)? Thanks. DC

      "One man's wage rise is another man's price increa

      E Offline
      E Offline
      englebart
      wrote on last edited by
      #2

      Did you try to skip the random in the middle? Go straight for “class”, “finalPrice” The extract the nested text node to skip the last div

      1 Reply Last reply
      0
      Reply
      • Reply as topic
      Log in to reply
      • Oldest to Newest
      • Newest to Oldest
      • Most Votes


      • Login

      • Don't have an account? Register

      • Login or register to search.
      • First post
        Last post
      0
      • Categories
      • Recent
      • Tags
      • Popular
      • World
      • Users
      • Groups