• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Trouble dealing with table data while scraping

Shahin

Why not call the TGA and ask for a full copy of the data
If you have a legitimate excuse to have it I'm sure they'll oblige you
 
@Sir Hui,
Thanks for your response and showing me the way how could I get the data. Anyways, the data is not I'm after, rather I wanna face the difficulty to get to the data so that I can deal with sites alike.
 
You may want to have a quick look at this video by Paul Kelly of Macro Mastery where he describes the web scraping process
 
@ sir Hui,
I went through the whole video when the webinar was going on. He focused on very basic things on how to scrape table data stored within html elements. I sent him an e-mail to spotlight on jscript as well cause it is the growing trend. However, he replied with affirmative to consider it in his next video.
 
Last edited:
Found a solution to get all the data from a certain table. There are around 300 rows of data which spread 7 column across. It takes a while to parse them all. There are seven steps to hurdle to get to the target page. Here is the working code. Just run it, sit back and relax until the browser is closed. Btw, before executing the script make sure to add "Selenium Type Library" in the reference.
Code:
Sub Table_Data()

Dim driver As New WebDriver
Dim posts As Object, post As Object, t_data As Object

With driver
    .Start "chrome", "http://apps.tga.gov.au/Prod/devices"
    .get "/daen-entry.aspx"
    .FindElementById("disclaimer-accept").Click
    .Wait 3000
    .FindElementById("medicine-name").SendKeys ("pump")
    .Wait 5000
    .FindElementByClass("medicines-check-all").Click
    .Wait 3000
    .FindElementById("submit-button").Click
    .Wait 5000
    .FindElementById("ctl00_body_MedicineSummaryControl_cmbPageSelection").Click
    .Wait 5000
    .FindElementByXPath("//option[@value='all']").Click
    .Wait 5000
End With

For Each posts In driver.FindElementsByXPath("//table[contains(@class,'daen-report')]")
    For Each post In posts.FindElementsByXPath(".//tr")
        For Each t_data In post.FindElementsByXPath(".//td[@class='row-odd']|.//td")
        y = y + 1
        Cells(x, y) = t_data.Text
        Next t_data
        x = x + 1
        y = 0
    Next post
Next posts
End Sub
 
Btw, making a little change (.wait 5000 to .wait 10000) on the script in the specific area shown below, the chance is high that it will not catch "element not found or visible" error:

Code:
    .FindElementById("medicine-name").SendKeys ("pump")
    .Wait 10000
    .FindElementByClass("medicines-check-all").Click
 
Back
Top