• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

VBA Web Scrap

Dushyant Gupta

New Member
Hi all,

I am trying to scrap list of VBA course items given at the right pane of the following link "www.tutorialspoint.com//vba/index.htm"

But I am unable to scrap list due to some error.

the code I have written is:

Code:
Sub tutorailpointsscrap()
  Dim ie As InternetExplorer
  
  Set ie = New InternetExplorer
  
  With ie
  .navigate "https://www.tutorialspoint.com//vba/index.htm"
  .Visible = True
  Do While ie.readyState <> READYSTATE_COMPLETE
  DoEvents
  Loop
  End With
  
  Dim html As HTMLDocument
  Set html = ie.document
  
   
  Dim ele As IHTMLElement
  
  Dim lists As IHTMLElementCollection
  Dim row As Long
  
  Set ele = html.getElementsByClassName("nav nav-list primary left-menu")
  
  Set lists = ele.getElementsByTagName("li")
  row = 1

  
  For Each li In lists
  Cells(row, 1) = li.innerText
  row = row + 1
  Next
  
  ie.Quit

End Sub

Thanks in advance
 
Hi !

Logic error ! Use getElementsByTagName on "A" instead of "li"

Tried that, but it gave the "Type mismatch" error.

However by removing " Set ele = html.getElementsByClassName("nav nav-list primary left-menu")"

and changing "Set lists = ele.getElementsByTagName("a")" to "Set lists = html.getElementsByTagName("a")"

it scrapes the full website with tagname "a"

But I want to scrape the list given in the following html:

HTML:
<ul class="nav nav-list primary left-menu">
<li class="heading">VBA Tutorial</li>
<li><a href="/vba/index.htm" style="background-color: rgb(214, 214, 214);">VBA - Home</a></li>
<li><a href="/vba/vba_overview.htm">VBA - Overview</a></li>
<li><a href="/vba/vba_excel_macros.htm">VBA - Excel Macros</a></li>
<li><a href="/vba/vba_excel_terms.htm">VBA - Excel Terms</a></li>
<li><a href="/vba/vba_macro_comments.htm">VBA - Macro Comments</a></li>
<li><a href="/vba/vba_message_box.htm">VBA - Message Box</a></li>
<li><a href="/vba/vba_input_box.htm">VBA - Input Box</a></li>
<li><a href="/vba/vba_variables.htm">VBA - Variables</a></li>
<li><a href="/vba/vba_constants.htm">VBA - Constants</a></li>
<li><a href="/vba/vba_operators.htm">VBA - Operators</a></li>
<li><a href="/vba/vba_decisions.htm">VBA - Decisions</a></li>
<li><a href="/vba/vba_loops.htm">VBA - Loops</a></li>
<li><a href="/vba/vba_strings.htm">VBA - Strings</a></li>
<li><a href="/vba/vba_date_time.htm">VBA - Date and Time</a></li>
<li><a href="/vba/vba_arrays.htm">VBA - Arrays</a></li>
<li><a href="/vba/vba_functions.htm">VBA - Functions</a></li>
<li><a href="/vba/vba_sub_procedure.htm">VBA - SubProcedure</a></li>
<li><a href="/vba/vba_events.htm">VBA - Events</a></li>
<li><a href="/vba/vba_error_handling.htm">VBA - Error Handling</a></li>
<li><a href="/vba/vba_excel_objects.htm">VBA - Excel Objects</a></li>
<li><a href="/vba/vba_text_files.htm">VBA - Text Files</a></li>
<li><a href="/vba/vba_programming_charts.htm">VBA - Programming Charts</a></li>
<li><a href="/vba/vba_userforms.htm">VBA - Userforms</a></li>
</ul>
 

Seems your error is a typo in code !

Or behind the initial class name object,
point out within structure the object containing only desired "A" tags …
 
First find "ul" Element Collection that you want, by looking for "ul" with .className("nav nav-list primary left-menu"). Then set "li" Element Collection and loop.

Code:
Sub tutorailpointsscrap()
  Dim ie As InternetExplorer
 
  Set ie = New InternetExplorer
 
  With ie
  .navigate "https://www.tutorialspoint.com//vba/index.htm"
  .Visible = True
  Do While ie.readyState <> READYSTATE_COMPLETE
  DoEvents
  Loop
  End With
 
  Dim html As HTMLDocument
  Set html = ie.document
 
  Dim ele As IHTMLElementCollection
  Dim lists As IHTMLElementCollection
  Dim row As Long
 
  Set ele = html.getElementsByTagName("ul")
  For Each e In ele
      If e.className = "nav nav-list primary left-menu" Then
          Set lists = e.getElementsByTagName("li")
          row = 1
          For Each li In lists
            Cells(row, 1) = li.innerText
            row = row + 1
          Next
      End If
  Next e
  ie.Quit

End Sub

FYI - getElementsByClassName is finicky to work with. Above is workaround to specify class without using that method.
 
In fact the issue is a lack of observation of what returns
getElementsByClassName and an uncomplete initial post
as OP must crystal clear explain which error occurs on which codeline !

As there are several items with this class name,
the desired one must be pointed out within code !

As piloting IE is slowest way, faster is to use a request …

Anyway under IE without any reference to activate :​
Code:
Sub DemoIE()
         Dim oElt As Object, R&
         [A1].CurrentRegion.Clear
    With CreateObject("InternetExplorer.Application")
        .Navigate "https://www.tutorialspoint.com/vba/index.htm"
         While .Busy Or .ReadyState < 4:  DoEvents:  Wend
    For Each oElt In .Document.getElementsByClassName("nav nav-list primary left-menu")(0).getElementsByTagName("A")
         R = R + 1
         Cells(R, 1).Value = oElt.innerText
    Next
        .Quit
    End With
End Sub
Do you like it ? So thanks to click on bottom right Like !
 
Back
Top