• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

How can I run loop in my script?

shahin

Active Member
Can't incorporate an idea frequenting in my mind. While working with python I noticed that there is a rarely used method applied when it comes to run loops. However, the method is, if a loop starts from 1 then it will run until the "http.status=404", which indicates that a loop will run until it finds the url with the looping number in it which doesn't exist and the code will break. I don't know whether it is applicable in vba. Took a chance, though! Tried initially with:

Code:
Sub movie_name()
Const mlink = "https://www.yify-torrent.org/genres/western/"
Dim http As New XMLHTTP60, html As New HTMLDocument
Dim post As Object

With http
    .Open "GET", mlink, False
    .send
    html.body.innerHTML = .responseText
End With

For y = 1 To http.Status <> 200
    With http
        .Open "GET", mlink & "p-" & y & "/", False
        .send
        html.body.innerHTML = .responseText
    End With
For Each post In html.getElementsByClassName("mv")
    With post.getElementsByTagName("h3")
        x = x + 1
        If .Length Then Cells(x, 1) = .Item(0).innerText
    End With
    Next post
Next y
End Sub
 
Hi !

(http.Status <> 200) : if True equals -1 and False is 0 !

Instead of a loop just use a simple If statement …
 
Hi ,

Try this :
Code:
Sub movie_name()
    Const mlink = "https://www.yify-torrent.org/genres/western/"
    Dim http As New XMLHTTP60, html As New HTMLDocument
    Dim post As Object

    With http
        .Open "GET", mlink, False
        .send
        html.body.innerHTML = .responseText
    End With

    y = 1
    Do
      previousx = x
      With http
            .Open "GET", mlink & "p-" & y & "/", False
            .send
            html.body.innerHTML = .responseText
      End With
   
      DoEvents

      For Each post In html.getElementsByClassName("mv")
          With post.getElementsByTagName("h3")
                x = x + 1
                If .Length Then Cells(x, 1) = .Item(0).innerText
          End With
      Next post
      y = y + 1
    Loop Until previousx >= x
End Sub
Narayan
 
Thanks Narayan, for your robust solution. It seems that whatever the upper bound number is, the script you provided will fetch the whole data until it reaches there. Magical code. New ideas are mushrooming. Might get back on anything if I find it difficult to understand your code. Thanks a trillion.
 
Last edited:
Ain't it doing the same job? May I have a one-liner clarification how previousx thing works?

Code:
Sub movie_name()
Const mlink = "https://www.yify-torrent.org/genres/western/"
Dim http As New XMLHTTP60, html As New HTMLDocument
Dim post As Object

y = 1

Do
  previousx = x
  With http
        .Open "GET", mlink & "p-" & y & "/", False
        .send
        html.body.innerHTML = .responseText
  End With

  For Each post In html.getElementsByClassName("mv")
      With post.getElementsByTagName("h3")
            x = x + 1
            If .Length Then Cells(x, 1) = .Item(0).innerText
      End With
  Next post
  y = y + 1
Loop Until previousx >= x
End Sub
 
Hi ,

x is a counter which increments each time an entry is made in the worksheet for a movie.

Thus , with a valid new page , x will have incremented a number of times , 18 according to what I have seen in my browser.

Thus , each time a pass through the Do loop completes , x will be greater than previousx ; only when a new page number delivers no new movies , will x not be incremented , and in this case , previousx will equal x , and the loop will be exited.

Narayan
 
The same concept if I apply here, the loop continues on and on. Clearly I've messed something up but can't figure out:
Code:
Sub Aoty_Data()
Dim http As New XMLHTTP60
Dim html As New HTMLDocument, topic As HTMLHtmlElement
Dim inputyear As Long

inputyear = InputBox("Input the year you're scraping")
If inputyear < 1900 Then Exit Sub

y = 1
Do
    prex = x
    With http
        .Open "GET", "http://www.albumoftheyear.org/ratings/6-highest-rated/" & inputyear & "/" & y, False
        .send
        html.body.innerHTML = .responseText
    End With
   
    For Each topic In html.getElementsByClassName("albumListRow")
        x = x + 1
        With topic.getElementsByClassName("listLargeTitle")(0).getElementsByTagName("a")
            If .Length Then Cells(x, 1) = Split(.Item(0).innerText, "-")(0)
        End With
            With topic.getElementsByClassName("listLargeTitle")(0).getElementsByTagName("a")
            If .Length Then Cells(x, 2) = Split(.Item(0).innerText, "-")(1)
        End With
        With topic.getElementsByClassName("listScoreValue")
            If .Length Then Cells(x, 3) = .Item(0).innerText
        End With
            With topic.getElementsByClassName("listScoreText")
            If .Length Then Cells(x, 4) = Split(.Item(0).innerText, " ")(0)
        End With
    Next topic
    y = y + 1
Loop Until prex >= x

End Sub
 
Hi ,

Try this :
Code:
Sub Aoty_Data()
    Dim http As New XMLHTTP60
    Dim html As New HTMLDocument, topic As HTMLHtmlElement
    Dim inputyear As Long
    Dim done As Boolean, firsttime As Boolean

    inputyear = InputBox("Input the year you're scraping")
    If inputyear < 1900 Then Exit Sub

    y = 1
    firsttime = True
    Do
      With http
            .Open "GET", "http://www.albumoftheyear.org/ratings/6-highest-rated/" & inputyear & "/" & y, False
            .send
            html.body.innerHTML = .responseText
      End With
  
      DoEvents
  
      For Each topic In html.getElementsByClassName("albumListRow")
          x = x + 1
          With topic.getElementsByClassName("listLargeTitle")(0).getElementsByTagName("a")
                If .Length Then Cells(x, 1) = Split(.Item(0).innerText, "-")(0)
          End With
          
          With topic.getElementsByClassName("listLargeTitle")(0).getElementsByTagName("a")
                If .Length Then Cells(x, 2) = Split(.Item(0).innerText, "-")(1)
          End With
      
          With topic.getElementsByClassName("listScoreValue")
                If .Length Then Cells(x, 3) = .Item(0).innerText
          End With
      
          With topic.getElementsByClassName("listScoreText")
                If .Length Then Cells(x, 4) = Split(.Item(0).innerText, " ")(0)
          End With
      
          If Not firsttime Then
              If ((Cells(x, 1) = Cells(1, 1)) And (Cells(x, 2) = Cells(1, 2)) And (Cells(x, 3) = Cells(1, 3)) And (Cells(x, 4) = Cells(1, 4))) Then
                done = True
                Cells(x, 1).Resize(, 4).ClearContents
                Exit For
              End If
          End If
          firsttime = False
      Next topic
      y = y + 1
    Loop Until done
End Sub
Narayan
 
Dear Narayan, your method works perfectly. But, I tried to finish the way I started. If I give it a little twitch, it loops once. I might be wrong but there are always alternatives which I'm after. I tried with this:

Code:
Sub Aoty_Data()
Dim http As New XMLHTTP60
Dim html As New HTMLDocument, topic As HTMLHtmlElement

y = 1
Do

With http
    .Open "GET", "http://www.albumoftheyear.org/ratings/6-highest-rated/2000/" & y, False
    .send
    html.body.innerHTML = .responseText
End With

For Each topic In html.getElementsByClassName("albumListRow")
    x = x + 1
    With topic.getElementsByClassName("listLargeTitle")(0).getElementsByTagName("a")
        If .Length Then Cells(x, 1) = Split(.Item(0).innerText, "-")(0)
    End With
        With topic.getElementsByClassName("listLargeTitle")(0).getElementsByTagName("a")
        If .Length Then Cells(x, 2) = Split(.Item(0).innerText, "-")(1)
    End With
Next topic
y = y + 1
Loop Until y <> ""

End Sub
 
Hi ,

Different pages behave differently.

Here , when the number of pages goes beyond 3 , the same pages are displayed over and over again.

Thus , after the 68th entry has been displayed on page 3 , when y is incremented to 4 , the first entry from the first page is displayed once more ; incrementing y to 5 displays page 2 , and incrementing y to 6 displays page 3. This repeats indefinitely.

Narayan
 
Oh I see!!! I thought that it would be easier for me to do stuffs with how you started in your first post but I'm wrong. Sorry for any inconvenience.
 
Back
Top