I had a desire to make a recursive web crawler in vba. As I don't have much knowledge on vba programming, so it took me a while to understand how the pattern might be. Finally, I've created one. The crawler I've created is doing just awesome. It starts from the first page of a torrent site then tracking the site's next page link it moves on while extracting names until all links are exhausted. Any input on this to make it more robust will be a great help. Thanks in advance.
Here is what I've written:
Sub yify(dynamic_link As String)
Application.ScreenUpdating = False
Const main_link As String = "https://yts.ag"
Dim http As New XMLHTTP60, html As New HTMLDocument
Dim movie As Object, link As Object
With http
.Open "GET", dynamic_link, False
.send
html.body.innerHTML = .responseText
End With
For Each movie In html.getElementsByClassName("browse-movie-title")
ActiveCell.Value = movie.innerText ''Scraping movie names
ActiveCell.Offset(1, 0).Select
Next movie
For Each link In html.getElementsByClassName("tsc_pagination")(0).getElementsByTagName("a")
If InStr(link.innerText, "Next") > 0 Then
yify (main_link & Split(link.href, ":")(1)) ''Feeding next page link to the crawler
End If
Next link
Application.ScreenUpdating = True
End Sub
Sub RecursiveCrawler()
Range("A1").Select
yify ("https://yts.ag/browse-movies/0/all/documentary/0/latest") ''Crawling process starts here
End Sub