2

I need to crawl the price values from the price comparison website (product link: https://www.toppreise.ch/prod_488002.html). I am not able to scrape. see the highlighted price in the image that I want to capture:

click to see image

Please help me how to crawl this page.

PS: toppreise.ch will not be accessible in many countries so use VPN

I am using the below code:

Private Sub SiteInfo_Click()
Dim strhtml
On Error Resume Next
ThisWorkbook.Sheets("Data Mining").Activate
Sheets("Data Mining").Range("B1").Select
Set xmlHttp = Nothing
Set xmlHttp = CreateObject("MSXML2.ServerXMLHTTP.6.0")

    StrUrl = ""
    StrUrl = Sheets("Data Mining").Range("B1").Value
    xmlHttp.Open "GET", StrUrl, False
    xmlHttp.Send
    strhtml =xmlHttp.responseText
    END Sub

When I run above code I am only getting below response text . It doesn't gives the whole page. (You can check the source code by using the product link or view here https://www.dropbox.com/s/ah80jt7a25xcicp/source%20code.txt?dl=0 )

<html><head>
        <script type="text/javascript" src="//en.toppreise.ch/js/tpjs.js"></script>
        <script type="text/javascript" src="//en.toppreise.ch/js/afxp.js"></script>
        <script type="text/javascript" src="//en.toppreise.ch/js/jquery.min.js"></script>
        <script type="text/javascript" src="//en.toppreise.ch/js/jquery-ui-autocomplete.min.js"></script>
    </head><body>...   
4
  • Looks like the page is dynamic, so you'll have to use a different approach, such as browser automation. Commented Feb 15, 2018 at 6:05
  • Thank you so much @TimWilliams. Will you be able to help on that? BTW How to find whether the page is dynamic or not?
    – Prasath
    Commented Feb 15, 2018 at 6:39
  • Remember to close On Error Resume Next with On Error GoTo 0 (as soon as possible ). You will be hiding errors all the time otherwise.
    – QHarr
    Commented Feb 15, 2018 at 6:39
  • Try googling "VBA automate IE" and you'll get plenty of examples. Commented Feb 15, 2018 at 6:42

1 Answer 1

0

This code works, Thanks SIM

Sub Get_Price()
Dim HTTP As New XMLHTTP60, HTML As New HTMLDocument
Dim post As HTMLDivElement

With HTTP
.Open "GET", "https://www.toppreise.ch/index.php?a=488002", False
.send
HTML.body.innerHTML = .responseText
End With

For Each post In HTML.getElementsByClassName("altLinesOdd")
With post.getElementsByTagName("a")
     If .Length Then R = R + 1: Cells(R, 1) = .Item(0).innerText
End With
With post.getElementsByClassName("spaceVert nobreak")
     If .Length Then Cells(R, 2) = .Item(0).innerText
End With
Next post
End Sub
1
  • Welcome to Stack Overflow! Thank you for this code snippet, which might provide some limited short-term help. A proper explanation would greatly improve its long-term value by showing why this is a good solution to the problem, and would make it more useful to future readers with other, similar questions. Please edit your answer to add some explanation, including the assumptions you've made. Commented Feb 16, 2018 at 9:43

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.