1

I am trying to scrap data from here (using python 2.7):

http://financials.morningstar.com/valuation/earnings-estimates.html?t=AMD

When I right click and choose "View Page Sources" in Chrome browser, the content that I am looking for is not there. For example I am looking for "Average Rating".

I searched Stackoverflow and saw this question and answer: Python 3, Web-scraping, and Javascript [Oh My] But when I tried the main answer, I could not find any XMLHttpRequest function.

I appreciate any help on this.

4
  • 1
    In the Firefox network inspector I see 3 AJAX requests (click "XHR" at the bottom). Commented Feb 28, 2015 at 23:39
  • Thanks Carpetsmoker. Sure I used the Firefox and now I see a number of "Get" and "Post". How can I use this information now?
    – TJ1
    Commented Feb 28, 2015 at 23:46
  • Similar for the network inspector in Chrome. Click on network, click the XHR filter, open the M* page, you'll see 3 XHR items, click on one of them in the left column (name), you'll then see a URL - copy that and go to the page in your browserl
    – foosion
    Commented Feb 28, 2015 at 23:56
  • Thanks foosion. Strangely I cannot see this in Chrome!
    – TJ1
    Commented Mar 1, 2015 at 0:04

1 Answer 1

1

It looks like the data you want is pulled from

http://financials.morningstar.com/valuation/annual-estimate-list.action?&t=XNAS:AMD&region=usa&culture=en-US&cur=&r=1425167484279.9668&_=1425167484280
http://financials.morningstar.com/valuation/analyst-opinion-list.action?&t=XNAS:AMD&region=usa&culture=en-US&cur=&r=1425167484282.3906&_=1425167484282
http://financials.morningstar.com/valuation/forward-comparisons-list.action?&t=XNAS:AMD&region=usa&culture=en-US&cur=&r=1425167484284.5396&_=1425167484284

You should be able to scrape these urls directly.

4
  • Thanks, that is correct. Did you use Chrome to find these?
    – TJ1
    Commented Mar 1, 2015 at 0:25
  • 1
    No, I used Firefox with the HttpFox toolbar (from addons.mozilla.org/en-us/firefox/addon/httpfox) Commented Mar 1, 2015 at 0:35
  • Thanks again. I installed httpfox, when I run it I see a lot of URLs, how do you know which one to pick?
    – TJ1
    Commented Mar 1, 2015 at 1:56
  • 2
    In HttpFox, look at the Type column. You can ignore text/css (formatting), text/javascript (javascript - could include dynamic data but usually not), and image/gif (pictures). This leaves the main page (we have already established it does not hold the data you want), a favicon file with the wrong data type (should be image/x-icon), the three files listed above, and an application/json file (data for the commodity quotes ticker at the top of the page). Commented Mar 1, 2015 at 2:35

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.