How to scrape javascript webpage using python standard libs only

Question

I have to scrape a website that uses javascript to display content. I have to use standard libs only as I will run this script on a server where there is not any browser. I have found selenium but it requires a browser that in my case is not possible to install.

Any idea or solution?

Why don't you rely on Scrapy for doing the task? Avoid reinventing the wheel. — narko, Commented Sep 18, 2015 at 7:11
Scarpy , Beautifulsoup are pretty good libraries for the same — Tushar Gupta, Commented Sep 18, 2015 at 7:41
@Shafiq Do you mind if I ask why requests and bs4 couldn't complete the task? These would have been my first go-to solutions. — pmccallum, Commented Sep 18, 2015 at 8:09

ram hemasri · Accepted Answer · 2015-09-18 08:00:29Z

2

Have a look at Ghost.py http://jeanphix.me/Ghost.py/. It doesn't require a browser.

pip install Ghost.py

from ghost import Ghost
ghost = Ghost()
page, resources = ghost.open('http://stackoverflow.com/')

answered Sep 18, 2015 at 8:00

ram hemasri

1,63411 silver badges14 bronze badges

Add a comment |

narko · Accepted Answer · 2015-09-18 08:11:34Z

1

You didn't mention anything about how the website is using javascript, but if it uses AJAX requests that are triggered after any kind of user interaction, you will need to use something like Selenium to automatize that behaviour. Here, you can find a short tutorial of how to scrape with Scrapy + Selenium. This of course requires a browser previously installed in your machine.

answered Sep 18, 2015 at 8:11

narko

3,9051 gold badge30 silver badges36 bronze badges

Add a comment |

Collectives™ on Stack Overflow

How to scrape javascript webpage using python standard libs only

2 Answers 2

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Related