All Questions
899 questions
2
votes
1
answer
95
views
Puppeteer Scraping: See XHR response data before request completes for real time data
I am using puppeteer to scrape a website for real time data in nodejs. Instead of scraping the page, I am watching the backend requests and capturing the JSON/TEXT responses so I have more structured ...
0
votes
0
answers
47
views
Extracting tables through puppeeteer
I am working on my university project. I want to extract data from my university LMS; I cracked the login authentication, and now I want to extract HTML of my subjects attendance and sessional marks ...
0
votes
0
answers
44
views
<img> Selector Returns Null in Headless Mode but Works in Non-Headless Mode
I'm working on a puppeteer-based scraper that extracts product details (image, title, and price) from a webpage. The scraper works perfectly in non-headless mode, but when I switch to headless mode, ...
1
vote
2
answers
93
views
Getting correct selector using Puppeteer
here's the HTML code.
<div class="list-row">
<div class="list-item">
<div class="imgframe">
<div class="img-wrap"&...
0
votes
0
answers
130
views
Close a modal window with Puppeteer/NodeJS
I'm trying to close a modal window that appears when a record is invalid.
I've already analyzed the HTML a lot, but still can't figure out why it isn't closing.
I can continue the processing even if ...
-1
votes
1
answer
42
views
Parallel execution with NodeJs and Puppeteer
I'm using NodeJs and Puppeteer to scrap data from a website.
My goal is to run multiple instances of Puppeteer when scraping data.
However, I may be doing something wrong. Because it is running only ...
0
votes
0
answers
38
views
Puppeteer with Proxy Shows Original IP in Chromium Window
I'm working on a Node.js project where I’m trying to change my IP address using proxies with Puppeteer. However, when I open the Chromium browser window (in script), I still see my original IP address ...
-1
votes
1
answer
142
views
NodeJS/Puppeteer - can't click an element
I'm trying to click on an element but whenever i click the focus seems to turn into an input.
Lemme try to rephrase: it is an clickable element who shows after i do some search. I want to click on it ...
1
vote
1
answer
38
views
Getting unexpected/not_present elements/tags while scraping in node js with cheerio
I am scraping and parsing the content of web page (https://www.mydealz.de/new). Structure is like follows.
<div class="threadGrid-title">
<strong><a href="">...
3
votes
2
answers
724
views
puppeteer scraping dynamic content
I'm trying to scrape data from a Looker Studio web page report using Puppeteer in Node.js, but I'm encountering issues because the report is dynamic. When I fetch the data, the body is empty. Here's
...
4
votes
2
answers
3k
views
XPath Selector in Puppeteer 22.x
I have read the newest Puppeteer v22.x documentation about XPath, still don't know how to use XPath in Puppeteer 22.x.
I want to click an element containing the text 'Next'. Here the HTML that has the ...
0
votes
1
answer
425
views
What are the differences between requests via Node's fetch() and requests via browsers? [closed]
I'm trying to scrape some APIs for public data. I'm sometimes getting blocked when using Node's fetch, but I'm not blocked when requesting the same API using my browser. Usually, if I'm blocked, I'd ...
1
vote
1
answer
290
views
Cannot use external function inside page.evaluate()
I am scraping dynamic website with puppeteer. My goal is to be able to create as much generic scraping logic as possible, which will also remove a lot of boilerplate code. So for that reason, I ...
0
votes
2
answers
1k
views
Increasing Web Scraping speed with puppeteer
I am trying to create an Node.js API, that scrapes website, (I started only with Goodreads as a website to be scraped and will expand further when I first the optimize the approach) and provides the ...
1
vote
1
answer
218
views
404 response while using axios.get on a live-server
I'm learning web-scraping with JavaScript, and while trying to log to the console a simple web page, I'm getting a weird 404 error:
Failed to load resource: the server responded with a status of 404
(...