Skip to main content

All Questions

Tagged with
0 votes
1 answer
63 views

Parse WhatsApp message read status [closed]

My question is more about html layout and parsing dynamic of content. My task: parse contacts who read my particular message in the Group. I tried to see DOM structure for the DIV block that hold that ...
Jeffrey Rasmussen's user avatar
2 votes
1 answer
107 views

How to handle self-closing tags without end-slash in html.parser.HTMLParser

By default it seems that html.parser.HTMLParser cannot handle self closing tags correctly, if they are not terminated using /. E.g. it handles <img src="asfd"/> fine, but it ...
flawr's user avatar
  • 11.6k
1 vote
1 answer
117 views

python: parse html document with UNNESTED div tags into dataframe (using beautifulsoup)

long time user, but never had to ask my own question. I want to use python to parse a table from an html document into a dataframe. The table is NOT an html table, I think it is javascript created ...
tailor's user avatar
  • 15
-2 votes
1 answer
60 views

I don't understand web parsing completely

I tried using this code below to extract hyperlinks, and it worked for 1 website I tried it with: import requests from bs4 import BeautifulSoup import time def timedelay(amount): print("...&...
bacon_man284's user avatar
2 votes
1 answer
49 views

Replace an HTML tag in an HTML document using Python without modifying the rest of the document

I'm making a simple Python + HTML website (as part of my study). The website menu looks like this: <ul> <li><a href="/">Home</a></li> <li><a ...
Irina Shishilova's user avatar
0 votes
2 answers
90 views

Why is my code print out the same html link a lot of times?

I'm doing a following link activity on Python ( it's an assignment on Python Web Access Data - Coursera). Here is the problem: In this assignment you will write a Python program that expands on http://...
Vinh Nguyễn Thành's user avatar
0 votes
1 answer
88 views

why does requests-html return content partialy?

i know, that its's because of content, being rendered by js, but requests-html supports js, so that's strange code itself: from requests_html import HTMLSession session = HTMLSession() session....
GayLord's user avatar
1 vote
1 answer
33 views

Python: How can i get a list of li tags in BeautifulSoup4

I'm trying to scrape a persian webpage and i want to get 3 li tags from a ul containing 6 of them. my problem is that every li, has nested li tags in it and when i use soup.find_all('li'), it finds ...
Seyedmahdi moosavyan's user avatar
0 votes
2 answers
102 views

How to find multiple tags at once along with attributes using BeautifulSoup in python3?

I am trying to find different tags at once using find_all() method of BeautifulSoup. I found a way to include all tags in the list to get the respective tags. But I am trying to get tags along with ...
David's user avatar
  • 379
0 votes
1 answer
109 views

Removing Specific Span Tags from a CSV file

I am trying to remove specific span tags from a csv file but my code is deleting all of them. I just need to point out certain ones to be removed for example '<span style="font-family: verdana,...
3ndurance's user avatar
1 vote
2 answers
174 views

HTML parser find tag info

I have a project where uses HTMLParser(). I never worked with this parser, so I read the documentation and found two useful methods I can override to extract information from the site: handle_starttag ...
Beginner's user avatar
2 votes
1 answer
56 views

How to dynamically find the nearest specific parent of a selected element?

I want to parse many html pages and remove a div that contains the text "Message", using beautifulsoup html.parser and python. The div has no name or id, so pointing to it is not possible. I ...
Newbie's user avatar
  • 590
0 votes
1 answer
65 views

Selenium. NoSuchElementException

can someone be able to understand what the problem of this code is?I understand that the question is not new, but what I found just didn't help me, but maybe I was looking badly wd = webdriver.Chrome('...
exPriceD's user avatar
0 votes
0 answers
45 views

chrome user agent doesn't work for a scrapper

I have the following code to scrape images: import os, requests, lxml, re, json, urllib.request from bs4 import BeautifulSoup from os.path import expanduser headers = { "User-Agent": &...
v_head's user avatar
  • 815
-1 votes
1 answer
46 views

Beatifulsoup find_all when a tag is not inside another tag

html = """ <html> <h2>Top Single Name</h2> <table> <tr> <p>hello</p> </tr> </table> <div> ...
hit's user avatar
  • 87

15 30 50 per page
1
2 3 4 5
37