All Questions
425 questions
0
votes
1
answer
114
views
How to Scrape Google RssFeed Links?
I'm trying to scrape links that are found in google RssFeeds for a given country.
The links are located in the xml format when you visit this url https://news.google.com/rss/search?q={"...
0
votes
1
answer
32
views
I would like to implement case sensitive search in elasticsearch and i dont know how
I have made kind of an RSS aggregator that stores news RSSs in elasticsearch and i would like to perform case sensitive searches. From what I have read (I am new to ES) I've seen that by default ES is ...
0
votes
1
answer
31
views
Rumble RSS feed - gives 403 when called with a python urllib.request
The iine of code failing is
req = urllib.request.Request(url)
A rumble RSS feed address that works fine typed into a browser address line returns a '403 - authorisation will not help'.
A ...
0
votes
0
answers
58
views
Using Python to create an RSS feed - but would like to display the feed locally
I've created a project that creates an RSS feed. This RSS feed will be sent to a company that will wrap it and create their own prettier RSS feed with it, but is there a tool or extension that devs ...
2
votes
3
answers
660
views
How to handle google consent page when scrapping article data using google news RSS links?
I have a list of google news links from google RSS feed and I would like to get full text of those articles. I use BeautifulSoup library to scrape the data, however, it seems that google redirects to ...
0
votes
3
answers
777
views
UnicodeEncodeError: 'utf-8' codec can't encode character '\ud83c' in position 0: surrogates not allowed
I am trying to parse "https://tre.tbe.taleo.net/tre01/ats/servlet/Rss?org=arobpers2&cws=42" but I am getting the error "UnicodeEncodeError: 'utf-8' codec can't encode character '\...
0
votes
0
answers
26
views
Extracting images out of multiple different RSS feeds
Working on a project that pulls articles from 20+ rss feeds and a variety of feed formats for where an articles image is.
I'm happy for the article image to be missing in some cases, and I'll fall ...
0
votes
0
answers
27
views
Serving a webpage content by running a php/python script
I'm trying to set up a RSS for my site. So I would like to make a link that takes in a keyword and produces a RSS feed.
I have a python script (script.py) to generate this xml, but I don't know how to ...
2
votes
1
answer
1k
views
How to web scrape google news headline of a particular year (e.g. news from 2020)
I've been exploring web scraping techniques using Python and RSS feed, but I'm not sure how to narrow down the search results to a particular year on Google News. Ideally, I'd like to retrieve ...
1
vote
1
answer
40
views
Getting none when trying to parse description tag of rss feed
so i'm acessing this rss feed
as you can see there is a description tag. when i'm parsing the feed it returns back none for the description tag
and this is the error message i get
AttributeError: '...
1
vote
1
answer
65
views
Avoid image div while parsing description tag
parsing an rss feed with this code
resp=requests.get(url)
soup = BeautifulSoup(resp.content, features="xml")
soup.prettify()
items = soup.findAll('item')
news_items = []
for item in items:
...
0
votes
0
answers
320
views
Looping through URLS in Python and generating feed with Feedgen
I'm pulling together a bunch of RSS feeds and grabbing items from them based on matching keywords. In the process I'm pulling certain fields--title, description, etc. but I'd like to pull the title of ...
0
votes
1
answer
69
views
How to get data from website via API or RSS?
I need to extract data from this website: https://nasdaqbaltic.com/statistics/en/news. from 2010-01-01 until now. RSS gives 100 items at page, so it is just a part. Company TeliaLietuva AB. How to get ...
-1
votes
1
answer
178
views
Correctly sort RSS items by time
I'm getting RSS items from different RSS channels. And I'd like to sort them correctly by time and take into account the time zone, from the latests to the oldests. So far, I have the following code:
...
1
vote
0
answers
123
views
Is there any way to generate links with rel="hub" with the feedgen package in Python?
I generate a RSS feed with the Python package feedgen and I am now trying to implement the PubSubHubbub protocol.
From what I understand here
https://indieweb.org/How_to_publish_and_consume_WebSub
I ...