Skip to main content

All Questions

Tagged with
0 votes
1 answer
114 views

How to Scrape Google RssFeed Links?

I'm trying to scrape links that are found in google RssFeeds for a given country. The links are located in the xml format when you visit this url https://news.google.com/rss/search?q={"...
Jack Holly's user avatar
0 votes
1 answer
32 views

I would like to implement case sensitive search in elasticsearch and i dont know how

I have made kind of an RSS aggregator that stores news RSSs in elasticsearch and i would like to perform case sensitive searches. From what I have read (I am new to ES) I've seen that by default ES is ...
msp12's user avatar
  • 5
0 votes
1 answer
31 views

Rumble RSS feed - gives 403 when called with a python urllib.request

The iine of code failing is req = urllib.request.Request(url) A rumble RSS feed address that works fine typed into a browser address line returns a '403 - authorisation will not help'. A ...
pperrin's user avatar
  • 1,497
0 votes
0 answers
58 views

Using Python to create an RSS feed - but would like to display the feed locally

I've created a project that creates an RSS feed. This RSS feed will be sent to a company that will wrap it and create their own prettier RSS feed with it, but is there a tool or extension that devs ...
Akkkk's user avatar
  • 1
2 votes
3 answers
660 views

How to handle google consent page when scrapping article data using google news RSS links?

I have a list of google news links from google RSS feed and I would like to get full text of those articles. I use BeautifulSoup library to scrape the data, however, it seems that google redirects to ...
Jurgita-ds's user avatar
0 votes
3 answers
777 views

UnicodeEncodeError: 'utf-8' codec can't encode character '\ud83c' in position 0: surrogates not allowed

I am trying to parse "https://tre.tbe.taleo.net/tre01/ats/servlet/Rss?org=arobpers2&cws=42" but I am getting the error "UnicodeEncodeError: 'utf-8' codec can't encode character '\...
Asher Ross's user avatar
0 votes
0 answers
26 views

Extracting images out of multiple different RSS feeds

Working on a project that pulls articles from 20+ rss feeds and a variety of feed formats for where an articles image is. I'm happy for the article image to be missing in some cases, and I'll fall ...
AlexAntra's user avatar
0 votes
0 answers
27 views

Serving a webpage content by running a php/python script

I'm trying to set up a RSS for my site. So I would like to make a link that takes in a keyword and produces a RSS feed. I have a python script (script.py) to generate this xml, but I don't know how to ...
Ben Fishbein's user avatar
2 votes
1 answer
1k views

How to web scrape google news headline of a particular year (e.g. news from 2020)

I've been exploring web scraping techniques using Python and RSS feed, but I'm not sure how to narrow down the search results to a particular year on Google News. Ideally, I'd like to retrieve ...
Charmi Divecha's user avatar
1 vote
1 answer
40 views

Getting none when trying to parse description tag of rss feed

so i'm acessing this rss feed as you can see there is a description tag. when i'm parsing the feed it returns back none for the description tag and this is the error message i get AttributeError: '...
NoIdea's user avatar
  • 33
1 vote
1 answer
65 views

Avoid image div while parsing description tag

parsing an rss feed with this code resp=requests.get(url) soup = BeautifulSoup(resp.content, features="xml") soup.prettify() items = soup.findAll('item') news_items = [] for item in items: ...
NoIdea's user avatar
  • 33
0 votes
0 answers
320 views

Looping through URLS in Python and generating feed with Feedgen

I'm pulling together a bunch of RSS feeds and grabbing items from them based on matching keywords. In the process I'm pulling certain fields--title, description, etc. but I'd like to pull the title of ...
Remi Castonguay's user avatar
0 votes
1 answer
69 views

How to get data from website via API or RSS?

I need to extract data from this website: https://nasdaqbaltic.com/statistics/en/news. from 2010-01-01 until now. RSS gives 100 items at page, so it is just a part. Company TeliaLietuva AB. How to get ...
jency's user avatar
  • 1
-1 votes
1 answer
178 views

Correctly sort RSS items by time

I'm getting RSS items from different RSS channels. And I'd like to sort them correctly by time and take into account the time zone, from the latests to the oldests. So far, I have the following code: ...
xralf's user avatar
  • 3,342
1 vote
0 answers
123 views

Is there any way to generate links with rel="hub" with the feedgen package in Python?

I generate a RSS feed with the Python package feedgen and I am now trying to implement the PubSubHubbub protocol. From what I understand here https://indieweb.org/How_to_publish_and_consume_WebSub I ...
Loan75's user avatar
  • 27

15 30 50 per page
1
2 3 4 5
29