Export Excel Spreadsheet From Website - Python

Question

I am trying to find a way to export a Microsoft Excel spreadsheet (.xlsx) from a website and store locally (to my desktop) or to a database. I am able to parse a URL with tabular content and display/write to file, but I need to determine a way to retrieve spreadsheet content that requires clicking a button to download the data. More importantly, I need to be able to be able to retrieve spreadsheet data embedded within multiple separate pages as displayed on a webpage. Below is a sample script that displays tabular data from a website.

import urllib3
from bs4 import BeautifulSoup

url = 'https://www.runnersworld.com/races-places/a20823734/these-are-the-worlds-fastest-marathoners-and-marathon-courses/'

http = urllib3.PoolManager()
response = http.request('GET', url)
soup = BeautifulSoup(response.data.decode('utf-8'))
print(soup)

I have inspected the Javascript tool that is the equivalent of manually exporting data on a website through a button click, but I need to find a way to automate this via a Python script...any assistance is most appreciated.

if you need to click things you need to use a browser automation tool like selenium,puppeteer or Playwright — Bindestrich, Commented Jul 16, 2024 at 21:35
@Bindestrich - Thanks for the suggestion, doing some more research it appears that Selenium and the web driver could be particularly useful. I will keep digging into this more, but this helps! — mdl518, Commented Jul 16, 2024 at 21:58
@SergeyK - I cannot provide the exact URL since it requires a specific certificate, but I am just trying to demonstrate how to click on an option to download a file/store locally. As an example, here is a link to save/download a .mp4 file through a button click (online-video-cutter.com/crop-video) so I am just trying to configure how to use Python to save data stored on websites. — mdl518, Commented Jul 17, 2024 at 16:25
@mdl518 i need this data to help u in ur question, mb this button after authentication — Sergey K, Commented Jul 18, 2024 at 19:18

Sergey K · Accepted Answer · 2024-07-23 11:30:39Z

Based on your comment

@SergeyK - Here is a link to the website with the data. I need to find a way to download the CSV listed under the "Starting Up" section of this URL: browserstack.com/test-on-the-right-mobile-devices

There are three download buttons on the site u mentioned. Yes they re the same and only one file will be downloaded, but as an example.

import requests
from bs4 import BeautifulSoup
import urllib.parse


response = requests.get('https://www.browserstack.com/test-on-the-right-mobile-devices')
for csv_href in BeautifulSoup(response.text, 'lxml').find_all('div', class_='download-csv'):
    link = 'https://www.browserstack.com/' + csv_href.findNext('a').get('href')
    file_name = urllib.parse.unquote(link).replace(" ", "").split('/')[-1]
    data = requests.get(link)
    with open(file_name, 'wb') as file:
        print(f'{file_name} saved from {link}')
        file.write(data.content)

OUTPUT

BrowserStack-Listofdevicestoteston.csv saved from https://www.browserstack.com/downloads/BrowserStack%20-%20List%20of%20devices%20to%20test%20on.csv
BrowserStack-Listofdevicestoteston.csv saved from https://www.browserstack.com/downloads/BrowserStack%20-%20List%20of%20devices%20to%20test%20on.csv
BrowserStack-Listofdevicestoteston.csv saved from https://www.browserstack.com/downloads/BrowserStack%20-%20List%20of%20devices%20to%20test%20on.csv

Or just Starting Up section without loop:

soup = BeautifulSoup(response.text, 'lxml').find('div', {'data-trigger': 'startingup'})
link = 'https://www.browserstack.com/' + soup.findNext('a').get('href')

@SergeiK - This solution works great! Moreover, I need to extend the concept to clicking a button the opens a pop-up window with three options including one for "Export Spreadsheet" which prompts the download. Can the script be extended to this pop-up which provides the option to download the data? I will otherwise concur to your answer as the correct solution, thanks again! — mdl518, Commented Jul 23, 2024 at 20:27
@mdl518 I need to look at the site to tell u how to do it correctly, I can’t just say “yes the script can extended”. Mb without selenium it’s impossible, or API, or something else… — Sergey K, Commented Jul 24, 2024 at 5:56
@SergeiK - I'm unable to provide the exact URL since it requires a custom certificate to access the page on the web, but I can try to find a similar website that requires the same functionality for clicking a button to prompt another pop-up for downloading data. I have otherwise concurred to your answer as the correct solution for my initial post, thanks again! — mdl518, Commented Jul 25, 2024 at 11:16

Collectives™ on Stack Overflow

Export Excel Spreadsheet From Website - Python

1 Answer 1

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Related