2

I'm trying to get into this website and retrieve fund status data https://fundfinder.panfoundation.org

I need to log into the website (this is an open website, anybody can create username and pwd to sign in), scroll down, click on 'prostate cancer' and then retrieve all the fund status (locked or unlocked). I'm using the selenium to do webscraping.

This is my code:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
import time

# Set up Chrome options for headless mode
chrome_options = Options()
chrome_options.add_argument("--headless") 
chrome_options.add_argument("--no-sandbox") 
chrome_options.add_argument("--disable-dev-shm-usage")  

# Set up the WebDriver
service = Service('/usr/local/bin/chromedriver')  # Update with the path to your ChromeDriver
driver = webdriver.Chrome(service=service, options=chrome_options)

# Open the website
HOME_PAGE = 'https://fundfinder.panfoundation.org'
driver.get(HOME_PAGE)
# Wait for the page to load
time.sleep(2)

# Find the username and password fields and log in
username_field = driver.find_element(By.NAME, 'email')  
password_field = driver.find_element(By.NAME, 'phrase')  

username_field.send_keys('Your_UserName')  # Replace with your username
password_field.send_keys('Your_Password')    # Replace with your password
password_field.send_keys(Keys.RETURN)        # Press Enter to log in

What I'm struggling with is next steps. After log in, I need to scroll down to click on the link 'Prostate Cancer'. I tried many different ways, such as using xpath below. But unfortunately nothing works.

data_element = driver.find_element(By.XPATH, '//div[@class="data"]')

I'm new to selenium. Does anyone know how I can achieve the task below:

  1. click on link 'Prostate Cancer' after log in

  2. retrieve the 5 fund status (currently all locked - means unavailable)

Thank you so much in advance!

4
  • Is fundfinder.panfdoundation.org the correct link within the question or fundfinder.panfoundation.org within your code block or something else? However none opens up from my system (APAC) Commented Jan 12 at 22:30
  • click on link 'Prostate Cancer' after log in & retrieve the 5 fund status (currently all locked - means unavailable) : Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. Commented Jan 12 at 22:36
  • sorry, there was a spelling error. I corrected it. fundfinder.panfoundation.org Commented Jan 12 at 22:37
  • Some places where you have to click is cryptic. If you think of it like a layered image from Photoshop, like putting several layers of transparent paper over the piece of paper with the drawing, the input, or button tag. Even though a user can see themselves clicking the button, or the input field, it was one of the outer tags wrapped in the styling that was triggering the button click, or focusing the input field. Commented Jan 14 at 11:55

1 Answer 1

1

The login approach might be wrong

  • using Keys.RETURN or Keys.ENTER in password_field.send_keys(Keys.RETURN) may not work. when I tried, this didn't work and had to click on the Log In button.

I used:

wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a[onclick='return valLoginForm();']"))).click()

Here's the working solution:

from selenium.webdriver import Chrome, ChromeOptions
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait

options = ChromeOptions()

options.add_argument("--start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option("useAutomationExtension", False)

driver = Chrome(options=options)
wait = WebDriverWait(driver, 10)

HOME_PAGE = 'https://fundfinder.panfoundation.org'
driver.get(HOME_PAGE)
# Wait for login form
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.login-wrap")))
# email_address
driver.find_element(By.CSS_SELECTOR, 'input#email').send_keys("your_email_id")
# password
driver.find_element(By.CSS_SELECTOR, 'input#phrase').send_keys("your_password")
# Click login when clickable
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a[onclick='return valLoginForm();']"))).click()

# Wait for fund list container after login
fund_list_container = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div#funds-tab>div>ul#fundList")))
# Click a fund item 'prostate cancer'(example)
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, 'li[data-srch="prostate cancer"]'))).click()
# Wait for detail wrapper to render
fund_details = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.fund-detail-wrapper")))

# Extract data
data = []
for detail in fund_details.find_elements(By.CSS_SELECTOR, 'span.fund-detail'):
    fund_link = detail.find_element(By.CSS_SELECTOR, 'span.fund-link a')
    fund_status = detail.find_element(By.CSS_SELECTOR, 'span.fund-open i').get_attribute("class")
    fund_name = detail.find_element(By.CSS_SELECTOR, 'span.fund-name').text

    row = {
        "fund_link": fund_link.get_attribute("href") if fund_link else "",
        "fund_link_title": fund_link.text.strip() if fund_link else "",
        "fund_status": "open" if "open" in fund_status else "lock",
        "fund_name": fund_name,
    }
    data.append(row)

print(data)

output:

[
  {
    'fund_link': 'https://www.panfoundation.org/find-disease-fund/',
    'fund_link_title': 'PAN Foundation',
    'fund_status': 'lock',
    'fund_name': 'Prostate cancer - Copay'
  },
  {
    'fund_link': 'https://www.cancercare.org/copayfoundation#',
    'fund_link_title': 'CancerCare Co-Payment Assistance Foundation',
    'fund_status': 'lock',
    'fund_name': 'Prostate Cancer - Copay'
  },
  {
    'fund_link': 'https://www.healthwellfoundation.org/disease-funds/',
    'fund_link_title': 'HealthWell Foundation',
    'fund_status': 'lock',
    'fund_name': 'Prostate Cancer - Medicare Access - Copay'
  },
  {
    'fund_link': 'https://www.copays.org/funds',
    'fund_link_title': 'Patient Advocate Foundation Co-Pay Relief',
    'fund_status': 'lock',
    'fund_name': 'Prostate Cancer - Copay'
  },
  {
    'fund_link': 'https://www.copays.org/funds',
    'fund_link_title': 'Patient Advocate Foundation Co-Pay Relief',
    'fund_status': 'lock',
    'fund_name': 'Prostate Cancer Health Equity Fund - Copay'
  },
  {
    'fund_link': 'https://tafcares.org/program-listing/',
    'fund_link_title': 'The Assistance Fund',
    'fund_status': 'lock',
    'fund_name': 'Prostate Cancer - Copay'
  }
]

the above solution is self explanatory as I put the comments.

Note:

  1. the full list of disease category gets loaded into the DOM after login, so there is no need to scroll.

  2. similarly, to get the details of other disease category from the fund list, simply change the name in the line

wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, 'li[data-srch="prostate cancer"]'))).click()

for example, for Batten Disease:

wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, 'li[data-srch="batten disease"]'))).click()
  1. for the fund status, mostly, it's open or lock and this is visible by the class name, fa fa-lock if status is lock and fa fa-lock-open if status is open.
  2. and as mentioned by others, the URL is not open in the APAC region I guess, had to use vpn to open and test it
Sign up to request clarification or add additional context in comments.

2 Comments

Maybe a working code block but please stop encouraging requests like "anybody can create username and pwd to sign in". Instead ask OP to provide the text based HTML within the question itself so the solution is helpful to future visitors. In it's current form the question wouldn't be of any help to future visitors.
Perfect! Your solution now works! Thank you for your help and all the detailed explanations. Learned a lot.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.