0

I am trying to scrape a few things from the website (url in code). I am able to scrape the brand name and the SDR but it seems like anything that's below the SDR, I cannot seem to scrape. I am only testing this on the first result, once i manage to figure it out I will make it dynamic. Hopefully one needs to just have selenium in their project and the chrome deriver and they can then copy/paste this code.

The below code gives the following error:

Exception in thread "main" org.openqa.selenium.TimeoutException: Expected condition failed: waiting for visibility of element located by By.xpath: /html/body/app-root/ecl-app/div[2]/app-search-page/app-search-container/div/div/section/div/app-elec-display-search-result/app-search-result/eui-block-content/div/app-search-result-item[1]/article/div[2]/div/app-elec-display-search-result-parameters/app-search-parameter-item[4]/div[2]/div/div[2]/div/div[1]/span (tried for 10 second(s) with 500 milliseconds interval)

Code:

public void scrape() throws InterruptedException {
    System.out.println("Starting Scrape!");
    String url = "https://eprel.ec.europa.eu/screen/product/electronicdisplays";

    WebDriver driver = new ChromeDriver();
    WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
    driver.get(url);
    driver.manage().window().maximize();
    WebElement until = wait.until(ExpectedConditions.presenceOfElementLocated(By.className("eui-block-content__wrapper")));
    //The results have been loaded now

    //Click on accept cookie page:
    new WebDriverWait(driver, Duration.ofSeconds(3
    )).until(ExpectedConditions.elementToBeClickable(By.linkText("Accept all cookies"))).click();

    String moreButton = "/html/body/app-root/ecl-app/div[2]/app-search-page/app-search-container/div/div/section/div/app-elec-display-search-result/app-search-result/eui-block-content/div/app-search-result-item[1]/article/div[3]/div/a";
    String xPathBrandName =     "/html/body/app-root/ecl-app/div[2]/app-search-page/app-search-container/div/div/section/div/app-elec-display-search-result/app-search-result/eui-block-content/div/app-search-result-item[1]/article/div[1]/div/div/div[1]/span[1]";
    String xPathSDR =           "/html/body/app-root/ecl-app/div[2]/app-search-page/app-search-container/div/div/section/div/app-elec-display-search-result/app-search-result/eui-block-content/div/app-search-result-item[1]/article/div[2]/div/app-elec-display-search-result-parameters/app-search-parameter-item[3]/div[1]/div/div[2]/div/div[1]/span";
    String energyRatingString = "/html/body/app-root/ecl-app/div[2]/app-search-page/app-search-container/div/div/section/div/app-elec-display-search-result/app-search-result/eui-block-content/div/app-search-result-item[1]/article/div[2]/div/app-elec-display-search-result-parameters/app-search-parameter-item[4]/div[2]/div/div[2]/div/div[1]/span";

    //Clicking on more button to load more results to be visible
    driver.findElement(By.xpath(moreButton)).click();

    WebElement SDR = driver.findElement(By.xpath(xPathSDR));

    //Using this logic to scroll to each of the result so it's visible on the web-page
    JavascriptExecutor js = (JavascriptExecutor) driver;
    js.executeScript("arguments[0].scrollIntoView();", SDR);

    WebElement brandName = driver.findElement(By.xpath(xPathBrandName));
    WebElement energyRating = wait.until(ExpectedConditions.visibilityOfElementLocated(By.xpath(energyRatingString)));

    System.out.println("Brand name: " + brandName.getText());
    System.out.println("SDR name: " + SDR.getText());
    System.out.println("energyRating: " + energyRating.getText());
}

But switching to replacing

WebElement energyRating = wait.until(ExpectedConditions.visibilityOfElementLocated(By.xpath(energyRatingString)));

to

WebElement energyRating = driver.findElement(By.xpath(energyRatingString ));

Gives the following output:

Starting Scrape!
Brand name: Samsung
SDR name: 63
energyRating: 

So i'm so puzzled as to why energyRating is missing and not giving a NoSuchElementException

1 Answer 1

1

The problem you were running into is that there are 2 of each of the fields, one visible and one hidden. Your XPath was to one of the hidden elements and because it never becomes visible, the wait timed out.

I wrote my own code to accomplish the task as you described.

String url = "https://eprel.ec.europa.eu/screen/product/electronicdisplays";

driver = new ChromeDriver();
driver.manage().window().maximize();
driver.get(url);

List<WebElement> results = driver.findElements(By.cssSelector("app-search-result-item"));
String brandName = "";
String sdr = "";
String energyRating = "";
for (WebElement result : results) {
    result.findElement(By.xpath("//a[text()=' More ']")).click();
    brandName = result.findElement(By.cssSelector("span.ecl-u-type-2xl")).getText();
    sdr = result.findElement(By.cssSelector("app-search-parameter-item[label='field.electronic-display.powerOnModeSDRV2'] div.ecl-u-d-l-block span.ecl-u-type-bold")).getText();
    energyRating = result.findElement(By.cssSelector("app-search-parameter-item[label='field.electronic-display.energyClassHDR'] div.ecl-u-d-l-block span.ecl-u-type-bold")).getText();
    result.findElement(By.xpath("//a[text()=' Less ']")).click();

    System.out.println("Brand name: " + brandName);
    System.out.println("SDR name: " + sdr);
    System.out.println("energyRating: " + energyRating);
}

and it outputs

Brand name: Samsung
SDR name: 63
energyRating: G
Brand name: Samsung
SDR name: 63
energyRating: G
...

Some feedback...

  1. Absolute XPaths (ones that start at /html), excessively long (multiple element levels), and indices (/div[2], etc.) are risky because the smallest change to the page will break your locators. I'm sure you're new and this is the best place to start but if you plan to continue writing scripts, learning to write your own locators will be invaluable.
  2. When you are going to scrape a page like this with repeating sections, find the uppermost element that contains each section, e.g. <app-search-result-item>. Grab those in a list that you can iterate over and that will make tasks like this easier. In each loop, you start your search from that anchor element so that you only find data that applies to that product, etc. That's why in my code you see a lot of result.findElement() because result is the product from the list of products that I'm looping through.
  3. Note that I never scrolled the page. In general, you don't need to scroll the page... Selenium will take care of that for you.
  4. While WebDriverWait is a good practice, they aren't always required.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.