Skip to main content
Notice removed Draw attention by CommunityBot
Bounty Ended with undetected Selenium's answer chosen by CommunityBot
deleted 57 characters in body
Source Link
oldboy
  • 6.1k
  • 7
  • 43
  • 104

Printing the length of items invokes some strange behaviour too. Instead of it always returning 32, which would correspond to the number of items on each page, it prints 32 for the first page, 64 for the second, 96 for the third, so on and so forth. I fixed this by using //div[contains(@id, "100_dealView_")]/div[contains(@class, "dealContainer")] instead of //div[contains(@id, "100_dealView_")] as the XPath for the items variable. I'm hoping this is the reason why it runs into issues on page 9. I'm running tests right now. Update: It is now scraping page 10 right now after crawling to it naturally from the preceding pagesand beyond, so this might've solved itthe issue is resolved.

Printing the length of items invokes some strange behaviour too. Instead of it always returning 32, which would correspond to the number of items on each page, it prints 32 for the first page, 64 for the second, 96 for the third, so on and so forth. I fixed this by using //div[contains(@id, "100_dealView_")]/div[contains(@class, "dealContainer")] instead of //div[contains(@id, "100_dealView_")] as the XPath for the items variable. I'm hoping this is the reason why it runs into issues on page 9. I'm running tests right now. Update: It is now scraping page 10 right now after crawling to it naturally from the preceding pages, so this might've solved it.

Printing the length of items invokes some strange behaviour too. Instead of it always returning 32, which would correspond to the number of items on each page, it prints 32 for the first page, 64 for the second, 96 for the third, so on and so forth. I fixed this by using //div[contains(@id, "100_dealView_")]/div[contains(@class, "dealContainer")] instead of //div[contains(@id, "100_dealView_")] as the XPath for the items variable. I'm hoping this is the reason why it runs into issues on page 9. I'm running tests right now. Update: It is now scraping page 10 and beyond, so the issue is resolved.

deleted 627 characters in body
Source Link
oldboy
  • 6.1k
  • 7
  • 43
  • 104

I duplicated the if count+1 ... statement on the same level as the if product_title not in ... level, and received the following error:

HTTPConnectionPool(host='127.0.0.1', port=58992): Max retries exceeded with url: /session/e8beed9b-4faa-4e91-a659-56761cb604d7/element (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000022D31378A58>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))

It's obvious what this means, but I'm not sure why only the additional if statement would invoke this error.

Printing the length of items invokes some strange behaviour too. Instead of it always returning 32, which would correspond to the number of items on each page, it prints 32 for the first page, 64 for the second, 96 for the third, so on and so forth. I fixed this by using //div[contains(@id, "100_dealView_")]/div[contains(@class, "dealContainer")] instead of //div[contains(@id, "100_dealView_")] as the XPath for the items variable. I'm hoping this is the reason why it runs into issues on page 9. I'm running tests right now. Update: It is now scraping page 10 right now after crawling to it naturally from the preceding pages, so this might've solved it.

I duplicated the if count+1 ... statement on the same level as the if product_title not in ... level, and received the following error:

HTTPConnectionPool(host='127.0.0.1', port=58992): Max retries exceeded with url: /session/e8beed9b-4faa-4e91-a659-56761cb604d7/element (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000022D31378A58>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))

It's obvious what this means, but I'm not sure why only the additional if statement would invoke this error.

Printing the length of items invokes some strange behaviour too. Instead of it always returning 32, which would correspond to the number of items on each page, it prints 32 for the first page, 64 for the second, 96 for the third, so on and so forth. I fixed this by using //div[contains(@id, "100_dealView_")]/div[contains(@class, "dealContainer")] instead of //div[contains(@id, "100_dealView_")] as the XPath for the items variable. I'm hoping this is the reason why it runs into issues on page 9. I'm running tests right now. Update: It is now scraping page 10 right now after crawling to it naturally from the preceding pages, so this might've solved it.

Printing the length of items invokes some strange behaviour too. Instead of it always returning 32, which would correspond to the number of items on each page, it prints 32 for the first page, 64 for the second, 96 for the third, so on and so forth. I fixed this by using //div[contains(@id, "100_dealView_")]/div[contains(@class, "dealContainer")] instead of //div[contains(@id, "100_dealView_")] as the XPath for the items variable. I'm hoping this is the reason why it runs into issues on page 9. I'm running tests right now. Update: It is now scraping page 10 right now after crawling to it naturally from the preceding pages, so this might've solved it.

added 68 characters in body
Source Link
oldboy
  • 6.1k
  • 7
  • 43
  • 104

Printing the length of items invokes some strange behaviour too. Instead of it always returning 32, which would correspond to the number of items on each page, it prints 32 for the first page, 64 for the second, 96 for the third, so on and so forth. I fixed this by using //div[contains(@id, "100_dealView_")]/div[contains(@class, "dealContainer")] instead of //div[contains(@id, "100_dealView_")] as the XPath for the items variable. I'm hoping this is the reason why it runs into issues on page 9. I'm running tests right now. Update: It is now scraping page 10 right now after crawling to it naturally from the preceding pages, so this might've solved it.

Printing the length of items invokes some strange behaviour too. Instead of it always returning 32, which would correspond to the number of items on each page, it prints 32 for the first page, 64 for the second, 96 for the third, so on and so forth. I fixed this by using //div[contains(@id, "100_dealView_")]/div[contains(@class, "dealContainer")] instead of //div[contains(@id, "100_dealView_")] as the XPath for the items variable. I'm hoping this is the reason why it runs into issues on page 9. I'm running tests right now.

Printing the length of items invokes some strange behaviour too. Instead of it always returning 32, which would correspond to the number of items on each page, it prints 32 for the first page, 64 for the second, 96 for the third, so on and so forth. I fixed this by using //div[contains(@id, "100_dealView_")]/div[contains(@class, "dealContainer")] instead of //div[contains(@id, "100_dealView_")] as the XPath for the items variable. I'm hoping this is the reason why it runs into issues on page 9. I'm running tests right now. Update: It is now scraping page 10 right now after crawling to it naturally from the preceding pages, so this might've solved it.

added 20 characters in body
Source Link
oldboy
  • 6.1k
  • 7
  • 43
  • 104
Loading
added 265 characters in body
Source Link
oldboy
  • 6.1k
  • 7
  • 43
  • 104
Loading
edited title
Link
oldboy
  • 6.1k
  • 7
  • 43
  • 104
Loading
edited tags
Link
undetected Selenium
  • 194.7k
  • 44
  • 306
  • 387
Loading
Corrected question heading, tag edits
Link
undetected Selenium
  • 194.7k
  • 44
  • 306
  • 387
Loading
Notice added Draw attention by oldboy
Bounty Started worth 50 reputation by oldboy
added 292 characters in body
Source Link
oldboy
  • 6.1k
  • 7
  • 43
  • 104
Loading
added 625 characters in body
Source Link
oldboy
  • 6.1k
  • 7
  • 43
  • 104
Loading
added 131 characters in body
Source Link
oldboy
  • 6.1k
  • 7
  • 43
  • 104
Loading
Source Link
oldboy
  • 6.1k
  • 7
  • 43
  • 104
Loading