0

I am trying to get image links of the products from the website. I can get image info on some of the products. However, I can't get some of them. In the code URL1 is working but URL2 throws "json.decoder.JSONDecodeError". I think the problem is I cant parse the JSON string. I am not good at regular expression. How can I get JSON string?

Screenshot

Code

import re,json,requests
url1 =  "https://www.trendyol.com/samsung/akilli-smart-air-sihirli-led-tv-televizyon-kumandasi-yerine-tuslu-kumanda-1078-p-43447565?boutiqueId=61&merchantId=384846"
url2 = "https://www.trendyol.com/samsung/k-ve-m-serisi-uyumlu-led-lcd-tv-akilli-kumandasi-bn59-01259b-p-45735139?boutiqueId=61&merchantId=115135"
r = requests.get(url2)
data = json.loads(re.search(r'PRODUCT_DETAIL_APP_INITIAL_STATE__=(.*?);', r.text).group(1))
images = ['https://www.trendyol.com' + img for img in data['product']['images']]
print(images)

2 Answers 2

1

The following regex is a better match for your given urls as it terminates at the end of the nested dictionaries and before the start of the next block.

import re,json,requests

url1 =  "https://www.trendyol.com/samsung/akilli-smart-air-sihirli-led-tv-televizyon-kumandasi-yerine-tuslu-kumanda-1078-p-43447565?boutiqueId=61&merchantId=384846"
url2 = "https://www.trendyol.com/samsung/k-ve-m-serisi-uyumlu-led-lcd-tv-akilli-kumandasi-bn59-01259b-p-45735139?boutiqueId=61&merchantId=115135"

for url in [url1, url2]:
    r = requests.get(url)
    data = json.loads(re.search(r'PRODUCT_DETAIL_APP_INITIAL_STATE__=(.*?\}\});', r.text).group(1))
    images = ['https://www.trendyol.com' + img for img in data['product']['images']]
    print(images)
    print("")

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

1

You can try this:

import requests
from bs4 import BeautifulSoup

headers = {
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0',
}

r = requests.get('https://www.trendyol.com/samsung/k-ve-m-serisi-uyumlu-led-lcd-tv-akilli-kumandasi-bn59-01259b-p-45735139?boutiqueId=61&merchantId=115135')
soup = BeautifulSoup ((r.text).encode('utf-8'))

img = soup.findAll ('img')
for x in img:
    print(x['src'])

1 Comment

Thank you. I tried that way before. It is getting low-resolution images. I need high-resolution images of the product.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.