Try this, to acces element attributes
imgss.append(image.attrs['src'])
Try this, to acces element attributes
imgss.append(image.attrs['src'])
import pandas as pd
listt=[list of web pages]
from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
imgss = []
for i in range(len(listt)):
print(i)
html = urlopen(listt[i])
bs = BeautifulSoup(html, 'html.parser')
images = bs.find_all('img', {'src':re.compile('PATTERN.jpg')})
for image in images:
imgss.append(image['src'])
There are about 1100 URLs of Amazon web pages in the listt. It works for some initial web pages but after that, it throws error : HTTP Error 503: Service Unavailable .
It throws error ? Can you add what kind of errors?