Solution 1 :

Try this, to acces element attributes

imgss.append(image.attrs['src'])

Problem :

import pandas as pd

listt=[list of web pages]

from urllib.request import urlopen
from bs4 import BeautifulSoup
import re

imgss = []
for i in range(len(listt)):
    print(i)
    html = urlopen(listt[i])
    bs = BeautifulSoup(html, 'html.parser')
    images = bs.find_all('img', {'src':re.compile('PATTERN.jpg')})
    for image in images:
        imgss.append(image['src'])

There are about 1100 URLs of Amazon web pages in the listt. It works for some initial web pages but after that, it throws error : HTTP Error 503: Service Unavailable .

Comments

Comment posted by Poojan

It throws error ? Can you add what kind of errors?

By