Solution 1 :

You can resolve your situation as mentioned here:

How to use string.replace() in python 3.x

string.replace(oldvalue, newvalue)

You can use a simple string.replace to resolve your situation.

In your situation:

yourHtmlContainer = """<body><img alt="Images may be two-dimensional, such as a photograph or screen display, or three-dimensional, such as a statue or hologram. They may be captured by optical devices – such as cameras, mirrors, lenses, telescopes, microscopes, etc. and natural objects and phenomena, such as the human eye or water." height="333" src="https://tvfcommunity-dev-ed--c.documentforce.com/servlet/rtaImage?eid=ka02v000001BOfL&amp;feoid=00N2v00000Rjh9i&amp;refid=0EM2v000002ijZG" width="500"><br>Images may be two-<a href="https://en.wikipedia.org/wiki/Dimensional" target="_blank" title="Dimensional">dimensional</a>, such as a&nbsp;<a href="https://en.wikipedia.org/wiki/Photograph" target="_blank" title="Photograph">photograph</a>&nbsp;or screen display, or three-dimensional, such as a&nbsp;<a href="https://en.wikipedia.org/wiki/Statue" target="_blank" title="Statue">statue</a>&nbsp;or&nbsp;<a href="https://en.wikipedia.org/wiki/Hologram" target="_blank" title="Hologram">hologram</a>. </body>"""
print("Before replace")
print(yourHtmlContainer)


newHtml = yourHtmlContainer.replace("tvfcommunity-dev-ed--c.documentforce.com", "globalcommunity.networks.com")
print("After replace")
print(newHtml)

Output:

Before replace
<body><img alt="Images may be two-dimensional, such as a photograph or screen display, or three-dimensional, such as a statue or hologram. They may be captured by optical devices – such as cameras, mirrors, lenses, telescopes, microscopes, etc. and natural objects and phenomena, such as the human eye or water." height="333" src="https://tvfcommunity-dev-ed--c.documentforce.com/servlet/rtaImage?eid=ka02v000001BOfL&amp;feoid=00N2v00000Rjh9i&amp;refid=0EM2v000002ijZG" width="500"><br>Images may be two-<a href="https://en.wikipedia.org/wiki/Dimensional" target="_blank" title="Dimensional">dimensional</a>, such as a&nbsp;<a href="https://en.wikipedia.org/wiki/Photograph" target="_blank" title="Photograph">photograph</a>&nbsp;or screen display, or three-dimensional, such as a&nbsp;<a href="https://en.wikipedia.org/wiki/Statue" target="_blank" title="Statue">statue</a>&nbsp;or&nbsp;<a href="https://en.wikipedia.org/wiki/Hologram" target="_blank" title="Hologram">hologram</a>. </body>
After replace
<body><img alt="Images may be two-dimensional, such as a photograph or screen display, or three-dimensional, such as a statue or hologram. They may be captured by optical devices – such as cameras, mirrors, lenses, telescopes, microscopes, etc. and natural objects and phenomena, such as the human eye or water." height="333" src="https://globalcommunity.networks.com/servlet/rtaImage?eid=ka02v000001BOfL&amp;feoid=00N2v00000Rjh9i&amp;refid=0EM2v000002ijZG" width="500"><br>Images may be two-<a href="https://en.wikipedia.org/wiki/Dimensional" target="_blank" title="Dimensional">dimensional</a>, such as a&nbsp;<a href="https://en.wikipedia.org/wiki/Photograph" target="_blank" title="Photograph">photograph</a>&nbsp;or screen display, or three-dimensional, such as a&nbsp;<a href="https://en.wikipedia.org/wiki/Statue" target="_blank" title="Statue">statue</a>&nbsp;or&nbsp;<a href="https://en.wikipedia.org/wiki/Hologram" target="_blank" title="Hologram">hologram</a>. </body>

For more help:
https://www.w3schools.com/python/ref_string_replace.asp

Solution 2 :

Thank you All for your valuable inputs.

I have solved this issue by using BeautifulSoup

from bs4 import BeautifulSoup

html_doc = '<body><img alt="Images may be two-dimensional, such as a photograph or screen display, or three-dimensional, such as a statue or hologram. They may be captured by optical devices – such as cameras, mirrors, lenses, telescopes, microscopes, etc. and natural objects and phenomena, such as the human eye or water." height="333" src="https://tvfcommunity-dev-ed--c.documentforce.com/servlet/rtaImage?eid=ka02v000001BOfL&amp;feoid=00N2v00000Rjh9i&amp;refid=0EM2v000002ijZG" width="500"/><br /> Images may be two-<a href="https://en.wikipedia.org/wiki/Dimensional" target="_blank" title="Dimensional">dimensional</a>, such as a&nbsp; <a href="https://en.wikipedia.org/wiki/Photograph" target="_blank" title="Photograph">photograph</a>&nbsp;or screen display, or three-dimensional, such as a&nbsp; <a href="https://en.wikipedia.org/wiki/Statue" target="_blank" title="Statue">statue</a>&nbsp;or&nbsp;<a href="https://en.wikipedia.org/wiki/Hologram" target="_blank" title="Hologram">hologram</a>. </body>'
modified_data = BeautifulSoup(html_doc, 'html.parser')

# Find image and change src domain
for tag in modified_data.findAll("img"): 
  tag['src'] = tag['src'].replace('https://tvfcommunity-dev-ed--c.documentforce.com/', 'https://globalcommunity.networks.com/')
print(modified_data)

Problem :

I have following HTML string

<body>
    <img
        alt="Images may be two-dimensional, such as a photograph or screen display, or three-dimensional, such as a statue or hologram. They may be captured by optical devices – such as cameras, mirrors, lenses, telescopes, microscopes, etc. and natural objects and phenomena, such as the human eye or water."
        height="333"
        src="https://tvfcommunity-dev-ed--c.documentforce.com/servlet/rtaImage?eid=ka02v000001BOfL&amp;feoid=00N2v00000Rjh9i&amp;refid=0EM2v000002ijZG"
        width="500"
    />
    <br />
    Images may be two-<a href="https://en.wikipedia.org/wiki/Dimensional" target="_blank" title="Dimensional">dimensional</a>, such as a&nbsp;
    <a href="https://en.wikipedia.org/wiki/Photograph" target="_blank" title="Photograph">photograph</a>&nbsp;or screen display, or three-dimensional, such as a&nbsp;
    <a href="https://en.wikipedia.org/wiki/Statue" target="_blank" title="Statue">statue</a>&nbsp;or&nbsp;<a href="https://en.wikipedia.org/wiki/Hologram" target="_blank" title="Hologram">hologram</a>.
</body>

I would like to change all occurrences of img src domain from tvfcommunity-dev-ed–c.documentforce.com to globalcommunity.networks.com in Python 3.x

Note: Looking for a solution that replaces the domain only if it present in img src. It should not replace if is in regular string or iframe src.

Any help?

Comments

Comment posted by lxml

Parse the XML using e.g.

Comment posted by user3164444

Hi Mateus, thank you for response. your answer even replaces a regular string (or iframe src) that contains tvfcommunity-dev-ed–c.documentforce.com. I am looking for a solution to replace only if it present in img src.

Comment posted by python-looping-through-html

Could you iterate your HTML tags? If the answer is yes, you can do some checking to see if the changed tag is an img. I can change the code if that resolves your situation.

Comment posted by user3164444

HI Mateus, may be that works. Instead of checking afterwords, if i can check and replace then it might help.

By