Solution 1 :

If you want to use Selenium you have to use the WebDriver. See it as a “connection” between your Programm and Google Chrome. If you can use Safari you can use Selenium without any WebDrivers that have to be installed manually.

If you want to use other tools I can recommend Beautifulsoup. It’s basically a HTML-Parser wich looks into the HTML-Code of the WebPage. With BS you don’t have to install any Drivers etc. You also can use BS with Python.

A other Method I’m thinking of is, downloading the HTML-Text of the WebPage and search locally through the file. But I wouldn’t recommend this Method.

For WebPages Selenium is really the way to go. I often use it for my own projects

Problem :

This isn’t really a specific question i’m sorry for that. I’m trying to create a script that would take real time data from another site ( from table tag to be exact, make it an array and display it somewhere ). I’ve created a simple python script:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import requests
import time

driver = webdriver.Chrome('C:/drivers/chromedriver.exe')

driver.set_page_load_timeout("10")
driver.get("link to the site")
driver.find_element_by_id("username-real").send_keys("login")
driver.find_element_by_id("pass-real").send_keys("pwd")
driver.find_element_by_xpath('//input[@class="button-login"]').submit()
#here potentially for loop that would refresh every second:
    for elem in driver.find_elements_by_xpath('//[@class="[email protected]"]'):
        #do something

As you can see it’s pretty simple, basically open chrome webdriver, log in to the website and do something with the table, I didn’t try to properly get the data yet because i don’t like this method.

I was wondering if there’s another way to do it, without running the webdriver – some console like application? I’m pretty lost what should i look into in order to create a script like that. Other programming language? Some kind of framework/method?

Comments

Comment posted by devcenter.heroku.com/articles/getting-started-with-python]

You can deploy your program on Heroku and it works as an API. [ reference:

Comment posted by Dano

@YashMakan Thank you! After going through the documentation you provided for a few seconds, this looks great! only thing i worry about is if it’s able to log in to a page? There shouldn’t be any problems with getting data that’s publicly available, what about personal user data that you need to log in to get it?

Comment posted by Yash Makan

No need to worry. I myself deployed complex programs on Heroku and all worked perfectly. If the code runs in your pc then it surely runs in the Heroku servers.

Comment posted by Dano

@YashMakan Great!!! Thank you 🙂

Comment posted by Dano

Thank you I’ll look into those methods. For DL the webpage, not a great idea imo as well, if i want it to dynamically check if something changed in the table ( if something new/old appeared/disappeared ) it would need to download the HTML every 1 sec or so? probably not the most efficient way of doing that. First i’ll look into those other methods, if nothing is good i’ll use selenium, maybe Safari one you mentioned would be better for me! 🙂 Thank you again!

Comment posted by Lukas Scholz

Yes you would have to download it everytime 😀 It just was a idea. When you use Selenium or other Webscraper you also have to develop some kind of automation. You could simply us a loop with waiting time (not efficient) or use Cron to execute your script at the time you want.

Comment posted by Dano

I was hoping i could do it so if it detects there’s a new element it would run the script, tho that requires a loop as well right XD? I was also thinking maybe modifying the HTML/JS on the website to make it somehow run the script when there’s new child added? Tho still not if that’s any realistic.

By