Haven’t used BS4 or python for awhile, but If I remember correctly something like this should work on getting all elements with data_stat and school_name in the data.
results = soup.findAll("td", {"data_stat" : "school_name"})
Or if you want all results in data with the data_stat attribute and the value doesn’t matter use –
results = soup.findAll("td", {"data_stat" : True})
You have a couple of options:
- You can use soup.find_all and loop through your results.
- Use the css selector for first.
- Inspect and copy the selector for that element.
I am parsing a webpage using bs4. There are more then one data type I would like to select, with the same class name.
My parsing code:
rows_ranking = soup_ranking.select('#current-poll tbody tr .left')
The page I want to parse has two different “.left” identifiers in the table rows. How can I choose which one I would like. Here is an exmample of two of these table rows (one I would like my program to parse, the other I would like to ignore)
1 – <td class="left " data-stat="school_name" csk="Baylor.015"><a href="/cbb/schools/baylor/2020.html">Baylor</a></td>
2 – <td class="left " data-stat="conf_abbr" csk="Big 12 Conference.015.001"><a href="/cbb/conferences/big-12/2020.html" title="Big 12 Conference">Big 12</a></td>
As you can see they have the same class identifier. Is there a way I can have bs4 look only for the first of the two?
I hope my question makes sense, thanks in advance!
The links do not correspond to anything.
It was supposed to be raw code but didn’t have backticks on it.
What is the issue, exactly? They clearly have other attributes to distinguish them, right? Have you read the BeautifulSoup docs?
Right, except there are 50 items returned with my call function. I want to grab every OTHER item. I suppose some python looping where I can grab every other item would work fine enough. But I am more curious to see if there is another way to have bs4 grab based solely off the ‘data-stat’ html thingy. I am not strong in html, i don’t think that is a selector.
You mean the data-stat attribute, and a selector can be any unique identification of an element. Either answer here though will get you what you’re looking for.