Solution 1 :

pd.read_html uses BeautifulSoup under the hood to scrape <table> elements from the webpage. Using requests to grab HTML for the webpage and parsing it manually, I found that the page you linked indeed contains only three <table> elements. However, the data for several additional tables (including the “kicking” one you want) can be found in HTML comments.


Parse the commented-out tables.

import requests
import bs4
import pandas as pd

url = ""
scraped_html = requests.get(url)
soup = bs4.BeautifulSoup(scraped_html.content)

# Get all html comments, then filter out everything that isn't a table
comments = soup.find_all(text=lambda text:isinstance(text, bs4.Comment))
commented_out_tables = [bs4.BeautifulSoup(cmt).find_all('table') for cmt in comments]
# Some of the entries in `commented_out_tables` are empty lists. Remove them.
commented_out_tables = [tab[0] for tab in commented_out_tables if len(tab) == 1]


Gives 8.

Only one of these is the “kicking” table. We can find it by looking for a table with the id attribute set to kicking.

for table in commented_out_tables:
    if table.get('id') == 'kicking':
        kicking_table = table

Turn this into a pd.DataFrame with pd.read_html.


Yields the following:

[  Unnamed: 0_level_0 Unnamed: 1_level_0 Unnamed: 2_level_0 Unnamed: 3_level_0 Games       ... Kickoffs Punting
                  No.             Player                Age                Pos     G   GS  ...    KOAvg     Pnt     Yds   Lng Blck   Y/P
 0                1.0          Matt Turk               32.0                  p    16  0.0  ...      NaN    92.0  3870.0  70.0  0.0  42.1
 1               10.0        Olindo Mare               27.0                  k    16  0.0  ...     60.3     NaN     NaN   NaN  NaN   NaN
 2                NaN         Team Total               27.3                NaN    16  NaN  ...     60.3    92.0  3870.0  70.0  0.0  42.1
 3                NaN          Opp Total                NaN                NaN    16  NaN  ...      NaN    87.0  3532.0   NaN  NaN  40.6

 [4 rows x 32 columns]]

Problem :

I want to try scraping all the tables form this website.This website contains more than 10 tables.When I use pd.read_html(),it returns only 3 tables but I expect that my script return all the tables.
My script:

import pandas as pd
url = ""
df = pd.read_html(url)



Specially, I want this table:

enter image description here

How can I get all the tables using pd.read_html()?


Comment posted by Karl Knechtel

When I view the link in my web browser, I see 3 tables: one titled

Comment posted by Humayun Ahmad Rajib

@KarlKnechtel sir above link for my case show more table like,

Comment posted by gallen

@HumayunAhmadRajib Many modern websites dynamically load content with JavaScript. It’s possible that when you request the HTML, it’s returning everything that has loaded so far. I am not sure exactly how

Comment posted by Humayun Ahmad Rajib

@gallen Sir, I am getting three tables, so why can’t I get the rest.

Comment posted by gallen

@HumayunAhmadRajib Because at the time the request made by

Comment posted by chitown88

just a fyi since oyu mentioned it,

Comment posted by Emerson Harkin

Thanks for letting me know @chitown88! I’ve edited my answer to include that information as context.