in your second for loop:
foreach ($rows as $row)
{
// get each column by tag name
$cols = $row->getElementsByTagName('td');
$row = array();
$i=0;
foreach ($cols as $node) {
# code...
if($row_headers==NULL)
$row[] = $node->nodeValue;
else
$row[$row_headers[$i]] = $node->firstChild->ownerDocument->saveHTML($node->firstChild);
$i++;
}
$table[] = $row;
}
than the output will be:
[1] => Array
(
[Column 1] => <b>Q</b>
[Column 2] => Desc.
)
I am able to parse HTML page properly, but it is parsing just the data whereas I want to fetch entire HTML code inside in <tr> , <td>
. Below is my PHP code:
<?php
$dom = new DOMDocument();
//load the html
$html = $dom->loadHTMLFile("hydrocarbon.htm");
//discard white space
//$dom->preserveWhiteSpace = false;
//the table by its tag name
$tables = $dom->getElementsByTagName('table');
//get all rows from the table
$rows = $tables->item(0)->getElementsByTagName('tr');
// get each column by tag name
$cols = $rows->item(0)->getElementsByTagName('th');
$row_headers = NULL;
foreach ($cols as $node) {
//print $node->nodeValue."n";
$row_headers[] = $node->nodeValue;
}
$table = array();
//get all rows from the table
$rows = $tables->item(0)->getElementsByTagName('tr');
foreach ($rows as $row)
{
// get each column by tag name
$cols = $row->getElementsByTagName('td');
$row = array();
$i=0;
foreach ($cols as $node) {
# code...
//print $node->nodeValue."n";
if($row_headers==NULL)
$row[] = $node->nodeValue;
else
$row[$row_headers[$i]] = $node->nodeValue;
$i++;
}
$table[] = $row;
}
//var_dump($table);
print("<pre>".print_r($table,true)."</pre>");
?>
This is my result:

and this is my HTML code:
<table>
<thead>
<tr><th>Column 1</th><th>Column 2</th><th>Column 3</th></tr>
</thead>
<tbody>
<tr> <td><b>Q</b></td><td>Desc.</td> </tr>
<tr> <td>Type</td><td>Multiple choice</td> </tr>
<tr><td>Option</td><td>image #####2</td><td>incorrect</td></tr>
<tr><td>Option</td><td>image #####2</td><td>incorrect</td></tr>
<tr><td>Option</td><td>image #####2</td><td>incorrect</td></tr>
<tr><td>Option</td><td>image #####2</td><td>incorrect</td></tr>
<tr><td>Solution</td><td>Some text / image</td></tr>
<tr><td>Marks</td><td>4</td><td>1</td></tr>
</tbody>
</table>
It is parsing Q
and not <b>Q</b>
. How can I achieve this?
Edit 1: Original table where your solution should work
<table class=MsoNormalTable border=1 cellspacing=0 cellpadding=0 width=610 style='width:457.25pt;margin-left:10.8pt;background:#CED7E7;border-collapse:
collapse;border:none'>
<tr style='height:30.35pt'>
<td width=112 valign=top style='width:84.0pt;border:solid black 1.0pt;
background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;height:30.35pt'>
<p class=BodyA><span lang=EN-US style='font-size:16.0pt;border:none'><span
style='border:none'>Question<span style='border:none'> </span></span>
</span>
</p>
</td>
<td width=498 colspan=2 valign=top style='width:373.25pt;border:solid black 1.0pt;
border-left:none;background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;
height:30.35pt'>
<p class=MsoNormal style='margin-top:0cm;margin-right:-48.45pt;margin-bottom:
0cm;margin-left:18.0pt;margin-bottom:.0001pt;line-height:115%;border:none'><b><span
lang=EN-US style='font-family:"Garamond","serif";border:none'><span
style='border:none'>Consider the following reaction,</span></span></b>
</p>
<p class=MsoNormal style='margin-top:0cm;margin-right:-48.45pt;margin-bottom:
0cm;margin-left:18.0pt;margin-bottom:.0001pt;line-height:115%'><b><span
lang=EN-US style='font-family:"Garamond","serif";border:none'><span
style='border:none'>H</span></span></b><b><sub><span lang=EN-US
style='font-family:"Garamond","serif";border:none'><span style='border:none'>3</span></span></sub></b><b><span
lang=EN-US style='font-family:"Garamond","serif";border:none'><span
style='border:none'>C – CH – CH – CH</span></span></b><b><sub><span
lang=EN-US style='font-family:"Garamond","serif";border:none'><span
style='border:none'>3</span></span></sub></b><b><span lang=EN-US
style='font-family:"Garamond","serif";border:none'><span style='border:none'>
+ </span></span></b><b><span lang=EN-US style='font-family:"Garamond","serif";
position:relative;top:2.0pt;border:none'><img width=26 height=29
src="hydrocarbon2_files/image001.png"></span></b><b><span lang=EN-US
style='font-family:"Garamond","serif";border:none'><span style='border:none'> →
‘X’ + HBr </span></span></b>
</p>
<p class=MsoNormal style='margin-top:0cm;margin-right:-48.45pt;margin-bottom:
0cm;margin-left:18.0pt;margin-bottom:.0001pt;line-height:115%'><b><span
lang=EN-US style='font-family:"Garamond","serif";border:none'><span
style='border:none'> | |</span></span></b>
</p>
<p class=MsoNormal style='margin-top:0cm;margin-right:-48.45pt;margin-bottom:
0cm;margin-left:18.0pt;margin-bottom:.0001pt;line-height:115%'><b><span
lang=EN-US style='font-family:"Garamond","serif";border:none'><span
style='border:none'> D CH</span></span></b><b><sub><span
lang=EN-US style='font-family:"Garamond","serif";border:none'><span
style='border:none'>3</span></span></sub></b>
</p>
<p class=MsoNoSpacing style='margin-top:0cm;margin-right:-48.45pt;margin-bottom:
0cm;margin-left:.3pt;margin-bottom:.0001pt;text-align:justify;text-indent:
-.3pt'><b><span lang=EN-GB style='font-size:16.0pt;font-family:"Chaparral Pro","serif"'> </span></b>
</p>
</td>
</tr>
<tr style='height:15.0pt'>
<td width=112 valign=top style='width:84.0pt;border:solid black 1.0pt;
border-top:none;background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;
height:15.0pt'>
<p class=BodyA><span lang=EN-US style='font-size:16.0pt;border:none'><span
style='border:none'>Type</span></span>
</p>
</td>
<td width=498 colspan=2 valign=top style='width:373.25pt;border-top:none;
border-left:none;border-bottom:solid black 1.0pt;border-right:solid black 1.0pt;
background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;height:15.0pt'>
<p class=BodyA><span lang=EN-US style='font-size:16.0pt;border:none'><span
style='border:none'>multiple_choice</span></span>
</p>
</td>
</tr>
<tr style='height:15.0pt'>
<td width=112 valign=top style='width:84.0pt;border:solid black 1.0pt;
border-top:none;background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;
height:15.0pt'>
<p class=BodyA><span lang=EN-US style='font-size:16.0pt;border:none'><span
style='border:none'>Option</span></span>
</p>
</td>
<td width=219 valign=top style='width:164.25pt;border-top:none;border-left:
none;border-bottom:solid black 1.0pt;border-right:solid black 1.0pt;
background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;height:15.0pt'>
<p class=BodyA><span style='font-size:16.0pt;color:black;border:none'><img
width=205 height=93 src="hydrocarbon2_files/image002.jpg"></span>
</p>
</td>
<td width=279 valign=top style='width:209.0pt;border-top:none;border-left:
none;border-bottom:solid black 1.0pt;border-right:solid black 1.0pt;
background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;height:15.0pt'>
<p class=BodyA><span lang=EN-US style='font-size:16.0pt;border:none'><span
style='border:none'>I</span></span><span lang=EN-US style='font-size:16.0pt;
border:none'><span style='border:none'>n<span style='border:none'>correct</span></span>
</span>
</p>
</td>
</tr>
<tr style='height:15.0pt'>
<td width=112 valign=top style='width:84.0pt;border:solid black 1.0pt;
border-top:none;background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;
height:15.0pt'>
<p class=BodyA><span lang=EN-US style='font-size:16.0pt;border:none'><span
style='border:none'>Option</span></span>
</p>
</td>
<td width=219 valign=top style='width:164.25pt;border-top:none;border-left:
none;border-bottom:solid black 1.0pt;border-right:solid black 1.0pt;
background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;height:15.0pt'>
<p class=BodyA><span style='font-size:16.0pt;border:none'><img width=205
height=102 id="Picture 13" src="hydrocarbon2_files/image003.jpg"></span>
</p>
</td>
<td width=279 valign=top style='width:209.0pt;border-top:none;border-left:
none;border-bottom:solid black 1.0pt;border-right:solid black 1.0pt;
background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;height:15.0pt'>
<p class=BodyA><span lang=EN-US style='font-size:16.0pt;border:none'><span
style='border:none'>C</span></span><span lang=EN-US style='font-size:16.0pt;
border:none'><span style='border:none'>orrect</span></span>
</p>
</td>
</tr>
<tr style='height:15.0pt'>
<td width=112 valign=top style='width:84.0pt;border:solid black 1.0pt;
border-top:none;background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;
height:15.0pt'>
<p class=BodyA><span lang=EN-US style='font-size:16.0pt;border:none'><span
style='border:none'>Option</span></span>
</p>
</td>
<td width=219 valign=top style='width:164.25pt;border-top:none;border-left:
none;border-bottom:solid black 1.0pt;border-right:solid black 1.0pt;
background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;height:15.0pt'>
<p class=BodyA><span style='font-size:16.0pt;border:none'><img width=205
height=107 id="Picture 16" src="hydrocarbon2_files/image004.jpg"></span>
</p>
</td>
<td width=279 valign=top style='width:209.0pt;border-top:none;border-left:
none;border-bottom:solid black 1.0pt;border-right:solid black 1.0pt;
background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;height:15.0pt'>
<p class=BodyA><span lang=EN-US style='font-size:16.0pt;border:none'><span
style='border:none'>Incorrect</span></span>
</p>
</td>
</tr>
<tr style='height:15.0pt'>
<td width=112 valign=top style='width:84.0pt;border:solid black 1.0pt;
border-top:none;background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;
height:15.0pt'>
<p class=BodyA><span lang=EN-US style='font-size:16.0pt;border:none'><span
style='border:none'>Option</span></span>
</p>
</td>
<td width=219 valign=top style='width:164.25pt;border-top:none;border-left:
none;border-bottom:solid black 1.0pt;border-right:solid black 1.0pt;
background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;height:15.0pt'>
<p class=BodyA><span style='font-size:16.0pt;border:none'><img width=205
height=112 id="Picture 19" src="hydrocarbon2_files/image005.jpg"></span>
</p>
</td>
<td width=279 valign=top style='width:209.0pt;border-top:none;border-left:
none;border-bottom:solid black 1.0pt;border-right:solid black 1.0pt;
background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;height:15.0pt'>
<p class=BodyA><span lang=EN-US style='font-size:16.0pt;border:none'><span
style='border:none'>Incorrect</span></span>
</p>
</td>
</tr>
<tr style='height:15.0pt'>
<td width=112 valign=top style='width:84.0pt;border:solid black 1.0pt;
border-top:none;background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;
height:15.0pt'>
<p class=BodyA><span lang=EN-US style='font-size:16.0pt;border:none'><span
style='border:none'>Solution</span></span>
</p>
</td>
<td width=498 colspan=2 valign=top style='width:373.25pt;border-top:none;
border-left:none;border-bottom:solid black 1.0pt;border-right:solid black 1.0pt;
background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;height:15.0pt'>
<p class=MsoNormal style='margin-left:27.0pt;text-align:justify;text-indent:
-27.0pt;line-height:115%'><span style='font-family:"Garamond","serif";
border:none'><img width=398 height=92 id="Picture 10"
src="hydrocarbon2_files/image006.jpg"></span>
</p>
</td>
</tr>
<tr style='height:15.0pt'>
<td width=112 valign=top style='width:84.0pt;border:solid black 1.0pt;
border-top:none;background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;
height:15.0pt'>
<p class=BodyA><span lang=EN-US style='font-size:16.0pt;border:none'><span
style='border:none'>Marks</span></span>
</p>
</td>
<td width=219 valign=top style='width:164.25pt;border-top:none;border-left:
none;border-bottom:solid black 1.0pt;border-right:solid black 1.0pt;
background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;height:15.0pt'>
<p class=BodyA><span lang=EN-US style='font-size:16.0pt;border:none'><span
style='border:none'>4</span></span>
</p>
</td>
<td width=279 valign=top style='width:209.0pt;border-top:none;border-left:
none;border-bottom:solid black 1.0pt;border-right:solid black 1.0pt;
background:transparent;padding:4.0pt 4.0pt 4.0pt 4.0pt;height:15.0pt'>
<p class=BodyA><span lang=EN-US style='font-size:16.0pt;border:none'><span
style='border:none'>1</span></span>
</p>
</td>
</tr>
</table>
This is not in reference to my requirement, the below answer works with sample html, but if I put my actual html (which I have copied in the end), this solution is not working
This was sample html, your code is for this particular sample, can you please chat with me, your solution is working but not in my original html
I have updated the question, added original table, i was expecting your answer to work with it also but its not compatible, kindly suggest solution according to it