How can using DOM instead of regular expressions improve the efficiency of extracting data from HTML in PHP?
When extracting data from HTML in PHP, using DOM manipulation instead of regular expressions can improve efficiency because DOM provides a more structured way to navigate and manipulate the HTML document, ensuring accurate extraction of data. Regular expressions can be error-prone and difficult to maintain when dealing with complex HTML structures.
// Load the HTML content into a DOMDocument
$html = file_get_contents('example.html');
$dom = new DOMDocument();
$dom->loadHTML($html);
// Use DOMXPath to query specific elements
$xpath = new DOMXPath($dom);
$elements = $xpath->query('//div[@class="content"]');
// Extract data from the queried elements
foreach ($elements as $element) {
$data = $element->nodeValue;
echo $data;
}