What are the potential pitfalls of using regular expressions for parsing HTML documents in PHP?
Using regular expressions for parsing HTML documents in PHP can be error-prone and unreliable due to the complex and varied nature of HTML. It is generally recommended to use a dedicated HTML parsing library like DOMDocument or SimpleHTMLDom instead. These libraries provide a more robust and accurate way to extract information from HTML documents.
// Using DOMDocument to parse HTML
$html = '<html><body><h1>Hello, World!</h1></body></html>';
$doc = new DOMDocument();
$doc->loadHTML($html);
$headings = $doc->getElementsByTagName('h1');
foreach ($headings as $heading) {
echo $heading->nodeValue;
}