What are the potential pitfalls of using file_get_contents and regex to extract the body part of an HTML file in PHP?

Using file_get_contents and regex to extract the body part of an HTML file in PHP can be error-prone and may not handle all edge cases, such as nested tags or variations in HTML structure. It's recommended to use a more robust HTML parsing library like DOMDocument to accurately extract content from HTML files.

$html = file_get_contents('example.html');
$dom = new DOMDocument();
$dom->loadHTML($html);
$body = $dom->getElementsByTagName('body')->item(0)->nodeValue;
echo $body;