What are some best practices for using regular expressions in PHP to extract specific content from HTML?
When using regular expressions in PHP to extract specific content from HTML, it is important to use the correct patterns to match the desired content accurately. It is also recommended to use functions such as preg_match() or preg_match_all() to perform the regex matching. Additionally, it is a good practice to sanitize and validate the extracted content to prevent any security vulnerabilities.
$html = file_get_contents('example.html');
$pattern = '/<h1>(.*?)<\/h1>/'; // regex pattern to extract content within <h1> tags
preg_match($pattern, $html, $matches);
if (!empty($matches)) {
$extracted_content = $matches[1];
// sanitize and validate the extracted content
echo $extracted_content;
} else {
echo 'No matching content found';
}