How can PHP be used to extract specific data from HTML source code, such as text within certain HTML tags?

To extract specific data from HTML source code using PHP, you can utilize the DOMDocument class to parse the HTML and then use XPath queries to target specific elements or attributes. This allows you to easily extract text within certain HTML tags by selecting the desired elements based on their tag names, classes, IDs, or other attributes.

// HTML source code
$html = &#039;&lt;div class=&quot;content&quot;&gt;&lt;h1&gt;Title&lt;/h1&gt;&lt;p&gt;Paragraph content&lt;/p&gt;&lt;/div&gt;&#039;;

// Create a new DOMDocument
$dom = new DOMDocument();
$dom-&gt;loadHTML($html);

// Use XPath to query for specific elements
$xpath = new DOMXPath($dom);
$elements = $xpath-&gt;query(&quot;//div[@class=&#039;content&#039;]/h1&quot;);

// Extract text within the selected element
if ($elements-&gt;length &gt; 0) {
    $text = $elements-&gt;item(0)-&gt;textContent;
    echo $text; // Output: Title
}

Keywords

PHP HTML data extraction DOM manipulation regular expressions

How can PHP be used to extract specific data from HTML source code, such as text within certain HTML tags?

Keywords

Related Questions