What are the advantages of using PHP's built-in tokenizer over regular expressions for handling comments in code?

When handling comments in code, using PHP's built-in tokenizer is advantageous over regular expressions because it provides a more reliable and accurate way to parse the code structure. Regular expressions may struggle with nested comments or complex code patterns, whereas the tokenizer can easily distinguish between comments and other code elements. Additionally, the tokenizer provides specific tokens for comments, making it easier to manipulate and extract comment content.

$code = file_get_contents('example.php');
$tokens = token_get_all($code);

foreach ($tokens as $token) {
    if (is_array($token) && $token[0] === T_COMMENT) {
        echo "Comment found: " . $token[1] . "\n";
    }
}