What are some potential issues when using substr() in PHP to extract parts of a string, especially when dealing with Unicode characters?

When using substr() in PHP to extract parts of a string that contains Unicode characters, there is a risk of cutting the string in the middle of a multi-byte character, resulting in corrupted data. To avoid this issue, it is recommended to use mb_substr() function, which is specifically designed to handle multi-byte character strings.

// Using mb_substr() to extract parts of a string containing Unicode characters
$string = &quot;Hello, 你好&quot;;
$substring = mb_substr($string, 0, 5, &#039;UTF-8&#039;);
echo $substring; // Output: Hello

Keywords

substr Unicode characters multibyte strings mb_substr encoding issues

What are some potential issues when using substr() in PHP to extract parts of a string, especially when dealing with Unicode characters?

Keywords

Related Questions