A serious pitfall when using mb_substr() set to HTML-ENTITIES encoding is that the function performs a number of conversions before returning the value, the worst one being that html special characters are not just counted but decoded.
<?php
mb_internal_encoding("ISO-8859-1"); echo mb_internal_encoding(),"\n<br><br>\n";
$a='jüst ä " simple " 日本 <b>test</b>';
echo mb_substr($a,0),"\n<br><br>\n";
// page source: jüst ä " simple " 日本 <b>test</b>
echo mb_substr($a,0,strlen($a),'HTML-ENTITIES');
// page source: jüst ä " simple " 日本 <b>test</b>
?>