<?php function is_utf8($str) return preg_match('//u', $str);
Almost any random byte sequence is technically valid ISO-8859-1 or Windows-1252 . Therefore, if a string is not UTF-8, the function will almost always fall back to ISO-8859-1 , even if it's actually Windows-1252 or something else. detect encoding php
There’s also a pure-PHP option: combined with mb_* functions gives you a U::toUtf8() method that attempts detection + conversion. ?php function is_utf8($str) return preg_match('//u'