Here is a hacked version, I have tested it for Chinese and Japanese tags and it works.Basically, it just excludes mutibyte chars(x80-xff) from the stopdata.
function sanitize_with_dashes( $text ) {
$text = strip_tags($text);
$text = remove_accents($text);
$text = strtolower($text);
$text = preg_replace(‘/&(^x80-xff)+?;/’, ”, $text); // kill entities
$text = preg_replace(‘/[^a-z0-9×80-xff _-]/’, ”, $text);
$text = preg_replace(‘/s+/’, ‘-‘, $text);
$text = preg_replace(array(‘|-+|’, ‘|_+|’), array(‘-‘, ‘_’), $text); // Kill the repeats
return $text;
}