From the MySQL manual: For any Unicode character set, operations performed using the xxx_general_ci collation are faster than those for the xxx_unicode_ci collation. For example, comparisons for the utf8_general_ci collation are faster, but slightly less correct, than comparisons for utf8_unicode_ci. They have a amusing “examples of the effect of collation” set on “sorting German umlauts,” […]
MySQL answer: utf8_unicode_ci vs. utf8_general_ci. Collation controls sorting behavior. Unicode rationalizes the character set, but doesn’t, on it’s own, rationalize sorting behavior for all the various languages it supports. utf8_general_ci (ci = case insensitive) is apparently a bit faster, but sloppier, and only appropriate for English language data sets.
This Gentoo Wiki page suggests dumping the table and using iconv to convert the characters, then insert the dump into a new table with the new charset. Alex King solved a different problem: his apps were talking UTF8, but his tables were Latin1. His solution was to dump the tables, change the charset info in […]