MySQL answer: utf8_unicode_ci vs. utf8_general_ci.
Collation controls sorting behavior. Unicode rationalizes the character set, but doesn’t, on it’s own, rationalize sorting behavior for all the various languages it supports. utf8_general_ci (ci = case insensitive) is apparently a bit faster, but sloppier, and only appropriate for English language data sets.
Posted May 11, 2009 by Casey
Categories: Dispatches. Tags: collation, mysql, utf8, utf8_general_ci, utf8_unicode_ci. Be the first one.
This Gentoo Wiki page suggests dumping the table and using iconv to convert the characters, then insert the dump into a new table with the new charset.
Alex King solved a different problem: his apps were talking UTF8, but his tables were Latin1. His solution was to dump the tables, change the charset info in the [...]
Posted October 9, 2008 by Casey
Categories: Technology. Tags: character encoding, character set, character set conversion, latin1, mysql, utf8. Be the first one.