Ransford Okpoti's Blog

4 October, 2011

How To Solve MySQL Character Sets and Collations Mixing Problems

Filed under: MySQL — Tags: , — ranskills @ 11:20 am

Character sets and collations enable MySQL to perform comparisons on strings, so if have two or more columns with different character sets and/or collations are used in a way that performs some comparison of some sort on them (either using =, >, <, , a JOIN ON, etc), MySQL will be a little confused and display the error message:

[Err] 1267 – Illegal mix of collations (latin1_danish_ci,IMPLICIT) and (latin1_swedish_ci,IMPLICIT) for operation ‘=’

Unfortunately, there is no single command at the database level that could convert the character sets and collations of all tables within a database. In fact, the only command which appears to get the job done only changes the default values of the character set and/or collation of a database, and does not affect existing tables, but only has effect on new tables to be created.

ALTER DATABASE {database_name} CHARACTER SET {character_set} COLLATE {collation};

Likely for us, there is just a command/syntax for changing these values at the table level, syntax shown below.

ALTER TABLE {table_name} CONVERT TO CHARACTER SET {character_set} COLLATE {collation};

Example:

ALTER TABLE employees CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

This problem can be resolved by ensuring that all the tables and their corresponding columns have the same character set and collation. If multiple databases are employed, then they should all have the same character set and collation.
Using the above commands would be a very boring and painful process if you have, say, anything above 20 tables in your database. You will realize that is no fun at all, so we will develop a stored procedure that will automatically iterate through all the tables in the database and sets the character set and collations to our desired values.

The codes for the stored procedure is show below, I presume you can create this in MySQL (If not, jump to the screencast below to see it).

DROP PROCEDURE IF EXISTS procChangeCharSetandCollationForAllTables $$

CREATE PROCEDURE procChangeCharSetandCollationForAllTables()

	BEGIN

		DECLARE table_name VARCHAR(255);
		DECLARE end_of_tables INT DEFAULT 0;
		DECLARE num_tables INT DEFAULT 0;

		DECLARE cur CURSOR FOR 
			SELECT t.table_name 
			FROM information_schema.tables t 
			WHERE t.table_schema = DATABASE() AND t.table_type='BASE TABLE';

		DECLARE CONTINUE HANDLER FOR NOT FOUND SET end_of_tables = 1;

		OPEN cur;

		tables_loop: LOOP

			FETCH cur INTO table_name;
			
			IF end_of_tables = 1 THEN
				LEAVE tables_loop;
			END IF;

			SET num_tables = num_tables + 1;

			SET @s = CONCAT('ALTER TABLE ' , table_name , ' CONVERT TO CHARACTER SET latin1 COLLATE latin1_general_ci');

			PREPARE stmt FROM @s;
			EXECUTE stmt;
		END LOOP;

		CLOSE cur;
	END $$

Once you have the above stored procedure created in the database, just run it as shown below to have all the tables use the same character set and collation which in this case happens to be
CHARACTER SET = latin1
COLLATE = latin1_general_ci

CALL procChangeCharSetandCollationForAllTables();

If you prefer to use a different character set and collation, say utf8 and utf8_general_ci respectively, just change the codes in line 23 to

SET @s = CONCAT('ALTER TABLE ' , table_name , ' CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci ');

[SCREENCAST COMING SOON]

Create a free website or blog at WordPress.com.