Category talk:Databases

From openSUSE

Contents

Data Migration from MySQL 4.0 to MySQL 4.1

If you upgrate your older Suse Linux to (Open)Suse 10, generally, you upgrade a lot of packages, and among them the MySQL from ver 4.0 to ver 4.1. Since English is not my native language, I kindly request to forgive me for any spell/grammar errors.

Looking only at the version number of the popular database, someone expects to have no major problems after the upgrade. Generaly, this is true, but there is a lot of buzz about the new features and especialy the "collation". For users that do not use utf-8 or latin1 characters, migration is quite tricky.

Before upgrade

Before any upgrade, you must have a recent backup of your database schema and data. In my case, the stored data were encoded at ISO-8859-7 which is the Greek character set. Of course, you have to use your own character set.

mysqldump -u root -p -h localhost -A -c -e \
--create-options --hex-blob --add-drop-database \
--default-character-set=greek -r databasesdump.sql 

Now you can safely upgrade your Suse Linux.

After upgrade

When your system upgrade is finished, you have to check, using your favorite tool, if all your databases have the correct the character set. At my server, 3 of the 14 databases had problems:

  • Column collation was set to latin1_swedish_ci
  • Table collation was set to latin1
  • Data stored at the database with the wrong character set.

Thus, I desided to drop all databases (remember that I had a full backup) and recreate them.

This is a straight-forward procedure, as far as you know how to perform it. All you need is the mysql client, the backup of your database and a few minutes.

To make long story short, the secret for the correct migration lies at the documentation. If you are the owner of the database server you can use the "Server wide" option. Otherwise, you have to follow the "hard" way.

Server wide settings

In case you have the luxury to have your own db server, you can set the following at the /etc/my.cnf file:

[client]
...
default-character-set=greek

...
[mysqld]
default-character-set=greek
init-connect="SET NAMES greek"

The tricky part is the init-connect variable. That way each client automatically has the 'greek' character set and connection collation. Do not forget to restart mysql service after the change. Now, you can use the following shell command to upload your data: mysql -u root -p < databasesdump.sql

Client only settings

If you do not have the luxury of your own database server, then you cannot rely on the good faith of the system administration. You have to upload the correct data, with the proper encoding using the mysql client.

Commonly, we use something like the following in order to load a new dump to our database:

mysql -u username -p password < dumpfile.sql

Well, this is not the correct approach to our case. Either from phpmyadmin, or the mysql client (using show variables; query), check the variables of your database server. You have to note the character set that mysqld uses for the collation and the connection.

In my case, I wanted to have all character sets to Greek, thus I simply changed the default character set of the currect connection to 'greek'. To do so, I executed the query:

set names 'greek';

Now the show variables output was:

| character_set_client            | greek                                                   |
| character_set_connection        | greek                                                   |
| character_set_database          | greek                                                   |
| character_set_results           | greek                                                   |
| character_set_server            | greek                                                   |
| character_set_system            | utf8                                                    |
| character_sets_dir              | /usr/share/mysql/charsets/                              |
| collation_connection            | greek_general_ci                                        |
| collation_database              | greek_general_ci                                        |
| collation_server                | greek_general_ci

I was ready to use the dump of my databases with the correct character set:

source ./databasesdump.sql

Other clients

If you use mysqldump, and you had set corectly the character set of your connection, (in my case greek_general_ci), by now, you should see the correct characters/encoding. If you still cannot see the correct ones, try to change the collation to a more general one. In most cases a web browser will not represent correctly the data. This is because of incorect internal tranformation of the character from the db server. I suppose that you have already notice that I use the term "connection". MySQL server treats each connection differently, unless you had strictly named a default encoding/collation at "/etc/my.cnf" file.

But phpmyadmin is not the only client. All scripts, programs that connects to the db are potential clients. Thus, you have to ensure that each time you set the correct encoding for the connection.

Per connection settings

In order to inform mysql that the character set you want to use is not the default one, you have to use the "set names" 'query before' any other sql statement that sends or retrieves data from the db. For example, if you use the php file for the connection named db.php:

<?
dblink = mysql_connect("localhost", "dbuser", "dbpass");
?>

change it to:

<?
dblink = mysql_connect("localhost", "dbuser", "dbpass");
mysql_query("set names 'greek'");
?>

Of course, you can use the mysql_query("set names 'greek'"); whenever you want to change the collation.

Links

Mysql Documentation