PDA

View Full Version : Gooseberry >> Hayes migration : conversion problem



Bizzz
08-28-2007, 03:23 PM
I have a big problem when converting my office Wiki from Gooseberry to Hayes 1.8.1.d.
All french accentuated characters are not correctly converted, and a lot of pages are now KO (bad display, strange redirections created, ...)
I think I know where is the problem, but I don't know what is the better (and faster) method to correct this. Some explanations :
- into the exported file obtained with mysqldump, I see that my Gooseberry tables are coded with CHARSET=latin1 option
- when I had input a french word like 'Activités', it is dumped like 'Activités'

- after correct migration and installation process into Hayes, I can see that tables are coded with CHARSET=utf8 COLLATE utf8_general_ci option
- the accentued french characters are now displayed with bad representation
For example 'Activités' is now displayed 'Activités' into titles

What can I do? Translate characters into exported SQL file, before installation into Hayes Wiki? Modify CHARSET declaration before Hayes creation? Other?
All accentuated characters are not concerned. I don't know why, but accentuated characters into page text are coded with HTML representation (é or & #233; for example). Accentuated characters into titles seem to be the bigger problem

Any suggestion?

royk
08-28-2007, 06:16 PM
First thing, let's verify the connection is using proper charsets. Run these two queries from mySQL after you connect to your database:

SHOW variables like '%collat%';
SHOW variables like '%charset%';

collation_database should be set to utf8_general_ci, while
character_set_database (and maybe character_set_system) should all be set to utf8

Bizzz
08-29-2007, 08:14 AM
Here are the values for my original Gooseberry database

mysql> SHOW variables like '%collat%';
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | latin1_swedish_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
+----------------------+-------------------+
mysql> SHOW variables like '%charset%';
Empty set (0.00 sec)
mysql> show variables like '%character_set%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

And below the same values for my new Hayes database

mysql> SHOW variables like '%collat%';
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | latin1_swedish_ci |
| collation_database | utf8_general_ci |
| collation_server | latin1_swedish_ci |
+----------------------+-------------------+
mysql> SHOW variables like '%charset%';
Empty set (0.00 sec)
mysql> show variables like '%character_set%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

merktnichts
08-29-2007, 08:38 AM
This might be related: http://forums.opengarden.org/showthread.php?t=569

Bizzz
08-30-2007, 03:33 PM
I have made a lot of different installation tests, without success :(:(
I've tried to make a mysqldump with --default-character-set=latin1, but the file obtained doesn't correct the problem when imported into new Hayes wiki.

HELP! Do non US users have made correct migration from Gooseberry to Hayes (and with accentuated characters into Titles and pages)?

I'm not ready to manually correct all my old wiki pages...

PeteE
08-30-2007, 03:52 PM
I have made a lot of different installation tests, without success :(:(
I've tried to make a mysqldump with --default-character-set=latin1, but the file obtained doesn't correct the problem when imported into new Hayes wiki.

HELP! Do non US users have made correct migration from Gooseberry to Hayes (and with accentuated characters into Titles and pages)?

I'm not ready to manually correct all my old wiki pages...

Bizzz - I haven't tried this myself but it might work. After you do a mysqldump, can you try running the following?


iconv -f latin1 -t utf8 wikidb.sql > wikidb-utf8.sql

Then load up that file into your DB.

Like I said, it's untested, but might work. Also, if you're willing to share your .sql dump I can try to recreate on my end. PM or email me if you're willing!

thanks,
pete

Bizzz
08-31-2007, 07:48 AM
Thank you for your proposal Pete, but I am not allowed to export my office wiki outside.
Tonight I'll try your solution with iconv and keep you informed of the result.