LINUX.ORG.RU
решено ФорумAdmin

mysql, utf8 и кириллица


0

1

Мне стыдно, стыдно, стыдно, стыдно...

Привет.

Почему-то неправильно сортируется кириллица в mysql.

$ locale
LANG=ru_RU.utf8
LC_CTYPE="ru_RU.utf8"
LC_NUMERIC="ru_RU.utf8"
LC_TIME="ru_RU.utf8"
LC_COLLATE="ru_RU.utf8"
LC_MONETARY="ru_RU.utf8"
LC_MESSAGES="ru_RU.utf8"
LC_PAPER="ru_RU.utf8"
LC_NAME="ru_RU.utf8"
LC_ADDRESS="ru_RU.utf8"
LC_TELEPHONE="ru_RU.utf8"
LC_MEASUREMENT="ru_RU.utf8"
LC_IDENTIFICATION="ru_RU.utf8"
LC_ALL=

$ mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 59
Server version: 5.1.41-3ubuntu12.10 (Ubuntu)

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create database `sample` character set utf8;
Query OK, 1 row affected (0.01 sec)

mysql> show create database `sample`;
+----------+-----------------------------------------------------------------+
| Database | Create Database                                                 |
+----------+-----------------------------------------------------------------+
| sample   | CREATE DATABASE `sample` /*!40100 DEFAULT CHARACTER SET utf8 */ |
+----------+-----------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> use sample;
Database changed

mysql> create table `example` (`id` INT NOT NULL AUTO_INCREMENT, `name` VARCHAR(255), PRIMARY KEY(`id`)) DEFAULT CHARACTER SET=utf8 COLLATE=`utf8_unicode_ci`;
Query OK, 0 rows affected (0.09 sec)

mysql> insert into `example` values ('', 'А');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into `example` values ('', 'Б');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into `example` values ('', 'В');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into `example` values ('', 'Г');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into `example` values ('', 'Д');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into `example` values ('', 'Е');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into `example` values ('', 'Ё');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into `example` values ('', 'Ж');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into `example` values ('', 'З');
Query OK, 1 row affected, 1 warning (0.00 sec)


mysql> select * from example order by `name` asc;
+----+------+
| id | name |
+----+------+
|  1 | А   |
|  7 | Ё   |
|  8 | Ж   |
|  9 | З   |
|  2 | Б   |
|  3 | В   |
|  4 | Г   |
|  5 | Д   |
|  6 | Е   |
+----+------+
9 rows in set (0.00 sec)

mysql> select * from example order by `name` desc;
+----+------+
| id | name |
+----+------+
|  6 | Е   |
|  5 | Д   |
|  4 | Г   |
|  3 | В   |
|  2 | Б   |
|  9 | З   |
|  8 | Ж   |
|  1 | А   |
|  7 | Ё   |
+----+------+
9 rows in set (0.00 sec)

Почему так? Не то сопоставление? А какое будет верным для кириллицы?

mysql> show collation where `charset`='utf8';
+--------------------+---------+-----+---------+----------+---------+
| Collation          | Charset | Id  | Default | Compiled | Sortlen |
+--------------------+---------+-----+---------+----------+---------+
| utf8_general_ci    | utf8    |  33 | Yes     | Yes      |       1 |
| utf8_bin           | utf8    |  83 |         | Yes      |       1 |
| utf8_unicode_ci    | utf8    | 192 |         | Yes      |       8 |
| utf8_icelandic_ci  | utf8    | 193 |         | Yes      |       8 |
| utf8_latvian_ci    | utf8    | 194 |         | Yes      |       8 |
| utf8_romanian_ci   | utf8    | 195 |         | Yes      |       8 |
| utf8_slovenian_ci  | utf8    | 196 |         | Yes      |       8 |
| utf8_polish_ci     | utf8    | 197 |         | Yes      |       8 |
| utf8_estonian_ci   | utf8    | 198 |         | Yes      |       8 |
| utf8_spanish_ci    | utf8    | 199 |         | Yes      |       8 |
| utf8_swedish_ci    | utf8    | 200 |         | Yes      |       8 |
| utf8_turkish_ci    | utf8    | 201 |         | Yes      |       8 |
| utf8_czech_ci      | utf8    | 202 |         | Yes      |       8 |
| utf8_danish_ci     | utf8    | 203 |         | Yes      |       8 |
| utf8_lithuanian_ci | utf8    | 204 |         | Yes      |       8 |
| utf8_slovak_ci     | utf8    | 205 |         | Yes      |       8 |
| utf8_spanish2_ci   | utf8    | 206 |         | Yes      |       8 |
| utf8_roman_ci      | utf8    | 207 |         | Yes      |       8 |
| utf8_persian_ci    | utf8    | 208 |         | Yes      |       8 |
| utf8_esperanto_ci  | utf8    | 209 |         | Yes      |       8 |
| utf8_hungarian_ci  | utf8    | 210 |         | Yes      |       8 |
+--------------------+---------+-----+---------+----------+---------+
21 rows in set (0.00 sec)
Ответ на: комментарий от anonymous
mysql> alter table `example` collate utf8_general_ci;
Query OK, 9 rows affected (0.07 sec)
Records: 9  Duplicates: 0  Warnings: 0

mysql> select * from example order by `name` asc;
+----+------+
| id | name |
+----+------+
|  1 | А   |
|  7 | Ё   |
|  8 | Ж   |
|  9 | З   |
|  2 | Б   |
|  3 | В   |
|  4 | Г   |
|  5 | Д   |
|  6 | Е   |
+----+------+
9 rows in set (0.00 sec)

Frakhtan-teh ★★
() автор топика

чини свой мускул:

pinkbyte@oas1 ~ $ mysql -u root -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.56-log Gentoo Linux mysql-5.1.56

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create database test;
Query OK, 1 row affected (0.13 sec)

mysql> use test;
Database changed
mysql> create table `example` (`id` INT NOT NULL AUTO_INCREMENT, `name` VARCHAR(255), PRIMARY KEY(`id`)) DEFAULT CHARACTER SET=utf8 COLLATE=`utf8_unicode_ci`;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into `example` values ('', 'А');
Query OK, 1 row affected, 1 warning (0.02 sec)

mysql> insert into `example` values ('', 'Б');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into `example` values ('', 'В');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into `example` values ('', 'Г');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into `example` values ('', 'Д');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into `example` values ('', 'Е');
Query OK, 1 row affected, 1 warning (0.01 sec)

mysql> insert into `example` values ('', 'Ё');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into `example` values ('', 'Ж');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into `example` values ('', 'З');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> select * from example order by `name` asc;
+----+------+
| id | name |
+----+------+
|  1 | А    |
|  2 | Б    |
|  3 | В    |
|  4 | Г    |
|  5 | Д    |
|  6 | Е    |
|  7 | Ё    |
|  8 | Ж    |
|  9 | З    |
+----+------+
9 rows in set (0.00 sec)

mysql>

Хотя Ё обрабатывает у меня тоже не совсем корректно :-)

mysql> select * from example order by `name` desc;
+----+------+
| id | name |
+----+------+
|  9 | З    |
|  8 | Ж    |
|  6 | Е    |
|  7 | Ё    |
|  5 | Д    |
|  4 | Г    |
|  3 | В    |
|  2 | Б    |
|  1 | А    |
+----+------+
9 rows in set (0.00 sec)

Дистрибутив - Gentoo, конфиги дефолтные, версия мускула - в заголовке

Pinkbyte ★★★★★
()
Ответ на: комментарий от Pinkbyte

За намек спасибо, нашел в чем дело.

mysql> show variables like '%char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | latin1                     |
| character_set_connection | latin1                     |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | latin1                     |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.02 sec)

Привел к такому виду и стало нормально сортироваться:

mysql> show variables like '%char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

Frakhtan-teh ★★
() автор топика
Вы не можете добавлять комментарии в эту тему. Тема перемещена в архив.