Column Character Set and Collation
Every "character" column (that is, a column of type CHAR, VARCHAR, or TEXT) has a column character set and a column collation. Column definition syntax for CREATE TABLE and ALTER TABLE has optional clauses for specifying the column character set and collation:
col_name{CHAR | VARCHAR | TEXT} (col_length) [CHARACTER SETcharset_name] [COLLATEcollation_name]
These clauses can also be used for ENUM and SET columns:
col_name{ENUM | SET} (val_list) [CHARACTER SETcharset_name] [COLLATEcollation_name]
Examples:
CREATE TABLE t1 ( col1 VARCHAR(5) CHARACTER SET latin1 COLLATE latin1_german1_ci ); ALTER TABLE t1 MODIFY col1 VARCHAR(5) CHARACTER SET latin1 COLLATE latin1_swedish_ci;
MySQL chooses the column character set and collation in the following manner:
- If both
CHARACTER SETandXCOLLATEare specified, character setYXand collationYare used.CREATE TABLE t1 ( col1 CHAR(10) CHARACTER SET utf8 COLLATE utf8_unicode_ci ) CHARACTER SET latin1 COLLATE latin1_bin;
The character set and collation are specified for the column, so they are used. The column has character set
utf8and collationutf8_unicode_ci. - If
CHARACTER SETis specified withoutXCOLLATE, character setXand its default collation are used.CREATE TABLE t1 ( col1 CHAR(10) CHARACTER SET utf8 ) CHARACTER SET latin1 COLLATE latin1_bin;
The character set is specified for the column, but the collation is not. The column has character set
utf8and the default collation forutf8, which isutf8_general_ci. To see the default collation for each character set, use theSHOW COLLATIONstatement. - If
COLLATEis specified withoutYCHARACTER SET, the character set associated withYand collationYare used.CREATE TABLE t1 ( col1 CHAR(10) COLLATE utf8_polish_ci ) CHARACTER SET latin1 COLLATE latin1_bin;
The collation is specified for the column, but the character set is not. The column has collation
utf8_polish_ciand the character set is the one associated with the collation, which isutf8. - Otherwise, the table character set and collation are used.
CREATE TABLE t1 ( col1 CHAR(10) ) CHARACTER SET latin1 COLLATE latin1_bin;
Neither the character set nor collation are specified for the column, so the table defaults are used. The column has character set
latin1and collationlatin1_bin.
The CHARACTER SET and COLLATE clauses are standard SQL.
If you use ALTER TABLE to convert a column from one character set to another, MariaDB attempts to map the data values, but if the character sets are incompatible, there may be data loss.