Using Character Sets and Unicode
All strings sent from the JDBC driver to the server are converted automatically from native Java Unicode form to the client character encoding, including all queries sent using Statement.execute()
, Statement.executeUpdate()
, Statement.executeQuery()
as well as all PreparedStatement
and CallableStatement
parameters with the exclusion of parameters set using setBytes()
, setBinaryStream()
, setAsciiStream()
, setUnicodeStream()
and setBlob()
.
Prior to MariaDB Server 4.1, Connector/J supported a single character encoding per connection, which could either be automatically detected from the server configuration, or could be configured by the user through the useUnicode
and characterEncoding
properties.
Starting with MariaDB Server 4.1, Connector/J supports a single character encoding between client and server, and any number of character encodings for data returned by the server to the client in ResultSets
.
The character encoding between client and server is automatically detected upon connection. The encoding used by the driver is specified on the server using the character_set
system variable for server versions older than 4.1.0 and character-set-server
for server versions 4.1.0 and newer. For more information, see , "Server Character Set and Collation".
To override the automatically detected encoding on the client side, use the characterEncoding
property in the URL used to connect to the server.
When specifying character encodings on the client side, use Java-style names. The following table lists Java-style names for MariaDB character sets:
MySQL to Java Encoding Name Translations.
MySQL Character Set Name | Java-Style Character Encoding Name |
---|---|
ascii | US-ASCII |
big5 | Big5 |
gbk | GBK |
sjis | SJIS (or Cp932 or MS932 for MariaDB Server < 4.1.11) |
cp932 | Cp932 or MS932 (MySQL Server > 4.1.11) |
gb2312 | EUC_CN |
ujis | EUC_JP |
euckr | EUC_KR |
latin1 | Cp1252 |
latin2 | ISO8859_2 |
greek | ISO8859_7 |
hebrew | ISO8859_8 |
cp866 | Cp866 |
tis620 | TIS620 |
cp1250 | Cp1250 |
cp1251 | Cp1251 |
cp1257 | Cp1257 |
macroman | MacRoman |
macce | MacCentralEurope |
utf8 | UTF-8 |
ucs2 | UnicodeBig |
Do not issue the query 'set names' with Connector/J, as the driver will not detect that the character set has changed, and will continue to use the character set detected during the initial connection setup.
To allow multiple character sets to be sent from the client, use the UTF-8 encoding, either by configuring utf8
as the default server character set, or by configuring the JDBC driver to use UTF-8 through the characterEncoding
property.