Functions and Operators - MariaDB - Databases - Software - Computers


Functions and Operators
Prev Next

Functions and Operators

Table of Contents

Function and Operator Reference
Type Conversion in Expression Evaluation
Operators
Operator Precedence
Comparison Functions and Operators
Logical Operators
Assignment Operators
Control Flow Functions
String Functions
String Comparison Functions
Regular Expressions
Numeric Functions and Operators
Arithmetic Operators
Mathematical Functions
Date and Time Functions
What Calendar Is Used By MySQL?
Full-Text Search Functions
Natural Language Full-Text Searches
Boolean Full-Text Searches
Full-Text Searches with Query Expansion
Full-Text Stopwords
Full-Text Restrictions
Fine-Tuning MariaDB Full-Text Search
Adding a Collation for Full-Text Indexing
Cast Functions and Operators
XML Functions
Bit Functions
Encryption and Compression Functions
Information Functions
Miscellaneous Functions
Functions and Modifiers for Use with GROUP BY Clauses
GROUP BY (Aggregate) Functions
GROUP BY Modifiers
GROUP BY and HAVING with Hidden Columns
Spatial Extensions
Introduction to MariaDB Spatial Support
The OpenGIS Geometry Model
Supported Spatial Data Formats
Creating a Spatially Enabled MariaDB Database
Spatial Analysis Functions
Optimizing Spatial Analysis
MySQL Conformance and Compatibility
Precision Math
Types of Numeric Values
DECIMAL Data Type Changes
Expression Handling
Rounding Behavior
Precision Math Examples

Expressions can be used at several points in SQL statements, such as in the ORDER BY or HAVING clauses of SELECT statements, in the WHERE clause of a SELECT, DELETE, or UPDATE statement, or in SET statements. Expressions can be written using literal values, column values, NULL, built-in functions, stored functions, user-defined functions, and operators. This chapter describes the functions and operators that are permitted for writing expressions in MySQL. Instructions for writing stored functions and user-defined functions are given in , "Using Stored Routines (Procedures and Functions)", and , "Adding New Functions to MySQL". See , "Function Name Parsing and Resolution", for the rules describing how the server interprets references to different kinds of functions.

An expression that contains NULL always produces a NULL value unless otherwise indicated in the documentation for a particular function or operator.Note

By default, there must be no whitespace between a function name and the parenthesis following it. This helps the MariaDB parser distinguish between function calls and references to tables or columns that happen to have the same name as a function. However, spaces around function arguments are permitted.

You can tell the MariaDB server to accept spaces after function names by starting it with the --sql-mode=IGNORE_SPACE option. (See , "Server SQL Modes".) Individual client programs can request this behavior by using the CLIENT_IGNORE_SPACE option for mysql_real_connect(). In either case, all function names become reserved words.

For the sake of brevity, most examples in this chapter display the output from the mysql program in abbreviated form. Rather than showing examples in this format:

mysql> SELECT MOD(29,9);
+-----------+
| mod(29,9) |
+-----------+
| 2 |
+-----------+
1 rows in set (0.00 sec)

This format is used instead:

mysql> SELECT MOD(29,9);
 -> 2

Function and Operator Reference

Table 11.1. Functions/Operators

Name Description
ABS() Return the absolute value
ACOS() Return the arc cosine
ADDDATE() Add time values (intervals) to a date value
ADDTIME() Add time
AES_DECRYPT() Decrypt using AES
AES_ENCRYPT() Encrypt using AES
AND, && Logical AND
ASCII() Return numeric value of left-most character
ASIN() Return the arc sine
= Assign a value (as part of a SET statement, or as part of the SET clause in an UPDATE statement)
:= Assign a value
ATAN2(), ATAN() Return the arc tangent of the two arguments
ATAN() Return the arc tangent
AVG() Return the average value of the argument
BENCHMARK() Repeatedly execute an expression
BETWEEN ... AND ... Check whether a value is within a range of values
BIN() Return a string representation of the argument
BINARY Cast a string to a binary string
BIT_AND() Return bitwise and
BIT_COUNT() Return the number of bits that are set
BIT_LENGTH() Return length of argument in bits
BIT_OR() Return bitwise or
BIT_XOR() Return bitwise xor
& Bitwise AND
~ Invert bits
| Bitwise OR
^ Bitwise XOR
CASE Case operator
CAST() Cast a value as a certain type
CEIL() Return the smallest integer value not less than the argument
CEILING() Return the smallest integer value not less than the argument
CHAR_LENGTH() Return number of characters in argument
CHAR() Return the character for each integer passed
CHARACTER_LENGTH() A synonym for CHAR_LENGTH()
CHARSET() Return the character set of the argument
COALESCE() Return the first non-NULL argument
COERCIBILITY() Return the collation coercibility value of the string argument
COLLATION() Return the collation of the string argument
COMPRESS() Return result as a binary string
CONCAT_WS() Return concatenate with separator
CONCAT() Return concatenated string
CONNECTION_ID() Return the connection ID (thread ID) for the connection
CONV() Convert numbers between different number bases
CONVERT_TZ() Convert from one timezone to another
CONVERT() Cast a value as a certain type
COS() Return the cosine
COT() Return the cotangent
COUNT(DISTINCT) Return the count of a number of different values
COUNT() Return a count of the number of rows returned
CRC32() Compute a cyclic redundancy check value
CURDATE() Return the current date
CURRENT_DATE(), CURRENT_DATE Synonyms for CURDATE()
CURRENT_TIME(), CURRENT_TIME Synonyms for CURTIME()
CURRENT_TIMESTAMP(), CURRENT_TIMESTAMP Synonyms for NOW()
CURRENT_USER(), CURRENT_USER The authenticated user name and host name
CURTIME() Return the current time
DATABASE() Return the default (current) database name
DATE_ADD() Add time values (intervals) to a date value
DATE_FORMAT() Format date as specified
DATE_SUB() Subtract a time value (interval) from a date
DATE() Extract the date part of a date or datetime expression
DATEDIFF() Subtract two dates
DAY() Synonym for DAYOFMONTH()
DAYNAME() Return the name of the weekday
DAYOFMONTH() Return the day of the month (0-31)
DAYOFWEEK() Return the weekday index of the argument
DAYOFYEAR() Return the day of the year (1-366)
DECODE() Decodes a string encrypted using ENCODE()
DEFAULT() Return the default value for a table column
DEGREES() Convert radians to degrees
DES_DECRYPT() Decrypt a string
DES_ENCRYPT() Encrypt a string
DIV Integer division
/ Division operator
ELT() Return string at index number
ENCODE() Encode a string
ENCRYPT() Encrypt a string
<=> NULL-safe equal to operator
= Equal operator
EXP() Raise to the power of
EXPORT_SET() Return a string such that for every bit set in the value bits, you get an on string and for every unset bit, you get an off string
EXTRACT() Extract part of a date
ExtractValue() Extracts a value from an XML string using XPath notation
FIELD() Return the index (position) of the first argument in the subsequent arguments
FIND_IN_SET() Return the index position of the first argument within the second argument
FLOOR() Return the largest integer value not greater than the argument
FORMAT() Return a number formatted to specified number of decimal places
FOUND_ROWS() For a SELECT with a LIMIT clause, the number of rows that would be returned were there no LIMIT clause
FROM_BASE64() Decode to a base-64 string and return result
FROM_DAYS() Convert a day number to a date
FROM_UNIXTIME() Format UNIX timestamp as a date
GET_FORMAT() Return a date format string
GET_LOCK() Get a named lock
>= Greater than or equal operator
> Greater than operator
GREATEST() Return the largest argument
GROUP_CONCAT() Return a concatenated string
HEX() Return a hexadecimal representation of a decimal or string value
HOUR() Extract the hour
IF() If/else construct
IFNULL() Null if/else construct
IN() Check whether a value is within a set of values
INET_ATON() Return the numeric value of an IP address
INET_NTOA() Return the IP address from a numeric value
INET6_ATON() Return the numeric value of an IPv6 address
INET6_NTOA() Return the IPv6 address from a numeric value
INSERT() Insert a substring at the specified position up to the specified number of characters
INSTR() Return the index of the first occurrence of substring
INTERVAL() Return the index of the argument that is less than the first argument
IS_FREE_LOCK() Checks whether the named lock is free
IS_IPV4_COMPAT() Return true if argument is an IPv4-compatible address
IS_IPV4_MAPPED() Return true if argument is an IPv4-mapped address
IS_IPV4() Return true if argument is an IPv4 address
IS_IPV6() Return true if argument is an IPv6 address
IS NOT NULL NOT NULL value test
IS NOT Test a value against a boolean
IS NULL NULL value test
IS_USED_LOCK() Checks whether the named lock is in use. Return connection identifier if true.
IS Test a value against a boolean
ISNULL() Test whether the argument is NULL
LAST_DAY Return the last day of the month for the argument
LAST_INSERT_ID() Value of the AUTOINCREMENT column for the last INSERT
LCASE() Synonym for LOWER()
LEAST() Return the smallest argument
<< Left shift
LEFT() Return the leftmost number of characters as specified
LENGTH() Return the length of a string in bytes
<= Less than or equal operator
< Less than operator
LIKE Simple pattern matching
LN() Return the natural logarithm of the argument
LOAD_FILE() Load the named file
LOCALTIME(), LOCALTIME Synonym for NOW()
LOCALTIMESTAMP, LOCALTIMESTAMP() Synonym for NOW()
LOCATE() Return the position of the first occurrence of substring
LOG10() Return the base-10 logarithm of the argument
LOG2() Return the base-2 logarithm of the argument
LOG() Return the natural logarithm of the first argument
LOWER() Return the argument in lowercase
LPAD() Return the string argument, left-padded with the specified string
LTRIM() Remove leading spaces
MAKE_SET() Return a set of comma-separated strings that have the corresponding bit in bits set
MAKEDATE() Create a date from the year and day of year
MAKETIME MAKETIME()
MASTER_POS_WAIT() Block until the slave has read and applied all updates up to the specified position
MATCH Perform full-text search
MAX() Return the maximum value
MD5() Calculate MD5 checksum
MICROSECOND() Return the microseconds from argument
MID() Return a substring starting from the specified position
MIN() Return the minimum value
- Minus operator
MINUTE() Return the minute from the argument
MOD() Return the remainder
% or MOD Modulo operator
MONTH() Return the month from the date passed
MONTHNAME() Return the name of the month
NAME_CONST() Causes the column to have the given name
NOT BETWEEN ... AND ... Check whether a value is not within a range of values
!=, <> Not equal operator
NOT IN() Check whether a value is not within a set of values
NOT LIKE Negation of simple pattern matching
NOT REGEXP Negation of REGEXP
NOT, ! Negates value
NOW() Return the current date and time
NULLIF() Return NULL if expr1 = expr2
OCT() Return an octal representation of a decimal number
OCTET_LENGTH() A synonym for LENGTH()
OLD_PASSWORD() Return the value of the pre-4.1 implementation of PASSWORD
||, OR Logical OR
ORD() Return character code for leftmost character of the argument
PASSWORD() Calculate and return a password string
PERIOD_ADD() Add a period to a year-month
PERIOD_DIFF() Return the number of months between periods
PI() Return the value of pi
+ Addition operator
POSITION() A synonym for LOCATE()
POW() Return the argument raised to the specified power
POWER() Return the argument raised to the specified power
PROCEDURE ANALYSE() Analyze the results of a query
QUARTER() Return the quarter from a date argument
QUOTE() Escape the argument for use in an SQL statement
RADIANS() Return argument converted to radians
RAND() Return a random floating-point value
REGEXP Pattern matching using regular expressions
RELEASE_LOCK() Releases the named lock
REPEAT() Repeat a string the specified number of times
REPLACE() Replace occurrences of a specified string
REVERSE() Reverse the characters in a string
>> Right shift
RIGHT() Return the specified rightmost number of characters
RLIKE Synonym for REGEXP
ROUND() Round the argument
ROW_COUNT() The number of rows updated
RPAD() Append string the specified number of times
RTRIM() Remove trailing spaces
SCHEMA() A synonym for DATABASE()
SEC_TO_TIME() Converts seconds to 'HH:MM:SS' format
SECOND() Return the second (0-59)
SESSION_USER() Synonym for USER()
SHA1(), SHA() Calculate an SHA-1 160-bit checksum
SHA2() Calculate an SHA-2 checksum
SIGN() Return the sign of the argument
SIN() Return the sine of the argument
SLEEP() Sleep for a number of seconds
SOUNDEX() Return a soundex string
SOUNDS LIKE Compare sounds
SPACE() Return a string of the specified number of spaces
SQRT() Return the square root of the argument
STD() Return the population standard deviation
STDDEV_POP() Return the population standard deviation
STDDEV_SAMP() Return the sample standard deviation
STDDEV() Return the population standard deviation
STR_TO_DATE() Convert a string to a date
STRCMP() Compare two strings
SUBDATE() A synonym for DATE_SUB() when invoked with three arguments
SUBSTR() Return the substring as specified
SUBSTRING_INDEX() Return a substring from a string before the specified number of occurrences of the delimiter
SUBSTRING() Return the substring as specified
SUBTIME() Subtract times
SUM() Return the sum
SYSDATE() Return the time at which the function executes
SYSTEM_USER() Synonym for USER()
TAN() Return the tangent of the argument
TIME_FORMAT() Format as time
TIME_TO_SEC() Return the argument converted to seconds
TIME() Extract the time portion of the expression passed
TIMEDIFF() Subtract time
* Multiplication operator
TIMESTAMP() With a single argument, this function returns the date or datetime expression; with two arguments, the sum of the arguments
TIMESTAMPADD() Add an interval to a datetime expression
TIMESTAMPDIFF() Subtract an interval from a datetime expression
TO_BASE64() Return the argument converted to a base-64 string
TO_DAYS() Return the date argument converted to days
TO_SECONDS() Return the date or datetime argument converted to seconds since Year 0
TRIM() Remove leading and trailing spaces
TRUNCATE() Truncate to specified number of decimal places
UCASE() Synonym for UPPER()
- Change the sign of the argument
UNCOMPRESS() Uncompress a string compressed
UNCOMPRESSED_LENGTH() Return the length of a string before compression
UNHEX() Convert each pair of hexadecimal digits to a character
UNIX_TIMESTAMP() Return a UNIX timestamp
UpdateXML() Return replaced XML fragment
UPPER() Convert to uppercase
USER() The user name and host name provided by the client
UTC_DATE() Return the current UTC date
UTC_TIME() Return the current UTC time
UTC_TIMESTAMP() Return the current UTC date and time
UUID_SHORT() Return an integer-valued universal identifier
UUID() Return a Universal Unique Identifier (UUID)
VALUES() Defines the values to be used during an INSERT
VAR_POP() Return the population standard variance
VAR_SAMP() Return the sample variance
VARIANCE() Return the population standard variance
VERSION() Returns a string that indicates the MariaDB server version
WEEK() Return the week number
WEEKDAY() Return the weekday index
WEEKOFYEAR() Return the calendar week of the date (0-53)
WEIGHT_STRING() Return the weight string for a string
XOR Logical XOR
YEAR() Return the year
YEARWEEK() Return the year and week

Type Conversion in Expression Evaluation

When an operator is used with operands of different types, type conversion occurs to make the operands compatible. Some conversions occur implicitly. For example, MariaDB automatically converts numbers to strings as necessary, and vice versa.

mysql> SELECT 1+'1';
 -> 2
mysql> SELECT CONCAT(2,' test');
 -> '2 test'

It is also possible to convert a number to a string explicitly using the CAST() function. Conversion occurs implicitly with the CONCAT() function because it expects string arguments.

mysql> SELECT 38.8, CAST(38.8 AS CHAR);
 -> 38.8, '38.8'
mysql> SELECT 38.8, CONCAT(38.8);
 -> 38.8, '38.8'

See later in this section for information about the character set of implicit number-to-string conversions.

The following rules describe how conversion occurs for comparison operations:

For information about conversion of values from one temporal type to another, see , "Conversion Between Date and Time Types".

The following examples illustrate conversion of strings to numbers for comparison operations:

mysql> SELECT 1 > '6x';
 -> 0
mysql> SELECT 7 > '6x';
 -> 1
mysql> SELECT 0 > 'x6';
 -> 0
mysql> SELECT 0 = 'x6';
 -> 1

For comparisons of a string column with a number, MariaDB cannot use an index on the column to look up the value quickly. If str_col is an indexed string column, the index cannot be used when performing the lookup in the following statement:

SELECT * FROM tbl_name WHERE str_col=1;

The reason for this is that there are many different strings that may convert to the value 1, such as '1', ' 1', or '1a'.

Comparisons that use floating-point numbers (or values that are converted to floating-point numbers) are approximate because such numbers are inexact. This might lead to results that appear inconsistent:

mysql> SELECT '18015376320243458' = 18015376320243458;
 -> 1
mysql> SELECT '18015376320243459' = 18015376320243459;
 -> 0

Such results can occur because the values are converted to floating-point numbers, which have only 53 bits of precision and are subject to rounding:

mysql> SELECT '18015376320243459'+0.0;
 -> 1.8015376320243e+16

Furthermore, the conversion from string to floating-point and from integer to floating-point do not necessarily occur the same way. The integer may be converted to floating-point by the CPU, whereas the string is converted digit by digit in an operation that involves floating-point multiplications.

The results shown will vary on different systems, and can be affected by factors such as computer architecture or the compiler version or optimization level. One way to avoid such problems is to use CAST() so that a value will not be converted implicitly to a float-point number:

mysql> SELECT CAST('18015376320243459' AS UNSIGNED) = 18015376320243459;
 -> 1

For more information about floating-point comparisons, see "Problems with Floating-Point Values".

In MariaDB 5.6, the server includes dtoa, a conversion library that provides the basis for improved conversion between string or DECIMAL values and approximate-value (FLOAT/DOUBLE) numbers:

Because the conversions produced by this library differ in some cases from previous results, the potential exists for incompatibilities in applications that rely on previous results. For example, applications that depend on a specific exact result from previous conversions might need adjustment to accommodate additional precision.

The dtoa library provides conversions with the following properties. D represents a value with a DECIMAL or string representation, and F represents a floating-point number in native binary (IEEE) format.

These properties imply that F -> D -> F conversions are lossless unless F is -inf, +inf, or NaN. The latter values are not supported because the SQL standard defines them as invalid values for FLOAT or DOUBLE.

For D -> F -> D conversions, a sufficient condition for losslessness is that D uses 15 or fewer digits of precision, is not a denormal value, -inf, +inf, or NaN. In some cases, the conversion is lossless even if D has more than 15 digits of precision, but this is not always the case.

In MariaDB 5.6, implicit conversion of a numeric or temporal value to string produces a value that has a character set and collation determined by the character_set_connection and collation_connection system variables. (These variables commonly are set with SET NAMES. For information about connection character sets, see , "Connection Character Sets and Collations".)

This means that such a conversion results in a character (nonbinary) string (a CHAR, VARCHAR, or LONGTEXT value), except in the case that the connection character set is set to binary. In that case, the conversion result is a binary string (a BINARY, VARBINARY, or LONGBLOB value).

Operators

Operator Precedence
Comparison Functions and Operators
Logical Operators
Assignment Operators

Table 11.2. Operators

Name Description
AND, && Logical AND
= Assign a value (as part of a SET statement, or as part of the SET clause in an UPDATE statement)
:= Assign a value
BETWEEN ... AND ... Check whether a value is within a range of values
BINARY Cast a string to a binary string
& Bitwise AND
~ Invert bits
| Bitwise OR
^ Bitwise XOR
CASE Case operator
DIV Integer division
/ Division operator
<=> NULL-safe equal to operator
= Equal operator
>= Greater than or equal operator
> Greater than operator
IS NOT NULL NOT NULL value test
IS NOT Test a value against a boolean
IS NULL NULL value test
IS Test a value against a boolean
<< Left shift
<= Less than or equal operator
< Less than operator
LIKE Simple pattern matching
- Minus operator
% or MOD Modulo operator
NOT BETWEEN ... AND ... Check whether a value is not within a range of values
!=, <> Not equal operator
NOT LIKE Negation of simple pattern matching
NOT REGEXP Negation of REGEXP
NOT, ! Negates value
||, OR Logical OR
+ Addition operator
REGEXP Pattern matching using regular expressions
>> Right shift
RLIKE Synonym for REGEXP
SOUNDS LIKE Compare sounds
* Multiplication operator
- Change the sign of the argument
XOR Logical XOR

Operator Precedence

Operator precedences are shown in the following list, from highest precedence to the lowest. Operators that are shown together on a line have the same precedence.

INTERVAL BINARY, COLLATE
!
- (unary minus), ~ (unary bit inversion)
^
*, /, DIV, %, MOD
-, +
<<, >>
&
|
= (comparison), <=>, >=, >, <=, <, <>, !=, IS, LIKE, REGEXP, IN BETWEEN, CASE, WHEN, THEN, ELSE NOT
&&, AND XOR
||, OR
= (assignment), :=

The precedence of = depends on whether it is used as a comparison operator (=) or as an assignment operator (=). When used as a comparison operator, it has the same precedence as <=>, >=, >, <=, <, <>, !=, IS, LIKE, REGEXP, and IN. When used as an assignment operator, it has the same precedence as :=. , "SET Syntax", and , "User-Defined Variables", explain how MariaDB determines which interpretation of = should apply.

The meaning of some operators depends on the SQL mode:

See , "Server SQL Modes".

The precedence of operators determines the order of evaluation of terms in an expression. To override this order and group terms explicitly, use parentheses. For example:

mysql> SELECT 1+2*3;
 -> 7
mysql> SELECT (1+2)*3;
 -> 9

Comparison Functions and Operators

Table 11.3. Comparison Operators

Name Description
BETWEEN ... AND ... Check whether a value is within a range of values
COALESCE() Return the first non-NULL argument
<=> NULL-safe equal to operator
= Equal operator
>= Greater than or equal operator
> Greater than operator
GREATEST() Return the largest argument
IN() Check whether a value is within a set of values
INTERVAL() Return the index of the argument that is less than the first argument
IS NOT NULL NOT NULL value test
IS NOT Test a value against a boolean
IS NULL NULL value test
IS Test a value against a boolean
ISNULL() Test whether the argument is NULL
LEAST() Return the smallest argument
<= Less than or equal operator
< Less than operator
LIKE Simple pattern matching
NOT BETWEEN ... AND ... Check whether a value is not within a range of values
!=, <> Not equal operator
NOT IN() Check whether a value is not within a set of values
NOT LIKE Negation of simple pattern matching
STRCMP() Compare two strings

Comparison operations result in a value of 1 (TRUE), 0 (FALSE), or NULL. These operations work for both numbers and strings. Strings are automatically converted to numbers and numbers to strings as necessary.

The following relational comparison operators can be used to compare not only scalar operands, but row operands:

= > < >= <= <> !=

For examples of row comparisons, see , "Row Subqueries".

Some of the functions in this section return values other than 1 (TRUE), 0 (FALSE), or NULL. For example, LEAST() and GREATEST(). However, the value they return is based on comparison operations performed according to the rules described in , "Type Conversion in Expression Evaluation".

To convert a value to a specific type for comparison purposes, you can use the CAST() function. String values can be converted to a different character set using CONVERT(). See , "Cast Functions and Operators".

By default, string comparisons are not case sensitive and use the current character set. The default is latin1 (cp1252 West European), which also works well for English.

Logical Operators

Table 11.4. Logical Operators

Name Description
AND, && Logical AND
NOT, ! Negates value
||, OR Logical OR
XOR Logical XOR

In SQL, all logical operators evaluate to TRUE, FALSE, or NULL (UNKNOWN). In MySQL, these are implemented as 1 (TRUE), 0 (FALSE), and NULL. Most of this is common to different SQL database servers, although some servers may return any nonzero value for TRUE.

MySQL evaluates any nonzero, non-NULL value to TRUE. For example, the following statements all assess to TRUE:

mysql> SELECT 10 IS TRUE;
-> 1
mysql> SELECT -10 IS TRUE;
-> 1
mysql> SELECT 'string' IS NOT NULL;
-> 1

Assignment Operators

Table 11.5. Assignment Operators

Name Description
= Assign a value (as part of a SET statement, or as part of the SET clause in an UPDATE statement)
:= Assign a value

Control Flow Functions

Table 11.6. Flow Control Operators

Name Description
CASE Case operator
IF() If/else construct
IFNULL() Null if/else construct
NULLIF() Return NULL if expr1 = expr2

String Functions

String Comparison Functions
Regular Expressions

Table 11.7. String Operators

Name Description
ASCII() Return numeric value of left-most character
BIN() Return a string representation of the argument
BIT_LENGTH() Return length of argument in bits
CHAR_LENGTH() Return number of characters in argument
CHAR() Return the character for each integer passed
CHARACTER_LENGTH() A synonym for CHAR_LENGTH()
CONCAT_WS() Return concatenate with separator
CONCAT() Return concatenated string
ELT() Return string at index number
EXPORT_SET() Return a string such that for every bit set in the value bits, you get an on string and for every unset bit, you get an off string
FIELD() Return the index (position) of the first argument in the subsequent arguments
FIND_IN_SET() Return the index position of the first argument within the second argument
FORMAT() Return a number formatted to specified number of decimal places
FROM_BASE64() Decode to a base-64 string and return result
HEX() Return a hexadecimal representation of a decimal or string value
INSERT() Insert a substring at the specified position up to the specified number of characters
INSTR() Return the index of the first occurrence of substring
LCASE() Synonym for LOWER()
LEFT() Return the leftmost number of characters as specified
LENGTH() Return the length of a string in bytes
LIKE Simple pattern matching
LOAD_FILE() Load the named file
LOCATE() Return the position of the first occurrence of substring
LOWER() Return the argument in lowercase
LPAD() Return the string argument, left-padded with the specified string
LTRIM() Remove leading spaces
MAKE_SET() Return a set of comma-separated strings that have the corresponding bit in bits set
MATCH Perform full-text search
MID() Return a substring starting from the specified position
NOT LIKE Negation of simple pattern matching
NOT REGEXP Negation of REGEXP
OCTET_LENGTH() A synonym for LENGTH()
ORD() Return character code for leftmost character of the argument
POSITION() A synonym for LOCATE()
QUOTE() Escape the argument for use in an SQL statement
REGEXP Pattern matching using regular expressions
REPEAT() Repeat a string the specified number of times
REPLACE() Replace occurrences of a specified string
REVERSE() Reverse the characters in a string
RIGHT() Return the specified rightmost number of characters
RLIKE Synonym for REGEXP
RPAD() Append string the specified number of times
RTRIM() Remove trailing spaces
SOUNDEX() Return a soundex string
SOUNDS LIKE Compare sounds
SPACE() Return a string of the specified number of spaces
STRCMP() Compare two strings
SUBSTR() Return the substring as specified
SUBSTRING_INDEX() Return a substring from a string before the specified number of occurrences of the delimiter
SUBSTRING() Return the substring as specified
TO_BASE64() Return the argument converted to a base-64 string
TRIM() Remove leading and trailing spaces
UCASE() Synonym for UPPER()
UNHEX() Convert each pair of hexadecimal digits to a character
UPPER() Convert to uppercase
WEIGHT_STRING() Return the weight string for a string

String-valued functions return NULL if the length of the result would be greater than the value of the max_allowed_packet system variable. See , "Tuning Server Parameters".

For functions that operate on string positions, the first position is numbered 1.

For functions that take length arguments, noninteger arguments are rounded to the nearest integer.

String Comparison Functions

Table 11.8. String Comparison Operators

Name Description
LIKE Simple pattern matching
NOT LIKE Negation of simple pattern matching
STRCMP() Compare two strings

If a string function is given a binary string as an argument, the resulting string is also a binary string. A number converted to a string is treated as a binary string. This affects only comparisons.

Normally, if any expression in a string comparison is case sensitive, the comparison is performed in case-sensitive fashion.

Regular Expressions

Table 11.9. String Regular Expression Operators

Name Description
NOT REGEXP Negation of REGEXP
REGEXP Pattern matching using regular expressions
RLIKE Synonym for REGEXP

A regular expression is a powerful way of specifying a pattern for a complex search.

MySQL uses Henry Spencer's implementation of regular expressions, which is aimed at conformance with POSIX 1003.2. MariaDB uses the extended version to support pattern-matching operations performed with the REGEXP operator in SQL statements.

This section summarizes, with examples, the special characters and constructs that can be used in MariaDB for REGEXP operations. It does not contain all the details that can be found in Henry Spencer's regex(7) manual page. That manual page is included in MariaDB source distributions, in the regex.7 file under the regex directory. See also , "Pattern Matching".

A regular expression describes a set of strings. The simplest regular expression is one that has no special characters in it. For example, the regular expression hello matches hello and nothing else.

Nontrivial regular expressions use certain special constructs so that they can match more than one string. For example, the regular expression hello|word matches either the string hello or the string word.

As a more complex example, the regular expression B[an]*s matches any of the strings Bananas, Baaaaas, Bs, and any other string starting with a B, ending with an s, and containing any number of a or n characters in between.

A regular expression for the REGEXP operator may use any of the following special characters and constructs:

To use a literal instance of a special character in a regular expression, precede it by two backslash (\) characters. The MariaDB parser interprets one of the backslashes, and the regular expression library interprets the other. For example, to match the string 1+2 that contains the special + character, only the last of the following regular expressions is the correct one:

mysql> SELECT '1+2' REGEXP '1+2'; -> 0
mysql> SELECT '1+2' REGEXP '1\+2'; -> 0
mysql> SELECT '1+2' REGEXP '1\\+2'; -> 1

Numeric Functions and Operators

Arithmetic Operators
Mathematical Functions

Table 11.10. Numeric Functions and Operators

Name Description
ABS() Return the absolute value
ACOS() Return the arc cosine
ASIN() Return the arc sine
ATAN2(), ATAN() Return the arc tangent of the two arguments
ATAN() Return the arc tangent
CEIL() Return the smallest integer value not less than the argument
CEILING() Return the smallest integer value not less than the argument
CONV() Convert numbers between different number bases
COS() Return the cosine
COT() Return the cotangent
CRC32() Compute a cyclic redundancy check value
DEGREES() Convert radians to degrees
DIV Integer division
/ Division operator
EXP() Raise to the power of
FLOOR() Return the largest integer value not greater than the argument
LN() Return the natural logarithm of the argument
LOG10() Return the base-10 logarithm of the argument
LOG2() Return the base-2 logarithm of the argument
LOG() Return the natural logarithm of the first argument
- Minus operator
MOD() Return the remainder
% or MOD Modulo operator
OCT() Return an octal representation of a decimal number
PI() Return the value of pi
+ Addition operator
POW() Return the argument raised to the specified power
POWER() Return the argument raised to the specified power
RADIANS() Return argument converted to radians
RAND() Return a random floating-point value
ROUND() Round the argument
SIGN() Return the sign of the argument
SIN() Return the sine of the argument
SQRT() Return the square root of the argument
TAN() Return the tangent of the argument
* Multiplication operator
TRUNCATE() Truncate to specified number of decimal places
- Change the sign of the argument

Arithmetic Operators

Table 11.11. Arithmetic Operators

Name Description
DIV Integer division
/ Division operator
- Minus operator
% or MOD Modulo operator
+ Addition operator
* Multiplication operator
- Change the sign of the argument

The usual arithmetic operators are available. The result is determined according to the following rules:

These rules are applied for each operation, such that nested calculations imply the precision of each component. Hence, (14620 / 9432456) / (24250 / 9432456), resolves first to (0.0014) / (0.0026), with the final result having 8 decimal places (0.60288653).

Because of these rules and the way they are applied, care should be taken to ensure that components and subcomponents of a calculation use the appropriate level of precision. See , "Cast Functions and Operators".

For information about handling of overflow in numeric expression evaluation, see , "Out-of-Range and Overflow Handling".

Arithmetic operators apply to numbers. For other types of values, alternative operations may be available. For example, to add date values, use DATE_ADD(); see , "Date and Time Functions".

Mathematical Functions

Table 11.12. Mathematical Functions

Name Description
ABS() Return the absolute value
ACOS() Return the arc cosine
ASIN() Return the arc sine
ATAN2(), ATAN() Return the arc tangent of the two arguments
ATAN() Return the arc tangent
CEIL() Return the smallest integer value not less than the argument
CEILING() Return the smallest integer value not less than the argument
CONV() Convert numbers between different number bases
COS() Return the cosine
COT() Return the cotangent
CRC32() Compute a cyclic redundancy check value
DEGREES() Convert radians to degrees
EXP() Raise to the power of
FLOOR() Return the largest integer value not greater than the argument
LN() Return the natural logarithm of the argument
LOG10() Return the base-10 logarithm of the argument
LOG2() Return the base-2 logarithm of the argument
LOG() Return the natural logarithm of the first argument
MOD() Return the remainder
OCT() Return an octal representation of a decimal number
PI() Return the value of pi
POW() Return the argument raised to the specified power
POWER() Return the argument raised to the specified power
RADIANS() Return argument converted to radians
RAND() Return a random floating-point value
ROUND() Round the argument
SIGN() Return the sign of the argument
SIN() Return the sine of the argument
SQRT() Return the square root of the argument
TAN() Return the tangent of the argument
TRUNCATE() Truncate to specified number of decimal places

All mathematical functions return NULL in the event of an error.

Date and Time Functions

This section describes the functions that can be used to manipulate temporal values. See , "Date and Time Types", for a description of the range of values each date and time type has and the valid formats in which values may be specified.

Table 11.13. Date/Time Functions

Name Description
ADDDATE() Add time values (intervals) to a date value
ADDTIME() Add time
CONVERT_TZ() Convert from one timezone to another
CURDATE() Return the current date
CURRENT_DATE(), CURRENT_DATE Synonyms for CURDATE()
CURRENT_TIME(), CURRENT_TIME Synonyms for CURTIME()
CURRENT_TIMESTAMP(), CURRENT_TIMESTAMP Synonyms for NOW()
CURTIME() Return the current time
DATE_ADD() Add time values (intervals) to a date value
DATE_FORMAT() Format date as specified
DATE_SUB() Subtract a time value (interval) from a date
DATE() Extract the date part of a date or datetime expression
DATEDIFF() Subtract two dates
DAY() Synonym for DAYOFMONTH()
DAYNAME() Return the name of the weekday
DAYOFMONTH() Return the day of the month (0-31)
DAYOFWEEK() Return the weekday index of the argument
DAYOFYEAR() Return the day of the year (1-366)
EXTRACT() Extract part of a date
FROM_DAYS() Convert a day number to a date
FROM_UNIXTIME() Format UNIX timestamp as a date
GET_FORMAT() Return a date format string
HOUR() Extract the hour
LAST_DAY Return the last day of the month for the argument
LOCALTIME(), LOCALTIME Synonym for NOW()
LOCALTIMESTAMP, LOCALTIMESTAMP() Synonym for NOW()
MAKEDATE() Create a date from the year and day of year
MAKETIME MAKETIME()
MICROSECOND() Return the microseconds from argument
MINUTE() Return the minute from the argument
MONTH() Return the month from the date passed
MONTHNAME() Return the name of the month
NOW() Return the current date and time
PERIOD_ADD() Add a period to a year-month
PERIOD_DIFF() Return the number of months between periods
QUARTER() Return the quarter from a date argument
SEC_TO_TIME() Converts seconds to 'HH:MM:SS' format
SECOND() Return the second (0-59)
STR_TO_DATE() Convert a string to a date
SUBDATE() A synonym for DATE_SUB() when invoked with three arguments
SUBTIME() Subtract times
SYSDATE() Return the time at which the function executes
TIME_FORMAT() Format as time
TIME_TO_SEC() Return the argument converted to seconds
TIME() Extract the time portion of the expression passed
TIMEDIFF() Subtract time
TIMESTAMP() With a single argument, this function returns the date or datetime expression; with two arguments, the sum of the arguments
TIMESTAMPADD() Add an interval to a datetime expression
TIMESTAMPDIFF() Subtract an interval from a datetime expression
TO_DAYS() Return the date argument converted to days
TO_SECONDS() Return the date or datetime argument converted to seconds since Year 0
UNIX_TIMESTAMP() Return a UNIX timestamp
UTC_DATE() Return the current UTC date
UTC_TIME() Return the current UTC time
UTC_TIMESTAMP() Return the current UTC date and time
WEEK() Return the week number
WEEKDAY() Return the weekday index
WEEKOFYEAR() Return the calendar week of the date (0-53)
YEAR() Return the year
YEARWEEK() Return the year and week

Here is an example that uses date functions. The following query selects all rows with a date_col value from within the last 30 days:

mysql> SELECT something FROM tbl_name
 -> WHERE DATE_SUB(CURDATE(),INTERVAL 30 DAY) <= date_col;

The query also selects rows with dates that lie in the future.

Functions that expect date values usually accept datetime values and ignore the time part. Functions that expect time values usually accept datetime values and ignore the date part.

Functions that return the current date or time each are evaluated only once per query at the start of query execution. This means that multiple references to a function such as NOW() within a single query always produce the same result. (For our purposes, a single query also includes a call to a stored program (stored routine, trigger, or event) and all subprograms called by that program.) This principle also applies to CURDATE(), CURTIME(), UTC_DATE(), UTC-TIME(), UTC_TIMESTAMP(), and to any of their synonyms.

The CURRENT-TIMESTAMP(), CURRENT_TIME(), CURRENT_DATE(), and FROM_UNIXTIME() functions return values in the connection's current time zone, which is available as the value of the time-zone system variable. In addition, UNIX_TIMESTAMP() assumes that its argument is a datetime value in the current time zone. See , "MySQL Server Time Zone Support".

Some date functions can be used with "zero" dates or incomplete dates such as '2001-11-00', whereas others cannot. Functions that extract parts of dates typically work with incomplete dates and thus can return 0 when you might otherwise expect a nonzero value. For example:

mysql> SELECT DAYOFMONTH('2001-11-00'), MONTH('2005-00-00');
 -> 0, 0

Other functions expect complete dates and return NULL for incomplete dates. These include functions that perform date arithmetic or that map parts of dates to names. For example:

mysql> SELECT DATE_ADD('2006-05-00',INTERVAL 1 DAY);
 -> NULL mysql> SELECT DAYNAME('2006-05-00');
 -> NULL

As of MariaDB 5.6.4, several functions are more strict when passed a DATE() function value as their argument and reject incomplete dates with a day part of zero. These functions are affected: CONVERT-TZ(), DATE-ADD(), DATE_SUB(), DAYOFYEAR(), LAST_DAY(), TIMESTAMPDIFF(), TO-DAYS(), TO_SECONDS(), WEEK(), WEEKDAY(), WEEKOFYEAR(), YEARWEEK(). This restriction was relaxed for LAST_DAY() in 5.6.5 to permit a day part of zero.

MySQL 5.6.4 and up supports fractional seconds for TIME, DATETIME, and TIMESTAMP values, with up to microsecond precision. Functions that take temporal arguments accept values with fractional seconds. Return values from temporal functions include fractional seconds as appropriate.

What Calendar Is Used By MySQL?

MySQL uses what is known as a proleptic Gregorian calendar.

Every country that has switched from the Julian to the Gregorian calendar has had to discard at least ten days during the switch. To see how this works, consider the month of October 1582, when the first Julian-to-Gregorian switch occurred.

Monday Tuesday Wednesday Thursday Friday Saturday Sunday
1 2 3 4 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

There are no dates between October 4 and October 15. This discontinuity is called the cutover. Any dates before the cutover are Julian, and any dates following the cutover are Gregorian. Dates during a cutover are nonexistent.

A calendar applied to dates when it was not actually in use is called proleptic. Thus, if we assume there was never a cutover and Gregorian rules always rule, we have a proleptic Gregorian calendar. This is what is used by MySQL, as is required by standard SQL. For this reason, dates prior to the cutover stored as MariaDB DATE or DATETIME values must be adjusted to compensate for the difference. It is important to realize that the cutover did not occur at the same time in all countries, and that the later it happened, the more days were lost. For example, in Great Britain, it took place in 1752, when Wednesday September 2 was followed by Thursday September 14. Russia remained on the Julian calendar until 1918, losing 13 days in the process, and what is popularly referred to as its "October Revolution" occurred in November according to the Gregorian calendar.

Full-Text Search Functions

Natural Language Full-Text Searches
Boolean Full-Text Searches
Full-Text Searches with Query Expansion
Full-Text Stopwords
Full-Text Restrictions
Fine-Tuning MariaDB Full-Text Search
Adding a Collation for Full-Text Indexing

MATCH (col1,col2,...) AGAINST (expr [search_modifier])

search_modifier:
 {
 IN NATURAL LANGUAGE MODE
 | IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION
 | IN BOOLEAN MODE
 | WITH QUERY EXPANSION
 }

MySQL has support for full-text indexing and searching:

Full-text searching is performed using MATCH() ... AGAINST syntax. MATCH() takes a comma-separated list that names the columns to be searched. AGAINST takes a string to search for, and an optional modifier that indicates what type of search to perform. The search string must be a literal string, not a variable or a column name. There are three types of full-text searches:

For more technical details about processing for InnoDB FULLTEXT indexes, see , "FULLTEXT Indexes".

Constraints on full-text searching are listed in , "Full-Text Restrictions".

The myisam_ftdump utility dumps the contents of a MyISAM full-text index. This may be helpful for debugging full-text queries. See , "myisam_ftdump - Display Full-Text Index information".

Natural Language Full-Text Searches

By default or with the IN NATURAL LANGUAGE MODE modifier, the MATCH() function performs a natural language search for a string against a text collection. A collection is a set of one or more columns included in a FULLTEXT index. The search string is given as the argument to AGAINST(). For each row in the table, MATCH() returns a relevance value; that is, a similarity measure between the search string and the text in that row in the columns named in the MATCH() list.

mysql> CREATE TABLE articles (
  id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
  title VARCHAR(200),
  body TEXT,
  FULLTEXT (title,body)
 ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO articles (title,body) VALUES
 ('MySQL Tutorial','DBMS stands for DataBase ...'),
 ('How To Use MariaDB Well','After you went through a ...'),
 ('Optimizing MySQL','In this tutorial we will show ...'),
 ('1001 MariaDB Tricks','1. Never run mysqld as root. 2. ...'),
 ('MySQL vs. YourSQL','In the following database comparison ...'),
 ('MySQL Security','When configured properly, MariaDB ...');
Query OK, 6 rows affected (0.00 sec)
Records: 6 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM articles
 WHERE MATCH (title,body)
 AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----+-------------------+------------------------------------------+
| id | title | body |
+----+-------------------+------------------------------------------+
| 1 | MariaDB Tutorial | DBMS stands for DataBase ... |
| 5 | MariaDB vs. YourSQL | In the following database comparison ... |
+----+-------------------+------------------------------------------+
2 rows in set (0.00 sec)

By default, the search is performed in case-insensitive fashion. To perform a case-sensitive full-text search, use a binary collation for the indexed columns. For example, a column that uses the latin1 character set of can be assigned a collation of latin1_bin to make it case sensitive for full-text searches.

When MATCH() is used in a WHERE clause, as in the example shown earlier, the rows returned are automatically sorted with the highest relevance first. Relevance values are nonnegative floating-point numbers. Zero relevance means no similarity. Relevance is computed based on the number of words in the row, the number of unique words in that row, the total number of words in the collection, and the number of documents (rows) that contain a particular word.

To simply count matches, you could use a query like this:

mysql> SELECT COUNT(*) FROM articles
 WHERE MATCH (title,body)
 AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----------+
| COUNT(*) |
+----------+
| 2 |
+----------+
1 row in set (0.00 sec)

You might find it quicker to rewrite the query as follows:

mysql> SELECT
 COUNT(IF(MATCH (title,body) AGAINST ('database' IN NATURAL LANGUAGE MODE), 1, NULL))
 AS count
 FROM articles;
+-------+
| count |
+-------+
| 2 |
+-------+
1 row in set (0.03 sec)

The first query does some extra work (sorting the results by relevance) but also can use an index lookup based on the WHERE clause. The index lookup might make the first query faster if the search matches few rows. The second query performs a full table scan, which might be faster than the index lookup if the search term was present in most rows.

For natural-language full-text searches, the columns named in the MATCH() function must be the same columns included in some FULLTEXT index in your table. For the preceding query, note that the columns named in the MATCH() function (title and body) are the same as those named in the definition of the article table's FULLTEXT index. To search the title or body separately, you would create separate FULLTEXT indexes for each column.

You can also perform a boolean search or a search with query expansion. These search types are described in , "Boolean Full-Text Searches", and , "Full-Text Searches with Query Expansion".

A full-text search that uses an index can name columns only from a single table in the MATCH() clause because an index cannot span multiple tables. A boolean search can be done in the absence of an index (albeit more slowly), in which case it is possible to name columns from multiple tables.

The preceding example is a basic illustration that shows how to use the MATCH() function where rows are returned in order of decreasing relevance. The next example shows how to retrieve the relevance values explicitly. Returned rows are not ordered because the SELECT statement includes neither WHERE nor ORDER BY clauses:

mysql> SELECT id, MATCH (title,body)
 AGAINST ('Tutorial' IN NATURAL LANGUAGE MODE) AS score
 FROM articles;
+----+---------------------+
| id | score |
+----+---------------------+
| 1 | 0.22764469683170319 |
| 2 | 0 |
| 3 | 0.22764469683170319 |
| 4 | 0 |
| 5 | 0 |
| 6 | 0 |
+----+---------------------+
6 rows in set (0.00 sec)

The following example is more complex. The query returns the relevance values and it also sorts the rows in order of decreasing relevance. To achieve this result, specify MATCH() twice: once in the SELECT list and once in the WHERE clause. This causes no additional overhead, because the MariaDB optimizer notices that the two MATCH() calls are identical and invokes the full-text search code only once.

mysql> SELECT id, body, MATCH (title,body) AGAINST
 ('Security implications of running MariaDB as root'
 IN NATURAL LANGUAGE MODE) AS score
 FROM articles WHERE MATCH (title,body) AGAINST
 ('Security implications of running MariaDB as root'
 IN NATURAL LANGUAGE MODE);
+----+-----------------------+------------------------------------------+
| id | title | body |
+----+-----------------------+------------------------------------------+
| 5 | MariaDB vs. YourSQL | In the following database comparison ... |
| 1 | MariaDB Tutorial | DBMS stands for DataBase ... |
| 3 | Optimizing MariaDB | In this tutorial we will show ... |
| 6 | MariaDB Security | When configured properly, MariaDB ... |
| 2 | How To Use MariaDB Well | After you went through a ... |
| 4 | 1001 MariaDB Tricks | 1. Never run mysqld as root. 2. ... |
+----+-----------------------+------------------------------------------+
6 rows in set (0.00 sec)

The MariaDB FULLTEXT implementation regards any sequence of true word characters (letters, digits, and underscores) as a word. That sequence may also contain apostrophes ("'"), but not more than one in a row. This means that aaa'bbb is regarded as one word, but aaa''bbb is regarded as two words. Apostrophes at the beginning or the end of a word are stripped by the FULLTEXT parser; 'aaa'bbb' would be parsed as aaa'bbb.

The FULLTEXT parser determines where words start and end by looking for certain delimiter characters; for example, " " (space), "," (comma), and "." (period). If words are not separated by delimiters (as in, for example, Chinese), the FULLTEXT parser cannot determine where a word begins or ends. To be able to add words or other indexed terms in such languages to a FULLTEXT index, you must preprocess them so that they are separated by some arbitrary delimiter such as "'".

In MariaDB 5.6, it is possible to write a plugin that replaces the built-in full-text parser. For details, see , "The MariaDB Plugin API". For example parser plugin source code, see the plugin/fulltext directory of a MariaDB source distribution.

Some words are ignored in full-text searches:

The default stopword lists are shown in , "Full-Text Stopwords". The default minimum word length and stopword list can be changed as described in , "Fine-Tuning MariaDB Full-Text Search".

Every correct word in the collection and in the query is weighted according to its significance in the collection or query. Thus, a word that is present in many documents has a lower weight, because it has lower semantic value in this particular collection. Conversely, if the word is rare, it receives a higher weight. The weights of the words are combined to compute the relevance of the row. This technique works best with large collections.MyISAM Limitation

For very small tables, word distribution does not adequately reflect their semantic value, and this model may sometimes produce bizarre results for search indexes on MyISAM tables. For example, although the word "MySQL" is present in every row of the articles table shown earlier, a search for the word in a MyISAM search index produces no results:

mysql> SELECT * FROM articles
 WHERE MATCH (title,body)
 AGAINST ('MySQL' IN NATURAL LANGUAGE MODE);
Empty set (0.00 sec)

The search result is empty because the word "MySQL" is present in at least 50% of the rows, and so is effectively treated as a stopword. This filtering technique is more suitable for large data sets, where you might not want the result set to return every second row from a 1GB table, than for small data sets where it might cause poor results for popular terms.

The 50% threshold can surprise you when you first try full-text searching to see how it works, and makes InnoDB tables more suited to experimentation with full-text searches. If you create a MyISAM table and insert only one or two rows of text into it, every word in the text occurs in at least 50% of the rows. As a result, no search returns any results until the table contains more rows. Users who need to bypass the 50% limitation can build search indexes on InnoDB tables, or the boolean search mode explained in , "Boolean Full-Text Searches".

Boolean Full-Text Searches

MySQL can perform boolean full-text searches using the IN BOOLEAN MODE modifier. With this modifier, certain characters have special meaning at the beginning or end of words in the search string. In the following query, the + and - operators indicate that a word must be present or absent, respectively, for a match to occur. Thus, the query retrieves all the rows that contain the word "MySQL" but that do not contain the word "YourSQL":

mysql> SELECT * FROM articles WHERE MATCH (title,body)
 AGAINST ('+MySQL -YourSQL' IN BOOLEAN MODE);
+----+-----------------------+-------------------------------------+
| id | title | body |
+----+-----------------------+-------------------------------------+
| 1 | MariaDB Tutorial | DBMS stands for DataBase ... |
| 2 | How To Use MariaDB Well | After you went through a ... |
| 3 | Optimizing MariaDB | In this tutorial we will show ... |
| 4 | 1001 MariaDB Tricks | 1. Never run mysqld as root. 2. ... |
| 6 | MariaDB Security | When configured properly, MariaDB ... |
+----+-----------------------+-------------------------------------+
Note

In implementing this feature, MariaDB uses what is sometimes referred to as implied Boolean logic, in which

Boolean full-text searches have these characteristics:

The boolean full-text search capability supports the following operators:

The following examples demonstrate some search strings that use boolean full-text operators:

Full-Text Searches with Query Expansion

Full-text search supports query expansion (and in particular, its variant "blind query expansion"). This is generally useful when a search phrase is too short, which often means that the user is relying on implied knowledge that the full-text search engine lacks. For example, a user searching for "database" may really mean that "MySQL", "Oracle", "DB2", and "RDBMS" all are phrases that should match "databases" and should be returned, too. This is implied knowledge.

Blind query expansion (also known as automatic relevance feedback) is enabled by adding WITH QUERY EXPANSION or IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION following the search phrase. It works by performing the search twice, where the search phrase for the second search is the original search phrase concatenated with the few most highly relevant documents from the first search. Thus, if one of these documents contains the word "databases" and the word "MySQL", the second search finds the documents that contain the word "MySQL" even if they do not contain the word "database". The following example shows this difference:

mysql> SELECT * FROM articles
 WHERE MATCH (title,body)
 AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----+-------------------+------------------------------------------+
| id | title | body |
+----+-------------------+------------------------------------------+
| 1 | MariaDB Tutorial | DBMS stands for DataBase ... |
| 5 | MariaDB vs. YourSQL | In the following database comparison ... |
+----+-------------------+------------------------------------------+
2 rows in set (0.00 sec)
mysql> SELECT * FROM articles
 WHERE MATCH (title,body)
 AGAINST ('database' WITH QUERY EXPANSION);
+----+-----------------------+------------------------------------------+
| id | title | body |
+----+-----------------------+------------------------------------------+
| 5 | MariaDB vs. YourSQL | In the following database comparison ... |
| 1 | MariaDB Tutorial | DBMS stands for DataBase ... |
| 3 | Optimizing MariaDB | In this tutorial we will show ... |
| 6 | MariaDB Security | When configured properly, MariaDB ... |
| 2 | How To Use MariaDB Well | After you went through a ... |
| 4 | 1001 MariaDB Tricks | 1. Never run mysqld as root. 2. ... |
+----+-----------------------+------------------------------------------+
6 rows in set (0.00 sec)

Another example could be searching for books by Georges Simenon about Maigret, when a user is not sure how to spell "Maigret". A search for "Megre and the reluctant witnesses" finds only "Maigret and the Reluctant Witnesses" without query expansion. A search with query expansion finds all books with the word "Maigret" on the second pass.Note

Because blind query expansion tends to increase noise significantly by returning nonrelevant documents, use it only when a search phrase is short.

Full-Text Stopwords

The stopword list is loaded and searched for full-text queries using the server character set and collation (the values of the character_set_server and collation_server system variables). False hits or misses may occur for stopword lookups if the stopword file or columns used for full-text indexing or searches have a character set or collation different from character_set_server or collation_server.

Case sensitivity of stopword lookups depends on the server collation. For example, lookups are case insensitive if the collation is latin1_swedish_ci, whereas lookups are case sensitive if the collation is latin1_general_cs or latin1_bin.

Stopwords for InnoDB Search Indexes

InnoDB has a relatively short list of default stopwords, because documents from technical, literary, and so on sources often use short words as keywords or in significant phrases. For example, you might search for "to be or not to be" and expect to get a sensible result, rather than having all those words ignored.

To see the list, query the table information_schema.innodb_ft_default_stopword. To define your own stopword list used for all InnoDB tables, define a table with the same structure as innodb_ft_default_stopword, fill it with the desired stopwords, and set the value of the innodb_ft_server_stopword_table option to a value of the form db_name/table_name before creating the search index. To create special stopword lists on a table-by-table basis, define other tables to hold these lists and specify the appropriate one in the innodb_ft_user_stopword_table option before creating the search index.

Stopwords for MyISAM Search Indexes

In MariaDB 5.6, the stopword file is loaded and searched using latin1 if character_set_server is ucs2, utf16, utf16le, or utf32. If any table was created with FULLTEXT indexes while the server character set was ucs2, utf16, utf16le, or utf32, it should be repaired using this statement:

REPAIR TABLE tbl_name QUICK;

The following table shows the default list of stopwords for MyISAM search indexes. In a MariaDB source distribution, you can find this list in the storage/myisam/ft_static.c file.

a's able about above according
accordingly across actually after afterwards
again against ain't all allow
allows almost alone along already
also although always am among
amongst an and another any
anybody anyhow anyone anything anyway
anyways anywhere apart appear appreciate
appropriate are aren't around as
aside ask asking associated at
available away awfully be became
because become becomes becoming been
before beforehand behind being believe
below beside besides best better
between beyond both brief but
by c'mon c's came can
can't cannot cant cause causes
certain certainly changes clearly co
com come comes concerning consequently
consider considering contain containing contains
corresponding could couldn't course currently
definitely described despite did didn't
different do does doesn't doing
don't done down downwards during
each edu eg eight either
else elsewhere enough entirely especially
et etc even ever every
everybody everyone everything everywhere ex
exactly example except far few
fifth first five followed following
follows for former formerly forth
four from further furthermore get
gets getting given gives go
goes going gone got gotten
greetings had hadn't happens hardly
has hasn't have haven't having
he he's hello help hence
her here here's hereafter hereby
herein hereupon hers herself hi
him himself his hither hopefully
how howbeit however i'd i'll
i'm i've ie if ignored
immediate in inasmuch inc indeed
indicate indicated indicates inner insofar
instead into inward is isn't
it it'd it'll it's its
itself just keep keeps kept
know known knows last lately
later latter latterly least less
lest let let's like liked
likely little look looking looks
ltd mainly many may maybe
me mean meanwhile merely might
more moreover most mostly much
must my myself name namely
nd near nearly necessary need
needs neither never nevertheless new
next nine no nobody non
none noone nor normally not
nothing novel now nowhere obviously
of off often oh ok
okay old on once one
ones only onto or other
others otherwise ought our ours
ourselves out outside over overall
own particular particularly per perhaps
placed please plus possible presumably
probably provides que quite qv
rather rd re really reasonably
regarding regardless regards relatively respectively
right said same saw say
saying says second secondly see
seeing seem seemed seeming seems
seen self selves sensible sent
serious seriously seven several shall
she should shouldn't since six
so some somebody somehow someone
something sometime sometimes somewhat somewhere
soon sorry specified specify specifying
still sub such sup sure
t's take taken tell tends
th than thank thanks thanx
that that's thats the their
theirs them themselves then thence
there there's thereafter thereby therefore
therein theres thereupon these they
they'd they'll they're they've think
third this thorough thoroughly those
though three through throughout thru
thus to together too took
toward towards tried tries truly
try trying twice two un
under unfortunately unless unlikely until
unto up upon us use
used useful uses using usually
value various very via viz
vs want wants was wasn't
way we we'd we'll we're
we've welcome well went were
weren't what what's whatever when
whence whenever where where's whereafter
whereas whereby wherein whereupon wherever
whether which while whither who
who's whoever whole whom whose
why will willing wish with
within without won't wonder would
wouldn't yes yet you you'd
you'll you're you've your yours
yourself yourselves zero

Full-Text Restrictions

Fine-Tuning MariaDB Full-Text Search

MySQL's full-text search capability has few user-tunable parameters. You can exert more control over full-text searching behavior if you have a MariaDB source distribution because some changes require source code modifications. See , "Installing MariaDB from Source".

Note that full-text search is carefully tuned for effectiveness. Modifying the default behavior in most cases can actually decrease effectiveness. Do not alter the MariaDB sources unless you know what you are doing.

Most full-text variables described in this section must be set at server startup time. A server restart is required to change them; they cannot be modified while the server is running.

Some variable changes require that you rebuild the FULLTEXT indexes in your tables. Instructions for doing so are given later in this section.

If you modify full-text variables that affect indexing (innodb_ft_min_token_size, innodb_ft_max_token_size, innodb_ft_server_stopword_table, innodb-ft-user-stopword-table, innodb_ft_enable_stopword, ft-min-word-len, ft_max_word_len, or ft_stopword_file), or if you change the stopword file itself, you must rebuild your FULLTEXT indexes after making the changes and restarting the server. To rebuild the indexes in this case, it is sufficient to do a QUICK repair operation:

mysql> REPAIR TABLE tbl_name QUICK;

Alternatively, use ALTER TABLE with the DROP INDEX and ADD INDEX options to drop and re-create each FULLTEXT index. In some cases, this may be faster than a repair operation.

Each table that contains any FULLTEXT index must be repaired as just shown. Otherwise, queries for the table may yield incorrect results, and modifications to the table will cause the server to see the table as corrupt and in need of repair.

Note that if you use myisamchk to perform an operation that modifies table indexes (such as repair or analyze), the FULLTEXT indexes are rebuilt using the default full-text parameter values for minimum word length, maximum word length, and stopword file unless you specify otherwise. This can result in queries failing.

The problem occurs because these parameters are known only by the server. They are not stored in MyISAM index files. To avoid the problem if you have modified the minimum or maximum word length or stopword file values used by the server, specify the same ft-min-word-len, ft_max_word_len, and ft-stopword-file values for myisamchk that you use for mysqld. For example, if you have set the minimum word length to 3, you can repair a table with myisamchk like this:

shell> myisamchk --recover --ft_min_word_len=3 tbl_name.MYI

To ensure that myisamchk and the server use the same values for full-text parameters, place each one in both the [mysqld] and [myisamchk] sections of an option file:

[mysqld]
ft_min_word_len=3
[myisamchk]
ft_min_word_len=3

An alternative to using myisamchk for index modification is to use the REPAIR TABLE, ANALYZE TABLE, OPTIMIZE TABLE, or ALTER TABLE statements. These statements are performed by the server, which knows the proper full-text parameter values to use.

Adding a Collation for Full-Text Indexing

This section describes how to add a new collation for full-text searches. The sample collation is like latin1_swedish_ci but treats the '-' character as a letter rather than as a punctuation character so that it can be indexed as a word character. General information about adding collations is given in , "Adding a Collation to a Character Set"; it is assumed that you have read it and are familiar with the files involved.

To add a collation for full-text indexing, use this procedure:

  1. Add a collation to the Index.xml file. The collation ID must be unused, so choose a value different from 62 if that ID is already taken on your system.

    <charset name='latin1'>
    ...
    <collation name='latin1_fulltext_ci' id='62'/>
    </charset>
    
  2. Declare the sort order for the collation in the latin1.xml file. In this case, the order can be copied from latin1_swedish_ci:

    <collation name='latin1_fulltext_ci'>
    <map>
    00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
    10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F
    20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F
    30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F
    40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F
    50 51 52 53 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F
    60 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F
    50 51 52 53 54 55 56 57 58 59 5A 7B 7C 7D 7E 7F
    80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F
    90 91 92 93 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF
    41 41 41 41 5C 5B 5C 43 45 45 45 45 49 49 49 49
    44 4E 4F 4F 4F 4F 5D D7 D8 55 55 55 59 59 DE DF
    41 41 41 41 5C 5B 5C 43 45 45 45 45 49 49 49 49
    44 4E 4F 4F 4F 4F 5D F7 D8 55 55 55 59 59 DE FF
    </map>
    </collation>
    
  3. Modify the ctype array in latin1.xml. Change the value corresponding to 0x2D (which is the code for the '-' character) from 10 (punctuation) to 01 (small letter). In the following array, this is the element in the fourth row down, third value from the end.

    <ctype>
    <map>
    00
    20 20 20 20 20 20 20 20 20 28 28 28 28 28 20 20
    20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
    48 10 10 10 10 10 10 10 10 10 10 10 10 01 10 10
    84 84 84 84 84 84 84 84 84 84 10 10 10 10 10 10
    10 81 81 81 81 81 81 01 01 01 01 01 01 01 01 01
    01 01 01 01 01 01 01 01 01 01 01 10 10 10 10 10
    10 82 82 82 82 82 82 02 02 02 02 02 02 02 02 02
    02 02 02 02 02 02 02 02 02 02 02 10 10 10 10 20
    10 00 10 02 10 10 10 10 10 10 01 10 01 00 01 00
    00 10 10 10 10 10 10 10 10 10 02 10 02 00 02 01
    48 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
    10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
    01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01
    01 01 01 01 01 01 01 10 01 01 01 01 01 01 01 02
    02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02
    02 02 02 02 02 02 02 10 02 02 02 02 02 02 02 02
    </map>
    </ctype>
    
  4. Restart the server.
  5. To employ the new collation, include it in the definition of columns that are to use it:

    mysql> DROP TABLE IF EXISTS t1;
    Query OK, 0 rows affected (0.13 sec)
    mysql> CREATE TABLE t1 (
     a TEXT CHARACTER SET latin1 COLLATE latin1_fulltext_ci,
     FULLTEXT INDEX(a)
     ) ENGINE=InnoDB;
    Query OK, 0 rows affected (0.47 sec)
    
  6. Test the collation to verify that hyphen is considered as a word character:

    mysql> INSERT INTO t1 VALUEs ('----'),('....'),('abcd');
    Query OK, 3 rows affected (0.22 sec)
    Records: 3 Duplicates: 0 Warnings: 0
    mysql> SELECT * FROM t1 WHERE MATCH a AGAINST ('----' IN BOOLEAN MODE);
    +------+
    | a |
    +------+
    | ---- |
    +------+
    1 row in set (0.00 sec)
    

Cast Functions and Operators

Table 11.14. Cast Functions

Name Description
BINARY Cast a string to a binary string
CAST() Cast a value as a certain type
CONVERT() Cast a value as a certain type

Normally, you cannot compare a BLOB value or other binary string in case-insensitive fashion because binary strings have no character set, and thus no concept of lettercase. To perform a case-insensitive comparison, use the CONVERT() function to convert the value to a nonbinary string. Comparisons of the result use the string collation. For example, if the character set of the result has a case-insensitive collation, a LIKE operation is not case sensitive:

SELECT 'A' LIKE CONVERT(blob_col USING latin1) FROM tbl_name;

To use a different character set, substitute its name for latin1 in the preceding statement. To specify a particular collation for the converted string, use a COLLATE clause following the CONVERT() call, as described in , "CONVERT() and CAST()". For example, to use latin1_german1_ci:

SELECT 'A' LIKE CONVERT(blob_col USING latin1) COLLATE latin1_german1_ci
 FROM tbl_name;

CONVERT() can be used more generally for comparing strings that are represented in different character sets.

LOWER() (and UPPER()) are ineffective when applied to binary strings (BINARY, VARBINARY, BLOB). To perform lettercase conversion, convert the string to a nonbinary string:

mysql> SET @str = BINARY 'New York';
mysql> SELECT LOWER(@str), LOWER(CONVERT(@str USING latin1));
+-------------+-----------------------------------+
| LOWER(@str) | LOWER(CONVERT(@str USING latin1)) |
+-------------+-----------------------------------+
| New York | new york |
+-------------+-----------------------------------+

The cast functions are useful when you want to create a column with a specific type in a CREATE TABLE ... SELECT statement:

CREATE TABLE new_table SELECT CAST('2000-01-01' AS DATE);

The functions also can be useful for sorting ENUM columns in lexical order. Normally, sorting of ENUM columns occurs using the internal numeric values. Casting the values to CHAR results in a lexical sort:

SELECT enum_col FROM tbl_name ORDER BY CAST(enum_col AS CHAR);

CAST(str AS BINARY) is the same thing as BINARY str. CAST(expr AS CHAR) treats the expression as a string with the default character set.

CAST() also changes the result if you use it as part of a more complex expression such as CONCAT('Date: ',CAST(NOW() AS DATE)).

You should not use CAST() to extract data in different formats but instead use string functions like LEFT() or EXTRACT(). See , "Date and Time Functions".

To cast a string to a numeric value in numeric context, you normally do not have to do anything other than to use the string value as though it were a number:

mysql> SELECT 1+'1';
 -> 2

If you use a string in an arithmetic operation, it is converted to a floating-point number during expression evaluation.

If you use a number in string context, the number automatically is converted to a string:

mysql> SELECT CONCAT('hello you ',2);
 -> 'hello you 2'

For information about implicit conversion of numbers to strings, see , "Type Conversion in Expression Evaluation".

MySQL supports arithmetic with both signed and unsigned 64-bit values. If you are using numeric operators (such as + or -) and one of the operands is an unsigned integer, the result is unsigned by default (see , "Arithmetic Operators"). You can override this by using the SIGNED or UNSIGNED cast operator to cast a value to a signed or unsigned 64-bit integer, respectively.

mysql> SELECT CAST(1-2 AS UNSIGNED)
 -> 18446744073709551615
mysql> SELECT CAST(CAST(1-2 AS UNSIGNED) AS SIGNED);
 -> -1

If either operand is a floating-point value, the result is a floating-point value and is not affected by the preceding rule. (In this context, DECIMAL column values are regarded as floating-point values.)

mysql> SELECT CAST(1 AS UNSIGNED) - 2.0;
 -> -1.0

The SQL mode affects the result of conversion operations. Examples:

For more information, see , "Server SQL Modes".

XML Functions

Table 11.15. XML Functions

Name Description
ExtractValue() Extracts a value from an XML string using XPath notation
UpdateXML() Return replaced XML fragment

This section discusses XML and related functionality in MySQL.Note

It is possible to obtain XML-formatted output from MariaDB in the mysql and mysqldump clients by invoking them with the --xml option. See , "mysql - The MariaDB Command-Line Tool", and , "mysqldump - A Database Backup Program".

Two functions providing basic XPath 1.0 (XML Path Language, version 1.0) capabilities are available. Some basic information about XPath syntax and usage is provided later in this section; however, an in-depth discussion of these topics is beyond the scope of this Manual, and you should refer to the XML Path Language (XPath) 1.0 standard for definitive information. A useful resource for those new to XPath or who desire a refresher in the basics is the Zvon.org XPath Tutorial, which is available in several languages.Note

These functions remain under development. We continue to improve these and other aspects of XML and XPath functionality in MariaDB 5.6 and onwards. You may discuss these, ask questions about them, and obtain help from other users with them in the MySQL XML User Forum.

XPath expressions used with these functions support user variables and local stored program variables. User variables are weakly checked; variables local to stored programs are strongly checked (see also Bug #26518):

Expressions containing user variables or variables local to stored programs must otherwise (except for notation) conform to the rules for XPath expressions containing variables as given in the XPath 1.0 specification.Note

Currently, a user variable used to store an XPath expression is treated as an empty string. Because of this, it is not possible to store an XPath expression as a user variable. (Bug #32911)

Note

A discussion in depth of XPath syntax and usage are beyond the scope of this Manual. Please see the XML Path Language (XPath) 1.0 specification for definitive information. A useful resource for those new to XPath or who are wishing a refresher in the basics is the Zvon.org XPath Tutorial, which is available in several languages.

Descriptions and examples of some basic XPath expressions follow:

XPath Limitations. The XPath syntax supported by these functions is currently subject to the following limitations:

XPath expressions passed as arguments to ExtractValue() and UpdateXML() may contain the colon character (":") in element selectors, which enables their use with markup employing XML namespaces notation. For example:

mysql> SET @xml = '<a>111<b:c>222<d>333</d><e:f>444</e:f></b:c></a>';
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT ExtractValue(@xml, '//e:f');
+-----------------------------+
| ExtractValue(@xml, '//e:f') |
+-----------------------------+
| 444 |
+-----------------------------+
1 row in set (0.00 sec)
mysql> SELECT UpdateXML(@xml, '//b:c', '<g:h>555</g:h>');
+--------------------------------------------+
| UpdateXML(@xml, '//b:c', '<g:h>555</g:h>') |
+--------------------------------------------+
| <a>111<g:h>555</g:h></a> |
+--------------------------------------------+
1 row in set (0.00 sec)

This is similar in some respects to what is permitted by Apache Xalan and some other parsers, and is much simpler than requiring namespace declarations or the use of the namespace-uri() and local-name() functions.

Error handling. For both ExtractValue() and UpdateXML(), the XPath locator used must be valid and the XML to be searched must consist of elements which are properly nested and closed. If the locator is invalid, an error is generated:

mysql> SELECT ExtractValue('<a>c</a><b/>', '/&a');
ERROR 1105 (HY000): XPATH syntax error: '&a'

If xml_frag does not consist of elements which are properly nested and closed, NULL is returned and a warning is generated, as shown in this example:

mysql> SELECT ExtractValue('<a>c</a><b', '//a');
+-----------------------------------+
| ExtractValue('<a>c</a><b', '//a') |
+-----------------------------------+
| NULL |
+-----------------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> SHOW WARNINGS;
+---------+------+-------------------------------------------------------------------------------------------+
| Level | Code | Message |
+---------+------+-------------------------------------------------------------------------------------------+
| Warning | 1523 | Incorrect XML value: 'parse error at line 1 pos 11: END-OF-INPUT unexpected ('>' wanted)' |
+---------+------+-------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> SELECT ExtractValue('<a>c</a><b/>', '//a');
+-------------------------------------+
| ExtractValue('<a>c</a><b/>', '//a') |
+-------------------------------------+
| c |
+-------------------------------------+
1 row in set (0.00 sec)
Important

The replacement XML used as the third argument to UpdateXML() is not checked to determine whether it consists solely of elements which are properly nested and closed.

XPath Injection. code injection occurs when malicious code is introduced into the system to gain unauthorized access to privileges and data. It is based on exploiting assumptions made by developers about the type and content of data input from users. XPath is no exception in this regard.

A common scenario in which this can happen is the case of application which handles authorization by matching the combination of a login name and password with those found in an XML file, using an XPath expression like this one:

//user[login/text()='neapolitan' and password/text()='1c3cr34m']/attribute::id

This is the XPath equivalent of an SQL statement like this one:

SELECT id FROM users WHERE login='neapolitan' AND password='1c3cr34m';

A PHP application employing XPath might handle the login process like this:

<?php
 $file = 'users.xml';
 $login = $POST['login'];
 $password = $POST['password'];
 $xpath = '//user[login/text()=$login and password/text()=$password]/attribute::id';
 if( file_exists($file) )
 {
 $xml = simplexml_load_file($file);
 if($result = $xml->xpath($xpath))
 echo 'You are now logged in as user $result[0].';
 else
 echo 'Invalid login name or password.';
 }
 else
 exit('Failed to open $file.');
?>

No checks are performed on the input. This means that a malevolent user can "short-circuit" the test by entering ' or 1=1 for both the login name and password, resulting in $xpath being evaluated as shown here:

//user[login/text()='' or 1=1 and password/text()='' or 1=1]/attribute::id

Since the expression inside the square brackets always evaluates as true, it is effectively the same as this one, which matches the id attribute of every user element in the XML document:

//user/attribute::id

One way in which this particular attack can be circumvented is simply by quoting the variable names to be interpolated in the definition of $xpath, forcing the values passed from a Web form to be converted to strings:

$xpath = '//user[login/text()='$login' and password/text()='$password']/attribute::id';

This is the same strategy that is often recommended for preventing SQL injection attacks. In general, the practices you should follow for preventing XPath injection attacks are the same as for preventing SQL injection:

Just as SQL injection attacks can be used to obtain information about database schemas, so can XPath injection be used to traverse XML files to uncover their structure, as discussed in Amit Klein's paper Blind XPath Injection (PDF file, 46KB).

It is also important to check the output being sent back to the client. Consider what can happen when we use the MariaDB ExtractValue() function:

mysql> SELECT ExtractValue(
 -> LOAD_FILE('users.xml'),
 -> '//user[login/text()='' or 1=1 and password/text()='' or 1=1]/attribute::id'
 -> ) AS id;
+-------------------------------+
| id |
+-------------------------------+
| 00327 13579 02403 42354 28570 |
+-------------------------------+
1 row in set (0.01 sec)

Because ExtractValue() returns multiple matches as a single space-delimited string, this injection attack provides every valid ID contained within users.xml to the user as a single row of output. As an extra safeguard, you should also test output before returning it to the user. Here is a simple example:

mysql> SELECT @id = ExtractValue(
 -> LOAD_FILE('users.xml'),
 -> '//user[login/text()='' or 1=1 and password/text()='' or 1=1]/attribute::id'
 -> );
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT IF(
 -> INSTR(@id, ' ') = 0,
 -> @id,
 -> 'Unable to retrieve user ID')
 -> AS singleID;
+----------------------------+
| singleID |
+----------------------------+
| Unable to retrieve user ID |
+----------------------------+
1 row in set (0.00 sec)

In general, the guidelines for returning data to users securely are the same as for accepting user input. These can be summed up as:

Bit Functions

Table 11.16. Bitwise Functions

Name Description
BIT_COUNT() Return the number of bits that are set
& Bitwise AND
~ Invert bits
| Bitwise OR
^ Bitwise XOR
<< Left shift
>> Right shift

MySQL uses BIGINT (64-bit) arithmetic for bit operations, so these operators have a maximum range of 64 bits.

Encryption and Compression Functions

Table 11.17. Encryption Functions

Name Description
AES_DECRYPT() Decrypt using AES
AES_ENCRYPT() Encrypt using AES
COMPRESS() Return result as a binary string
DECODE() Decodes a string encrypted using ENCODE()
DES_DECRYPT() Decrypt a string
DES_ENCRYPT() Encrypt a string
ENCODE() Encode a string
ENCRYPT() Encrypt a string
MD5() Calculate MD5 checksum
OLD_PASSWORD() Return the value of the pre-4.1 implementation of PASSWORD
PASSWORD() Calculate and return a password string
SHA1(), SHA() Calculate an SHA-1 160-bit checksum
SHA2() Calculate an SHA-2 checksum
UNCOMPRESS() Uncompress a string compressed
UNCOMPRESSED_LENGTH() Return the length of a string before compression

Many encryption and compression functions return strings for which the result might contain arbitrary byte values. If you want to store these results, use a column with a VARBINARY or BLOB binary string data type. This will avoid potential problems with trailing space removal or character set conversion that would change data values, such as may occur if you use a nonbinary string data type (CHAR, VARCHAR, TEXT).

Some encryption functions return strings of ASCII characters: MD5(), OLD_PASSWORD(), PASSWORD(), SHA(), SHA1(), SHA2(). In MariaDB 5.6, their return value is a nonbinary string that has a character set and collation determined by the character_set_connection and collation_connection system variables.

For versions in which functions such as MD5() or SHA1() return a string of hex digits as a binary string, the return value cannot be converted to uppercase or compared in case-insensitive fashion as is. You must convert the value to a nonbinary string. See the discussion of binary string conversion in , "Cast Functions and Operators".

If an application stores values from a function such as MD5() or SHA1() that returns a string of hex digits, more efficient storage and comparisons can be obtained by converting the hex representation to binary using UNHEX() and storing the result in a BINARY(N) column. Each pair of hex digits requires one byte in binary form, so the value of N depends on the length of the hex string. N is 16 for an MD5() value and 20 for a SHA1() value. For SHA2(), N ranges from 28 to 32 depending on the argument specifying the desired bit length of the result.

The size penalty for storing the hex string in a CHAR column is at least two times, up to eight times if the value is stored in a column that uses the utf8 character set (where each character uses 4 bytes). Storing the string also results in slower comparisons because of the larger values and the need to take character set collation rules into account.

Suppose that an application stores MD5() string values in a CHAR(32) column:

CREATE TABLE md5_tbl (md5_val CHAR(32), ...);
INSERT INTO md5_tbl (md5_val, ...) VALUES(MD5('abcdef'), ...);

To convert hex strings to more compact form, modify the application to use UNHEX() and BINARY(16) instead as follows:

CREATE TABLE md5_tbl (md5_val BINARY(16), ...);
INSERT INTO md5_tbl (md5_val, ...) VALUES(UNHEX(MD5('abcdef')), ...);

Applications should be prepared to handle the very rare case that a hashing function produces the same value for two different input values. One way to make collisions detectable is to make the hash column a primary key.Note

Exploits for the MD5 and SHA-1 algorithms have become known. You may wish to consider using one of the other encryption functions described in this section instead, such as SHA2().Caution

Passwords or other sensitive values supplied as arguments to encryption functions are sent in plaintext to the MariaDB server unless an SSL connection is used. Also, such values will appear in any MariaDB logs to which they are written. To avoid these types of exposure, applications can encrypt sensitive values on the client side before sending them to the server. The same considerations apply to encryption keys. To avoid exposing these, applications can use stored procedures to encrypt and decrypt values on the server side.

Information Functions

Table 11.18. Information Functions

Name Description
BENCHMARK() Repeatedly execute an expression
CHARSET() Return the character set of the argument
COERCIBILITY() Return the collation coercibility value of the string argument
COLLATION() Return the collation of the string argument
CONNECTION_ID() Return the connection ID (thread ID) for the connection
CURRENT_USER(), CURRENT_USER The authenticated user name and host name
DATABASE() Return the default (current) database name
FOUND_ROWS() For a SELECT with a LIMIT clause, the number of rows that would be returned were there no LIMIT clause
LAST_INSERT_ID() Value of the AUTOINCREMENT column for the last INSERT
ROW_COUNT() The number of rows updated
SCHEMA() A synonym for DATABASE()
SESSION_USER() Synonym for USER()
SYSTEM_USER() Synonym for USER()
USER() The user name and host name provided by the client
VERSION() Returns a string that indicates the MariaDB server version

Miscellaneous Functions

Table 11.19. Miscellaneous Functions

Name Description
DEFAULT() Return the default value for a table column
GET_LOCK() Get a named lock
INET_ATON() Return the numeric value of an IP address
INET_NTOA() Return the IP address from a numeric value
INET6_ATON() Return the numeric value of an IPv6 address
INET6_NTOA() Return the IPv6 address from a numeric value
IS_FREE_LOCK() Checks whether the named lock is free
IS_IPV4_COMPAT() Return true if argument is an IPv4-compatible address
IS_IPV4_MAPPED() Return true if argument is an IPv4-mapped address
IS_IPV4() Return true if argument is an IPv4 address
IS_IPV6() Return true if argument is an IPv6 address
IS_USED_LOCK() Checks whether the named lock is in use. Return connection identifier if true.
MASTER_POS_WAIT() Block until the slave has read and applied all updates up to the specified position
NAME_CONST() Causes the column to have the given name
RAND() Return a random floating-point value
RELEASE_LOCK() Releases the named lock
SLEEP() Sleep for a number of seconds
UUID_SHORT() Return an integer-valued universal identifier
UUID() Return a Universal Unique Identifier (UUID)
VALUES() Defines the values to be used during an INSERT

Functions and Modifiers for Use with GROUP BY Clauses

GROUP BY (Aggregate) Functions
GROUP BY Modifiers
GROUP BY and HAVING with Hidden Columns

GROUP BY (Aggregate) Functions

Table 11.20. Aggregate (GROUP BY) Functions

Name Description
AVG() Return the average value of the argument
BIT_AND() Return bitwise and
BIT_OR() Return bitwise or
BIT_XOR() Return bitwise xor
COUNT(DISTINCT) Return the count of a number of different values
COUNT() Return a count of the number of rows returned
GROUP_CONCAT() Return a concatenated string
MAX() Return the maximum value
MIN() Return the minimum value
STD() Return the population standard deviation
STDDEV_POP() Return the population standard deviation
STDDEV_SAMP() Return the sample standard deviation
STDDEV() Return the population standard deviation
SUM() Return the sum
VAR_POP() Return the population standard variance
VAR_SAMP() Return the sample variance
VARIANCE() Return the population standard variance

This section describes group (aggregate) functions that operate on sets of values. Unless otherwise stated, group functions ignore NULL values.

If you use a group function in a statement containing no GROUP BY clause, it is equivalent to grouping on all rows. For more information, see , "GROUP BY and HAVING with Hidden Columns".

For numeric arguments, the variance and standard deviation functions return a DOUBLE value. The SUM() and AVG() functions return a DECIMAL value for exact-value arguments (integer or DECIMAL), and a DOUBLE value for approximate-value arguments (FLOAT or DOUBLE).

The SUM() and AVG() aggregate functions do not work with temporal values. (They convert the values to numbers, losing everything after the first nonnumeric character.) To work around this problem, convert to numeric units, perform the aggregate operation, and convert back to a temporal value. Examples:

SELECT SEC_TO_TIME(SUM(TIME_TO_SEC(time_col))) FROM tbl_name;
SELECT FROM_DAYS(SUM(TO_DAYS(date_col))) FROM tbl_name;

Functions such as SUM() or AVG() that expect a numeric argument cast the argument to a number if necessary. For SET or ENUM values, the cast operation causes the underlying numeric value to be used.

GROUP BY Modifiers

The GROUP BY clause permits a WITH ROLLUP modifier that causes extra rows to be added to the summary output. These rows represent higher-level (or super-aggregate) summary operations. ROLLUP thus enables you to answer questions at multiple levels of analysis with a single query. It can be used, for example, to provide support for OLAP (Online Analytical Processing) operations.

Suppose that a table named sales has year, country, product, and profit columns for recording sales profitability:

CREATE TABLE sales
(
 year INT NOT NULL,
 country VARCHAR(20) NOT NULL,
 product VARCHAR(32) NOT NULL,
 profit INT
);

The table's contents can be summarized per year with a simple GROUP BY like this:

mysql> SELECT year, SUM(profit) FROM sales GROUP BY year;
+------+-------------+
| year | SUM(profit) |
+------+-------------+
| 2000 | 4525 |
| 2001 | 3010 |
+------+-------------+

This output shows the total profit for each year, but if you also want to determine the total profit summed over all years, you must add up the individual values yourself or run an additional query.

Or you can use ROLLUP, which provides both levels of analysis with a single query. Adding a WITH ROLLUP modifier to the GROUP BY clause causes the query to produce another row that shows the grand total over all year values:

mysql> SELECT year, SUM(profit) FROM sales GROUP BY year WITH ROLLUP;
+------+-------------+
| year | SUM(profit) |
+------+-------------+
| 2000 | 4525 |
| 2001 | 3010 |
| NULL | 7535 |
+------+-------------+

The grand total super-aggregate line is identified by the value NULL in the year column.

ROLLUP has a more complex effect when there are multiple GROUP BY columns. In this case, each time there is a "break" (change in value) in any but the last grouping column, the query produces an extra super-aggregate summary row.

For example, without ROLLUP, a summary on the sales table based on year, country, and product might look like this:

mysql> SELECT year, country, product, SUM(profit)
 -> FROM sales
 -> GROUP BY year, country, product;
+------+---------+------------+-------------+
| year | country | product | SUM(profit) |
+------+---------+------------+-------------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
| 2000 | USA | Calculator | 75 |
| 2000 | USA | Computer | 1500 |
| 2001 | Finland | Phone | 10 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 2700 |
| 2001 | USA | TV | 250 |
+------+---------+------------+-------------+

The output indicates summary values only at the year/country/product level of analysis. When ROLLUP is added, the query produces several extra rows:

mysql> SELECT year, country, product, SUM(profit)
 -> FROM sales
 -> GROUP BY year, country, product WITH ROLLUP;
+------+---------+------------+-------------+
| year | country | product | SUM(profit) |
+------+---------+------------+-------------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | Finland | NULL | 1600 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
| 2000 | India | NULL | 1350 |
| 2000 | USA | Calculator | 75 |
| 2000 | USA | Computer | 1500 |
| 2000 | USA | NULL | 1575 |
| 2000 | NULL | NULL | 4525 |
| 2001 | Finland | Phone | 10 |
| 2001 | Finland | NULL | 10 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 2700 |
| 2001 | USA | TV | 250 |
| 2001 | USA | NULL | 3000 |
| 2001 | NULL | NULL | 3010 |
| NULL | NULL | NULL | 7535 |
+------+---------+------------+-------------+

For this query, adding ROLLUP causes the output to include summary information at four levels of analysis, not just one. Here is how to interpret the ROLLUP output:

Other Considerations When using ROLLUP

The following items list some behaviors specific to the MariaDB implementation of ROLLUP:

When you use ROLLUP, you cannot also use an ORDER BY clause to sort the results. In other words, ROLLUP and ORDER BY are mutually exclusive. However, you still have some control over sort order. GROUP BY in MariaDB sorts results, and you can use explicit ASC and DESC keywords with columns named in the GROUP BY list to specify sort order for individual columns. (The higher-level summary rows added by ROLLUP still appear after the rows from which they are calculated, regardless of the sort order.)

LIMIT can be used to restrict the number of rows returned to the client. LIMIT is applied after ROLLUP, so the limit applies against the extra rows added by ROLLUP. For example:

mysql> SELECT year, country, product, SUM(profit)
 -> FROM sales
 -> GROUP BY year, country, product WITH ROLLUP
 -> LIMIT 5;
+------+---------+------------+-------------+
| year | country | product | SUM(profit) |
+------+---------+------------+-------------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | Finland | NULL | 1600 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
+------+---------+------------+-------------+

Using LIMIT with ROLLUP may produce results that are more difficult to interpret, because you have less context for understanding the super-aggregate rows.

The NULL indicators in each super-aggregate row are produced when the row is sent to the client. The server looks at the columns named in the GROUP BY clause following the leftmost one that has changed value. For any column in the result set with a name that is a lexical match to any of those names, its value is set to NULL. (If you specify grouping columns by column number, the server identifies which columns to set to NULL by number.)

Because the NULL values in the super-aggregate rows are placed into the result set at such a late stage in query processing, you cannot test them as NULL values within the query itself. For example, you cannot add HAVING product IS NULL to the query to eliminate from the output all but the super-aggregate rows.

On the other hand, the NULL values do appear as NULL on the client side and can be tested as such using any MariaDB client programming interface.

GROUP BY and HAVING with Hidden Columns

In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause. For example, this query is illegal in standard SQL because the name column in the select list does not appear in the GROUP BY:

SELECT o.custid, c.name, MAX(o.payment)
 FROM orders AS o, customers AS c
 WHERE o.custid = c.custid
 GROUP BY o.custid;

For the query to be legal, the name column must be omitted from the select list or named in the GROUP BY clause.

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values the server chooses.

A similar MariaDB extension applies to the HAVING clause. Standard SQL does not permit the HAVING clause to name any column not found in the GROUP BY clause unless it is enclosed in an aggregate function. MariaDB permits the use of such columns to simplify calculations. This extension assumes that the nongrouped columns will have the same group-wise values. Otherwise, the result is indeterminate.

To disable the MariaDB GROUP BY extension, enable the ONLY_FULL_GROUP_BY SQL mode. This enables standard SQL behavior: Columns not named in the GROUP BY clause cannot be used in the select list or HAVING clause unless enclosed in an aggregate function.

For example, the following query returns name values that occur only once in table orders:

SELECT name, COUNT(name) FROM orders
 GROUP BY name
 HAVING COUNT(name) = 1;

However, the result of the following similar query that uses an alias for the aggregated column depends on the SQL mode:

SELECT name, COUNT(name) AS c FROM orders
 GROUP BY name
 HAVING c = 1;

In this case, a non-grouping field 'c' is used in HAVING clause error occurs if ONLY_FULL_GROUP_BY is enabled because the extension does not apply. The column c in the HAVING clause is not enclosed in an aggregate function (instead, it is an aggregate function).

The select list extension also applies to ORDER BY. That is, you can use nonaggregated columns in the ORDER BY clause that do not appear in the GROUP BY clause. (However, as mentioned previously, ORDER BY does not affect which values are chosen from nonaggregated columns; it only sorts them after they have been chosen.) This extension does not apply if the ONLY_FULL_GROUP_BY SQL mode is enabled.

In some cases, you can use MIN() and MAX() to obtain a specific column value even if it is not unique. If the sort column contains integers no larger than 6 digits, the following query gives the value of column from the row containing the smallest sort value:

SUBSTR(MIN(CONCAT(LPAD(sort,6,'0'),column)),7)

See , "The Rows Holding the Group-wise Maximum of a Certain Column".

If you are trying to follow standard SQL, you cannot use expressions in GROUP BY clauses. As a workaround, use an alias for the expression:

SELECT id, FLOOR(value/100) AS val
 FROM tbl_name
 GROUP BY id, val;

MySQL permits expressions in GROUP BY clauses, so the alias is unnecessary:

SELECT id, FLOOR(value/100)
 FROM tbl_name
 GROUP BY id, FLOOR(value/100);

Spatial Extensions

Introduction to MariaDB Spatial Support
The OpenGIS Geometry Model
Supported Spatial Data Formats
Creating a Spatially Enabled MariaDB Database
Spatial Analysis Functions
Optimizing Spatial Analysis
MySQL Conformance and Compatibility

MySQL supports spatial extensions to enable the generation, storage, and analysis of geographic features. These features are available for MyISAM, InnoDB, NDB, and ARCHIVE tables.

For spatial columns, MyISAM supports both SPATIAL and non-SPATIAL indexes. Other storage engines support non-SPATIAL indexes, as described in , "CREATE INDEX Syntax".

This chapter covers the following topics:

Additional Resources

Introduction to MariaDB Spatial Support

MySQL implements spatial extensions following the specification of the Open Geospatial Consortium (OGC). This is an international consortium of more than 250 companies, agencies, and universities participating in the development of publicly available conceptual solutions that can be useful with all kinds of applications that manage spatial data. The OGC maintains a Web site at http://www.opengis.org/.

In 1997, the Open Geospatial Consortium published the OpenGIS Simple Features Specifications For SQL, a document that proposes several conceptual ways for extending an SQL RDBMS to support spatial data. This specification is available from the OGC Web site at http://www.opengis.org/docs/99-049.pdf. It contains additional information relevant to this chapter.

MySQL implements a subset of the SQL with Geometry Types environment proposed by OGC. This term refers to an SQL environment that has been extended with a set of geometry types. A geometry-valued SQL column is implemented as a column that has a geometry type. The specification describe a set of SQL geometry types, as well as functions on those types to create and analyze geometry values.

A geographic feature is anything in the world that has a location. A feature can be:

Some documents use the term geospatial feature to refer to geographic features.

Geometry is another word that denotes a geographic feature. Originally the word geometry meant measurement of the earth. Another meaning comes from cartography, referring to the geometric features that cartographers use to map the world.

This chapter uses all of these terms synonymously: geographic feature, geospatial feature, feature, or geometry. Here, the term most commonly used is geometry, defined as a point or an aggregate of points representing anything in the world that has a location.

The OpenGIS Geometry Model

The Geometry Class Hierarchy
Class Geometry
Class Point
Class Curve
Class LineString
Class Surface
Class Polygon
Class GeometryCollection
Class MultiPoint
Class MultiCurve
Class MultiLineString
Class MultiSurface
Class MultiPolygon

The set of geometry types proposed by OGC's SQL with Geometry Types environment is based on the OpenGIS Geometry Model. In this model, each geometric object has the following general properties:

The Geometry Class Hierarchy

The geometry classes define a hierarchy as follows:

It is not possible to create objects in noninstantiable classes. It is possible to create objects in instantiable classes. All classes have properties, and instantiable classes may also have assertions (rules that define valid class instances).

Geometry is the base class. It is an abstract class. The instantiable subclasses of Geometry are restricted to zero-, one-, and two-dimensional geometric objects that exist in two-dimensional coordinate space. All instantiable geometry classes are defined so that valid instances of a geometry class are topologically closed (that is, all defined geometries include their boundary).

The base Geometry class has subclasses for Point, Curve, Surface, and GeometryCollection:

Geometry, Curve, Surface, MultiCurve, and MultiSurface are defined as noninstantiable classes. They define a common set of methods for their subclasses and are included for extensibility.

Point, LineString, Polygon, GeometryCollection, MultiPoint, MultiLineString, and MultiPolygon are instantiable classes.

Class Geometry

Geometry is the root class of the hierarchy. It is a noninstantiable class but has a number of properties that are common to all geometry values created from any of the Geometry subclasses. These properties are described in the following list. Particular subclasses have their own specific properties, described later.

Geometry Properties

A geometry value has the following properties:

Class Point

A Point is a geometry that represents a single location in coordinate space.

Point Examples

Point Properties

Class Curve

A Curve is a one-dimensional geometry, usually represented by a sequence of points. Particular subclasses of Curve define the type of interpolation between points. Curve is a noninstantiable class.

Curve Properties

Class LineString

A LineString is a Curve with linear interpolation between points.

LineString Examples

LineString Properties

Class Surface

A Surface is a two-dimensional geometry. It is a noninstantiable class. Its only instantiable subclass is Polygon.

Surface Properties

Class Polygon

A Polygon is a planar Surface representing a multisided geometry. It is defined by a single exterior boundary and zero or more interior boundaries, where each interior boundary defines a hole in the Polygon.

Polygon Examples

Polygon Assertions

The preceding assertions make a Polygon a simple geometry.

Class GeometryCollection

A GeometryCollection is a geometry that is a collection of one or more geometries of any class.

All the elements in a GeometryCollection must be in the same Spatial Reference System (that is, in the same coordinate system). There are no other constraints on the elements of a GeometryCollection, although the subclasses of GeometryCollection described in the following sections may restrict membership. Restrictions may be based on:

Class MultiPoint

A MultiPoint is a geometry collection composed of Point elements. The points are not connected or ordered in any way.

MultiPoint Examples

MultiPoint Properties

Class MultiCurve

A MultiCurve is a geometry collection composed of Curve elements. MultiCurve is a noninstantiable class.

MultiCurve Properties

Class MultiLineString

A MultiLineString is a MultiCurve geometry collection composed of LineString elements.

MultiLineString Examples

Class MultiSurface

A MultiSurface is a geometry collection composed of surface elements. MultiSurface is a noninstantiable class. Its only instantiable subclass is MultiPolygon.

MultiSurface Assertions

Class MultiPolygon

A MultiPolygon is a MultiSurface object composed of Polygon elements.

MultiPolygon Examples

MultiPolygon Assertions

MultiPolygon Properties

Supported Spatial Data Formats

Well-Known Text (WKT) Format
Well-Known Binary (WKB) Format

This section describes the standard spatial data formats that are used to represent geometry objects in queries. They are:

Internally, MariaDB stores geometry values in a format that is not identical to either WKT or WKB format.

Well-Known Text (WKT) Format

The Well-Known Text (WKT) representation of Geometry is designed to exchange geometry data in ASCII form. For a Backus-Naur grammar that specifies the formal production rules for writing WKT values, see the OpenGIS specification document referenced in , "Spatial Extensions".

Examples of WKT representations of geometry objects:

Well-Known Binary (WKB) Format

The Well-Known Binary (WKB) representation for geometric values is defined by the OpenGIS specification. It is also defined in the ISO SQL/MM Part 3: Spatial standard.

WKB is used to exchange geometry data as binary streams represented by BLOB values containing geometric WKB information.

WKB uses one-byte unsigned integers, four-byte unsigned integers, and eight-byte double-precision numbers (IEEE 754 format). A byte is eight bits.

For example, a WKB value that corresponds to POINT(1 1) consists of this sequence of 21 bytes (each represented here by two hex digits):

0101000000000000000000F03F000000000000F03F

The sequence may be broken down into these components:

Byte order : 01
WKB type : 01000000
X : 000000000000F03F Y : 000000000000F03F

Component representation is as follows:

WKB values for more complex geometry values are represented by more complex data structures, as detailed in the OpenGIS specification.

Creating a Spatially Enabled MariaDB Database

MySQL Spatial Data Types
Creating Spatial Values
Creating Spatial Columns
Populating Spatial Columns
Fetching Spatial Data

This section describes the data types you can use for representing spatial data in MySQL, and the functions available for creating and retrieving spatial values.

MySQL Spatial Data Types

MySQL has data types that correspond to OpenGIS classes. Some of these types hold single geometry values:

GEOMETRY can store geometry values of any type. The other single-value types (POINT, LINESTRING, and POLYGON) restrict their values to a particular geometry type.

The other data types hold collections of values:

GEOMETRYCOLLECTION can store a collection of objects of any type. The other collection types (MULTIPOINT, MULTILINESTRING, MULTIPOLYGON, and GEOMETRYCOLLECTION) restrict collection members to those having a particular geometry type.

Creating Spatial Values

Creating Geometry Values Using WKT Functions
Creating Geometry Values Using WKB Functions
Creating Geometry Values Using MySQL-Specific Functions

This section describes how to create spatial values using Well-Known Text and Well-Known Binary functions that are defined in the OpenGIS standard, and using MySQL-specific functions.

Creating Geometry Values Using WKT Functions

MySQL provides a number of functions that take as arguments a Well-Known Text representation and, optionally, a spatial reference system identifier (SRID). They return the corresponding geometry.

GeomFromText() accepts a WKT of any geometry type as its first argument. An implementation also provides type-specific construction functions for construction of geometry values of each geometry type.

The OpenGIS specification also defines the following optional functions, which MariaDB does not implement. These functions construct Polygon or MultiPolygon values based on the WKT representation of a collection of rings or closed LineString values. These values may intersect.

Creating Geometry Values Using WKB Functions

MySQL provides a number of functions that take as arguments a BLOB containing a Well-Known Binary representation and, optionally, a spatial reference system identifier (SRID). They return the corresponding geometry.

These functions also accept geometry objects for compatibility with the return value of the functions in , "Creating Geometry Values Using MySQL-Specific Functions". Thus, those functions may be used to provide the first argument to the functions in this section.

The OpenGIS specification also describes optional functions for constructing Polygon or MultiPolygon values based on the WKB representation of a collection of rings or closed LineString values. These values may intersect. MariaDB does not implement these functions:

Creating Geometry Values Using MySQL-Specific Functions

MySQL provides a set of useful nonstandard functions for creating geometry values. The functions described in this section are MariaDB extensions to the OpenGIS specification.

These functions produce geometry objects from either WKB values or geometry objects as arguments. If any argument is not a proper WKB or geometry representation of the proper object type, the return value is NULL.

For example, you can insert the geometry return value from Point() directly into a Point column:

INSERT INTO t1 (pt_col) VALUES(Point(1,2));

Creating Spatial Columns

MySQL provides a standard way of creating spatial columns for geometry types, for example, with CREATE TABLE or ALTER TABLE. Currently, spatial columns are supported for MyISAM, InnoDB, NDB, and ARCHIVE tables. See also the annotations about spatial indexes under , "Creating Spatial Indexes".

Populating Spatial Columns

After you have created spatial columns, you can populate them with spatial data.

Values should be stored in internal geometry format, but you can convert them to that format from either Well-Known Text (WKT) or Well-Known Binary (WKB) format. The following examples demonstrate how to insert geometry values into a table by converting WKT values into internal geometry format:

The following examples insert more complex geometries into the table:

SET @g = 'LINESTRING(0 0,1 1,2 2)';
INSERT INTO geom VALUES (GeomFromText(@g));
SET @g = 'POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7, 5 5))';
INSERT INTO geom VALUES (GeomFromText(@g));
SET @g =
'GEOMETRYCOLLECTION(POINT(1 1),LINESTRING(0 0,1 1,2 2,3 3,4 4))';
INSERT INTO geom VALUES (GeomFromText(@g));

The preceding examples all use GeomFromText() to create geometry values. You can also use type-specific functions:

SET @g = 'POINT(1 1)';
INSERT INTO geom VALUES (PointFromText(@g));
SET @g = 'LINESTRING(0 0,1 1,2 2)';
INSERT INTO geom VALUES (LineStringFromText(@g));
SET @g = 'POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7, 5 5))';
INSERT INTO geom VALUES (PolygonFromText(@g));
SET @g =
'GEOMETRYCOLLECTION(POINT(1 1),LINESTRING(0 0,1 1,2 2,3 3,4 4))';
INSERT INTO geom VALUES (GeomCollFromText(@g));

Note that if a client application program wants to use WKB representations of geometry values, it is responsible for sending correctly formed WKB in queries to the server. However, there are several ways of satisfying this requirement. For example:

Fetching Spatial Data

Geometry values stored in a table can be fetched in internal format. You can also convert them into WKT or WKB format.

Spatial Analysis Functions

Geometry Format Conversion Functions
Geometry Functions
Functions That Create New Geometries from Existing Ones
Functions for Testing Spatial Relations Between Geometric Objects

After populating spatial columns with values, you are ready to query and analyze them. MariaDB provides a set of functions to perform various operations on spatial data. These functions can be grouped into four major categories according to the type of operation they perform:

Spatial analysis functions can be used in many contexts, such as:

Geometry Format Conversion Functions

MySQL supports the following functions for converting geometry values between internal format and either WKT or WKB format:

Geometry Functions

General Geometry Functions
Point Functions
LineString Functions
MultiLineString Functions
Polygon Functions
MultiPolygon Functions
GeometryCollection Functions

Each function that belongs to this group takes a geometry value as its argument and returns some quantitative or qualitative property of the geometry. Some functions restrict their argument type. Such functions return NULL if the argument is of an incorrect geometry type. For example, Area() returns NULL if the object type is neither Polygon nor MultiPolygon.

General Geometry Functions

The functions listed in this section do not restrict their argument and accept a geometry value of any type.

The OpenGIS specification also defines the following functions, which MariaDB does not implement:

Point Functions

A Point consists of X and Y coordinates, which may be obtained using the following functions:

LineString Functions

A LineString consists of Point values. You can extract particular points of a LineString, count the number of points that it contains, or obtain its length.

The OpenGIS specification also defines the following function, which MariaDB does not implement:

MultiLineString Functions

These functions return properties of MultiLineString values.

Polygon Functions

These functions return properties of Polygon values.

MultiPolygon Functions

These functions return properties of MultiPolygon values.

The OpenGIS specification also defines the following functions, which MariaDB does not implement:

GeometryCollection Functions

These functions return properties of GeometryCollection values.

Functions That Create New Geometries from Existing Ones

Geometry Functions That Produce New Geometries
Spatial Operators

The following sections describe functions that take geometry values as arguments and return new geometry values.

Geometry Functions That Produce New Geometries

, "Geometry Functions", discusses several functions that construct new geometries from existing ones. See that section for descriptions of these functions:

Spatial Operators

OpenGIS proposes a number of other functions that can produce geometries. They are designed to implement spatial operators.

These functions are not implemented in MySQL.

Functions for Testing Spatial Relations Between Geometric Objects

Relations on Geometry Minimal Bounding Rectangles (MBRs)
Functions That Test Spatial Relationships Between Geometries

The functions described in these sections take two geometries as input parameters and return a qualitative or quantitative relation between them.

Relations on Geometry Minimal Bounding Rectangles (MBRs)

MySQL provides several functions that test relations between minimal bounding rectangles of two geometries g1 and g2. The return values 1 and 0 indicate true and false, respectively.

Functions That Test Spatial Relationships Between Geometries

The OpenGIS specification defines the following functions. They test the relationship between two geometry values g1 and g2.

The return values 1 and 0 indicate true and false, respectively.Note

MySQL originally implemented these functions such that they used object bounding rectangles and returned the same result as the corresponding MBR-based functions. As of MariaDB 5.6.1, corresponding versions are available that use precise object shapes. These versions are named with an ST_ prefix. For example, Contains() uses object bounding rectangles, whereas ST_Contains() uses object shapes.

As of MariaDB 5.6.1, there are also ST_ aliases for existing spatial functions that were already exact. For example, ST_IsEmpty() is an alias for IsEmpty()

Functions That Use Object Shapes
Functions That Use Bounding Rectangles

Optimizing Spatial Analysis

Creating Spatial Indexes
Using a Spatial Index

For MyISAM tables, Search operations in nonspatial databases can be optimized using SPATIAL indexes. This is true for spatial databases as well. With the help of a great variety of multi-dimensional indexing methods that have previously been designed, it is possible to optimize spatial searches. The most typical of these are:

MySQL uses R-Trees with quadratic splitting for SPATIAL indexes on spatial columns. A SPATIAL index is built using the MBR of a geometry. For most geometries, the MBR is a minimum rectangle that surrounds the geometries. For a horizontal or a vertical linestring, the MBR is a rectangle degenerated into the linestring. For a point, the MBR is a rectangle degenerated into the point.

It is also possible to create normal indexes on spatial columns. In a non-SPATIAL index, you must declare a prefix for any spatial column except for POINT columns.

MyISAM supports both SPATIAL and non-SPATIAL indexes. Other storage engines support non-SPATIAL indexes, as described in , "CREATE INDEX Syntax".

Creating Spatial Indexes

For MyISAM tables, MariaDB can create spatial indexes using syntax similar to that for creating regular indexes, but extended with the SPATIAL keyword. Currently, columns in spatial indexes must be declared NOT NULL. The following examples demonstrate how to create spatial indexes:

For MyISAM tables, SPATIAL INDEX creates an R-tree index. For storage engines that support nonspatial indexing of spatial columns, the engine creates a B-tree index. A B-tree index on spatial values will be useful for exact-value lookups, but not for range scans.

For more information on indexing spatial columns, see , "CREATE INDEX Syntax".

To drop spatial indexes, use ALTER TABLE or DROP INDEX:

Example: Suppose that a table geom contains more than 32,000 geometries, which are stored in the column g of type GEOMETRY. The table also has an AUTO_INCREMENT column fid for storing object ID values.

mysql> DESCRIBE geom;
+-------+----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+----------------+
| fid | int(11) | | PRI | NULL | auto_increment |
| g | geometry | | | | |
+-------+----------+------+-----+---------+----------------+
2 rows in set (0.00 sec)
mysql> SELECT COUNT(*) FROM geom;
+----------+
| count(*) |
+----------+
| 32376 |
+----------+
1 row in set (0.00 sec)

To add a spatial index on the column g, use this statement:

mysql> ALTER TABLE geom ADD SPATIAL INDEX(g);
Query OK, 32376 rows affected (4.05 sec)
Records: 32376 Duplicates: 0 Warnings: 0

Using a Spatial Index

The optimizer investigates whether available spatial indexes can be involved in the search for queries that use a function such as MBRContains() or MBRWithin() in the WHERE clause. The following query finds all objects that are in the given rectangle:

mysql> SET @poly =
 -> 'Polygon((30000 15000,
 31000 15000,
 31000 16000,
 30000 16000,
 30000 15000))';
mysql> SELECT fid,AsText(g) FROM geom WHERE
 -> MBRContains(GeomFromText(@poly),g);
+-----+---------------------------------------------------------------+
| fid | AsText(g) |
+-----+---------------------------------------------------------------+
| 21 | LINESTRING(30350.4 15828.8,30350.6 15845,30333.8 15845,30 ... |
| 22 | LINESTRING(30350.6 15871.4,30350.6 15887.8,30334 15887.8, ... |
| 23 | LINESTRING(30350.6 15914.2,30350.6 15930.4,30334 15930.4, ... |
| 24 | LINESTRING(30290.2 15823,30290.2 15839.4,30273.4 15839.4, ... |
| 25 | LINESTRING(30291.4 15866.2,30291.6 15882.4,30274.8 15882. ... |
| 26 | LINESTRING(30291.6 15918.2,30291.6 15934.4,30275 15934.4, ... |
| 249 | LINESTRING(30337.8 15938.6,30337.8 15946.8,30320.4 15946. ... |
| 1 | LINESTRING(30250.4 15129.2,30248.8 15138.4,30238.2 15136. ... |
| 2 | LINESTRING(30220.2 15122.8,30217.2 15137.8,30207.6 15136, ... |
| 3 | LINESTRING(30179 15114.4,30176.6 15129.4,30167 15128,3016 ... |
| 4 | LINESTRING(30155.2 15121.4,30140.4 15118.6,30142 15109,30 ... |
| 5 | LINESTRING(30192.4 15085,30177.6 15082.2,30179.2 15072.4, ... |
| 6 | LINESTRING(30244 15087,30229 15086.2,30229.4 15076.4,3024 ... |
| 7 | LINESTRING(30200.6 15059.4,30185.6 15058.6,30186 15048.8, ... |
| 10 | LINESTRING(30179.6 15017.8,30181 15002.8,30190.8 15003.6, ... |
| 11 | LINESTRING(30154.2 15000.4,30168.6 15004.8,30166 15014.2, ... |
| 13 | LINESTRING(30105 15065.8,30108.4 15050.8,30118 15053,3011 ... |
| 154 | LINESTRING(30276.2 15143.8,30261.4 15141,30263 15131.4,30 ... |
| 155 | LINESTRING(30269.8 15084,30269.4 15093.4,30258.6 15093,30 ... |
| 157 | LINESTRING(30128.2 15011,30113.2 15010.2,30113.6 15000.4, ... |
+-----+---------------------------------------------------------------+
20 rows in set (0.00 sec)

Use EXPLAIN to check the way this query is executed:

mysql> SET @poly =
 -> 'Polygon((30000 15000,
 31000 15000,
 31000 16000,
 30000 16000,
 30000 15000))';
mysql> EXPLAIN SELECT fid,AsText(g) FROM geom WHERE
 -> MBRContains(GeomFromText(@poly),g)\G
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: geom
 type: range possible_keys: g
 key: g
 key_len: 32
 ref: NULL
 rows: 50
 Extra: Using where
1 row in set (0.00 sec)

Check what would happen without a spatial index:

mysql> SET @poly =
 -> 'Polygon((30000 15000,
 31000 15000,
 31000 16000,
 30000 16000,
 30000 15000))';
mysql> EXPLAIN SELECT fid,AsText(g) FROM g IGNORE INDEX (g) WHERE
 -> MBRContains(GeomFromText(@poly),g)\G
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: geom
 type: ALL possible_keys: NULL
 key: NULL
 key_len: NULL
 ref: NULL
 rows: 32376
 Extra: Using where
1 row in set (0.00 sec)

Executing the SELECT statement without the spatial index yields the same result but causes the execution time to rise from 0.00 seconds to 0.46 seconds:

mysql> SET @poly =
 -> 'Polygon((30000 15000,
 31000 15000,
 31000 16000,
 30000 16000,
 30000 15000))';
mysql> SELECT fid,AsText(g) FROM geom IGNORE INDEX (g) WHERE
 -> MBRContains(GeomFromText(@poly),g);
+-----+---------------------------------------------------------------+
| fid | AsText(g) |
+-----+---------------------------------------------------------------+
| 1 | LINESTRING(30250.4 15129.2,30248.8 15138.4,30238.2 15136. ... |
| 2 | LINESTRING(30220.2 15122.8,30217.2 15137.8,30207.6 15136, ... |
| 3 | LINESTRING(30179 15114.4,30176.6 15129.4,30167 15128,3016 ... |
| 4 | LINESTRING(30155.2 15121.4,30140.4 15118.6,30142 15109,30 ... |
| 5 | LINESTRING(30192.4 15085,30177.6 15082.2,30179.2 15072.4, ... |
| 6 | LINESTRING(30244 15087,30229 15086.2,30229.4 15076.4,3024 ... |
| 7 | LINESTRING(30200.6 15059.4,30185.6 15058.6,30186 15048.8, ... |
| 10 | LINESTRING(30179.6 15017.8,30181 15002.8,30190.8 15003.6, ... |
| 11 | LINESTRING(30154.2 15000.4,30168.6 15004.8,30166 15014.2, ... |
| 13 | LINESTRING(30105 15065.8,30108.4 15050.8,30118 15053,3011 ... |
| 21 | LINESTRING(30350.4 15828.8,30350.6 15845,30333.8 15845,30 ... |
| 22 | LINESTRING(30350.6 15871.4,30350.6 15887.8,30334 15887.8, ... |
| 23 | LINESTRING(30350.6 15914.2,30350.6 15930.4,30334 15930.4, ... |
| 24 | LINESTRING(30290.2 15823,30290.2 15839.4,30273.4 15839.4, ... |
| 25 | LINESTRING(30291.4 15866.2,30291.6 15882.4,30274.8 15882. ... |
| 26 | LINESTRING(30291.6 15918.2,30291.6 15934.4,30275 15934.4, ... |
| 154 | LINESTRING(30276.2 15143.8,30261.4 15141,30263 15131.4,30 ... |
| 155 | LINESTRING(30269.8 15084,30269.4 15093.4,30258.6 15093,30 ... |
| 157 | LINESTRING(30128.2 15011,30113.2 15010.2,30113.6 15000.4, ... |
| 249 | LINESTRING(30337.8 15938.6,30337.8 15946.8,30320.4 15946. ... |
+-----+---------------------------------------------------------------+
20 rows in set (0.46 sec)

MySQL Conformance and Compatibility

MySQL does not yet implement the following GIS features:

Precision Math

Types of Numeric Values
DECIMAL Data Type Changes
Expression Handling
Rounding Behavior
Precision Math Examples

MySQL 5.6 provides support for precision math: numeric value handling that results in extremely accurate results and a high degree control over invalid values. Precision math is based on these two features:

These features have several implications for numeric operations and provide a high degree of compliance with standard SQL:

The following discussion covers several aspects of how precision math works, including possible incompatibilities with older applications. At the end, some examples are given that demonstrate how MariaDB 5.6 handles numeric operations precisely. For information about controlling the SQL mode, see , "Server SQL Modes".

Types of Numeric Values

The scope of precision math for exact-value operations includes the exact-value data types (DECIMAL and integer types) and exact-value numeric literals. Approximate-value data types and numeric literals are handled as floating-point numbers.

Exact-value numeric literals have an integer part or fractional part, or both. They may be signed. Examples: 1, .2, 3.4, -5, -6.78, +9.10.

Approximate-value numeric literals are represented in scientific notation with a mantissa and exponent. Either or both parts may be signed. Examples: 1.2E3, 1.2E-3, -1.2E3, -1.2E-3.

Two numbers that look similar may be treated differently. For example, 2.34 is an exact-value (fixed-point) number, whereas 2.34E0 is an approximate-value (floating-point) number.

The DECIMAL data type is a fixed-point type and calculations are exact. In MySQL, the DECIMAL type has several synonyms: NUMERIC, DEC, FIXED. The integer types also are exact-value types.

The FLOAT and DOUBLE data types are floating-point types and calculations are approximate. In MySQL, types that are synonymous with FLOAT or DOUBLE are DOUBLE PRECISION and REAL.

DECIMAL Data Type Changes

This section discusses the characteristics of the DECIMAL data type (and its synonyms) in MariaDB 5.6, with particular regard to the following topics:

Possible incompatibilities with applications that are written for older versions of MariaDB (prior to 5.0.3) are noted throughout this section.

The declaration syntax for a DECIMAL column is DECIMAL(M,D). The ranges of values for the arguments in MariaDB 5.6 are as follows:

The maximum value of 65 for M means that calculations on DECIMAL values are accurate up to 65 digits. This limit of 65 digits of precision also applies to exact-value numeric literals, so the maximum range of such literals differs from before. (In older versions of MySQL, decimal values could have up to 254 digits. However, calculations were done using floating-point and thus were approximate, not exact.)

Values for DECIMAL columns in MariaDB 5.6 are stored using a binary format that packs nine decimal digits into 4 bytes. The storage requirements for the integer and fractional parts of each value are determined separately. Each multiple of nine digits requires 4 bytes, and any remaining digits left over require some fraction of 4 bytes. The storage required for remaining digits is given by the following table.

Leftover Digits Number of Bytes
0 0
1-2 1
3-4 2
5-6 3
7-9 4

For example, a DECIMAL(18,9) column has nine digits on either side of the decimal point, so the integer part and the fractional part each require 4 bytes. A DECIMAL(20,6) column has fourteen integer digits and six fractional digits. The integer digits require four bytes for nine of the digits and 3 bytes for the remaining five digits. The six fractional digits require 3 bytes.

Unlike some older versions of MySQL, DECIMAL columns in MariaDB 5.6 do not store a leading + character or - character or leading 0 digits. If you insert +0003.1 into a DECIMAL(5,1) column, it is stored as 3.1. For negative numbers, a literal - character is not stored. Applications that rely on the older behavior must be modified to account for this change.

DECIMAL columns in MariaDB 5.6 do not permit values larger than the range implied by the column definition. For example, a DECIMAL(3,0) column supports a range of -999 to 999. A DECIMAL(M,D) column permits at most M - D digits to the left of the decimal point. This is not compatible with applications relying on older versions of MariaDB that permitted storing an extra digit in lieu of a + sign.

The SQL standard requires that the precision of NUMERIC(M,D) be exactly M digits. For DECIMAL(M,D), the standard requires a precision of at least M digits but permits more. In MySQL, DECIMAL(M,D) and NUMERIC(M,D) are the same, and both have a precision of exactly M digits.

For more detailed information about porting applications that rely on the old treatment of the DECIMAL data type, see the MySQL 5.0 Reference Manual.

Expression Handling

With precision math, exact-value numbers are used as given whenever possible. For example, numbers in comparisons are used exactly as given without a change in value. In strict SQL mode, for INSERT into a column with an exact data type (DECIMAL or integer), a number is inserted with its exact value if it is within the column range. When retrieved, the value should be the same as what was inserted. (Without strict mode, truncation for INSERT is permissible.)

Handling of a numeric expression depends on what kind of values the expression contains:

If a numeric expression contains any strings, they are converted to double-precision floating-point values and the expression is approximate.

Inserts into numeric columns are affected by the SQL mode, which is controlled by the sql_mode system variable. (See , "Server SQL Modes".) The following discussion mentions strict mode (selected by the STRICT_ALL_TABLES or STRICT_TRANS_TABLES mode values) and ERROR_FOR_DIVISION_BY_ZERO. To turn on all restrictions, you can simply use TRADITIONAL mode, which includes both strict mode values and ERROR_FOR_DIVISION_BY_ZERO:

mysql> SET sql_mode='TRADITIONAL';

If a number is inserted into an exact type column (DECIMAL or integer), it is inserted with its exact value if it is within the column range.

If the value has too many digits in the fractional part, rounding occurs and a warning is generated. Rounding is done as described in , "Rounding Behavior".

If the value has too many digits in the integer part, it is too large and is handled as follows:

Underflow is not detected, so underflow handling is undefined.

By default, division by zero produces a result of NULL and no warning. With the ERROR_FOR_DIVISION_BY_ZERO SQL mode enabled, MariaDB handles division by zero differently:

In other words, inserts and updates involving expressions that perform division by zero can be treated as errors, but this requires ERROR_FOR_DIVISION_BY_ZERO in addition to strict mode.

Suppose that we have this statement:

INSERT INTO t SET i = 1/0;

This is what happens for combinations of strict and ERROR_FOR_DIVISION_BY_ZERO modes.

sql_mode Value Result
'' (Default) No warning, no error; i is set to NULL.
strict No warning, no error; i is set to NULL.
ERROR_FOR_DIVISION_BY_ZERO Warning, no error; i is set to NULL.
strict,ERROR_FOR_DIVISION_BY_ZERO Error condition; no row is inserted.

For inserts of strings into numeric columns, conversion from string to number is handled as follows if the string has nonnumeric contents:

Rounding Behavior

This section discusses precision math rounding for the ROUND() function and for inserts into columns with exact-value types (DECIMAL and integer).

The ROUND() function rounds differently depending on whether its argument is exact or approximate:

The following example shows how rounding differs for exact and approximate values:

mysql> SELECT ROUND(2.5), ROUND(25E-1);
+------------+--------------+
| ROUND(2.5) | ROUND(25E-1) |
+------------+--------------+
| 3 | 2 |
+------------+--------------+

For inserts into a DECIMAL or integer column, the target is an exact data type, so rounding uses "round half up," regardless of whether the value to be inserted is exact or approximate:

mysql> CREATE TABLE t (d DECIMAL(10,0));
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO t VALUES(2.5),(2.5E0);
Query OK, 2 rows affected, 2 warnings (0.00 sec)
Records: 2 Duplicates: 0 Warnings: 2
mysql> SELECT d FROM t;
+------+
| d |
+------+
| 3 |
| 3 |
+------+

Precision Math Examples

This section provides some examples that show precision math query results in MariaDB 5.6. These examples demonstrate the principles described in , "Expression Handling", and , "Rounding Behavior".

Example 1. Numbers are used with their exact value as given when possible:

mysql> SELECT (.1 + .2) = .3;
+----------------+
| (.1 + .2) = .3 |
+----------------+
| 1 |
+----------------+

For floating-point values, results are inexact:

mysql> SELECT (.1E0 + .2E0) = .3E0;
+----------------------+
| (.1E0 + .2E0) = .3E0 |
+----------------------+
| 0 |
+----------------------+

Another way to see the difference in exact and approximate value handling is to add a small number to a sum many times. Consider the following stored procedure, which adds .0001 to a variable 1,000 times.

CREATE PROCEDURE p ()
BEGIN
 DECLARE i INT DEFAULT 0;
 DECLARE d DECIMAL(10,4) DEFAULT 0;
 DECLARE f FLOAT DEFAULT 0;
 WHILE i < 10000 DO
 SET d = d + .0001;
 SET f = f + .0001E0;
 SET i = i + 1;
 END WHILE;
 SELECT d, f;
END;

The sum for both d and f logically should be 1, but that is true only for the decimal calculation. The floating-point calculation introduces small errors:

+--------+------------------+
| d | f |
+--------+------------------+
| 1.0000 | 0.99999999999991 |
+--------+------------------+

Example 2. Multiplication is performed with the scale required by standard SQL. That is, for two numbers X1 and X2 that have scale S1 and S2, the scale of the result is S1 + S2:

mysql> SELECT .01 * .01;
+-----------+
| .01 * .01 |
+-----------+
| 0.0001 |
+-----------+

Example 3. Rounding behavior for exact-value numbers is well-defined:

Rounding behavior (for example, with the ROUND() function) is independent of the implementation of the underlying C library, which means that results are consistent from platform to platform.

Example 4. In strict mode, inserting a value that is out of range for a column causes an error, rather than truncation to a legal value.

When MariaDB is not running in strict mode, truncation to a legal value occurs:

mysql> SET sql_mode='';
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.01 sec)
mysql> INSERT INTO t SET i = 128;
Query OK, 1 row affected, 1 warning (0.00 sec)
mysql> SELECT i FROM t;
+------+
| i |
+------+
| 127 |
+------+
1 row in set (0.00 sec)

However, an error occurs if strict mode is in effect:

mysql> SET sql_mode='STRICT_ALL_TABLES';
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO t SET i = 128;
ERROR 1264 (22003): Out of range value adjusted for column 'i' at row 1
mysql> SELECT i FROM t;
Empty set (0.00 sec)

Example 5: In strict mode and with ERROR_FOR_DIVISION_BY_ZERO set, division by zero causes an error, not a result of NULL.

In nonstrict mode, division by zero has a result of NULL:

mysql> SET sql_mode='';
Query OK, 0 rows affected (0.01 sec)
mysql> CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO t SET i = 1 / 0;
Query OK, 1 row affected (0.00 sec)
mysql> SELECT i FROM t;
+------+
| i |
+------+
| NULL |
+------+
1 row in set (0.03 sec)

However, division by zero is an error if the proper SQL modes are in effect:

mysql> SET sql_mode='STRICT_ALL_TABLES,ERROR_FOR_DIVISION_BY_ZERO';
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE t (i TINYINT);
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO t SET i = 1 / 0;
ERROR 1365 (22012): Division by 0
mysql> SELECT i FROM t;
Empty set (0.01 sec)

Example 6. Exact-value literals are evaluated as exact values.

Prior to MariaDB 5.0.3, exact-value and approximate-value literals both are evaluated as double-precision floating-point values:

mysql> SELECT VERSION();
+------------+
| VERSION() |
+------------+
| 4.1.18-log |
+------------+
1 row in set (0.01 sec)
mysql> CREATE TABLE t SELECT 2.5 AS a, 25E-1 AS b;
Query OK, 1 row affected (0.07 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> DESCRIBE t;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| a | double(3,1) | | | 0.0 | |
| b | double | | | 0 | |
+-------+-------------+------+-----+---------+-------+
2 rows in set (0.04 sec)

As of MariaDB 5.0.3, the approximate-value literal is evaluated using floating point, but the exact-value literal is handled as DECIMAL:

mysql> SELECT VERSION();
+-----------------+
| VERSION() |
+-----------------+
| 5.1.6-alpha-log |
+-----------------+
1 row in set (0.11 sec)
mysql> CREATE TABLE t SELECT 2.5 AS a, 25E-1 AS b;
Query OK, 1 row affected (0.01 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> DESCRIBE t;
+-------+-----------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------------------+------+-----+---------+-------+
| a | decimal(2,1) unsigned | NO | | 0.0 | |
| b | double | NO | | 0 | |
+-------+-----------------------+------+-----+---------+-------+
2 rows in set (0.01 sec)

Example 7. If the argument to an aggregate function is an exact numeric type, the result is also an exact numeric type, with a scale at least that of the argument.

Consider these statements:

mysql> CREATE TABLE t (i INT, d DECIMAL, f FLOAT);
mysql> INSERT INTO t VALUES(1,1,1);
mysql> CREATE TABLE y SELECT AVG(i), AVG(d), AVG(f) FROM t;

Before MariaDB 5.0.3, the result is a double no matter the argument type:

mysql> DESCRIBE y;
+--------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+--------------+------+-----+---------+-------+
| AVG(i) | double(17,4) | YES | | NULL | |
| AVG(d) | double(17,4) | YES | | NULL | |
| AVG(f) | double | YES | | NULL | |
+--------+--------------+------+-----+---------+-------+

As of MariaDB 5.0.3, the result is a double only for the floating-point argument. For exact type arguments, the result is also an exact type:

mysql> DESCRIBE y;
+--------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+---------------+------+-----+---------+-------+
| AVG(i) | decimal(14,4) | YES | | NULL | |
| AVG(d) | decimal(14,4) | YES | | NULL | |
| AVG(f) | double | YES | | NULL | |
+--------+---------------+------+-----+---------+-------+

The result is a double only for the floating-point argument. For exact type arguments, the result is also an exact type.Copyright 1997, 2012, Oracle and/or its affiliates. All rights reserved. Legal Notices


Prev Next
Data Types Home Chapter 12. SQL Statement Syntax