The string types are CHAR
,
VARCHAR
, BINARY
,
VARBINARY
, BLOB
,
TEXT
, ENUM
, and
SET
. This section describes how these types
work and how to use them in your queries. For string type storage
requirements, see Section 11.5, “Data Type Storage Requirements”.
The CHAR
and VARCHAR
types
are similar, but differ in the way they are stored and
retrieved. As of MySQL 5.0.3, they also differ in maximum length
and in whether trailing spaces are retained.
The CHAR
and VARCHAR
types
are declared with a length that indicates the maximum number of
characters you want to store. For example,
CHAR(30)
can hold up to 30 characters.
The length of a CHAR
column is fixed to the
length that you declare when you create the table. The length
can be any value from 0 to 255. When CHAR
values are stored, they are right-padded with spaces to the
specified length. When CHAR
values are
retrieved, trailing spaces are removed.
Values in VARCHAR
columns are variable-length
strings. The length can be specified as a value from 0 to 255
before MySQL 5.0.3, and 0 to 65,535 in 5.0.3 and later versions.
(The maximum effective length of a VARCHAR
in
MySQL 5.0.3 and later is determined by the maximum row size and
the character set used. The maximum length overall is 65,532
bytes.)
In contrast to CHAR
,
VARCHAR
values are stored using only as many
characters as are needed, plus one byte to record the length
(two bytes for columns that are declared with a length longer
than 255).
VARCHAR
values are not padded when they are
stored. Handling of trailing spaces is version-dependent. As of
MySQL 5.0.3, trailing spaces are retained when values are stored
and retrieved, in conformance with standard SQL. Before MySQL
5.0.3, trailing spaces are removed from values when they are
stored into a VARCHAR
column; this means that
the spaces also are absent from retrieved values.
If you assign a value to a CHAR
or
VARCHAR
column that exceeds the column's
maximum length, the value is truncated to fit. If the truncated
characters are not spaces, a warning is generated. For
truncation of non-space characters, you can cause an error to
occur (rather than a warning) and suppress insertion of the
value by using strict SQL mode. See
Section 5.2.5, “The Server SQL Mode”.
Before MySQL 5.0.3, if you need a data type for which trailing
spaces are not removed, consider using a BLOB
or TEXT
type. Also, if you want to store
binary values such as results from an encryption or compression
function that might contain arbitrary byte values, use a
BLOB
column rather than a
CHAR
or VARCHAR
column, to
avoid potential problems with trailing space removal that would
change data values.
The following table illustrates the differences between
CHAR
and VARCHAR
by
showing the result of storing various string values into
CHAR(4)
and VARCHAR(4)
columns:
Value | CHAR(4) | Storage Required | VARCHAR(4) | Storage Required |
'' | ' ' | 4 bytes | '' | 1 byte |
'ab' | 'ab ' | 4 bytes | 'ab' | 3 bytes |
'abcd' | 'abcd' | 4 bytes | 'abcd' | 5 bytes |
'abcdefgh' | 'abcd' | 4 bytes | 'abcd' | 5 bytes |
Note that the values shown as stored in the last row of the table apply only when not using strict mode; if MySQL is running in strict mode, values that exceed the column length are not stored, and an error results.
If a given value is stored into the CHAR(4)
and VARCHAR(4)
columns, the values retrieved
from the columns are not always the same because trailing spaces
are removed from CHAR
columns upon retrieval.
The following example illustrates this difference:
mysql>CREATE TABLE vc (v VARCHAR(4), c CHAR(4));
Query OK, 0 rows affected (0.01 sec) mysql>INSERT INTO vc VALUES ('ab ', 'ab ');
Query OK, 1 row affected (0.00 sec) mysql>SELECT CONCAT('(', v, ')'), CONCAT('(', c, ')') FROM vc;
+---------------------+---------------------+ | CONCAT('(', v, ')') | CONCAT('(', c, ')') | +---------------------+---------------------+ | (ab ) | (ab) | +---------------------+---------------------+ 1 row in set (0.06 sec)
Values in CHAR
and VARCHAR
columns are sorted and compared according to the character set
collation assigned to the column.
Note that all MySQL collations are of type
PADSPACE
. This means that all
CHAR
and VARCHAR
values in
MySQL are compared without regard to any trailing spaces. For
example:
mysql>CREATE TABLE names (myname CHAR(10), yourname VARCHAR(10));
Query OK, 0 rows affected (0.09 sec) mysql>INSERT INTO names VALUES ('Monty ', 'Monty ');
Query OK, 1 row affected (0.00 sec) mysql>SELECT myname = 'Monty ', yourname = 'Monty ' FROM names;
+--------------------+----------------------+ | myname = 'Monty ' | yourname = 'Monty ' | +--------------------+----------------------+ | 1 | 1 | +--------------------+----------------------+ 1 row in set (0.00 sec)
Note that this is true for all MySQL versions, and it makes no
difference whether your version trims trailing spaces from
VARCHAR
values before storing them. Nor does
the server SQL mode make any difference in this regard.
For those cases where trailing pad characters are stripped or
comparisons ignore them, if a column has an index that requires
unique values, inserting into the column values that differ only
in number of trailing pad characters will result in a
duplicate-key error. For example, if a table contains
'a'
, an attempt to store
'a '
causes a duplicate-key error.
The BINARY
and VARBINARY
types are similar to CHAR
and
VARCHAR
, except that they contain binary
strings rather than non-binary strings. That is, they contain
byte strings rather than character strings. This means that they
have no character set, and sorting and comparison are based on
the numeric values of the bytes in the values.
The allowable maximum length is the same for
BINARY
and VARBINARY
as it
is for CHAR
and VARCHAR
,
except that the length for BINARY
and
VARBINARY
is a length in bytes rather than in
characters.
The BINARY
and VARBINARY
data types are distinct from the CHAR BINARY
and VARCHAR BINARY
data types. For the latter
types, the BINARY
attribute does not cause
the column to be treated as a binary string column. Instead, it
causes the binary collation for the column character set to be
used, and the column itself contains non-binary character
strings rather than binary byte strings. For example,
CHAR(5) BINARY
is treated as CHAR(5)
CHARACTER SET latin1 COLLATE latin1_bin
, assuming that
the default character set is latin1
. This
differs from BINARY(5)
, which stores 5-bytes
binary strings that have no character set or collation.
When BINARY
values are stored, they are
right-padded with the pad value to the specified length. The pad
value and how it is handled is version specific:
As of MySQL 5.0.15, the pad value is 0x00
(the zero byte). Values are right-padded with
0x00
on insert, and no trailing bytes are
removed on select. All bytes are significant in comparisons,
including ORDER BY
and
DISTINCT
operations.
0x00
bytes and spaces are different in
comparisons, with 0x00
< space.
Example: For a BINARY(3)
column,
'a '
becomes
'a \0'
when inserted.
'a\0'
becomes 'a\0\0'
when inserted. Both inserted values remain unchanged when
selected.
Before MySQL 5.0.15, the pad value is space. Values are
right-padded with space on insert, and trailing spaces are
removed on select. Trailing spaces are ignored in
comparisons, including ORDER BY
and
DISTINCT
operations.
0x00
bytes and spaces are different in
comparisons, with 0x00
< space.
Example: For a BINARY(3)
column,
'a '
becomes
'a '
when inserted and
'a'
when selected.
'a\0'
becomes
'a\0 '
when inserted and
'a\0'
when selected.
For VARBINARY
, there is no padding on insert
and no bytes are stripped on select. All bytes are significant
in comparisons, including ORDER BY
and
DISTINCT
operations. 0x00
bytes and spaces are different in comparisons, with
0x00
< space. (Exceptions: Before MySQL
5.0.3, trailing spaces are removed when values are stored.
Before MySQL 5.0.15, trailing 0x00 bytes are removed for
ORDER BY
operations.)
Note: The InnoDB
storage engine continues to
preserve trailing spaces in BINARY
and
VARBINARY
column values through MySQL 5.0.18.
Beginning with MySQL 5.0.19, InnoDB
uses
trailing space characters in making comparisons as do other
MySQL storage engines.
For those cases where trailing pad bytes are stripped or
comparisons ignore them, if a column has an index that requires
unique values, inserting into the column values that differ only
in number of trailing pad bytes will result in a duplicate-key
error. For example, if a table contains 'a'
,
an attempt to store 'a\0'
causes a
duplicate-key error.
You should consider the preceding padding and stripping
characteristics carefully if you plan to use the
BINARY
data type for storing binary data and
you require that the value retrieved be exactly the same as the
value stored. The following example illustrates how
0x00
-padding of BINARY
values affects column value comparisons:
mysql>CREATE TABLE t (c BINARY(3));
Query OK, 0 rows affected (0.01 sec) mysql>INSERT INTO t SET c = 'a';
Query OK, 1 row affected (0.01 sec) mysql>SELECT HEX(c), c = 'a', c = 'a\0\0' from t;
+--------+---------+-------------+ | HEX(c) | c = 'a' | c = 'a\0\0' | +--------+---------+-------------+ | 610000 | 0 | 1 | +--------+---------+-------------+ 1 row in set (0.09 sec)
If the value retrieved must be the same as the value specified
for storage with no padding, it might be preferable to use
VARBINARY
or one of the
BLOB
data types instead.
A BLOB
is a binary large object that can hold
a variable amount of data. The four BLOB
types are TINYBLOB
, BLOB
,
MEDIUMBLOB
, and LONGBLOB
.
These differ only in the maximum length of the values they can
hold. The four TEXT
types are
TINYTEXT
, TEXT
,
MEDIUMTEXT
, and LONGTEXT
.
These correspond to the four BLOB
types and
have the same maximum lengths and storage requirements. See
Section 11.5, “Data Type Storage Requirements”. No lettercase conversion
for TEXT
or BLOB
columns
takes place during storage or retrieval.
BLOB
columns are treated as binary strings
(byte strings). TEXT
columns are treated as
non-binary strings (character strings). BLOB
columns have no character set, and sorting and comparison are
based on the numeric values of the bytes in column values.
TEXT
columns have a character set, and values
are sorted and compared based on the collation of the character
set.
If a TEXT
column is indexed, index entry
comparisons are space-padded at the end. This means that, if the
index requires unique values, duplicate-key errors will occur
for values that differ only in the number of trailing spaces.
For example, if a table contains 'a'
, an
attempt to store 'a '
causes a
duplicate-key error. This is not true for
BLOB
columns.
When not running in strict mode, if you assign a value to a
BLOB
or TEXT
column that
exceeds the data type's maximum length, the value is truncated
to fit. If the truncated characters are not spaces, a warning is
generated. You can cause an error to occur and the value to be
rejected rather than to be truncated with a warning by using
strict SQL mode. See Section 5.2.5, “The Server SQL Mode”.
In most respects, you can regard a BLOB
column as a VARBINARY
column that can be as
large as you like. Similarly, you can regard a
TEXT
column as a VARCHAR
column. BLOB
and TEXT
differ from VARBINARY
and
VARCHAR
in the following ways:
There is no trailing-space removal for
BLOB
and TEXT
columns
when values are stored or retrieved. Before MySQL 5.0.3,
this differs from VARBINARY
and
VARCHAR
, for which trailing spaces are
removed when values are stored.
Note that TEXT
is on comparison space
extended to fit the compared object, exactly like
CHAR
and VARCHAR
.
For indexes on BLOB
and
TEXT
columns, you must specify an index
prefix length. For CHAR
and
VARCHAR
, a prefix length is optional. See
Section 7.4.3, “Column Indexes”.
LONG
and LONG VARCHAR
map
to the MEDIUMTEXT
data type. This is a
compatibility feature. If you use the BINARY
attribute with a TEXT
data type, the column
is assigned the binary collation of the column character set.
MySQL Connector/ODBC defines BLOB
values as
LONGVARBINARY
and TEXT
values as LONGVARCHAR
.
Because BLOB
and TEXT
values can be extremely long, you might encounter some
constraints in using them:
Only the first max_sort_length
bytes of
the column are used when sorting. The default value of
max_sort_length
is 1024. This value can
be changed using the
--max_sort_length=
option when starting the mysqld server.
See Section 5.2.2, “Server System Variables”.
N
You can make more bytes significant in sorting or grouping
by increasing the value of
max_sort_length
at runtime. Any client
can change the value of its session
max_sort_length
variable:
mysql>SET max_sort_length = 2000;
mysql>SELECT id, comment FROM t
->ORDER BY comment;
Another way to use GROUP BY
or
ORDER BY
on a BLOB
or
TEXT
column containing long values when
you want more than max_sort_length
bytes
to be significant is to convert the column value into a
fixed-length object. The standard way to do this is with the
SUBSTRING()
function. For example, the
following statement causes 2000 bytes of the
comment
column to be taken into account
for sorting:
mysql>SELECT id, SUBSTRING(comment,1,2000) FROM t
->ORDER BY SUBSTRING(comment,1,2000);
The maximum size of a BLOB
or
TEXT
object is determined by its type,
but the largest value you actually can transmit between the
client and server is determined by the amount of available
memory and the size of the communications buffers. You can
change the message buffer size by changing the value of the
max_allowed_packet
variable, but you must
do so for both the server and your client program. For
example, both mysql and
mysqldump allow you to change the
client-side max_allowed_packet
value. See
Section 7.5.2, “Tuning Server Parameters”,
Section 8.6, “mysql — The MySQL Command-Line Tool”, and Section 8.12, “mysqldump — A Database Backup Program”.
Each BLOB
or TEXT
value is
represented internally by a separately allocated object. This is
in contrast to all other data types, for which storage is
allocated once per column when the table is opened.
In some cases, it may be desirable to store binary data such as
media files in BLOB
or
TEXT
columns. You may find MySQL's string
handling functions useful for working with such data. See
Section 12.3, “String Functions”. For security and other
reasons, it is usually preferable to do so using application
code rather than allowing application users the
FILE
privilege. You can discuss specifics for
various languages and platforms in the MySQL Forums
(http://forums.mysql.com/).
An ENUM
is a string object with a value
chosen from a list of allowed values that are enumerated
explicitly in the column specification at table creation time.
An enumeration value must be a quoted string literal; it may not be an expression, even one that evaluates to a string value. This means that you also may not employ a user variable as an enumeration value.
For example, you can create a table with an
ENUM
column like this:
CREATE TABLE sizes ( name ENUM('small', 'medium', 'large') );
However, this version of the previous CREATE
TABLE
statement does not work:
CREATE TABLE sizes ( c1 ENUM('small', CONCAT('med','ium'), 'large') );
You also may not employ a user variable as an enumeration value. This pair of statements do not work:
SET @mysize = 'medium'; CREATE TABLE sizes ( name ENUM('small', @mysize, 'large') );
If you wish to use a number as an enumeration value, you must enclose it in quotes.
The value may also be the empty string (''
)
or NULL
under certain circumstances:
If you insert an invalid value into an
ENUM
(that is, a string not present in
the list of allowed values), the empty string is inserted
instead as a special error value. This string can be
distinguished from a “normal” empty string by
the fact that this string has the numerical value 0. More
about this later.
If strict SQL mode is enabled, attempts to insert invalid
ENUM
values result in an error.
If an ENUM
column is declared to allow
NULL
, the NULL
value
is a legal value for the column, and the default value is
NULL
. If an ENUM
column is declared NOT NULL
, its default
value is the first element of the list of allowed values.
Each enumeration value has an index:
Values from the list of allowable elements in the column specification are numbered beginning with 1.
The index value of the empty string error value is 0. This
means that you can use the following
SELECT
statement to find rows into which
invalid ENUM
values were assigned:
mysql> SELECT * FROM tbl_name
WHERE enum_col
=0;
The index of the NULL
value is
NULL
.
The term “index” here refers only to position within the list of enumeration values. It has nothing to do with table indexes.
For example, a column specified as ENUM('one', 'two',
'three')
can have any of the values shown here. The
index of each value is also shown:
Value | Index |
NULL | NULL |
'' | 0 |
'one' | 1 |
'two' | 2 |
'three' | 3 |
An enumeration can have a maximum of 65,535 elements.
Trailing spaces are automatically deleted from
ENUM
member values in the table definition
when a table is created.
When retrieved, values stored into an ENUM
column are displayed using the lettercase that was used in the
column definition. Note that ENUM
columns can
be assigned a character set and collation. For binary or
case-sensitive collations, lettercase is taken into account when
assigning values to the column.
If you retrieve an ENUM
value in a numeric
context, the column value's index is returned. For example, you
can retrieve numeric values from an ENUM
column like this:
mysql> SELECT enum_col
+0 FROM tbl_name
;
If you store a number into an ENUM
column,
the number is treated as an index, and the value stored is the
enumeration member with that index. (However, this does not work
with LOAD DATA
, which treats all input as
strings.) It is not advisable to define an
ENUM
column with enumeration values that look
like numbers, because this can easily become confusing. For
example, the following column has enumeration members with
string values of '0'
, '1'
,
and '2'
, but numeric index values of
1
, 2
, and
3
:
numbers ENUM('0','1','2')
ENUM
values are sorted according to the order
in which the enumeration members were listed in the column
specification. (In other words, ENUM
values
are sorted according to their index numbers.) For example,
'a'
sorts before 'b'
for
ENUM('a', 'b')
, but 'b'
sorts before 'a'
for ENUM('b',
'a')
. The empty string sorts before non-empty strings,
and NULL
values sort before all other
enumeration values. To prevent unexpected results, specify the
ENUM
list in alphabetical order. You can also
use GROUP BY CAST(col AS CHAR)
or
GROUP BY CONCAT(col)
to make sure that the
column is sorted lexically rather than by index number.
If you want to determine all possible values for an
ENUM
column, use SHOW COLUMNS FROM
and parse the
tbl_name
LIKE
enum_col
ENUM
definition in the
Type
column of the output.
A SET
is a string object that can have zero
or more values, each of which must be chosen from a list of
allowed values specified when the table is created.
SET
column values that consist of multiple
set members are specified with members separated by commas
(‘,
’). A consequence of this is
that SET
member values should not themselves
contain commas.
For example, a column specified as SET('one', 'two')
NOT NULL
can have any of these values:
'' 'one' 'two' 'one,two'
A SET
can have a maximum of 64 different
members.
Trailing spaces are automatically deleted from
SET
member values in the table definition
when a table is created.
When retrieved, values stored in a SET
column
are displayed using the lettercase that was used in the column
definition. Note that SET
columns can be
assigned a character set and collation. For binary or
case-sensitive collations, lettercase is taken into account when
assigning values to the column.
MySQL stores SET
values numerically, with the
low-order bit of the stored value corresponding to the first set
member. If you retrieve a SET
value in a
numeric context, the value retrieved has bits set corresponding
to the set members that make up the column value. For example,
you can retrieve numeric values from a SET
column like this:
mysql> SELECT set_col
+0 FROM tbl_name
;
If a number is stored into a SET
column, the
bits that are set in the binary representation of the number
determine the set members in the column value. For a column
specified as SET('a','b','c','d')
, the
members have the following decimal and binary values:
SET Member | Decimal Value | Binary Value |
'a' | 1 | 0001 |
'b' | 2 | 0010 |
'c' | 4 | 0100 |
'd' | 8 | 1000 |
If you assign a value of 9
to this column,
that is 1001
in binary, so the first and
fourth SET
value members
'a'
and 'd'
are selected
and the resulting value is 'a,d'
.
For a value containing more than one SET
element, it does not matter what order the elements are listed
in when you insert the value. It also does not matter how many
times a given element is listed in the value. When the value is
retrieved later, each element in the value appears once, with
elements listed according to the order in which they were
specified at table creation time. For example, suppose that a
column is specified as SET('a','b','c','d')
:
mysql> CREATE TABLE myset (col SET('a', 'b', 'c', 'd'));
If you insert the values 'a,d'
,
'd,a'
, 'a,d,d'
,
'a,d,a'
, and 'd,a,d'
:
mysql> INSERT INTO myset (col) VALUES
-> ('a,d'), ('d,a'), ('a,d,a'), ('a,d,d'), ('d,a,d');
Query OK, 5 rows affected (0.01 sec)
Records: 5 Duplicates: 0 Warnings: 0
Then all of these values appear as 'a,d'
when
retrieved:
mysql> SELECT col FROM myset;
+------+
| col |
+------+
| a,d |
| a,d |
| a,d |
| a,d |
| a,d |
+------+
5 rows in set (0.04 sec)
If you set a SET
column to an unsupported
value, the value is ignored and a warning is issued:
mysql>INSERT INTO myset (col) VALUES ('a,d,d,s');
Query OK, 1 row affected, 1 warning (0.03 sec) mysql>SHOW WARNINGS;
+---------+------+------------------------------------------+ | Level | Code | Message | +---------+------+------------------------------------------+ | Warning | 1265 | Data truncated for column 'col' at row 1 | +---------+------+------------------------------------------+ 1 row in set (0.04 sec) mysql>SELECT col FROM myset;
+------+ | col | +------+ | a,d | | a,d | | a,d | | a,d | | a,d | | a,d | +------+ 6 rows in set (0.01 sec)
If strict SQL mode is enabled, attempts to insert invalid
SET
values result in an error.
SET
values are sorted numerically.
NULL
values sort before
non-NULL
SET
values.
Normally, you search for SET
values using the
FIND_IN_SET()
function or the
LIKE
operator:
mysql>SELECT * FROM
mysql>tbl_name
WHERE FIND_IN_SET('value
',set_col
)>0;SELECT * FROM
tbl_name
WHEREset_col
LIKE '%value
%';
The first statement finds rows where
set_col
contains the
value
set member. The second is
similar, but not the same: It finds rows where
set_col
contains
value
anywhere, even as a substring
of another set member.
The following statements also are legal:
mysql>SELECT * FROM
mysql>tbl_name
WHEREset_col
& 1;SELECT * FROM
tbl_name
WHEREset_col
= 'val1
,val2
';
The first of these statements looks for values containing the
first set member. The second looks for an exact match. Be
careful with comparisons of the second type. Comparing set
values to
'
returns different results than comparing values to
val1
,val2
''
.
You should specify the values in the same order they are listed
in the column definition.
val2
,val1
'
If you want to determine all possible values for a
SET
column, use SHOW COLUMNS FROM
and parse the
tbl_name
LIKE
set_col
SET
definition in the Type
column of the output.