Disk seeks are a huge performance bottleneck. This problem becomes more apparent when the amount of data starts to grow so large that effective caching becomes impossible. For large databases where you access data more or less randomly, you can be sure that you need at least one disk seek to read and a couple of disk seeks to write things. To minimize this problem, use disks with low seek times.
Increase the number of available disk spindles (and thereby reduce the seek overhead) by either symlinking files to different disks or striping the disks:
Using symbolic links
This means that, for MyISAM
tables, you
symlink the index file and data files from their usual
location in the data directory to another disk (that may
also be striped). This makes both the seek and read times
better, assuming that the disk is not used for other
purposes as well. See Section 7.6.1, “Using Symbolic Links”.
Striping means that you have many disks and put the first
block on the first disk, the second block on the second
disk, and the N
-th block on the
(
)
disk, and so on. This means if your normal data size is
less than the stripe size (or perfectly aligned), you get
much better performance. Striping is very dependent on the
operating system and the stripe size, so benchmark your
application with different stripe sizes. See
Section 7.1.5, “Using Your Own Benchmarks”.
N
MOD
number_of_disks
The speed difference for striping is very dependent on the parameters. Depending on how you set the striping parameters and number of disks, you may get differences measured in orders of magnitude. You have to choose to optimize for random or sequential access.
For reliability, you may want to use RAID 0+1 (striping plus
mirroring), but in this case, you need 2 ×
N
drives to hold
N
drives of data. This is probably
the best option if you have the money for it. However, you may
also have to invest in some volume-management software to
handle it efficiently.
A good option is to vary the RAID level according to how
critical a type of data is. For example, store semi-important
data that can be regenerated on a RAID 0 disk, but store
really important data such as host information and logs on a
RAID 0+1 or RAID N
disk. RAID
N
can be a problem if you have many
writes, due to the time required to update the parity bits.
On Linux, you can get much more performance by using
hdparm
to configure your disk's interface.
(Up to 100% under load is not uncommon.) The following
hdparm
options should be quite good for
MySQL, and probably for many other applications:
hdparm -m 16 -d 1
Note that performance and reliability when using this command
depend on your hardware, so we strongly suggest that you test
your system thoroughly after using hdparm
.
Please consult the hdparm
manual page for
more information. If hdparm
is not used
wisely, filesystem corruption may result, so back up
everything before experimenting!
You can also set the parameters for the filesystem that the database uses:
If you do not need to know when files were last accessed
(which is not really useful on a database server), you can
mount your filesystems with the -o noatime
option. That skips updates to the last access time in inodes
on the filesystem, which avoids some disk seeks.
On many operating systems, you can set a filesystem to be
updated asynchronously by mounting it with the -o
async
option. If your computer is reasonably stable,
this should give you more performance without sacrificing too
much reliability. (This flag is on by default on Linux.)
You can move tables and databases from the database directory to other locations and replace them with symbolic links to the new locations. You might want to do this, for example, to move a database to a file system with more free space or increase the speed of your system by spreading your tables to different disk.
The recommended way to do this is simply to symlink databases to a different disk. Symlink tables only as a last resort.
On Unix, the way to symlink a database is first to create a directory on some disk where you have free space and then to create a symlink to it from the MySQL data directory.
shell>mkdir /dr1/databases/test
shell>ln -s /dr1/databases/test
/path/to/datadir
MySQL does not support linking one directory to multiple
databases. Replacing a database directory with a symbolic link
works as long as you do not make a symbolic link between
databases. Suppose that you have a database
db1
under the MySQL data directory, and
then make a symlink db2
that points to
db1
:
shell>cd
shell>/path/to/datadir
ln -s db1 db2
The result is that, or any table tbl_a
in
db1
, there also appears to be a table
tbl_a
in db2
. If one
client updates db1.tbl_a
and another client
updates db2.tbl_a
, problems are likely to
occur.
However, if you really need to do this, it is possible by
altering the source file
mysys/my_symlink.c
, in which you should
look for the following statement:
if (!(MyFlags & MY_RESOLVE_LINK) || (!lstat(filename,&stat_buff) && S_ISLNK(stat_buff.st_mode)))
Change the statement to this:
if (1)
You should not symlink tables on systems that do not have a
fully operational realpath()
call. (Linux
and Solaris support realpath()
). You can
check whether your system supports symbolic links by issuing a
SHOW VARIABLES LIKE 'have_symlink'
statement.
Symlinks are fully supported only for
MyISAM
tables. For files used by tables for
other storage engines, you may get strange problems if you try
to use symbolic links.
The handling of symbolic links for MyISAM
tables works as follows:
In the data directory, you always have the table format
(.frm
) file, the data
(.MYD
) file, and the index
(.MYI
) file. The data file and index
file can be moved elsewhere and replaced in the data
directory by symlinks. The format file cannot.
You can symlink the data file and the index file independently to different directories.
You can instruct a running MySQL server to perform the
symlinking by using the DATA DIRECTORY
and INDEX DIRECTORY
options to
CREATE TABLE
. See
Section 13.1.5, “CREATE TABLE
Syntax”. Alternatively, symlinking
can be accomplished manually from the command line using
ln -s
if mysqld is
not running.
myisamchk does not replace a symlink with the data file or index file. It works directly on the file to which the symlink points. Any temporary files are created in the directory where the data file or index file is located.
Note: When you drop a
table that is using symlinks, both the symlink
and the file to which the symlink points are
dropped. This is an extremely good reason why
you should not run
mysqld as the system
root
or allow system users to have
write access to MySQL database directories.
If you rename a table with ALTER TABLE ...
RENAME
and you do not move the table to another
database, the symlinks in the database directory are
renamed to the new names and the data file and index file
are renamed accordingly.
If you use ALTER TABLE ... RENAME
to
move a table to another database, the table is moved to
the other database directory. The old symlinks and the
files to which they pointed are deleted. In other words,
the new table is not symlinked.
If you are not using symlinks, you should use the
--skip-symbolic-links
option to
mysqld to ensure that no one can use
mysqld to drop or rename a file outside
of the data directory.
Table symlink operations that are not yet supported:
ALTER TABLE
ignores the DATA
DIRECTORY
and INDEX DIRECTORY
table options.
BACKUP TABLE
and RESTORE
TABLE
do not respect symbolic links.
The .frm
file must
never be a symbolic link (as
indicated previously, only the data and index files can be
symbolic links). Attempting to do this (for example, to
make synonyms) produces incorrect results. Suppose that
you have a database db1
under the MySQL
data directory, a table tbl1
in this
database, and in the db1
directory you
make a symlink tbl2
that points to
tbl1
:
shell>cd
shell>/path/to/datadir
/db1ln -s tbl1.frm tbl2.frm
shell>ln -s tbl1.MYD tbl2.MYD
shell>ln -s tbl1.MYI tbl2.MYI
Problems result if one thread reads
db1.tbl1
and another thread updates
db1.tbl2
:
The query cache is “fooled” (it has no
way of knowing that tbl1
has not
been updated, so it returns outdated results).
ALTER
statements on
tbl2
fail.
Symbolic links are enabled by default for all Windows servers.
This enables you to put a database directory on a different
disk by setting up a symbolic link to it. This is similar to
the way that database symbolic links work on Unix, although
the procedure for setting up the link is different. If you do
not need symbolic links, you can disable them using the
--skip-symbolic-links
option.
On Windows, create a symbolic link to a MySQL database by
creating a file in the data directory that contains the path
to the destination directory. The file should be named
,
where db_name
.symdb_name
is the database name.
Suppose that the MySQL data directory is
C:\mysql\data
and you want to have
database foo
located at
D:\data\foo
. Set up a symlink using this
procedure
Make sure that the D:\data\foo
directory exists by creating it if necessary. If you
already have a database directory named
foo
in the data directory, you should
move it to D:\data
. Otherwise, the
symbolic link will be ineffective. To avoid problems, make
sure that the server is not running when you move the
database directory.
Create a text file
C:\mysql\data\foo.sym
that contains
the pathname D:\data\foo\
.
After this, all tables created in the database
foo
are created in
D:\data\foo
. Note that the
symbolic link is not used if a directory with the same name as
the database exists in the MySQL data directory.