In this section, we discuss changes in the implementation of MySQL Cluster in MySQL 5.0 as compared to MySQL 4.1. We will also discuss our roadmap for further improvements to MySQL Cluster as currently planned for MySQL 5.1.
There are relatively few changes between the NDB Cluster storage engine implementations in MySQL 4.1 and in 5.0, so the upgrade path should be relatively quick and painless.
All significantly new features being developed for MySQL Cluster are going into the MySQL 5.1 and 5.2 trees. For information on changes in the Cluster implementations in MySQL versions 5.1 and later, see http://dev.mysql.com/doc/refman/5.1/en/ndbcluster.html.
MySQL Cluster in versions 5.0.3-beta and later contains a number of new features that are likely to be of interest:
Push-Down Conditions: Consider the following query:
SELECT * FROM t1 WHERE non_indexed_attribute = 1;
This query will use a full table scan and the condition will
be evaluated in the cluster's data nodes. Thus, it is not
necessary to send the records across the network for
evaluation. (That is, function transport is used, rather
than data transport.) Please note that this feature is
currently disabled by default (pending more thorough
testing), but it should work in most cases. This feature can
be enabled through the use of the SET
engine_condition_pushdown = On
statement.
Alternatively, you can run mysqld with
the this feature enabled by starting the MySQL server with
the --engine-condition-pushdown
option.
A major benefit of this change is that queries can be executed in parallel. This means that queries against non-indexed columns can run faster than previously by a factor of as much as 5 to 10 times, times the number of data nodes, because multiple CPUs can work on the query in parallel.
You can use EXPLAIN
to determine when
condition pushdown is being used. See
Section 7.2.1, “Optimizing Queries with EXPLAIN
”.
Decreased
IndexMemory
Usage: In MySQL
5.0, each record consumes approximately 25
bytes of index memory, and every unique index uses 25 bytes
per record of index memory (in addition to some data memory
because these are stored in a separate table). This is
because the primary key is not stored in the index memory
anymore.
Query Cache Enabled for MySQL Cluster: See Section 5.14, “The MySQL Query Cache”, for information on configuring and using the query cache.
New Optimizations: One optimization that merits particular attention is that a batched read interface is now used in some queries. For example, consider the following query:
SELECT * FROM t1 WHERE primary_key
IN (1,2,3,4,5,6,7,8,9,10);
This query will be executed 2 to 3 times more quickly than in previous MySQL Cluster versions due to the fact that all 10 key lookups are sent in a single batch rather than one at a time.
Limit On Number of Metadata
Objects: Beginning with MySQL 5.0.6, each Cluster
database may contain a maximum of 20320 metadata objects
— this includes database tables, system tables,
indexes and BLOB
values. (Previously,
this number was 1600.)
What is said here is a status report based on recent commits to the MySQL 5.1 source tree. It should be noted all 5.1 development is subject to change.
There are currently 4 major new features being developed for MySQL 5.1:
Integration of MySQL Cluster into MySQL replication: This will make it possible to update from any MySQL Server in the cluster and still have the MySQL Replication handled by one of the MySQL Servers in the cluster, with the state of the slave side remaining consistent with the cluster acting as the master.
Support for disk-based records: Records on disk will be supported. Indexed fields including the primary key hash index must still be stored in RAM but all other fields can be on disk.
Variable-sized records: A
column defined as VARCHAR(255)
currently
uses 260 bytes of storage independent of what is stored in
any particular record. In MySQL 5.1 Cluster tables, only the
portion of the column actually taken up by the record will
be stored. This will make possible a reduction in space
requirements for such columns by a factor of 5 in many
cases.
User-defined partitioning:
Users will be able to define partitions based on columns
that are part of the primary key. The MySQL Server will be
able to discover whether it is possible to prune away some
of the partitions from the WHERE
clause.
Partitioning based on KEY
,
HASH
, RANGE
, and
LIST
handlers will be possible, as well
as subpartitioning. This feature should also be available
for many other handlers, and not only NDB
Cluster
.
In addition, we are working to increase the 8KB size limit for
rows containing columns of types other than
BLOB
or TEXT
in Cluster
tables. This is due to the fact that rows are currently fixed in
size and the page size is 32,768 bytes (minus 128 bytes for the
row header). Currently, this means that if we allowed more than
8KB per record, any remaining space (up to approximately 14,000
bytes) would be left empty. In MySQL 5.1, we plan to fix this
limitation so that using more than 8KB in a given row does not
result in the remainder of the page being wasted.