15.11. MySQL Cluster Development Roadmap

In this section, we discuss changes in the implementation of MySQL Cluster in MySQL 5.0 as compared to MySQL 4.1. We will also discuss our roadmap for further improvements to MySQL Cluster as currently planned for MySQL 5.1.

There are relatively few changes between the NDB Cluster storage engine implementations in MySQL 4.1 and in 5.0, so the upgrade path should be relatively quick and painless.

All significantly new features being developed for MySQL Cluster are going into the MySQL 5.1 and 5.2 trees. For information on changes in the Cluster implementations in MySQL versions 5.1 and later, see http://dev.mysql.com/doc/refman/5.1/en/ndbcluster.html.

15.11.1. MySQL Cluster Changes in MySQL 5.0

MySQL Cluster in versions 5.0.3-beta and later contains a number of new features that are likely to be of interest:

Push-Down Conditions: Consider the following query:
```
SELECT * FROM t1 WHERE non_indexed_attribute = 1;
```
This query will use a full table scan and the condition will be evaluated in the cluster's data nodes. Thus, it is not necessary to send the records across the network for evaluation. (That is, function transport is used, rather than data transport.) Please note that this feature is currently disabled by default (pending more thorough testing), but it should work in most cases. This feature can be enabled through the use of the SET engine_condition_pushdown = On statement. Alternatively, you can run mysqld with the this feature enabled by starting the MySQL server with the --engine-condition-pushdown option.
A major benefit of this change is that queries can be executed in parallel. This means that queries against non-indexed columns can run faster than previously by a factor of as much as 5 to 10 times, times the number of data nodes, because multiple CPUs can work on the query in parallel.
You can use EXPLAIN to determine when condition pushdown is being used. See Section 7.2.1, “Optimizing Queries with EXPLAIN”.
Decreased IndexMemory Usage: In MySQL 5.0, each record consumes approximately 25 bytes of index memory, and every unique index uses 25 bytes per record of index memory (in addition to some data memory because these are stored in a separate table). This is because the primary key is not stored in the index memory anymore.
Query Cache Enabled for MySQL Cluster: See Section 5.14, “The MySQL Query Cache”, for information on configuring and using the query cache.
New Optimizations: One optimization that merits particular attention is that a batched read interface is now used in some queries. For example, consider the following query:
```
SELECT * FROM t1 WHERE primary_key IN (1,2,3,4,5,6,7,8,9,10);
```
This query will be executed 2 to 3 times more quickly than in previous MySQL Cluster versions due to the fact that all 10 key lookups are sent in a single batch rather than one at a time.
Limit On Number of Metadata Objects: Beginning with MySQL 5.0.6, each Cluster database may contain a maximum of 20320 metadata objects — this includes database tables, system tables, indexes and BLOB values. (Previously, this number was 1600.)

15.11.2. MySQL 5.1 Development Roadmap for MySQL Cluster

What is said here is a status report based on recent commits to the MySQL 5.1 source tree. It should be noted all 5.1 development is subject to change.

There are currently 4 major new features being developed for MySQL 5.1:

Integration of MySQL Cluster into MySQL replication: This will make it possible to update from any MySQL Server in the cluster and still have the MySQL Replication handled by one of the MySQL Servers in the cluster, with the state of the slave side remaining consistent with the cluster acting as the master.
Support for disk-based records: Records on disk will be supported. Indexed fields including the primary key hash index must still be stored in RAM but all other fields can be on disk.
Variable-sized records: A column defined as VARCHAR(255) currently uses 260 bytes of storage independent of what is stored in any particular record. In MySQL 5.1 Cluster tables, only the portion of the column actually taken up by the record will be stored. This will make possible a reduction in space requirements for such columns by a factor of 5 in many cases.
User-defined partitioning: Users will be able to define partitions based on columns that are part of the primary key. The MySQL Server will be able to discover whether it is possible to prune away some of the partitions from the WHERE clause. Partitioning based on KEY, HASH, RANGE, and LIST handlers will be possible, as well as subpartitioning. This feature should also be available for many other handlers, and not only NDB Cluster.

In addition, we are working to increase the 8KB size limit for rows containing columns of types other than BLOB or TEXT in Cluster tables. This is due to the fact that rows are currently fixed in size and the page size is 32,768 bytes (minus 128 bytes for the row header). Currently, this means that if we allowed more than 8KB per record, any remaining space (up to approximately 14,000 bytes) would be left empty. In MySQL 5.1, we plan to fix this limitation so that using more than 8KB in a given row does not result in the remainder of the page being wasted.