15.6. Process Management in MySQL Cluster

Understanding how to manage MySQL Cluster requires a knowledge of four essential processes. In the next few sections of this chapter, we cover the roles played by these processes in a cluster, how to use them, and what startup options are available for each of them:

15.6.1. MySQL Server Process Usage for MySQL Cluster

mysqld is the traditional MySQL server process. To be used with MySQL Cluster, mysqld needs to be built with support for the NDB Cluster storage engine, as it is in the precompiled -max binaries available from http://dev.mysql.com/downloads/. If you build MySQL from source, you must invoke configure with the --with-ndbcluster option to enable NDB Cluster storage engine support.

If the mysqld binary has been built with Cluster support, the NDB Cluster storage engine is still disabled by default. You can use either of two possible options to enable this engine:

  • Use --ndbcluster as a startup option on the command line when starting mysqld.

  • Insert a line containing ndbcluster in the [mysqld] section of your my.cnf file.

An easy way to verify that your server is running with the NDB Cluster storage engine enabled is to issue the SHOW ENGINES statement in the MySQL Monitor (mysql). You should see the value YES as the Support value in the row for NDBCLUSTER. If you see NO in this row or if there is no such row displayed in the output, you are not running an NDB-enabled version of MySQL. If you see DISABLED in this row, you need to enable it in either one of the two ways just described.

To read cluster configuration data, the MySQL server requires at a minimum three pieces of information:

  • The MySQL server's own cluster node ID

  • The hostname or IP address for the management server (MGM node)

  • The number of the TCP/IP port on which it can connect to the management server

Node IDs can be allocated dynamically, so it is not strictly necessary to specify them explicitly.

The mysqld parameter ndb-connectstring is used to specify the connectstring either on the command line when starting mysqld or in my.cnf. The connectstring contains the hostname or IP address where the management server can be found, as well as the TCP/IP port it uses.

In the following example, ndb_mgmd.mysql.com is the host where the management server resides, and the management server listens for cluster messages on port 1186:

shell> mysqld --ndb-connectstring=ndb_mgmd.mysql.com:1186

See Section 15.4.4.2, “The MySQL Cluster connectstring, for more information on connectstrings.

Given this information, the MySQL server will be a full participant in the cluster. (We sometimes refer to a mysqld process running in this manner as an SQL node.) It will be fully aware of all cluster data nodes as well as their status, and will establish connections to all data nodes. In this case, it is able to use any data node as a transaction coordinator and to read and update node data.

15.6.2. ndbd, the Storage Engine Node Process

ndbd is the process that is used to handle all the data in tables using the NDB Cluster storage engine. This is the process that empowers a data node to accomplish distributed transaction handling, node recovery, checkpointing to disk, online backup, and related tasks.

In a MySQL Cluster, a set of ndbd processes cooperate in handling data. These processes can execute on the same computer (host) or on different computers. The correspondences between data nodes and Cluster hosts is completely configurable.

ndbd generates a set of log files which are placed in the directory specified by DataDir in the config.ini configuration file. These log files are listed below. Note that node_id represents the node's unique identifier. For example, ndb_2_error.log is the error log generated by the data node whose node ID is 2.

  • ndb_node_id_error.log is a file containing records of all crashes which the referenced ndbd process has encountered. Each record in this file contains a brief error string and a reference to a trace file for this crash. A typical entry in this file might appear as shown here:

    Date/Time: Saturday 30 July 2004 - 00:20:01
    Type of error: error
    Message: Internal program error (failed ndbrequire)
    Fault ID: 2341
    Problem data: DbtupFixAlloc.cpp
    Object of reference: DBTUP (Line: 173)
    ProgramName: NDB Kernel
    ProcessID: 14909
    TraceFile: ndb_2_trace.log.2
    ***EOM***
    

    Note: It is very important to be aware that the last entry in the error log file is not necessarily the newest one (nor is it likely to be). Entries in the error log are not listed in chronological order; rather, they correspond to the order of the trace files as determined in the ndb_node_id_trace.log.next file (see below). Error log entries are thus overwritten in a cyclical and not sequential fashion.

  • ndb_node_id_trace.log.trace_id is a trace file describing exactly what happened just before the error occurred. This information is useful for analysis by the MySQL Cluster development team.

    It is possible to configure the number of these trace files that will be created before old files are overwritten. trace_id is a number which is incremented for each successive trace file.

  • ndb_node_id_trace.log.next is the file that keeps track of the next trace file number to be assigned.

  • ndb_node_id_out.log is a file containing any data output by the ndbd process. This file is created only if ndbd is started as a daemon.

  • ndb_node_id.pid is a file containing the process ID of the ndbd process when started as a daemon. It also functions as a lock file to avoid the starting of nodes with the same identifier.

  • ndb_node_id_signal.log is a file used only in debug versions of ndbd, where it is possible to trace all incoming, outgoing, and internal messages with their data in the ndbd process.

It is recommended not to use a directory mounted through NFS because in some environments this can cause problems whereby the lock on the .pid file remains in effect even after the process has terminated.

To start ndbd, it may also be necessary to specify the hostname of the management server and the port on which it is listening. Optionally, one may also specify the node ID that the process is to use.

shell> ndbd --connect-string="nodeid=2;host=ndb_mgmd.mysql.com:1186"

See Section 15.4.4.2, “The MySQL Cluster connectstring, for additional information about this issue. Section 15.6.5, “Command Options for MySQL Cluster Processes”, describes other options for ndbd.

When ndbd starts, it actually initiates two processes. The first of these is called the “angel process”; its only job is to discover when the execution process has been completed, and then to restart the ndbd process if it is configured to do so. Thus, if you attempt to kill ndbd via the Unix kill command, it is necessary to kill both processes, beginning with the angel process. The preferred method of terminating an ndbd process is to use the management client and stop the process from there.

The execution process uses one thread for reading, writing, and scanning data, as well as all other activities. This thread is implemented asynchronously so that it can easily handle thousands of concurrent activites. In addition, a watch-dog thread supervises the execution thread to make sure that it does not hang in an endless loop. A pool of threads handles file I/O, with each thread able to handle one open file. Threads can also be used for transporter connections by the transporters in the ndbd process. In a system performing a large number of operations, including updates, the ndbd process can consume up to 2 CPUs if permitted to do so. For a machine with many CPUs it is recommended to use several ndbd processes which belong to different node groups.

15.6.3. ndb_mgmd, the Management Server Process

The management server is the process that reads the cluster configuration file and distributes this information to all nodes in the cluster that request it. It also maintains a log of cluster activities. Management clients can connect to the management server and check the cluster's status.

It is not strictly necessary to specify a connectstring when starting the management server. However, if you are using more than one management server, a connectstring should be provided and each node in the cluster should specify its node ID explicitly.

See Section 15.4.4.2, “The MySQL Cluster connectstring, for information about using connectstrings. Section 15.6.5, “Command Options for MySQL Cluster Processes”, describes other options for ndb_mgmd.

The following files are created or used by ndb_mgmd in its starting directory, and are placed in the DataDir as specified in the config.ini configuration file. In the list that follows, node_id is the unique node identifier.

  • config.ini is the configuration file for the cluster as a whole. This file is created by the user and read by the management server. Section 15.4, “MySQL Cluster Configuration”, discusses how to set up this file.

  • ndb_node_id_cluster.log is the cluster events log file. Examples of such events include checkpoint startup and completion, node startup events, node failures, and levels of memory usage. A complete listing of cluster events with descriptions may be found in Section 15.7, “Management of MySQL Cluster”.

    When the size of the cluster log reaches one million bytes, the file is renamed to ndb_node_id_cluster.log.seq_id, where seq_id is the sequence number of the cluster log file. (For example: If files with the sequence numbers 1, 2, and 3 already exist, the next log file is named using the number 4.)

  • ndb_node_id_out.log is the file used for stdout and stderr when running the management server as a daemon.

  • ndb_node_id.pid is the process ID file used when running the management server as a daemon.

15.6.4. ndb_mgm, the Management Client Process

The management client process is actually not needed to run the cluster. Its value lies in providing a set of commands for checking the cluster's status, starting backups, and performing other administrative functions. The management client accesses the management server using a C API. Advanced users can also employ this API for programming dedicated management processes to perform tasks similar to those performed by ndb_mgm.

To start the management client, it is necessary to supply the hostname and port number of the management server:

shell> ndb_mgm [host_name [port_num]]

For example:

shell> ndb_mgm ndb_mgmd.mysql.com 1186

The default hostname and port number are localhost and 1186, respectively.

Additional information about using ndb_mgm can be found in Section 15.6.5.4, “Command Options for ndb_mgm, and Section 15.7.2, “Commands in the Management Client”.

15.6.5. Command Options for MySQL Cluster Processes

All MySQL Cluster executables (except for mysqld) take the options described in this section. Users of earlier MySQL Cluster versions should note that some of these options have been changed from those in MySQL 4.1 Cluster to make them consistent with one another as well as with mysqld. You can use the --help option to view a list of supported options.

The following sections describe options specific to individual NDB programs.

  • --help --usage, -?

    Prints a short list with descriptions of the available command options.

  • --connect-string=connect_string, -c connect_string

    connect_string sets the connectstring to the management server as a command option.

    shell> ndbd --connect-string="nodeid=2;host=ndb_mgmd.mysql.com:1186"
    
  • --debug[=options]

    This option can only be used for versions compiled with debugging enabled. It is used to enable output from debug calls in the same manner as for the mysqld process.

  • --execute=command -e command

    Can be used to send a command to a Cluster executable from the system shell. For example, either of the following:

    shell> ndb_mgm -e show
    

    or

    shell> ndb_mgm --execute="SHOW"
    

    is equivalent to

    NDB> SHOW;
    

    This is analogous to how the --execute or -e option works with the mysql command-line client. See Section 4.3.1, “Using Options on the Command Line”.

  • --version, -V

    Prints the version number of the ndbd process. The version number is the MySQL Cluster version number. The version number is relevant because not all versions can be used together, and at startup the MySQL Cluster processes verifies that the versions of the binaries being used can co-exist in the same cluster. This is also important when performing an online (rolling) software upgrade of MySQL Cluster. (See Section 15.5.1, “Performing a Rolling Upgrade or Downgrade”).

15.6.5.1. MySQL Cluster-Related Command Options for mysqld

  • --ndb-connectstring=connect_string

    When using the NDB Cluster storage engine, this option specifies the management server that distributes cluster configuration data.

  • --ndbcluster

    The NDB Cluster storage engine is necessary for using MySQL Cluster. If a mysqld binary includes support for the NDB Cluster storage engine, the engine is disabled by default. Use the --ndbcluster option to enable it. Use --skip-ndbcluster to explicitly disable the engine.

15.6.5.2. Command Options for ndbd

For options common to NDB programs, see Section 15.6.5, “Command Options for MySQL Cluster Processes”.

  • --daemon, -d

    Instructs ndbd to execute as a daemon process. This is the default behavior. --nodaemon can be used to not start the process as a daemon.

  • --initial

    Instructs ndbd to perform an initial start. An initial start erases any files created for recovery purposes by earlier instances of ndbd. It also re-creates recovery log files. Note that on some operating systems this process can take a substantial amount of time.

    An --initial start is to be used only the very first time that the ndbd process is started because it removes all files from the Cluster filesystem and re-creates all REDO log files. The exceptions to this rule are:

    • When performing a software upgrade which has changed the contents of any files.

    • When restarting the node with a new version of ndbd.

    • As a measure of last resort when for some reason the node restart or system restart repeatedly fails. In this case, be aware that this node can no longer be used to restore data due to the destruction of the datafiles.

    This option does not affect any backup files that have already been created by the affected node.

  • --initial-start

    This option is used when performing a partial initial start of the cluster. Each node should be started with this option, as well as --no-wait-nodes.

    For example, suppose you have a 4-node cluster whose data nodes have the IDs 2, 3, 4, and 5, and you wish to perform a partial initial start using only nodes 2, 4, and 5 — that is, omitting node 3:

    ndbd --ndbd-nodeid=2 --no-wait-nodes=3 --initial-start
    ndbd --ndbd-nodeid=4 --no-wait-nodes=3 --initial-start
    ndbd --ndbd-nodeid=5 --no-wait-nodes=3 --initial-start
    

    This option was added in MySQL 5.0.21.

  • --nowait-nodes=node_id_1[, node_id_2[, ...]]

    This option takes a list of data nodes which for which the cluster will not wait for before starting.

    This can be used to start the cluster in a partitioned state. For example, to start the cluster with only half of the data nodes (nodes 2, 3, 4, and 5) running in a 4-node cluster, you can start each ndbd process with --nowait-nodes=3,5. In this case, the cluster starts as soon as nodes 2 and 4 connect, and does not wait StartPartitionedTimeout milliseconds for nodes 3 and 5 to connect as it would otherwise.

    If you wanted to start up the same cluster as in the previous example without one ndbd — say, for example, that the host machine for node 3 has suffered a hardware failure — then start nodes 2, 4, and 5 with --no-wait-nodes=3. Then the cluster will start as soon as nodes 2, 4, and 5 connect and will not wait for node 3 to start.

    This option was added in MySQL 5.0.21.

  • --nodaemon

    Instructs ndbd not to start as a daemon process. This is useful when ndbd is being debugged and you want output to be redirected to the screen.

  • --nostart

    Instructs ndbd not to start automatically. When this option is used, ndbd connects to the management server, obtains configuration data from it, and initializes communication objects. However, it does not actually start the execution engine until specifically requested to do so by the management server. This can be accomplished by issuing the proper command to the management client.

15.6.5.3. Command Options for ndb_mgmd

For options common to NDB programs, see Section 15.6.5, “Command Options for MySQL Cluster Processes”.

  • --config-file=filename, -f filename,

    Instructs the management server as to which file it should use for its configuration file. This option must be specified. The filename defaults to config.ini.

    Note: This option also can be given as -c file_name, but this shortcut is obsolete and should not be used in new installations.

  • --daemon, -d

    Instructs ndb_mgmd to start as a daemon process. This is the default behavior.

  • --nodaemon

    Instructs ndb_mgmd not to start as a daemon process.

15.6.5.4. Command Options for ndb_mgm

For options common to NDB programs, see Section 15.6.5, “Command Options for MySQL Cluster Processes”.

  • --try-reconnect=number

    If the connection to the management server is broken, the node tries to reconnect to it every 5 seconds until it succeeds. By using this option, it is possible to limit the number of attempts to number before giving up and reporting an error instead.