15.7. Management of MySQL Cluster

Managing a MySQL Cluster involves a number of tasks, the first of which is to configure and start MySQL Cluster. This is covered in Section 15.4, “MySQL Cluster Configuration”, and Section 15.6, “Process Management in MySQL Cluster”.

The following sections cover the management of a running MySQL Cluster.

There are essentially two methods of actively managing a running MySQL Cluster. The first of these is through the use of commands entered into the management client whereby cluster status can be checked, log levels changed, backups started and stopped, and nodes stopped and started. The second method involves studying the contents of the cluster log ndb_node_id_cluster.log in the management server's DataDir directory. (Recall that node_id represents the unique identifier of the node whose activity is being logged.) The cluster log contains event reports generated by ndbd. It is also possible to send cluster log entries to a Unix system log.

15.7.1. MySQL Cluster Startup Phases

This section describes the steps involved when the cluster is started.

There are several different startup types and modes, as shown here:

Initial Start: The cluster starts with a clean filesystem on all data nodes. This occurs either when the cluster started for the very first time, or when it is restarted using the --initial option.
System Restart: The cluster starts and reads data stored in the data nodes. This occurs when the cluster has been shut down after having been in use, when it is desired for the cluster to resume operations from the point where it left off.
Node Restart: This is the online restart of a cluster node while the cluster itself is running.
Initial Node Restart: This is the same as a node restart, except that the node is reinitialized and started with a clean filesystem.

Prior to startup, each data node (ndbd process) must be initialized. Initialization consists of the following steps:

Obtain a Node ID.
Fetch configuration data.
Allocate ports to be used for inter-node communications.
Allocate memory according to settings obtained from the configuration file.

When a data node or SQL node first connects to the management node, it reserves a cluster node ID. To make sure that no other node allocates the same node ID, this ID is retained until the node has managed to connect to the cluster and at least one ndbd reports that this node is connected. This retention of the node ID is guarded by the connection between the node in question and ndb_mgmd.

Normally, in the event of a problem with the node, the node disconnects from the management server, the socket used for the connection is closed, and the reserved node ID is freed. However, if a node is disconnected abruptly — for example, due to a hardware failure in one of the cluster hosts, or because of network issues — the normal closing of the socket by the operating system may not take place. In this case, the node ID continues to be reserved and not released until a TCP timeout occurs 10 or so minutes later.

To take care of this problem, you can use PURGE STALE SESSIONS. Running this statement forces all reserved node IDs to be checked; any that are not being used by nodes actually connected to the cluster are then freed.

Beginning with MySQL 5.0.22, timeout handling of node ID assignments is implemented. This performs the ID usage checks automatically after approximately 20 seconds, so that PURGE STALE SESSIONS should no longer be necessary in a normal Cluster start.

After each data node has been initialized, the cluster startup process can proceed. The stages which the cluster goes through during this process are listed here:

Stage 0
Clear the cluster filesystem. This stage occurs only if the cluster was started with the --initial option.
Stage 1
This stage sets up Cluster connections, establishes inter-node communications are established, and starts Cluster heartbeats.
Stage 2
The arbitrator node is elected. If this is a system restart, the cluster determines the latest restorable global checkpoint.
Stage 3
This stage initializes a number of internal cluster variables.
Stage 4
For an initial start or initial node restart, the redo log files are created. The number of these files is equal to NoOfFragmentLogFiles.
For a system restart:
- Read schema or schemas.
- Read data from the local checkpoint and undo logs.
- Apply all redo information until the latest restorable global checkpoint has been reached.
For a node restart, find the tail of the redo log.
Stage 5
If this is an initial start, create the SYSTAB_0 and NDB$EVENTS internal system tables.
For a node restart or an initial node restart:
1. The node is included in transaction handling operations.
2. The node schema is compared with that of the master and synchronized with it.
3. Synchronize data received in the form of INSERT from the other data nodes in this node's node group.
4. In all cases, wait for complete local checkpoint as determined by the arbitrator.
Stage 6
Update internal variables.
Stage 7
Update internal variables.
Stage 8
In a system restart, rebuild all indexes.
Stage 9
Update internal variables.
Stage 10
At this point in a node restart or initial node restart, APIs may connect to the node and being to receive events.
Stage 11
At this point in a node restart or initial node restart, event delivery is handed over to the node joining the cluster. The newly-joined node takes over responsibility for delivering its primary data to subscribers.

After this process is completed for an initial start or system restart, transaction handling is enabled. For a node restart or initial node restart, completion of the startup process means that the node may now act as a transaction coordinator.

15.7.2. Commands in the Management Client

In addition to the central configuration file, a cluster may also be controlled through a command-line interface available through the management client ndb_mgm. This is the primary administrative interface to a running cluster.

Commands for the event logs are given in Section 15.7.3, “Event Reports Generated in MySQL Cluster”; commands for creating backups and restoring from backup are provided in Section 15.8, “On-line Backup of MySQL Cluster”.

The management client has the following basic commands. In the listing that follows, node_id denotes either a database node ID or the keyword ALL, which indicates that the command should be applied to all of the cluster's data nodes.

HELP
Displays information on all available commands.
SHOW
Displays information on the cluster's status.
Note: In a cluster where multiple management nodes are in use, this command displays information only for data nodes that are actually connected to the current management server.
node_id START
Starts the data node identified by node_id (or all data nodes).
Beginning with MySQL 5.0.19, this command can also be used to start individual management nodes. Note: ALL START continues to affect data nodes only.
node_id STOP
Stops the data node identified by node_id (or all data nodes).
Beginning with MySQL 5.0.19, this command can also be used to stop individual management nodes. Note: ALL STOP continues to affect data nodes only.
node_id RESTART [-N] [-I]
Restarts the data node identified by node_id (or all data nodes).
node_id STATUS
Displays status information for the data node identified by node_id (or for all data nodes).
ENTER SINGLE USER MODE node_id
Enters single-user mode, whereby only the MySQL server identified by the node ID node_id is allowed to access the database.
EXIT SINGLE USER MODE
Exits single-user mode, allowing all SQL nodes (that is, all running mysqld processes) to access the database.
QUIT
Terminates the management client.
SHUTDOWN
Shuts down all cluster nodes, except for SQL nodes, and exits.

15.7.3. Event Reports Generated in MySQL Cluster

In this section, we discuss the types of event logs provided by MySQL Cluster, and the types of events that are logged.

MySQL Cluster provides two types of event log. These are the cluster log, which includes events generated by all cluster nodes, and the node logs, which are local to each data node.

Output generated by cluster event logging can have multiple destinations including a file, the management server console window, or syslog. Output generated by node event logging is written to the data node's console window.

Both types of event logs can be set to log different subsets of events.

Note: The cluster log is the log recommended for most uses because it provides logging information for an entire cluster in a single file. Node logs are intended to be used only during application development, or for debugging application code.

Each reportable event can be distinguished according to three different criteria:

Category: This can be any one of the following values: STARTUP, SHUTDOWN, STATISTICS, CHECKPOINT, NODERESTART, CONNECTION, ERROR, or INFO.
Priority: This is represented by one of the numbers from 1 to 15 inclusive, where 1 indicates “most important” and 15 “least important.”
Severity Level: This can be any one of the following values: ALERT, CRITICAL, ERROR, WARNING, INFO, or DEBUG.

Both the cluster log and the node log can be filtered on these properties.

15.7.3.1. Logging Management Commands

The following management commands are related to the cluster log:

CLUSTERLOG ON
Turns the cluster log on.
CLUSTERLOG OFF
Turns the cluster log off.
CLUSTERLOG INFO
Provides information about cluster log settings.
node_id CLUSTERLOG category=threshold
Logs category events with priority less than or equal to threshold in the cluster log.
CLUSTERLOG FILTER severity_level
Toggles cluster logging of events of the specified severity_level.

The following table describes the default setting (for all data nodes) of the cluster log category threshold. If an event has a priority with a value lower than or equal to the priority threshold, it is reported in the cluster log.

Note that events are reported per data node, and that the threshold can be set to different values on different nodes.

Category	Default threshold (All data nodes)
`STARTUP`	7
`SHUTDOWN`	7
`STATISTICS`	7
`CHECKPOINT`	7
`NODERESTART`	7
`CONNECTION`	7
`ERROR`	15
`INFO`	7

Thresholds are used to filter events within each category. For example, a STARTUP event with a priority of 3 is not logged unless the threshold for STARTUP is changed to 3 or lower. Only events with priority 3 or lower are sent if the threshold is 3.

The following table shows the event severity levels. (Note: These correspond to Unix syslog levels, except for LOG_EMERG and LOG_NOTICE, which are not used or mapped.)

1	`ALERT`	A condition that should be corrected immediately, such as a corrupted system database
2	`CRITICAL`	Critical conditions, such as device errors or insufficient resources
3	`ERROR`	Conditions that should be corrected, such as configuration errors
4	`WARNING`	Conditions that are not errors, but that might require special handling
5	`INFO`	Informational messages
6	`DEBUG`	Debugging messages used for `NDB Cluster` development

Event severity levels can be turned on or off (using CLUSTERLOG FILTER — see above). If a severity level is turned on, then all events with a priority less than or equal to the category thresholds are logged. If the severity level is turned off then no events belonging to that severity level are logged.

15.7.3.2. Log Events

An event report reported in the event logs has the following format:

datetime [string] severity -- message

For example:

09:19:30 2005-07-24 [NDB] INFO -- Node 4 Start phase 4 completed

This section discusses all reportable events, ordered by category and severity level within each category.

In the event descriptions, GCP and LCP mean “Global Checkpoint” and “Local Checkpoint,” respectively.

CONNECTION Events

These events are associated with connections between Cluster nodes.

Event	Priority	Severity Level	Description
data nodes connected	8	`INFO`	Data nodes connected
data nodes disconnected	8	`INFO`	Data nodes disconnected
Communication closed	8	`INFO`	SQL node or data node connection closed
Communication opened	8	`INFO`	SQL node or data node connection opened

CHECKPOINT Events

The logging messages shown here are associated with checkpoints.

Event	Priority	Severity Level	Description
LCP stopped in calc keep GCI	0	`ALERT`	LCP stopped
Local checkpoint fragment completed	11	`INFO`	LCP on a fragment has been completed
Global checkpoint completed	10	`INFO`	GCP finished
Global checkpoint started	9	`INFO`	Start of GCP: REDO log is written to disk
Local checkpoint completed	8	`INFO`	LCP completed normally
Local checkpoint started	7	`INFO`	Start of LCP: data written to disk
Report undo log blocked	7	`INFO`	UNDO logging blocked; buffer near overflow

STARTUP Events

The following events are generated in response to the startup of a node or of the cluster and of its success or failure. They also provide information relating to the progress of the startup process, including information concerning logging activities.

Event	Priority	Severity Level	Description
Internal start signal received STTORRY	15	`INFO`	Blocks received after completion of restart
Undo records executed	15	`INFO`
New REDO log started	10	`INFO`	GCI keep `X`, newest restorable GCI `Y`
New log started	10	`INFO`	Log part `X`, start MB `Y`, stop MB `Z`
Node has been refused for inclusion in the cluster	8	`INFO`	Node cannot be included in cluster due to misconfiguration, inability to establish communication, or other problem
data node neighbors	8	`INFO`	Shows neighboring data nodes
data node start phase `X` completed	4	`INFO`	A data node start phase has been completed
Node has been successfully included into the cluster	3	`INFO`	Displays the node, managing node, and dynamic ID
data node start phases initiated	1	`INFO`	NDB Cluster nodes starting
data node all start phases completed	1	`INFO`	NDB Cluster nodes started
data node shutdown initiated	1	`INFO`	Shutdown of data node has commenced
data node shutdown aborted	1	`INFO`	Unable to shut down data node normally

NODERESTART Events

The following events are generated when restarting a node and relate to the success or failure of the node restart process.

Event	Priority	Severity Level	Description
Node failure phase completed	8	`ALERT`	Reports completion of node failure phases
Node has failed, node state was `X`	8	`ALERT`	Reports that a node has failed
Report arbitrator results	2	`ALERT`	There are eight different possible results for arbitration attempts: Arbitration check failed — less than 1/2 nodes left Arbitration check succeeded — node group majority Arbitration check failed — missing node group Network partitioning — arbitration required Arbitration succeeded — affirmative response from node `X` Arbitration failed - negative response from node `X` Network partitioning - no arbitrator available Network partitioning - no arbitrator configured
Completed copying a fragment	10	`INFO`
Completed copying of dictionary information	8	`INFO`
Completed copying distribution information	8	`INFO`
Starting to copy fragments	8	`INFO`
Completed copying all fragments	8	`INFO`
GCP takeover started	7	`INFO`
GCP takeover completed	7	`INFO`
LCP takeover started	7	`INFO`
LCP takeover completed (state = `X`)	7	`INFO`
Report whether an arbitrator is found or not	6	`INFO`	There are seven different possible outcomes when seeking an arbitrator: Management server restarts arbitration thread [state=`X`] Prepare arbitrator node `X` [ticket=`Y`] Receive arbitrator node `X` [ticket=`Y`] Started arbitrator node `X` [ticket=`Y`] Lost arbitrator node `X` - process failure [state=`Y`] Lost arbitrator node `X` - process exit [state=`Y`] Lost arbitrator node `X` <error msg> [state=`Y`]

STATISTICS Events

The following events are of a statistical nature. They provide information such as numbers of transactions and other operations, amount of data sent or received by individual nodes, and memory usage.

Event	Priority	Severity Level	Description
Report job scheduling statistics	9	`INFO`	Mean internal job scheduling statistics
Sent number of bytes	9	`INFO`	Mean number of bytes sent to node `X`
Received # of bytes	9	`INFO`	Mean number of bytes received from node `X`
Report transaction statistics	8	`INFO`	Numbers of: transactions, commits, reads, simple reads, writes, concurrent operations, attribute information, and aborts
Report operations	8	`INFO`	Number of operations
Report table create	7	`INFO`
Memory usage	5	`INFO`	Data and index memory usage (80%, 90%, and 100%)

ERROR Events

These events relate to Cluster errors and warnings. The presence of one or more of these generally indicates that a major malfunction or failure has occurred.

Event	Priority	Severity	Description
Dead due to missed heartbeat	8	`ALERT`	Node `X` declared “dead” due to missed heartbeat
Transporter errors	2	ERROR
Transporter warnings	8	`WARNING`
Missed heartbeats	8	`WARNING`	Node `X` missed heartbeat #`Y`
General warning events	2	`WARNING`

INFO Events

These events provide general information about the state of the cluster and activities associated with Cluster maintenance, such as logging and heartbeat transmission.

Event	Priority	Severity	Description
Sent heartbeat	12	`INFO`	Heartbeat sent to node `X`
Create log bytes	11	`INFO`	Log part, log file, MB
General information events	2	`INFO`

15.7.3.3. Using `CLUSTERLOG STATISTICS`

The NDB management client's CLUSTERLOG STATISTICS command can provide a number of useful statistics in its output. The following statistics are reported by the transaction coordinator:

Statistic	*Description (Number of...)*
`Trans. Count`	Transactions attempted with this node as coordinator
`Commit Count`	Transactions committed with this node as coordinator
`Read Count`	Primary key reads (all)
`Simple Read Count`	Primary key reads reading the latest committed value
`Write Count`	Primary key writes (includes all `INSERT`, `UPDATE`, and `DELETE` operations)
`AttrInfoCount`	Data words used to describe all reads and writes received
`Concurrent Operations`	All concurrent operations ongoing at the moment the report is taken
`Abort Count`	Transactions with this node as coordinator that were aborted
`Scans`	Scans (all)
`Range Scans`	Index scans

The ndbd process has a scheduler that runs in an infinite loop. During each loop scheduler performs the following tasks:

Read any incoming messages from sockets into a job buffer.
Check whether there are any timed messages to be executed; if so, put these into the job buffer as well.
Execute (in a loop) any messages in the job buffer.
Send any distributed messages that were generated by executing the messages in the job buffer.
Wait for any new incoming messages.

The number of loops executed in the third step is reported as the Mean Loop Counter. This statistic increases in size as the utilisation of the TCP/IP buffer improves. You can use this to monitor performance as you add new processes to the cluster.

The Mean send size and Mean receive size statistics allow you to gauge the efficiency of writes and reads (respectively) between nodes. These values are given in bytes. Higher values mean a lower cost per byte sent or received; the maximum is 64k.

To generate a report of all cluster log statistics, you can use the following command in the NDB management client:

ndb_mgm> ALL CLUSTERLOG STATISTICS=15

15.7.4. Single-User Mode

Single-user mode allows the database administrator to restrict access to the database system to a single MySQL server (SQL node). When entering single-user mode, all connections to all other MySQL servers are closed gracefully and all running transactions are aborted. No new transactions are allowed to be started.

Once the cluster has entered single-user mode, only the designated SQL node is granted access to the database.

You can use the ALL STATUS command to see when the cluster has entered single-user mode.

Example:

NDB> ENTER SINGLE USER MODE 5

After this command has executed and the cluster has entered single-user mode, the SQL node whose node ID is 5 becomes the cluster's only permitted user.

The node specified in the preceding command must be a MySQL Server node; An attempt to specify any other type of node will be rejected.

Note: When the preceding commmand is invoked, all transactions running on the designated node are aborted, the connection is closed, and the server must be restarted.

The command EXIT SINGLE USER MODE changes the state of the cluster's data nodes from single-user mode to normal mode. MySQL Servers waiting for a connection (that is, for the cluster to become ready and available), are again permitted to connect. The MySQL Server denoted as the single-user SQL node continues to run (if still connected) during and after the state change.

Example:

NDB> EXIT SINGLE USER MODE

There are two recommended ways to handle a node failure when running in single-user mode:

Method 1:
1. Finish all single-user mode transactions
2. Issue the EXIT SINGLE USER MODE command
3. Restart the cluster's data nodes
Method 2:
Restart database nodes prior to entering single-user mode.