Skip to main content
Version: 2.2.6

v2 Replication Operations

Deprecation Warning

v2 Multi-Datacenter Replication is deprecated and will be removed in a future version. Please use v3 instead.

Riak Enterprise's Multi-Datacenter Replication system is largely controlled by the riak-repl command. The sections below detail the available subcommands.

add-listener

Adds a listener (primary) to the given node, IP address, and port.

riak-repl add-listener <nodename> <listen_ip> <port>

Below is an example usage:

riak-repl add-listener riak@10.0.1.156 10.0.1.156 9010

add-nat-listener

Adds a NAT-aware listener (primary) to the given node, IP address, port, NAT IP, and NAT port. If a non-NAT listener already exists with the same internal IP and port, it is "upgraded” to a NAT Listener.

riak-repl add-nat-listener <nodename> <internal_ip> <internal_port> <nat_ip> <nat_port>

Below is an example usage:

riak-repl add-nat-listener riak@10.0.1.156 10.0.1.156 9010 50.16.238.123 9010

del-listener

Removes and shuts down a listener (primary) on the given node, IP address, and port.

riak-repl del-listener <nodename> <listen_ip> <port>

Below is an example usage:

riak-repl del-listener riak@10.0.1.156 10.0.1.156 9010

add-site

Adds a site (secondary) to the local node, connecting to the specified listener.

riak-repl add-site <ipaddr> <portnum> <sitename>

Below is an example usage:

riak-repl add-site 10.0.1.156 9010 newyork

del-site

Removes a site (secondary) from the local node by name.

riak-repl del-site <sitename>

Below is an example usage:

riak-repl del-site newyork

status

Obtains status information about replication. Reports counts on how much data has been transmitted, transfer rates, message queue lengths of clients and servers, number of fullsync operations, and connection status. This command only displays useful information on the leader node.

riak-repl status

start-fullsync

Manually initiates a fullsync operation with connected sites.

riak-repl start-fullsync

cancel-fullsync

Cancels any fullsync operations in progress. If a partition is in progress, synchronization will stop after that partition completes. During cancellation, riak-repl status will show cancelled in the status.

riak-repl cancel-fullsync

pause-fullsync

Pauses any fullsync operations in progress. If a partition is in progress, synchronization will pause after that partition completes. While paused, riak-repl status will show paused in the status information. Fullsync may be cancelled while paused.

riak-repl pause-fullsync

resume-fullsync

Resumes any fullsync operations that were paused. If a fullsync operation was running at the time of the pause, the next partition will be synchronized. If not, it will wait until the next start-fullsync command or fullsync_interval.

riak-repl resume-fullsync

riak-repl Status Output

The following definitions describe the output of the riak-repl status command. Please note that many of these statistics will only appear on the current leader node, and that all counts will be reset to 0 upon restarting Riak Enterprise.

Client

FieldDescription
client_statsSee Client Statistics
client_bytes_recvThe total number of bytes the client has received since the server has been started
client_bytes_sentThe total number of bytes sent to all connected sites
client_connect_errorsThe number of TCP/IP connection errors
client_connectsA count of the number of site connections made to this node
client_redirectIf a client connects to a non-leader node, it will be redirected to a leader node
client_rx_kbpsA snapshot of the client (site)-received kilobits/second taken once a minute. The past 8 snapshots are stored in this list. Newest snapshots appear on the left side of the list.
client_tx_kbpsA snapshot of the client (site)-sent kilobits/second taken once a minute. The past 8 snapshots are stored in this list. Newest snapshots appear on the left side of the list.

Server

FieldDescription
server_bytes_recvThe total number of bytes the server (listener) has received
server_bytes_sentThe total number of bytes the server (listener) has sent
server_connect_errorsThe number of listener to site connection errors
server_connectsThe number of times the listener connects to the client site
server_fullsyncsThe number of fullsync operations that have occurred since the server was started
server_rx_kbpsA snapshot of the server (listener) received kilobits/second taken once a minute. The past 8 snapshots are stored in this list. Newest snapshots appear on the left side of the list.
server_tx_kbpsA snapshot of the server (listener) sent kilobits/second taken once a minute. The past 8 snapshots are stored in this list. Newest snapshots appear on the left side of the list.
server_statsSee Server Statistics

Elections and Objects

FieldDescription
elections_electedIf the replication leader node becomes unresponsive or unavailable, a new leader node in the cluster will be elected
elections_leader_changedThe number of times a Riak node has surrendered leadership
objects_dropped_no_clientsIf the realtime replication work queue is full and there aren't any clients to receive objects, then objects will be dropped from the queue. These objects will be synchronized during a fullsync operation.
objects_dropped_no_leaderIf a client (site) cannot connect to a leader, objects will be dropped during realtime replication
objects_forwardedThe number of Riak objects forwarded to the leader the participate in replication. Please note that this value will only be accurate on a non-leader node.
objects_sentThe number of objects sent via realtime replication

Other

FieldDescription
listener_<nodeid>Defines a replication listener that is running on node <nodeid>
[sitename]_ipsDefines a replication site
leaderWhich node is the current leader of the cluster
local_leader_message_queue_lenThe length of the object queue on the leader
local_leader_heap_sizeThe amount of memory the leader is using

Client Statistics

FieldDescription
nodeA unique ID for the Riak node on which the client (site) is running
siteThe connected site name configured with riak-repl add-site
strategyA replication strategy defines an implementation of the Riak Replication protocol. Valid values: keylist, syncv1
fullsync_workerThe Erlang process ID of the fullsync worker
waiting_to_retryThe listeners currently waiting to retry replication after a failure
connectedA list of connected clients
  • connected --- The IP address and port of a connected client (site)
  • cluster_name --- The name of the connected client (site)
  • connecting --- The PID, IP address, and port of a client currently establishing a connection
stateState shows what the current replication strategy is currently processing. The following definitions appear in the status output if keylist strategy is being used. They can be used by Basho support to identify replication issues.
  • request_partition
  • wait_for_fullsync
  • send_keylist
  • wait_ack

Bounded Queue

The bounded queue is responsible for holding objects that are waiting to participate in realtime replication. Please see the Riak Enterprise MDC Replication Configuration guide for more information.

FieldDescription
queue_pidThe Erlang process ID of the bounded queue
dropped_countThe number of objects that failed to be enqueued in the bounded queue due to the queue being full. These objects will be replicated during the next fullsync operation.
queue_lengthThe number of Riak objects currently in the bounded queue
queue_byte_sizeThe size of all objects currently in the queue
queue_max_sizeThe number of bytes the queue can hold before objects are dropped. These objects will be replicated during the next fullsync operation.
queue_percentageThe percentage of the queue that is full
queue_pendingThe current count of "in-flight" objects we've sent that the client has not acknowledged
queue_max_pendingThe maximum number of objects that can be "in flight" before we refuse to send any more.

Server Statistics

FieldDescription
nodeA unique ID for the Riak node on which the server (listener) is running
siteThe connected site name configured with riak-repl add-site
strategyA replication strategy defines an implementation of the Riak Replication protocol. Valid values: keylist or syncv1.
fullsync_workerThe Erlang process ID of the fullsync worker
bounded_queueSee the Bounded Queue section above
stateState shows what the current replication strategy is processing. The following definitions appear in the status output if the keylist strategy is being used. They can be used by Basho support to identify replication issues.
  • wait_for_partition
  • build_keylist
  • wait_keylist
  • diff_bloom
  • diff_keylist
s
message_queue_lenThe number of Erlang messages that are waiting to be processed by the server

Keylist Strategy

These similar fields are under both the keylist_server and keylist_client fields. Any differences are described in the table.

FieldDescription
fullsyncOn the client, the number of partitions that remain to be processed. On the server, the partition currently being processed by fullsync replication.
partition_startThe number of elapsed seconds since replication has started on a given partition
stage_startThe number of elapsed seconds since replication has started on a given stage
get_pool_sizeThe number of Riak get finite state workers available to process requests