Erlang VM Tuning
Riak was written almost exclusively in Erlang and runs on an Erlang virtual machine (VM), which makes proper Erlang VM tuning an important part of optimizing Riak performance. The Erlang VM itself provides a wide variety of configurable parameters that you can use to tune its performance; Riak enables you to tune a subset of those parameters in each node's configuration files.
The table below lists some of the parameters that are available, showing both their names as used in Erlang and their names as Riak parameters.
| Erlang parameter | Riak parameter | 
|---|---|
| +A | erlang.async_threads | 
| +K | erlang.K | 
| +P | erlang.process_limit | 
| +Q | erlang.max_ports | 
| +S | erlang.schedulers.total,erlang.schedulers.online | 
| +W | erlang.W | 
| +a | erlang.async_threads.stack_size | 
| +e | erlang.max_ets_tables | 
| +scl | erlang.schedulers.compaction_of_load | 
| +sfwi | erlang.schedulers.force_wakeup_interval | 
| -smp | erlang.smp | 
| +sub | erlang.schedulers.utilization_balancing | 
| +zdbbl | erlang.distribution_buffer_size | 
| -kernel net_ticktime | erlang.distribution.net_ticktime | 
| -env FULLSWEEP_AFTER | erlang.fullsweep_after | 
| -env ERL_CRASH_DUMP | erlang.crash_dump | 
| -env ERL_MAX_ETS_TABLES | erlang.max_ets_tables | 
| -name | nodename | 
In versions of Riak prior to 2.0, Erlang VM-related parameters were specified
in a vm.args configuration file; in versions 2.0 and later, all
Erlang-VM-specific parameters are set in the riak.conf file. If you're
upgrading to 2.0 from an earlier version, you can still use your old vm.args
if you wish.  Please note, however, that if you set one or more parameters in
both vm.args and in riak.conf, the settings in vm.args will override
those in riak.conf.
SMP
Some operating systems provide Erlang VMs with Symmetric Multiprocessing
capabilities
(SMP) for
taking advantage of multi-processor hardware architectures. SMP support
can be turned on or off by setting the erlang.smp parameter to
enable or disable. It is enabled by default. The following would
disable SMP support:
erlang.smp = disable
Because Riak is supported on some operating systems that do not provide
SMP support. Make sure that your OS supports SMP before enabling it for
use by Riak's Erlang VM. If it does not, you should set erlang.smp to
disable prior to starting up your cluster.
Another safe option is to set erlang.smp to auto. This will instruct
the Erlang VM to start up with SMP support enabled if (a) SMP support is
available on the current OS and (b) more than one logical processor is
detected. If neither of these conditions is met, the Erlang VM will
start up with SMP disabled.
Schedulers
We recommend that all users set the +sfwi to 500 (milliseconds)
and the +scl flag to false if using the older, vm.args-based
configuration system. If you are using the new, riak.conf-based
configuration system, the corresponding parameters are
erlang.schedulers.force_wakeup_interval and
erlang.schedulers.compaction_of_load.
Please note that you will need to uncomment the appropriate lines in
your riak.conf for this configuration to take effect.
If SMP support has been enabled on your Erlang
VM, i.e. if erlang.smp is set to enable or auto on a machine
providing SMP support and more than one logical processor, you can
configure the number of logical processors, or scheduler
threads, that are created
when starting Riak, as well as the number of threads that are set
online.
The total number of threads can be set using the
erlang.schedulers.total parameter, whereas the number of threads set
online can be set using erlang.schedulers.online. These parameters map
directly onto Schedulers and SchedulersOnline, both of which are
used by erl.
While the maximum for both parameters is 1024, there is no universal
default for either. Instead, the Erlang VM will attempt to determine the
number of configured processors, as well as the number of available
processors, on its own. If the Erlang VM can make that determination,
schedulers.total will default to the total number of configured
processors while schedulers.online will default to the number of
processors available; if the Erlang VM can't make that determination,
both values will default to 1.
If either parameter is set to a negative integer, that value will be
subtracted from the default number of processors that are configured or
available, depending on the parameter. For example, if there are 100
configured processors and schedulers.total is set to -50, then the
calculated value for schedulers.total will be 50. Setting either
parameter to 0, on the other hand, will reset both values to their
defaults.
If SMP support is not enabled, i.e. if erlang.smp is set to disable
(or set to auto on a machine without SMP support or with only one
logical processor), then the values of schedulers.total and
schedulers.online will be ignored.
Scheduler Wakeup Interval
Scheduler wakeup is an optional process whereby Erlang VM schedulers are
periodically scanned to determine whether they have "fallen asleep,"
i.e. whether they have an empty run
queue. The interval at which
this process occurs can be set, in milliseconds, using the
erlang.schedulers.force_wakeup_interval parameter, which corresponds
to the Erlang VM's +sfwi flag. This parameter is set to 0 by
default, which disables scheduler wakeup.
Erlang distributions like R15Bx have a tendency to put schedulers to sleep too often. If you are using a more recent distribution, i.e. a if you are running Riak 2.0 or later, you most likely won't need to enable scheduler wakeup.
Scheduler Compaction and Balancing
The Erlang scheduler offers two methods of distributing load across schedulers: compaction of load and utilization balancing of load.
Compaction of load is used by default. When enabled, the Erlang VM will
attempt to fully load as many scheduler threads as possible, i.e. it
will attempt to ensure that scheduler threads do not run out of work. To
that end, the VM will take into account the frequency with which
schedulers run out of work when making decisions about which schedulers
should be assigned work. You can disable compaction of load by setting
the erlang.schedulers.compaction_of_load setting to false (in the
older configuration system, set +scl to true).
The other option, utilization balancing, is disabled by default in favor
of load balancing. When utilization balancing is enabled instead, the
Erlang VM will strive to balance scheduler utilization as equally as
possible between schedulers, without taking into account the frequency
at which schedulers run out of work. You can enable utilization
balancing by setting the erlang.schedulers.utilization_balancing
setting to true (or the +scl parameter to false in the older
configuration system).
At any given time, only compaction of load or utilization balancing
can be used. If you set both parameters to false, Riak will default to
using compaction of load; if both are set to true, Riak will enable
whichever setting is listed first in riak.conf (or vm.args if you're
using the older configuration system).
Port Settings
Riak uses epmd, the Erlang
Port Mapper Daemon, for most inter-node communication. In this system,
other nodes in the cluster use the Erlang identifiers specified by the nodename parameter (or -name in vm.args), for example riak@10.9.8.7. On each node, the daemon resolves these node
identifiers to a TCP port. You can specify a port or range of ports for
Riak nodes to listen on as well as the maximum number of concurrent
ports/sockets.
Port Range
By default, epmd binds to TCP port 4369 and listens on the wildcard interface. epmd uses an unpredictable port for inter-node communication by default, binding to port 0, which means that it uses the first available port. This can make it difficult to configure firewalls.
To make configuring firewalls easier, you can instruct the Erlang VM to
use either a limited range of TCP ports or a single TCP port. The
minimum and maximum can be set using the
erlang.distribution.port_range.minimum and
erlang.distribution.port.maximum parameters, respectively. The
following would set the range to ports between 3000 and 5000:
- riak.conf
- app.config
erlang.distribution.port_range.minimum = 3000
erlang.distribution.port_range.maximum = 5000
%% The older, app.config-based system uses different parameter names
%% for specifying the minimum and maximum port
{kernel, [
          % ...
          {inet_dist_listen_min, 3000},
          {inet_dist_listen_max, 5000}
          % ...
         ]}
You can set the Erlang VM to use a single port by setting the minimum to the desired port while setting no maximum. The following would set the port to 5000:
- riak.conf
- app.config
erlang.distribution.port_range.minimum = 5000
{kernel, [
          % ...
          {inet_dist_listen_min, 5000},
          % ...
         ]}
If the minimum port is unset, the Erlang VM will listen on a random high-numbered port.
Maximum Ports
You can set the maximum number of concurrent ports/sockets used by the
Erlang VM using the erlang.max_ports setting. Possible values range
from 1024 to 134217727. The default is 65536. In vm.args you can use
either +Q or -env ERL_MAX_PORTS.
Asynchronous Thread Pool
If thread support is available in your Erlang VM, you can set the number
of asynchronous threads in the Erlang VM's asynchronous thread pool
using erlang.async_threads (+A in vm.args).  The valid range is 0
to 1024. If thread support is available on your OS, the default is 64.
Below is an example setting the number of async threads to 600:
- riak.conf
- vm.args
erlang.async_threads = 600
+A 600
Stack Size
In addition to the number of asynchronous threads, you can determine the
memory allocated to each thread using the
erlang.async_threads.stack_size parameter, which corresponds to the
+a Erlang flag. You can determine that size in Riak using KB, MB, GB,
etc. The valid range is 16-8192 kilowords, which translates to 64-32768
KB on 32-bit architectures. While there is no default, we suggest a
stack size of 16 kilowords, which translates to 64 KB. We suggest such a
small size because the number of asynchronous threads, as determined by
erlang.async_threads might be quite large in your Erlang VM. The 64 KB
default is enough for drivers delivered with Erlang/OTP but might not be
large enough to accommodate drivers that use the driver_async()
functionality, documented
here. We recommend
setting higher values with caution, always keeping the number of
available threads in mind.
Kernel Polling
You can utilize kernel polling in your Erlang distribution if your OS
supports it. Kernel polling can improve performance if many file
descriptors are in use; the more file descriptors, the larger an effect
kernel polling may have on performance. Kernel polling is enabled by
default on Riak's Erlang VM, i.e. the default for erlang.K is on.
This corresponds to the
+K setting on the
Erlang VM. You can disable it by setting erlang.K to off.
Warning Messages
Erlang's
error_logger is an
event manager that registers error, warning, and info events from the
Erlang runtime. By default, events from the error_logger are mapped as
warnings, but you can also set messages to be mapped as errors or info
reports using the erlang.W parameter (or +W in vm.args). The
possible values are w (warnings), errors, or i (info reports).
Process Limit
The erlang.process_limit parameter can be used to set the maximum
number of simultaneously existing system processes (corresponding to
Erlang's +P parameter). The valid range is 1024 to 134217727. The
default is 256000.
Distribution Buffer
You can set the size of the Erlang VM's distribution buffer busy limit
(denoted by +zdbbl on the VM and in vm.args) using
erlang.distribution_buffer_size.  Modifying this setting can be useful
on nodes with many busy_dist_port events, i.e. instances when the
Erlang distribution is overloaded. The default is 32 MB (i.e. 32MB),
but this may be insufficient for some workloads. The maximum value is
2097151 KB.
A larger buffer limit will allow processes to buffer more outgoing
messages. When the limit is reached, sending processes will be suspended
until the the buffer size has shrunk below the limit specified by
erlang.distribution_buffer_size. Higher values will tend to produce
lower latency and higher throughput but at the expense of higher RAM
usage. You should evaluate your RAM resources prior to increasing this
setting.
Erlang Built-in Storage
Erlang uses a built-in database called
ets (Erlang Term Storage)
for some processes that require fast access from memory in constant
access time (rather than logarithmic access time).  The maximum number
of tables can be set using the erlang.max_ets_tables setting. The
default is 256000, which is higher than the default limit of 1400 on the
Erlang VM. The corresponding setting in vm.args is +e.
Higher values for erlang.max_ets_tables will tend to provide more
quick-access data storage but at the cost of higher RAM usage. Please
note that the default values for erlang.max_ets_tables and
erlang.distribution_size (explained in the section above) are the same.
Crash Dumps
By default, crash dumps from Riak's Erlang distribution are deposited in
./log/erl_crash.dump. You can change this location using
erlang.crash_dump. This is the equivalent of setting the
ERL_CRASH_DUMP
environment variable for the Erlang VM.
Net Kernel Tick Time
The net kernel is an Erlang
system process that provides various forms of network monitoring. In a
Riak cluster, one of the functions of the net kernel is to periodically
check node liveness. Tick time is the frequency with which those
checks happen. You can determine that frequency using the
erlang.distribution.net_ticktime. The tick will occur every N seconds,
where N is the value set. Thus, setting
erlang.distribution.net_ticktime to 60 will make the tick occur once
every minute. The corresponding flag in vm.args is -kernel net_ticktime.
Shutdown Time
You can determine how long the Erlang VM spends shutting down using the
erlang.shutdown_time parameter. The default is 10s (10 seconds).
Once this duration elapses, all existing processes are killed.
Decreasing shutdown time can be useful in situations in which you are
frequently starting and stopping a cluster, e.g. in test clusters. In
vm.args you can set the -shutdown_time flag in milliseconds.