Load Balancing
This is a legacy Apache Ignite documentation
The new documentation is hosted here: https://ignite.apache.org/docs/latest/
Overview
In Apache Ignite, load balancing is achieved via LoadBalancingSpi
, which controls the load on all the nodes making sure that every node in the cluster is equally loaded. In homogeneous environments with homogeneous tasks, load balancing is achieved by random or round-robin policies. However, in many other use cases, especially under uneven load, more complex adaptive load-balancing policies may be needed.
LoadBalancingSpi
uses an "early" load balancing technique where a job is scheduled for execution on a specific node before it is sent to the cluster.
Data Affinity
Note that load balancing is triggered whenever your jobs are not collocated with data or have no real preference on which node to execute. If Collocation Of Compute and Data is used, then data affinity takes priority over load balancing.
Round-Robin Load Balancing
RoundRobinLoadBalancingSpi
iterates through the cluster nodes in a round-robin fashion and picks the next sequential node. In Round-Robin load balancing, two modes of operation are supported: per-task and global. Global mode is used by default.
Per-Task Mode
When configured in per-task mode, the implementation picks a random node at the beginning of every task execution and then sequentially iterates through all the nodes in the topology starting from that node. For cases when the split size of a task is equal to the number of nodes, this mode guarantees that all nodes will participate in job execution.
Global Mode
When configured in global mode, a single sequential queue of nodes is maintained for all tasks and the next node in the queue is picked every time. In this mode (unlike in per-task mode), it is possible that even if the split size of a task is equal to the number of nodes, some jobs within the same task will be assigned to the same node whenever multiple tasks are executing concurrently.
<bean id="grid.custom.cfg" class="org.apache.ignite.IgniteConfiguration" singleton="true">
...
<property name="loadBalancingSpi">
<bean class="org.apache.ignite.spi.loadbalancing.roundrobin.RoundRobinLoadBalancingSpi">
<!-- Set to per-task round-robin mode (this is default behavior). -->
<property name="perTask" value="true"/>
</bean>
</property>
...
</bean>
RoundRobinLoadBalancingSpi spi = new RoundRobinLoadBalancingSpi();
// Configure SPI to use per-task mode (this is default behavior).
spi.setPerTask(true);
IgniteConfiguration cfg = new IgniteConfiguration();
// Override default load balancing SPI.
cfg.setLoadBalancingSpi(spi);
// Start Ignite node.
Ignition.start(cfg);
Random and Weighted Load Balancing
WeightedRandomLoadBalancingSpi
picks a random node for job execution. You can also optionally assign weights to nodes, so nodes with larger weights will end up getting proportionally more jobs routed to them. By default all nodes get equal weight of 10.
<bean id="grid.custom.cfg" class="org.apache.ignite.IgniteConfiguration" singleton="true">
...
<property name="loadBalancingSpi">
<bean class="org.apache.ignite.spi.loadbalancing.weightedrandom.WeightedRandomLoadBalancingSpi">
<property name="useWeights" value="true"/>
<property name="nodeWeight" value="10"/>
</bean>
</property>
...
</bean>
WeightedRandomLoadBalancingSpi spi= new WeightedRandomLoadBalancingSpi();
// Configure SPI to used weighted random load balancing.
spi.setUseWeights(true);
// Set weight for the local node.
spi.setNodeWeight(10);
IgniteConfiguration cfg = new IgniteConfiguration();
// Override default load balancing SPI.
cfg.setLoadBalancingSpi(spi);
// Start Ignite node.
Ignition.start(cfg);
Job Stealing
Quite often grids are deployed across many computers some of which may be more powerful or under-utilized than others. Enabling JobStealingCollisionSpi
helps avoid jobs being stuck at an over-utilized node, as they will be stolen by an under-utilized node.
JobStealingCollisionSpi
supports job stealing from over-utilized nodes to under-utilized nodes. This SPI is especially useful if you have some jobs that complete quickly, while others are sitting in the waiting queue on over-utilized nodes. In such a case, the waiting jobs will be stolen from the slower node and moved to the fast/under-utilized node.
JobStealingCollisionSpi
adopts a "late" load balancing technique, which allows reassigning a job from node A to node B after the job has been scheduled for execution on node A.
Here is an example of how to configure JobStealingCollisionSpi
:
<bean class="org.apache.ignite.IgniteConfiguration" singleton="true">
<!-- Enabling the required Failover SPI. -->
<property name="failoverSpi">
<bean class="org.apache.ignite.spi.failover.jobstealing.JobStealingFailoverSpi"/>
</property>
<!-- Enabling the JobStealingCollisionSpi for late load balancing. -->
<property name="collisionSpi">
<bean class="org.apache.ignite.spi.collision.jobstealing.JobStealingCollisionSpi">
<property name="activeJobsThreshold" value="50"/>
<property name="waitJobsThreshold" value="0"/>
<property name="messageExpireTime" value="1000"/>
<property name="maximumStealingAttempts" value="10"/>
<property name="stealingEnabled" value="true"/>
<property name="stealingAttributes">
<map>
<entry key="node.segment" value="foobar"/>
</map>
</property>
</bean>
</property>
...
</bean>
JobStealingCollisionSpi spi = new JobStealingCollisionSpi();
// Configure number of waiting jobs
// in the queue for job stealing.
spi.setWaitJobsThreshold(10);
// Configure message expire time (in milliseconds).
spi.setMessageExpireTime(1000);
// Configure stealing attempts number.
spi.setMaximumStealingAttempts(10);
// Configure number of active jobs that are allowed to execute
// in parallel. This number should usually be equal to the number
// of threads in the pool (default is 100).
spi.setActiveJobsThreshold(50);
// Enable stealing.
spi.setStealingEnabled(true);
// Set stealing attribute to steal from/to nodes that have it.
spi.setStealingAttributes(Collections.singletonMap("node.segment", "foobar"));
// Enable `JobStealingFailoverSpi`
JobStealingFailoverSpi failoverSpi = new JobStealingFailoverSpi();
IgniteConfiguration cfg = new IgniteConfiguration();
// Override default Collision SPI.
cfg.setCollisionSpi(spi);
cfg.setFailoverSpi(failoverSpi);
Configuration requirement.
Note that
org.apache.ignite.spi.failover.jobstealing.JobStealingFailoverSpi
andIgniteConfiguration.getMetricsUpdateFrequency()
should be enabled in order for this SPI to work properly. All otherJobStealingCollisionSpi
configuration parameters are optional.
Updated 2 months ago