This document describes the concurrency and message grouping implications of data changes sent from adaptors to YOUnite for processing.
About Concurrency and Message Groups
The following details the principles of concurrency and message groups.
Concurrency
The YOUnite Server processes incoming messages from the Message Bus concurrently, with a configurable level of
concurrency. Each concurrent process that handles incoming messages is called a Message Consumer
.
Concurrency increases the throughput of processing messages, but could lead to messages being processed out of order
as each Message Consumer
will process messages at a slightly different rate. To ensure data consistency, message
groups
are employed to ensure that related messages are sent to the same Message Consumer
.
Message Groups
Message groups are a way of grouping messages together to ensure they are sent to the same Message Consumer
and are
processed in the order with which they were added to the queue. In ActiveMQ, message groups are assigned to a message
by including a value for the header JMSXGroupID
.
By default, the Adaptor SDK
groups messages by their unique identifier in YOUnite (henceforth referred to as
the DR UUID
). This is effective for messages whose DR UUID
is known. For new records, a DR UUID
has not been
assigned yet, and by default the Adaptor SDK
will use the hash code of the Local Identifier
(ie primary key)
of the record. Once the DR UUID
is known to the adaptor, it will switch to using that as the message group.
Grouping by DR UUID
(or hash code of its Local Identifier
) ensures that the same Message Consumer
will process all
messages, in order, for those records.
High Availability
In a High Availability environment (multiple YOUnite Servers), the same concurrency principles apply. For example,
if the concurrency level is set to 5 and there are 3 YOUnite Servers, the total concurrency will be 15, meaning
there will be 15 Message Consumers
. Messages will be divided up between those 15 consumers.
TL;DR
Message grouping is crucial to ensure that related messages are processed in order! The default message grouping is by
the UUID of the data record (DR UUID
) or the string representation of it’s unique ID (until a DR UUID
is assigned).
Configuration
Options
The following configuration options are available in the YOUnite Server. These options may either be set by including
them in application.properties
or via environment variables.
Option | Description | Default Value |
---|---|---|
message.bus.data.queue.concurrency |
Concurrency of the listener for data changes from adaptors. Must be a fixed value. |
5 |
message.bus.ops.queue.concurrency |
Concurrency of the listener for operational messages from adaptors including responses to federated GET requests. May be a fixed value or a range. |
1-5 |
message.bus.link.queue.concurrency |
Concurrency of the listener for messages from adaptors with linkage information. These messages are sent as response to data events so that YOUnite knows what records have been linked at an adaptor. May be a fixed value or a range. |
1-5 |
message.bus.adaptor.log.queue.concurrency |
Concurrency of the listener for logging messages from adaptors. May be a fixed value or a range. |
1-5 |
Tuning
Some considerations when setting the concurrency level:
-
Data events are CPU-intensive and short lived; therefore, the optimal concurrency may be close to the number of available CPU cores.
-
On the other hand, a small concurrency level means grouping more and more unrelated records together. This can lead to starvation of one or more
Message Consumers
as it waits for messages to match its group. -
By default ActiveMQ distributes messages to up to 1,024 message groups. The hash code of the
JMSXGroupID
is used to determine which message group to use. Each consumer will be matched to one or more of these message groups. 1,024 groups should be plenty for normal usage (at a concurrency of 10, that would be 102 servers …). ActiveMQ can be configured to use a different number of message groups if desired.