This guide outlines the procedures for managing and restoring services within the YOUnite stack in a clustered environment.

Important
The younite-api service holds paramount importance, as the majority of other services either support or are functionally subordinate to it.

Dependencies between YOUnite Stack Services

Service Depends On

younite-db

-

younite-mb

-

younite-notification-service-db¹

-

younite-elastic

younite-logstash ?

younite-kibana¹

younite-elastic, younite-logstash, IDP Provider

younite-ui¹

ID Provider, younite-api, younite-notification-service², younite-data-virtualization-service²

younite-notification-service¹

younite-notification-service-db, younite-api, younite-mb, ID Provider?

younite-data-virtualization-service¹

yountie-api, younite-db, younite-mb, younite-elastc?, ID Provider

younite-api

younite-db, younite-mb, younite-elasticsearch, yountie-logstash?, ID Provider

¹ Optional

² If configured - not a critical dependency for YOUnite core functionality

Handling and Recovering Services

younite-db

Utilize storage solutions such as AWS EBS, with its DB snapshots and automatic replication, to not only mitigate failures in younite-db but to also enable database restoration.

If younite-db fails or the DB has a "catastrophic failure" or "irrecoverable error" then younite-api should be shutdown immediately and restarted once younite-db is restarted and operational.

If the younite-db in younite-db has a "catastrophic failure" or "irrecoverable error" it will need to be restored from the latest backup.

Steps

  • Container or service failure:

    1. Stop younite-api and if configured, the younite-data-virtualization-service

    2. Restart the younite-db service

    3. Wait for the the younite-db service to become operational

    4. Restart younite-api

  • Catastrophic or Irrecoverable DB error:

    1. Stop younite-api

    2. Restart the younite-db service

    3. Wait for the the younite-db service to become operational

    4. Restore the younite-db to that last know good backup

      • Inspect the younite-api, younite-db (if possible) and Elastic logs for failed data events/exceptions and then report them as data issues to the staff responsible for data stewardship

    5. Restart younite-api

younite-notification-service-db

Utilize storage solutions such as AWS EBS, with its DB snapshots and automatic replication, to not only mitigate failures in younite-notification-service-db but to also enable database restoration.

If younite-notification-service-db fails or the DB has a "catastrophic failure" or "irrecoverable error" then younite-notification-service should be shutdown immediately and restarted once younite-notification-service-db is restarted and operational.

  • Container or service failure:

    1. Stop younite-notification-service

    2. Restart the younite-notification-service-db service

    3. Wait for the the younite-notification-service-db service to become operational

    4. Restart younite-notification-service

  • Catastrophic or Irrecoverable DB error:

    1. Stop younite-notification-service

    2. Restart the younite-notification-service-db service

    3. Wait for the the younite-notification-service-db service to become operational

    4. Restore the younite-notification-service-db to that last know good backup

      • Inspect the younite-notification-service log, younite-notification-service-db log (if possible) and Elastic logs for exceptions and failures and report them as data issues to the staff responsible for data stewardship.

    5. Restart younite-notification-service

younite-mb

If younite-mb fails (AMQ Message Bus) or is in a fault condition it will need to be restarted.

The following services depend on younite-mb Active MQ message bus):

  • younite-api

  • yountie-data-virtualization-service

  • younite-notification-service

  • YOUnite Off-the-Shelf adaptors

  • Custom adaptors implemented by the customer or SI

The connection between these services depend on the younite-mb (.

Normal Handling of Fault or Failed younite-mb

Typically, services will reconnect to younite-mb if it has been down or in a fault condition. These services will reconnect when restarting younite-mb. If any of these services do not reconnect to younite-mb, then they will need to be restarted.

YOUnite Off-the-Shelf Adaptors Pending Data Events

The YOUnite Off-the-Shelf adaptors will cache the data events that occur while the younite-mb is down or in a fault condition and will send them to younite-api once younite-mb is restarted.

Detecting When Services Loose Connectivity With younite-mb

The logs of a service will contain the following errors:

[AR-Data-6] ERROR o.a.activemq.artemis.core.client - AMQ214016: Failed to create netty connection

java.net.UnknownHostException: younite-mb

ERROR o.s.j.l.DefaultMessageListenerContainer - Could not refresh JMS Connection for destination 'AR-Data' - retrying using FixedBackOff{interval=5000, currentAttempts=0, maxAttempts=unlimited}. Cause: Failed to create session factory; nested exception is ActiveMQNotConnectedException[errorType=NOT_CONNECTED message=AMQ219007: Cannot connect to server(s). Tried with all available servers.]

AMQ219007: Cannot connect to server(s). Tried with all available servers.
Important
The logs for each service that depends on younite-mb should be checked for these log exceptions in the event the younite-mb enters a failed or fault condition.
Detecting if a Service Depending on younite-mb Should be Restarted

If the log exceptions described cease, then the service automatically reconnected to younite-mb.

If the log exceptions do not cease, then the service should be restarted.

Elastic Services

If logging data events is critical to the organization, then Elastic services should be setup as a cluster with high-availability . See Elasticsearch guide on high availability.

Recovering Elastic services in a non-mission critical configuration.

younite-elastic & younite-logstash

Either of these services should be restarted in a failed or fault condition. Services that send log requests to Elastic typically retry their requests even in non-HA configurations, but there is no guarantee that the retries will be logged so there may be some lost log entires.

younite-kibana

Younite-kibana is the UI dashboard service for Elastic and can be restarted on a failure or fault condition. All UI users will need to log back in.

younite-ui

If the YOUnite UI fails, all UI user’s sessions will be terminated. The service can be restarted to resume normal operation.

younite-notification-service

If the younite-notification-service fails it can simply be restarted.

The affect of a momentary pause in the service are that: * Applications attempting to register for notifications will get a 401 or similar response and will have to retry their request * YOUnite UI sessions will stop receiving notifications and will need to log out and back in again. * Existing notification queued to be delivered will be delayed during the restart

younite-data-virtualization-service

If the younite-data-virtualization-service fails it can simply be restarted. However, to mitigate failures the younite-data-virtualization-service can be started as a scalable service.

The affect of a momentary pause in the service are that: * Applications (including the YOUnite UI) attempting to make data virtualization requests will get a 401 or similar response and will have to retry their request

ID Provider

Since YOUnite is typically configured to integrate with the customer’s ID Provider (IDP), applications will be unable to start a session and refresh tokens if the IDP becomes available. Not action is needed since normal token processing will be restored with the IDP becomes available.

younite-api

Younite-api is a horizontally scaled application and must run on a minimum of three servers (e.g. Kubernetes nodes, cloud services, Docker containers, physical servers, etc.) in a clustered configuration.

If in the rare event all the instances of younite-api fail or go into a fault condition then, they will need to be restarted. Refer to the Dependencies between YOUnite Stack Services table to ensure that all the services it depends on are operational before restarting the younite-api instances.

Services that depend on younite-api should automatically reconnect to it.