Background

The following terms are used in this page and are described in the Glossary.

  • Data Domain (Domain)

  • Data Record (DR)

  • Data Record Keys

  • Data Catalog

  • Data Event

  • Source System

  • Source Data Record or Source Entity

Data Catalog and Federated Data Records

Important
Ensure that the YOUnite API database is encrypted at rest and in transit to protect sensitive data, especially the data catalog, from unauthorized access. Additionally, it should be securely placed behind a firewall to control and restrict external access effectively.

The capacity to manage all the federated data records (DRs) in your YOUnite Data Fabric is a key YOUnite feature. In order to manage the data domains and its data records (DRs), YOUnite maintains a data catalog of where the source entities live.

The data catalog contains lives in the YOUnite API server’s database (YOUnite metadata) and contains links and data record keys for each federated DR.

For example, suppose there is a deployment with a "customer" data domain and there is a customer "Hitoshi Niki" stored in three different data sources:

  1. Customer Service database

  2. Home Loans database

  3. ATM Users database

Using matching algorithms, YOUnite is able to reconcile the differences between the records to create a single federated data record for the customer Sally Jones. Then the YOUnite data catalog will hold three links and the data record keys for Sally Jones in the data catalog for each source system e.g.:

YOUnite Data Catalog

  • Data Domain Version: Customer:v1

  • DR UUID: 4aaef377-6057-48be-9d69-a28162b25b68

  • Adaptor Entry:

    • Source System: Customer Service database

    • Source System: Home Loans database

      • name: Jones

      • phone: 81 123456

      • cust-id: 890

    • Source System ATM Users database

      • cust-id: 890

Assumptions

To facilitate YOUnite’s managing of your federated data, the following assumptions are made:

  1. Each application or service (source system) has been identified and registered to YOUnite through an adaptor. See Creating and Managing Adaptors for how to register and manage an adaptor.

  2. Data Domains have been identified and understood for this YOUnite Data Fabric implementation. See Data Domains.

  3. Adaptors have been created for each of the applicable source systems. Off-the-shelf adaptors are available (see YOUnite DB Adaptor) and custom Adaptors are developed using the Adaptor SDK.

  4. Key actors in the process of linking have been established. This includes the DGS and the individual Zone Data Stewards (see the section on "Zones" in Zones, Users, Groups, Roles and Permissions).

Creating the Data Catalog - Linking Data Records

The process of linking source entities to DRs (linking) is accomplished by either: * Standard data linking * Data discovery

Standard Linking

With this method, YOUnite waits for data events to occur and to be published by adaptors. As adaptors send data events to YOUnite for a given DR, YOUnite automatically links and synchronizes them according to data governance (i.e. inbound/outbound ACLs).

A PUT or POST data event will automatically link a DR to YOUnite if it is not already linked.

Data Discovery

Since we don’t know which data sources (DBs) have the most recent copy of a data record, it may not be desired to synchronize the existing data in a DB when introducing it into the data fabric. It’s typical to start synchronizing data between a new system that contains data once all the records in the new system have been added to the YOUnite data catalog otherwise, old data in the new system may overwrite more recent data in other systems.

Putting an adaptor in "Discovery mode" allows the adaptor to scan a DB for records and then send them to YOUnite for linking without synchronizing the data to other DBs.

Once the adaptor has completed its initial scan, it can be taken out of "discovery mode" and into standard linking operation - detecting data events and synchronizing them according to data governance (i.e. inbound/outbound ACLs).

Difference Between Standard Linking and Data Discovery

If an adaptor is not put into discovery mode when first connected to the source system, any existing records in the source system will not get linked to DRs (added to the data catalog) until an existing record is updated.

Enabling and Disabling Adaptor Data Discovery

An adaptor is typically put in discovery mode when the adaptor is started for the first time and when there is no activity on the source system.

1) When creating the adaptor, set discoveryMode to true.

POST /zones/<zone-uuid>/adaptors
{
  "name": ā€Postgres DB",
  "discoveryMode": true,
    .
    .
    .
}

2) The YOUniteUI can be used to put an adaptor into Discovery mode. First pause the adaptor then turn on Discovery Mode.

Pause Adaptor button
Discovery Mode from Adaptor Pulldown

3) Once in Discovery mode, an adaptor will scan the source system to identify all records associated with the specified data domain and send them to YOUnite for data catalog updating, however while in Discovery mode the data events do NOT get routed to other source systems (this is the desired effect of data discovery).

4) Start the adaptor

Play Adaptor button

4) Adaptor metrics indicate how many records are being scanned and cataloged

Important
Adaptor metrics will not get reported while an adaptor is in the PAUSED state.

3) On the adaptor’s page in the YOUnite UI, there is a Metrics tab. When the SENT value approaches or is at 0 it means there are no more records to scan and that the adaptor can be taken out of Discovery mode.

Metrics Tab
Note
Record the scan start time since it might be required to navigate to the correct line in the list to see the discovery activity.

4) The adaptor can be taken out of Discovery mode using the YOUnite UI by selecting the adaptor’s pull-down menu and selecting Cancel Discovery Mode.

Cancel Discovery Mode from Adaptor Pulldown

Doing the same using the YOUnite API:

PATCH /zones/<zone-uuid>/adaptors/<adaptor-uuid>
{
  "discoveryMode": false,
  "changeVersion": <changeVersion>
}