The Root Admin, Data Governance Steward and any other stakeholders can follow the steps outlined in this guide for quickly building out a YOUnite Data Fabric.
A more comprehensive outline can be found in the YOUnite Adoption Plan.
Prerequisites
-
Understand the topics covered in An Introduction to YOUnite.
-
Access to a working YOUnite deployment. See Local Deployment Quick Start or if deploying in AWS see Deploying YOUnite as a Kubernetes Cluster in AWS.
The Operational Side of YOUnite: Zones
One of YOUnite’s key design solutions is the ability to group an organization’s data by the organization’s structure (e.g. divisions, departments, districts, schools, etc.) and create relationships between these groups. YOUnite calls these groupings zones. An example might be found in a college school system. For instance, YOUnite could mirror a college system’s structure where the top-level zone is the Chancellor’s Office, with college district zones in the middle and individual college zones underneath.
Zones contain users, groups, roles, permissions, adaptors, logs, ACLs, and other resources. Zones are associated with each other in a hierarchical structure with parent, sibling, and child zones. In the example of the college school district, the schools within that district might be considered child zones of the district (parent zone).
Zone Users
When a zone is created, two distinct users are defined:
-
Zone Admin
-
Zone Data Steward
The Zone Admin controls the operational aspects of the zone while the Zone Data Steward controls the data. See Managed Roles for a more comprehensive overview of these two roles.
More about the responsibilities of the Zone Admin and the Zone Data Steward can be found in:
Zones
For this quick start, we can reuse the admin
and dgs
users for all of the zones.
For example, to create the following zone structure:
Example:
Zone Parent | Zone Name | Description | Admin | Data Steward | Geo Coords (optional) |
---|---|---|---|---|---|
none |
Root |
The root zone |
admin |
dgs |
n/a |
Root |
College District |
College District Zone |
admin |
dgs |
90, 45 |
College District |
West College |
West College Zone |
admin |
dgs |
50.85,4.35 |
College District |
Central College |
Central College Zone |
admin |
dgs |
37.56,122.32 |
College District |
East College |
East College Zone |
admin |
dgs |
35.67,139.65 |
Use a spreadsheet or the following to create a list of your intended zone structure:
Zone Parent | Zone Name | Description | Admin | Data Steward | longitude (optional) | latitude (optional) |
---|---|---|---|---|---|---|
none |
Root |
The root zone |
admin |
dgs |
none |
none |
Root |
||||||
|
||||||
|
||||||
|
||||||
|
||||||
|
To create the zones, the admin logs into the YOUnite UI and selects Zones
from the landing page:
Select ADD ZONE
. This will create a child of the current zone:
Since a zone admin is creating the zones then the admin
and dgs
(Data Governance Steward) of the current will be the Zone Admin
and Zone Data Steward
for the new zone. New users can be added as zone admins and data stewards to the new zone and the zone admin and data stewards of the parent zone can be removed giving the zone admin and data stewards full control of the resources in the zone.
Once admin
or dgs
login, they can manage multiple zones by using the pulldown at the top of the screen:
Data Domains
Data domains represent the data models used to synchronize source entities between source systems.
The first questions stakeholders must answer are:
-
What data needs to be synchronized i.e. What Data Domains need to be created?
-
What source systems do the source entities reside on ?
-
What properties exist in the source systems that can be used to prevent duplicate records in the YOUnite data fabric? These properties are part of the data domain and are called
DR Keys
.
Once these are answered, data domain consensus between the stakeholders can begin.
For example:
Data Domain | Source Systems it Resides on | DR Keys |
---|---|---|
customer |
CRM (Oracle), ERP (MySQL), MIS(Posgtres), Accounting(DB2), Distribution(SQL Server), Customer Support (Mongo) |
cust_id |
department |
ERP (MySQL), MIS(Posgtres), Accounting(DB2) |
dept_id, geo_id |
employee |
CRM (Oracle), ERP (MySQL), MIS(Posgtres), Accounting(DB2), |
emp_id |
employee_department |
ERP (MySQL), MIS(Posgtres), Accounting(DB2) |
dept_id |
product |
CRM (Oracle), ERP (MySQL), Customer Support (Mongo) |
product_id |
change_order |
CRM (Oracle), ERP (MySQL), Accounting(DB2), Distribution(SQL Server), Customer Support (Mongo) |
product_id,order_id |
sale |
CRM (Oracle), ERP (MySQL), Accounting(DB2), Distribution(SQL Server), Customer Support (Mongo) |
cust_id,rep_id,order_id |
Just seven domains are listed above. Some deployments start with just a few data domains while others start with dozens or more.
Most data analysts begin their data domain definitions with Data Domain Mapping Spreadsheets
with one or more spreadsheets for each data domain (see below for a simple example).
Starting Data Domain Mapping Spreadsheets
now will make the process of data mapping and defining adaptor configurations easier later on in the process.
More on data domains can be found in the following guides:
At this point, data domains can be created in the YOUnite UI but it is best to create data domains after studying how data maps between source systems using adaptors.
Adaptors: The Key to Federated Data Discovery, Cataloging and Synchronization
With Federated Data Management, adaptors are the interface between an organization disjointed systems.
Adaptors perform CRUD (create, read, update, and delete) operations on the data while the YOUnite Server performs the task of distributing (routing) the data between the systems.
Users can either:
-
Develop their own adaptors. See the Adaptor SDK Summary.
-
Use YOUnite’s Off-the-Shelf Adaptors. Contact a YOUnite Representative for a list of data systems that "off-the-shelf" adaptors connect to.
Note
|
YOUnite’s Out-of-the-Box adaptors connect to many different types of source systems but in the instance a pre-existing "out-of-the-box" adaptor does not exist, an adaptor can be developed using the YOUnite Adaptor SDK. |
At this point the data analysts begin the mapping process between source systems for a given data domain using their Data Domain Mapping Spreadsheets
.
In this example, the organization’s CRM, ERP, and MIS systems need to be integrated with YOUnite through the adaptors.
Extending the example above, each adaptor belongs to a zone as in the image below. The envelops represents data records in the form of a data event originating at an adaptor in Zone-A and being routed to adaptors in other zones or in response to data events originating at an adaptor or API consumer’s request to the YOUnite Server.
The Data Domain Mapping Spreadsheet
for this domain would look similar to this:
-
Data Domain: Customer:v1 (Customer, version 1)
-
DrKeyProperties: customerId
-
Source Systems:
Property | CRM | ERP | MIS |
---|---|---|---|
city |
city |
primary.city |
city |
zip |
postal_code |
primary.zip |
zipcode |
firstName |
first_name |
name.first |
fname |
lastName |
last_name |
name.last |
lname |
birthDate |
birth_date |
n/a |
bday |
cust_primary_email |
email.work |
n/a |
|
salesRepId |
rep_id |
n/a |
rep_id |
phone |
cust_primary_phone |
phone.mobile |
phone |
state |
province |
primary.state |
state |
address |
bill_address |
primary.address |
primary-address |
customerId |
customer_id |
id |
cust.id |
Once this process is completed:
-
The data domain (e.g. Customer version 1) can be created. It is best-practices to create the domain as part of the
Root
zone so that their definition can be controlled by theData Governance Steward
as opposed to specficZone Data Stewards
. -
The Adaptors can be installed and Configured
-
Source entities in the source systems can be linked to YOUnite. See The Data Catalog - Linking YOUnite Data Records to Entities in Source Systems.
This is just a quick overview of the process and more on Adaptors and mapping to data domains can be found in the following guides:
-
Configuring Adaptors
-
A more detailed description about adaptors can be found on the Adaptors page.
-
Creating and managing adaptors can be found on the Managing Adaptors page.
-
The Adaptor SDK Summary page covers the Java SDK and how to develop an adaptor how to use applications can leverage YOUnite.
Next Steps
The YOUnite Data Fabric is complete once zones, users and data domains are defined and adaptors are developed and deployed.
The YOUnite Layers is an abstraction that makes it easy to see how the services in the data fabric work together:
Layer 1 - Data Synchronization
On this page we have covered creating the first layer by creating zones, data domains and adaptors. This provides real-time federated data synchronization.
At this point, federated data management is in place without the legacy of batch processing and / or point-to-point integrations.
Layer 2 - Data Governance
The next step is for the zone data stewards to define data governance between zones and adaptors.
Data governance defines what data a zone chooses to share and receive (inbound and outbound ACLs) with other zones and adaptors. For example, the HR Zone may choose to restrict changes it receives (inbound) from a system that is part of the Manufacturing zone. Or, the HR Zone may restrict changes it receives from an entire zone representing a company spin-off subsidiary and all its systems. Or, the HR zone may apply governance ACLs on its own system, preventing personal information from being shared outside of the system (outbound)
See the Governance guide and the Governance
page for more. To visit the YOUnite UI Governance
page select the icon from the landing page:
From here a zone data steward can create inbound and outbound ACLs that work similar to firewall rules but for data:
Layer 3 - Data Virtualization/Federated Data Access, Global Delete, Notifications and Federated Data Event Tracing
Now that federated data management is in place, there are some very powerful features for zone data stewards to tap into.
If source entities are mapped to YOUnite, then you can make requests similar to the following examples.
Data Virtualization (Federated GET)
Users and applications can make requests and gain data access through the YOUnite API and the YOUnite UI. YOUnite becomes a virtual operational data store when accessing data records through the API or when using the YOUnite UI; this data virtualization action is known as Federated GET.
Users can make requests for YOUnite to retrieve data records from various source systems and assemble it into one federated data record relative to the user’s unique requirements based on which systems they consider the best source of truth while considering data governance rules that need to be applied to the request.
To make a federated GET request through the YOUnite UI, navigate to the Data Access
page:
Select the disk icon for one of the data records:
The assembled record is displayed:
The assembled record is displayed. More about the request can be found by selecting the Assembled Meta Data
tab.
From the Assembled Meta Data
tab requests for source entities on specific source systems can be made and compared with other source entities in other source systems:
A zone data steward can select the preferred source systems for a given data domain by selecting Gold & Silver Adaptors
:
More on accessing data records and gold & silver adaptors is in the following guides:
Global DELETE
Through a single interface, YOUnite provides data stewards the ability to forget data across source systems in the YOUnite Data Fabric. This is especially valuable when needed for data compliance; for example assisting the organization’s requirement to comply with a customer or user’s "right to be forgotten."
From the Data Access data page select the trash can icon:
The following dialog is displayed:
Note
|
Obviously removing a specific data record (e.g. customer) from the entire data fabric should be done with great care. Data governance policies can be put in place to make sure data privacy laws are being followed and to eliminate complexities of removing data when a customer makes a request to be forgotten. |
Single Platform for Data Change Notification
Notifications allow events to be generated and delivered to legacy applications to trigger business logic. If, for example, an employee is promoted, and the HR system updates the employee’s record, the adaptor attached to the HR system can detect the change and pass it on to YOUnite where it can then notify other systems that have registered interest in employee status changes.
Trace Data Lineage
All data events that flow through the YOUnite Data Fabric are logged and can be traced providing data compliance officers the ability to see when a record was created, updated, routed to other systems, retrieved or deleted.
Note
|
Data events are logged but the data itself is not included in the log entry. |
Select the lineage
icon for a federated record to get a quick view of a data records lineage:
A more complete view can be found viewing the logs through the Elastic Kibana interface. To view the logs select the LOGS
icon from the landing page:
The Kibana Discovery dashboard is presented:
Kibana uses Lucene’s query syntax in the search bar.
See the Logging guide for more on searching the logs for specific events.