Data Assets

A quick summary of what are data assets, and how do they make a difference on how Y42 works.

What is a Data Asset in Y42?

In Y42, data assets are understood as the building blocks that make up data pipelines, as well as the byproducts and outputs generated by them. In this way our definition goes beyond just tables and dashboards, as is the case in some data platforms. Concrete examples of these pipeline elements are integrations, models, orchestrations, widgets, but also tables and dashboards.

Defining everything as an asset helps us achieve consistent metadata across the data lifecycle, fuelling unique governance and collaboration capabilities.

The Bridge Between Code and No Code

At Y42, we combine the code and no code paradigms thanks to how we save the projects you build in the platform. Your Y42 solutions are specified fully in a set of version-controlled files stored in a git repository.

When you build your project using our web interface, your changes are immediately saved in the relevant files. If you prefer working on the files instead, the web interface will reflect the changes you commit. This synchronization allows technical and non-technical users to work together, using the interface that suits their needs better.

Every asset in the platform is defined by a minimum of two files:

  • metadata.json: Holding all the relevant metadata for a specific asset. This file generally always has the same structure.
  • settings.json: Holding the specific configuration of the data asset. The structure of this file depends on the particular asset type.

Some assets require additional files to specify their functionality. For example, integrations have an accounts.json file that records the authenticated accounts.

Asset Types

In the following sections, we describe each asset type and how its properties can be specified using the file system. Please don't hesitate to go into the documentation section of each product area for in-depth descriptions.

Integrations and Models

Integrations are responsible for bringing data from their origin into your warehouse. At the same time, models allow transforming data to make it more usable and surfacing the insights that you need.

These two types go under the same category because although they meet different functionalities, there are parallels in their functioning, and rely on similar specification files.

The file structure of Integrations looks as follows:

  • integration_folder: Folder created for every integration you define in the app
    • metadata.json: Defines the relevant metadata fields for the data asset
    • settings.json: Holds the general configuration for the integration
    • table_folder_x: Y42 generates a folder for holding the files relevant to each individual table that the integration generates
      • settings.json: Contains the configuration for a specific table, defining fields such as columns and schema
      • metadata.json: Defines the metadata specific to the table -
    • .y42: Hidden folder that contains the most private setup information about an integration
      • accounts.json: Holds information about authenticated accounts
      • secret.enc: Id of the encrypted secret, hosted in our backend services
      • schema.json: Specifies the tables, columns, metadata, and some additional properties of the data we want to import

For the models, the structure goes as follows:

  • model_folder: Folder created for every model you define in the app
    • metadata.json: Defines the relevant metadata fields for the data asset
    • settings.json: Holds the general configuration for the model
    • ui-model.json (If it's a UI model): Defines the set of blocks and functions that make up a UI Model and their position in the canvas
    • output_table_folder: A folder is created for each output table in the case of UI models and for every query in the case of SQL models
      • metadata.json: Defines the relevant metadata fields for the data asset
      • settings.json: Holds the general configuration for the table
      • tests.json: Definition of the tests that will be applied on the table once it's been materialized
      • sql-model.sql (if it's a SQL model): The SQL query to be run against the warehouse

Tables, Tests & Contracts

Y42 provides you with additional tools to keep data quality high and maintain a consistent data architecture across your pipelines. Both models and integrations produce materialized tables in your data warehouse. One can attach data tests and data contracts to tables to guarantee that they meet a set of agreed-upon conditions every time they're populated.

If a table has tests or a contract attached, the job consists of three steps:

  1. Table materialization: Creation of the table in your data warehouse
  2. Test execution: Configured data tests are run against the materialized table
  3. Data contract execution: The conditions defined in the data contract are run against the table

A failure in any of the three individual steps will mark the overall job as invalid. The job can be canceled at any point while the materialization step is incomplete. It is not possible to interrupt it while the tests or the data contract are running.

Exports

Exports take care of the reverse EL functionality in our platform: They enable you to pour processed data right where your data users expect it.

You can think of exports as the potential destinations where you want to move your data and triggers as the actual representation of your warehouse tables within those destinations.

The file structure for Exports goes as follows:

  • export_name_folder: Folder created for every export you define in the app
    • settings.json: Holds the general configuration for the export
    • metadata.json: Defines the relevant metadata fields for the data asset
    • trigger_folder: A folder will be created to hold the information about each trigger
      • metadata.json: Defines the relevant metadata fields for the data asset
      • settings.json: Holds the general configuration for the trigger
    • .y42: Hidden folder that contains the most private setup information about an export
      • accounts.json: Holds information about authenticated accounts
      • secret.enc: Id of the encrypted secret, hosted in our backend services
      • schema.json: Specifies the tables, columns, metadata, and some additional properties of the data we want to export

Visualizations

The visualization area is responsible for helping you produce insightful and beautiful representations of your data. In Y42 visualizations, one builds widgets that can then be bundled into dashboards, each of which can have multiple tabs. Additionally, one can create themes that allow styling entire dashboards and save filters for individual tabs. You can find the comprehensive descriptions in the visualization documentation.

As there are four asset types within visualizations, the visualization folder is broken into four subfolders: Widgets, Dashboards, Themes, and Filters. Each of these assets is defined by a metadata.json and a settings.json file.

The metadata.json defines the same pieces of information as in other product areas. However, it's worth exploring the responsibilities of settings.json for the different visualizations assets:

  • Widgets: the file allows configuring the widget type, properties, and behavior. Y42 has complete Echarts support. For further information, read the extended guide
  • Dashboard: the file defines the theme and the order of the tabs
  • Themes: the file specifies the theme in case it's a built-in one or the color palette if it's a custom one. Notice that there'll be a theme folder for every existing dashboard, even if all dashboards sport the same built-in theme
  • Filters: Specifies the field(s) to be filtered, data type(s), filter type, and the logic of the filter

Note that within the folder for each dashboard, there's a folder for each tab. Tabs are an additional way to distribute information in your visualizations. You can find the elements shown in the canvas of a tab within the canvas folder. Every element in the canvas is defined in an independent JSON file named after a unique hash.

Orchestrations

Orchestrations set the heartbeat of your Y42 solutions. They make sure that the components of your pipeline are automatically run such that you get fresh data where you need it, when you need it.
An orchestration allows scheduling runs of its components, such that you don't need to trigger them manually. Subsequently, orchestration jobs are the only ones that can trigger other jobs' execution.

The file structure of orchestrations is as follows:

  • orchestration_name_folder: Folder created for every orchestration you define in the app
    • metadata.json: Defines the relevant metadata fields for the data asset
    • settings.json: Specifies the integrations, models, or exports triggered by the orchestration, their schedule, and their execution order and mode - Incremental or full.

Alerts

Alerts allow notifying users based on the status of the last valid job of data asset that meets specific criteria, such as matching tags or a specific naming convention.
Soon, we aim to offer connectors that allow streaming these alerts to your preferred communication channels, such as Slack, Teams, or Webhook.

The alerts folder is divided into two subfolders: one for connections and the other one for status alerts. As of today, only the Status Alerts folder is populated, and its items look as follows:

  • alert_name_folder: Folder created for every alert you define in the app
    • metadata.json: Defines the relevant metadata fields for the data asset
    • settings.json: Specifies the status type that the alert flags, what kind of assets are matched, and the output channel for the alert