Create your first Incremental Model
Don't forget the parentheses in your
is_incremental()
macro! Video uses the old version without parentheses.
What is Incremental modeling?
Incremental modeling is the process of importing and running only the new rows of data based on the conditions criteria. The new rows are appended in your already imported/created dataset.
Y42 provides you the ability to append new data in your pipeline in a faster and cheaper way continuously.
Why use Incremental modeling?
Let's suppose you have millions of records in your Paypal integration. You use a SQL model to model those raw data inputs and create your desired report.
Every day new records are added to your Paypal records, so you build an Orchestration and schedule it to run every day, to generate those new records in the Paypal import and then to run the query of the SQL model.
Doing this process costs because you are writing the data in your warehouse and the process takes time also.
Here is when Incremental models comes in the rescue.
By putting a condition like for example created_at
> max(created_at)
you will run the query only for the rows that satisfy the matching criteria and then append the new rows to the old dataset of the model.
How to create an incremental model?
To create an incremental model you need first to know how to create a simple SQL model, the process is the same.
Example case:
We have initially this raw input data for which we want to calculate the sum of debt grouped by day, where debt is less than 1000 and before 31/12/2022.
I. Create the SQL query in SQL models.
SelectÂ
  `created_at`,
  round(sum(`debt`), 2) as sum_debt
from {{ ref('src/Models/SQL/incremental_input/table_1') }}
where `debt` < 1000 and `created_at`< '2022-12-31'
GROUPÂ BYÂ 1
ORDERÂ BYÂ 1
The result in the SQL model will be:
II. Decide which condition criteria you will use for the incremental models.
The next two days your integration is enriched with 3 new rows:
For this example we will use the created_at
column (as you might notice it could be as well the id
column as a valid condition criteria).
III. Put the condition at the end of the SQL model query.
In our SQL model we simply add the condition {% if is_incremental() %}
, using the supported Jinja templating at the Y42 code editor, at the end of the query, and add the condition inside the tags.
Important note: When you have more than one condition (like we do in the example) you need to add
and
clause in the end of the condition criteria inside the incremental tags, and add the next conditions below the incremental tags.
{% if is_incremental() %}
    where `created_at` > (SELECT max(created_at) from {{ this }}) and
{% endif %}
The whole SQl model will look like this:
SelectÂ
  `created_at`,
  round(sum(`debt`), 2) as sum_debt
from {{ ref('src/Models/SQL/incremental_raw/table_1') }}
{% if is_incremental() %}
where `created_at` > (SELECT max(created_at) from {{ this }}) and
{% endif %}
where `debt` < 1000 and `created_at`< '2022-12-31'
GROUP BY 1
ORDER BY 1
The next time the SQL model will be triggered it will be run only on the newly created rows.
The desired output will be like this:
The incremental jinja is a replacement that get’s added when is_incremental()
reverts to true. So the table has to be imported incrementally!
You need to trigger the SQL model incrementally for it to work! If you fully trigger it, the changes we made will not have any effect.
IV. Commit the model
For the changes to take effect as always we need to commit the changes we made.
The first import will always be a Full import!
V. Trigger the SQL model Incrementally.
Now that the changes are loaded you can try the Incremental modeling.
You can trigger the SQL modeling manually or with an orchestration.
- Triggering the SQL model manually
To trigger the SQL model manually you need to go to the Jobs section of the model and simply click the Incremental import button.
- Triggering the SQL model via an Orchestration
You can trigger an incremental model via an Orchestration. To set up an orchestration read this article.
The only thing to have in consideration while setting up the orchestration is that it has to be set as an incremental import!
Common questions and answers for Incremental modelling
In the incremental modelling, are we only doing append or are we also doing merge right now?
Answer: We Append only at the moment, but will add merge the next quarter!
When I switch on incremental modelling on a node, is it enough or do I need to also write is_incremental()
in my sql?
is_incremental()
in my sql?Answer: Need to write is incremental
in SQL model.
How do I do incremental ui modelling?
Answer: At the moment we do not do incremental modeling in UI model (only in SQL model), but we will implement it in the future.
What if in a SQL Node, I trigger full import manually, but write is_incremental()
in the query?
is_incremental()
in the query?Answer: It will happen a full import!
What if in a SQL Node, I switch on incremental import on UI, but don’t write is_incremental()
in the query?
is_incremental()
in the query?Answer: It will happen a full import! Need to write is_incremental()
in SQL model, for incremental model to be valid!
Updated 29 days ago