We have deprecated the TRANSPOSE node in the course of the launch of the new Y42 Modern DataOps platform. The reason for us to deprecate this function was that the TRANSPOSE function does neither natively exist in Google BigQuery nor in Snowflake. To offer this functionality we have custom-built the TRANSPOSE node. The better, more stable and more efficient alternative is the PIVOT node, which has the same functionalities.
The UI Model compiles each step (each node) to a so-called CTE (“WITH” statement) - which BigQuery may potentially optimize less than an equivalent statement only built with inline SELECT statements. So technically, you can achieve higher performance with SQL models, but the difference should not be substantial/even noticeable in the vast majority of cases. If SQL Model statements are not written well, there is no guarantee of good performance either. There are also ways to optimize your UI model by only selecting columns you need in the FIELDS and OUTPUT node as well as minimizing the usage of table manipulation nodes.
Start from the Output table where the duplicates occur and move backward, node by node, to identify in which node the duplicates are created. You can do this by checking the row numbers of the Input preview and Output preview of each node.
Usually, duplicates are created in JOINs when you have duplicated values in the JOIN key. To ensure that the granularity of your tables stays remained, make sure that for example in your LEFT JOIN the right table does not have any duplicates in the JOIN KEY column.
Updated 10 months ago