返回

利用宏在 bigquery 上的多个模式上不再在 dbt 中删除旧关系

发布时间:2022-07-25 03:31:20 262

来源和灵感:清理仓库中的旧款和弃用款

尝试将以下 dbt 宏转换为做两件事:

  1. 在大查询上运行
  2. 从我的多个目标模式中删除“旧”关系

我的模式查询的输出如下:

SELECT schema_name FROM `my-project.region-us.INFORMATION_SCHEMA.SCHEMATA` order by schema_name desc;

schema_name
dbt_dev
dbt_dev_stage
dbt_dev_mart
dbt_dev_analytics
dbt_prod
dbt_prod_stage
dbt_prod_mart
dbt_prod_analytics
etc...

和我的;“调整后”;宏类似于:

{% macro drop_old_relations(dryrun=False) %}

{% if execute %}
  {% set current_models=[] %}

  {% for node in graph.nodes.values()
     | selectattr("resource_type", "in", ["model", "seed", "snapshot"])%}
    {% do current_models.append(node.name) %}

  {% endfor %}
{% endif %}

{% set cleanup_query %}

      WITH MODELS_TO_DROP AS (
          SELECT
            CASE 
              WHEN TABLE_TYPE = 'BASE TABLE' THEN 'TABLE'
              WHEN TABLE_TYPE = 'VIEW' THEN 'VIEW'
            END AS RELATION_TYPE,
            CONCAT( TABLE_CATALOG,".",{{ target.schema }},".", TABLE_NAME) AS RELATION_NAME
          FROM 
            {{ target.database }}.{{ target.schema }}.INFORMATION_SCHEMA.TABLES
          WHERE TABLE_SCHEMA = {{ target.schema }}
            AND TABLE_NAME NOT IN
              ({%- for model in current_models -%}
                  '{{ model.upper() }}'
                  {%- if not loop.last -%}
                      ,
                  {% endif %}
              {%- endfor -%})) 
      SELECT 
        'DROP ' || RELATION_TYPE || ' ' || RELATION_NAME || ';' as DROP_COMMANDS
      FROM 
        MODELS_TO_DROP
  {% endset %}

{% do log(cleanup_query, info=True) %}
{% set drop_commands = run_query(cleanup_query).columns[0].values() %}

{% if drop_commands %}
  {% if dryrun | as_bool == False %}
    {% do log('Executing DROP commands...', True) %}
  {% else %}
    {% do log('Printing DROP commands...', True) %}
  {% endif %}
  {% for drop_command in drop_commands %}
    {% do log(drop_command, True) %}
    {% if dryrun | as_bool == False %}
      {% do run_query(drop_command) %}
    {% endif %}
  {% endfor %}
{% else %}
  {% do log('No relations to clean.', True) %}
{% endif %}

{%- endmacro -%}

我目前遇到了宏无法识别我的一些目标模式的问题:

dbt run-operation drop_old_relations --args "{dryrun: True}"

Encountered an error while running operation: Database Error
  Unrecognized name: dbt_dev at [14:32]

或者我很乐意采用模式参数之类的方法,然后作为运行钩子在模式上迭代,比如:

dbt run-operation drop_old_relations --args "{schema: dbt_dev_mart, dryrun: True}"

on-run-start:
 - "{% for schema in schemas%}drop_old_relations({{ schema }},False);{% endfor%}"
特别声明:以上内容(图片及文字)均为互联网收集或者用户上传发布,本站仅提供信息存储服务!如有侵权或有涉及法律问题请联系我们。
举报
评论区(1)
按点赞数排序
用户头像
相关帖子