adp.ingest.datatypes.escalate_datatypes
- adp.ingest.datatypes.escalate_datatypes(transform_func, table_name: str, snapshot: DataFrame, type_escalation_mode: TypeEscalationModeEnum, can_rewrite_history=False) Tuple[DataFrame, DataFrame]
Apply the escalation patterns as defined by data_strict and data_loose.
Performs the following steps for datatype escalation:
Get the columns which are both in the new snapshot and in the bronze/silver df. We call this the “common” columns.
Calculate escalation target datatype using data_strict or data_loose for each common column. (Depending on the type_escalation_mode)
Cast new snapshot and/or delta table to the escalation target. (Do not apply the cast yet - spark is lazy!)
Rewrite history (if applicable)
Run the transform_func. This is append for bronze, and merge for silver
- Parameters:
delta_path (str) – The path to the bronze/silver table in the datalake, for example: abfss://sdp@nubulosdpdlsENV01.dfs.core.windows.net/unit_tests/datatypes/all_types.delta
snapshot (DataFrame) – The new dataset to be appended/merged with the current bronze/silver table
type_escalation_mode (TypeEscalationModeEnum) – What kind of escalation mode should we use
can_rewrite_history (bool, optional) – Is it allowed to rewrite bronze/silver? Defaults to False as tihs screws with the table_changes functionality.
- Raises:
CannotModifyDataTypeForHistory – Raises when can_rewrite_history is False and we have to rewrite history to make the two DataFrames compatible.
- Returns:
Converted snapshot dataframe, converted bronze/silver dataframe
- Return type:
(DataFrame, DataFrame)