adp.ingest.ingest.preprocess

adp.ingest.ingest.preprocess(ingest: Ingest, entity: Entity, df: DataFrame) DataFrame | None

Preprocess the new snapshot of the data

Applies the following steps to the dataframe:

  1. Flattens the columns in the dataframe

  2. Filters the dataframe using the selected, excluded and max_depth columns.

  3. Rename the columns to lowercase and remove special characters

Parameters:
  • entity (Entity) – The entity for which to preprcess the data

  • df (DataFrame) – The data to preprocess for the entity

Returns:

The flattened, exploded and filtered dataframe used in later bronze and silver stages.

Return type:

Dataframe