adp.ingest.sources.staging.StagingEntitySource

class adp.ingest.sources.staging.StagingEntitySource(entity: EntityType, ingest: Ingest)

Defines a Source for data present in the staging dictionary in ADLS

__init__(entity: EntityType, ingest: Ingest)

Methods

__init__(entity, ingest)

all_files_in_staging([abfs_path])

all_files_in_staging retrieves all files in the staging directory for the given entity

files_matching_patterns(input_paths, pattern)

Retrieve all files corresponding to the pattern string

get_data(run_metadata)

Retrieves the data for the staging sources

remove_temp_files()

Removes the temporary files

unzip_files()

Unzips all files in the staging folder

upload_to_staging()

Upload data to the staging directory

Attributes

path

Retrieves the path for the staging folder

source

all_files_in_staging(abfs_path=False) List[str]

all_files_in_staging retrieves all files in the staging directory for the given entity

Returns:

List of ADLS file paths for all the files in the staging directory

Return type:

List[str]

files_matching_patterns(input_paths: List[str], pattern: str, anti_pattern: str | None = None) List[str]

Retrieve all files corresponding to the pattern string

get_data(run_metadata: IngestRunMetadata) DataFrame | None

Retrieves the data for the staging sources

property path: str

Retrieves the path for the staging folder

remove_temp_files()

Removes the temporary files

unzip_files()

Unzips all files in the staging folder

upload_to_staging()

Upload data to the staging directory