adp.ingest.sources.staging.StagingEntitySource
- class adp.ingest.sources.staging.StagingEntitySource(entity: EntityType, ingest: Ingest)
Defines a Source for data present in the staging dictionary in ADLS
Methods
__init__(entity, ingest)all_files_in_staging([abfs_path])all_files_in_staging retrieves all files in the staging directory for the given entity
files_matching_patterns(input_paths, pattern)Retrieve all files corresponding to the pattern string
get_data(run_metadata)Retrieves the data for the staging sources
Removes the temporary files
Unzips all files in the staging folder
Upload data to the staging directory
Attributes
Retrieves the path for the staging folder
source- all_files_in_staging(abfs_path=False) List[str]
all_files_in_staging retrieves all files in the staging directory for the given entity
- Returns:
List of ADLS file paths for all the files in the staging directory
- Return type:
List[str]
- files_matching_patterns(input_paths: List[str], pattern: str, anti_pattern: str | None = None) List[str]
Retrieve all files corresponding to the pattern string
- get_data(run_metadata: IngestRunMetadata) DataFrame | None
Retrieves the data for the staging sources
- property path: str
Retrieves the path for the staging folder
- remove_temp_files()
Removes the temporary files
- unzip_files()
Unzips all files in the staging folder
- upload_to_staging()
Upload data to the staging directory