Welcome to S3 Downloader’s documentation!¶

Read Me:

S3 Download

S3 Downloader¶

Main Module¶

class s3.dumper.Downloader(bucket_name: str, download_dir: str | None = None, region_name: str | None = None, profile_name: str | None = None, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, logger: ~logging.Logger | None = None, log_type: ~s3.logger.LogType = LogType.stdout, sort: ~s3.squire.Sort = Sort.no_sort, prefix: str | ~typing.List[str] | None = None, retry_config: ~botocore.config.Config = <botocore.config.Config object>, transfer_config: ~boto3.s3.transfer.TransferConfig = <boto3.s3.transfer.TransferConfig object>)¶

Initiates Downloader object to download an entire S3 bucket.

>>> Downloader

Initiates all the necessary args and creates a boto3 session with retry logic.

Parameters:

bucket_name – Name of the bucket.
download_dir – Name of the download directory. Defaults to bucket name.
region_name – Name of the AWS region.
profile_name – AWS profile name.
aws_access_key_id – AWS access key ID.
aws_secret_access_key – AWS secret access key.
logger – Bring your own logger.
log_type – Type of logging output. Defaults to stdout.
sort – Sorting options for the files to be downloaded. Defaults to no_sort.
prefix – Specific path [OR] list of paths from which the objects have to be downloaded.
retry_config – Custom retry configuration for boto3 client. Defaults to RETRY_CONFIG.
transfer_config – Custom transfer configuration for boto3 client. Defaults to TRANSFER_CONFIG.

Warning

The default sort option is no_sort which uses the default lexicographical order by object key.
Bucket objects are fetched using bucket.objects.all() which is paginated under the hood.
Sorting will pull everything into memory. This may be expensive for very large buckets.

RETRY_CONFIG: Config = <botocore.config.Config object>¶

TRANSFER_CONFIG: TransferConfig = <boto3.s3.transfer.TransferConfig object>¶

init() → None¶

Instantiates the bucket instance.

Raises:

ValueError – If no bucket name was passed.
BucketNotFound – If bucket name was not found.

exit() → None¶: Logs if there were any failures.

get_objects() → List[S3Object]¶

Get all the objects in the target s3 bucket.

Raises:

InvalidPrefix – If no objects with the given path exists.
NoObjectFound – If the bucket is empty.

Returns:

List of objects in the bucket.

Return type:

List[S3Object]

downloader(s3_object: S3Object, callback: ProgressPercentage) → None¶

Download the files in the exact path as in the bucket.

Parameters:

s3_object – Takes the S3Object as an argument.
callback – Takes the ProgressPercentage callback to track download progress.

Exceptions¶

Module to store all the custom exceptions and formatters.

>>> S3Error

exception s3.exceptions.S3Error¶: Custom error for base exception to the s3-downloader module.

exception s3.exceptions.BucketNotFound¶: Custom error for bucket not found.

exception s3.exceptions.NoObjectFound¶: Custom error for no objects found.

exception s3.exceptions.InvalidPrefix(prefix: str, bucket_name: str)¶

Custom exception for invalid prefix value.

Initialize an instance of InvalidPrefix object inherited from S3Error

Parameters:

prefix – Prefix to limit the objects.
bucket_name – Name of the S3 bucket.

format_error_message()¶: Returns the formatter error message as a string.

Progress¶

class s3.progress.ProgressPercentage(filename: str, size: int, bar: alive_bar)¶

Tracks the file transfer progress in S3 and updates the alive_bar.

>>> ProgressPercentage

Initializes the progress tracker.

Parameters:

filename – Name of the file being transferred.
size – Total size of the file in bytes.
bar – alive_bar instance to update progress.

Squire¶

s3.squire.refine_prefix(prefix: str | List[str] | None = None) → Generator[str]¶

Refines the prefix input to ensure it is a list of strings.

Parameters:: prefix – A string or a list of strings representing the prefix(es) to filter S3 objects.
Yields:: str – Yields strings representing the refined prefix(es).

s3.squire.size_converter(byte_size: int | float) → str¶

Gets the current memory consumed and converts it to human friendly format.

Parameters:: byte_size – Receives byte size as argument.
Returns:: Converted human understandable size.
Return type:: str

s3.squire.convert_to_folder_structure(sequence: Dict[str, int]) → str¶

Convert objects in an S3 bucket into a folder-like representation including sizes.

Parameters:: sequence – A dictionary where keys are S3 object keys (paths) and values are their sizes in bytes.
Returns:: A string representing the folder structure of the S3 bucket, with each file and folder showing the size.
Return type:: str

s3.squire.format_bucket_structure(bucket_structure: Dict[str, int], convert_size: bool) → Dict[str, Any]¶

Formats the bucket structure into a human-readable string.

Parameters:

bucket_structure – A dictionary where keys are S3 object keys (paths) and values are their sizes in bytes.
convert_size – A boolean indicating whether to convert sizes to human-readable format.

Returns:

A dictionary representing the folder structure of the S3 bucket, with each file and folder showing the size.

Return type:

Dict[str, Any]

class s3.squire.S3Object(key: str, size: int)¶

Represents an S3 object with its key and size.

key: str¶

size: int¶

class s3.squire.DownloadResults¶

Object to store results of S3 download.

>>> DownloadResults

success: int = 0¶

failed: int = 0¶

skipped: int = 0¶

class s3.squire.Sort(value)¶

Enum to represent sorting options for S3 objects.

>>> Sort

size: str = 'size'¶

size_desc: str = 'size_desc'¶

key: str = 'key'¶

key_desc: str = 'key_desc'¶

last_modified: str = 'last_modified'¶

last_modified_desc: str = 'last_modified_desc'¶

no_sort: str = 'no_sort'¶

Logger¶

Loads a default logger with StreamHandler set to DEBUG mode.

>>> logging.Logger

class s3.logger.LogType(value)¶

Defines the type of logging output.

>>> LogType

file: str = 'file'¶

stdout: str = 'stdout'¶

s3.logger.default_handler(log_type: LogType) → StreamHandler | FileHandler¶

Creates a handler and assigns a default format to it.

Parameters:: log_type – An instance of the LogType enum to specify the type of logging output.
Returns:: Returns an instance of either StreamHandler or FileHandler based on the specified log type.
Return type:: Union[logging.StreamHandler, logging.FileHandler]

s3.logger.default_format() → Formatter¶

Creates a logging Formatter with a custom message and datetime format.

Returns:: Returns an instance of the Formatter object.
Return type:: logging.Formatter

s3.logger.default_logger(log_type: LogType) → Logger¶

Creates a default logger with debug mode enabled.

Parameters:: log_type – An instance of the LogType enum to specify the type of logging output.
Returns:: Returns an instance of the Logger object.
Return type:: logging.Logger

Welcome to S3 Downloader’s documentation!¶

S3 Downloader¶

Main Module¶

Exceptions¶

Progress¶

Squire¶

Logger¶

Indices and tables¶

Table of Contents

Next topic

This Page