Welcome to Git2S3’s documentation!¶
Git2S3 - Main¶
- class git2s3.main.Git2S3(env_file: str | os.PathLike = '.env', logger: Logger = None, max_per_page: int = 100)¶
Instantiates Git2S3 object to clone all repos/wiki/gists from GitHub and upload to S3.
>>> Git2S3
- Keyword Arguments:
env_file – Environment configuration.
logger – Bring your own logger object.
max_per_page – Maximum number of repos to fetch per page.
- profile_type() str ¶
Get the profile type.
- Returns:
Returns the profile type.
- Return type:
str
- cli(cmd: str, fail: bool = True, retry: bool = False) int ¶
Runs CLI commands.
- Parameters:
cmd – Command to run.
fail – Boolean flag to fail on errors.
- Returns:
Return code after running the command.
- Return type:
int
- get_all(source: SourceControl) Generator[Dict[str, str]] ¶
Iterate through a target owner/organization to get all available repositories/gists.
- Parameters:
source – Source type to clone.
- Yields:
Generator[Dict[str, str]] – Yields a dictionary of each repo’s information.
- set_pat(url: Union[str, Url]) Optional[Union[str, Url]] ¶
Creates an authenticated URL by updating the netloc, and sets that as the origin URL.
- Parameters:
url – Takes the repository/gist/wiki URL as input.
See also
- This step is not required for:
Public repositories/gists/wiki
- clone_wiki(datastore: DataStore) None ¶
Clone all the wikis from the repository.
- Parameters:
datastore – DataStore model to store repository/gist information.
- worker(repo: Dict[str, str]) None ¶
Clones repository/gist/wiki from GitHub.
- Parameters:
repo – Repository information as JSON payload.
- Raises:
Exception –
If the thread fails to clone the repository. –
- cloner(source: SourceControl) bool ¶
Clones all the repos/gists concurrently.
- Parameters:
source – Source type to clone.
See also
Clones all the repos/gists concurrently using ThreadPoolExecutor.
GitHub doesn’t have a rate limit for cloning, so multi-threading is safe.
This makes it depend on Git installed on the host machine.
References
https://github.com/orgs/community/discussions/44515
- Returns:
Returns a boolean flag to indicate if any of the threads failed.
- Return type:
bool
- start() None ¶
Start the cloning process and upload to S3 once cloning completes successfully.
S3¶
- class git2s3.s3.Uploader(env: EnvConfig, logger: Logger)¶
Concurrent uploader object to upload files to S3.
>>> Uploader
- Keyword Arguments:
env – Environment configuration.
logger – Logger object.
- upload_file(local_file_path: str | os.PathLike, s3_file_path: str | os.PathLike) None ¶
Uploads an object to S3.
- Parameters:
local_file_path – Local file path to upload from.
s3_file_path – S3 file path to upload to.
- trigger() int ¶
Trigger to upload all file objects concurrently to S3.
- Returns:
Returns a failed count to indiciate the number files that were failed to upload.
- Return type:
int
Squire¶
- git2s3.squire.archer(destination: str) None ¶
Archives a given directory and deletes it while retaining the zipfile.
- Parameters:
destination – Directory path to be archived.
- Raises:
AssertionError –
If zipfile is not present after archiving. –
- git2s3.squire.env_loader(filename: str | os.PathLike) EnvConfig ¶
Loads environment variables based on filetypes.
- Parameters:
filename – Filename from where env vars have to be loaded.
- Returns:
Returns a reference to the
EnvConfig
object.- Return type:
- git2s3.squire.source_detector(repo: Dict[str, str], env: EnvConfig) DataStore ¶
Detects the type of source to clone and returns the DataStore model.
- Parameters:
repo – Repository information as a dict.
env – Environment configuration.
- Returns:
DataStore model.
- Return type:
- git2s3.squire.default_logger(env: EnvConfig) Logger ¶
Generates a default console logger.
- Parameters:
env – Environment configuration.
- Returns:
Logger object.
- Return type:
logging.Logger
- git2s3.squire.check_file_presence(source_dir: str | os.PathLike) int ¶
Get a list of all subdirectories and check for file presence.
- Parameters:
source_dir – Root directory to check for file presence.
- Returns:
Returns the total number of zip files cloned.
- Return type:
int
Configuration¶
- class git2s3.config.DataStore(BaseModel)¶
DataStore model to store repository/gist information.
>>> DataStore
- source: SourceControl¶
- clone_url: Url¶
- name: str¶
- description: Optional[str]¶
- private: bool¶
- class git2s3.config.EnvConfig(BaseSettings)¶
Configure all env vars and validate using
pydantic
.>>> EnvConfig
- git_api_url: Url¶
- git_owner: str¶
- git_token: str¶
- git_ignore: List[str]¶
- incomplete_upload: bool¶
- source: Union[SourceControl, List[SourceControl]]¶
- log: LogOptions¶
- debug: bool¶
- local_store: bool¶
- aws_profile_name: str | None¶
- aws_access_key_id: str | None¶
- aws_secret_access_key: str | None¶
- aws_region_name: str | None¶
- aws_bucket_name: str¶
- aws_s3_prefix: str¶
- boto3_retry_attempts: int¶
- boto3_retry_mode: Boto3RetryMode¶
- classmethod from_env_file(filename: Path) EnvConfig ¶
Create an instance of EnvConfig from environment file.
- Parameters:
filename – Name of the env file.
See also
Loading environment variables from files are an additional feature.
Both the system’s and session’s env vars are processed by default.
- Returns:
Loads the
EnvConfig
model.- Return type:
- classmethod parse_source(value: Union[SourceControl, List[SourceControl]]) Path ¶
Validate and parse ‘source’ to remove ‘all’ from the source option.
- classmethod parse_git_api_url(value: Url) str ¶
Parse git_api_url stripping the
/
at the end.
- classmethod parse_git_ignore(value: List[str]) List[str] ¶
Convert all git_ignore values to lowercase.
- class git2s3.config.LogOptions(StrEnum)¶
Available log options for default logger.
>>> LogOptions
- stdout: str = 'stdout'¶
- file: str = 'file'¶
Exceptions¶
- exception git2s3.exc.DirectoryExists¶
Warning: Raised when clone directory already exists.
- exception git2s3.exc.UnsupportedSource¶
Warning: Raised when source is not supported.
- exception git2s3.exc.Git2S3Error¶
Exception: Base class for all exceptions.
- exception git2s3.exc.GitHubAPIError¶
Exception: Raised when failed to fetch repositories from source control.
- exception git2s3.exc.InvalidOwner¶
Exception: Raised when owner is invalid.
- exception git2s3.exc.InvalidSource¶
Exception: Raised when source is invalid.
- exception git2s3.exc.ArchiveError¶
Exception: Raised when failed to archive repositories.
- exception git2s3.exc.UploadError¶
Exception: Raised when failed to upload file objects to S3.