atmo.clusters¶
The code base to manage AWS EMR clusters.
atmo.clusters.forms¶
-
class
atmo.clusters.forms.EMRReleaseChoiceField(*args, **kwargs)[source]¶ A
ModelChoiceFieldsubclass that usesEMRReleaseobjects for the choices and automatically uses a “radioset” rendering – a horizontal button group for easier selection.
-
class
atmo.clusters.forms.NewClusterForm(*args, **kwargs)[source]¶ A form used for creating new clusters.
Parameters: - identifier (
RegexField) – A unique identifier for your cluster, visible in the AWS management console. (Lowercase, use hyphens instead of spaces.) - size (
IntegerField) – Number of workers to use in the cluster, between 1 and 30. For testing or development 1 is recommended. - lifetime (
IntegerField) – Lifetime in hours after which the cluster is automatically terminated, between 2 and 24. - ssh_key (
ModelChoiceField) – Ssh key - emr_release (
EMRReleaseChoiceField) – Different AWS EMR versions have different versions of software like Hadoop, Spark, etc. See what’s new in each.
- identifier (
atmo.clusters.models¶
-
class
atmo.clusters.models.Cluster(id, created_at, modified_at, created_by, emr_release, identifier, size, lifetime, lifetime_extension_count, ssh_key, expires_at, started_at, ready_at, finished_at, jobflow_id, most_recent_status, master_address, expiration_mail_sent)[source]¶ Parameters: - id (
AutoField) – Id - created_at (
DateTimeField) – Created at - modified_at (
DateTimeField) – Modified at - created_by_id (ForeignKey to
User) – User that created the instance. - emr_release_id (ForeignKey to
EMRRelease) – Different AWS EMR versions have different versions of software like Hadoop, Spark, etc. See what’s new in each. - identifier (
CharField) – Cluster name, used to non-uniqely identify individual clusters. - size (
IntegerField) – Number of computers used in the cluster. - lifetime (
PositiveSmallIntegerField) – Lifetime of the cluster after which it’s automatically terminated, in hours. - lifetime_extension_count (
PositiveSmallIntegerField) – Number of lifetime extensions. - ssh_key_id (ForeignKey to
SSHKey) – SSH key to use when launching the cluster. - expires_at (
DateTimeField) – Date/time that the cluster will expire and automatically be deleted. - started_at (
DateTimeField) – Date/time when the cluster was started on AWS EMR. - ready_at (
DateTimeField) – Date/time when the cluster was ready to run steps on AWS EMR. - finished_at (
DateTimeField) – Date/time when the cluster was terminated or failed on AWS EMR. - jobflow_id (
CharField) – AWS cluster/jobflow ID for the cluster, used for cluster management. - most_recent_status (
CharField) – Most recently retrieved AWS status for the cluster. - master_address (
CharField) – Public address of the master node.This is only available once the cluster has bootstrapped - expiration_mail_sent (
BooleanField) – Whether the expiration mail were sent.
-
exception
DoesNotExist¶
-
exception
MultipleObjectsReturned¶
-
info¶ Returns the provisioning information for the cluster.
-
is_active¶ Returns whether the cluster is active or not.
-
is_expiring_soon¶ Returns whether the cluster is expiring in the next hour.
-
is_failed¶ Returns whether the cluster has failed or not.
-
is_ready¶ Returns whether the cluster is ready or not.
-
is_terminated¶ Returns whether the cluster is terminated or not.
-
is_terminating¶ Returns whether the cluster is terminating or not.
- id (
-
class
atmo.clusters.models.EMRRelease(created_at, modified_at, version, changelog_url, help_text, is_active, is_experimental, is_deprecated)[source]¶ Parameters: - created_at (
DateTimeField) – Created at - modified_at (
DateTimeField) – Modified at - version (
CharField) – Version - changelog_url (
TextField) – The URL of the changelog with details about the release. - help_text (
TextField) – Optional help text to show for users when creating a cluster. - is_active (
BooleanField) – Whether this version should be shown to the user at all. - is_experimental (
BooleanField) – Whether this version should be shown to users as experimental. - is_deprecated (
BooleanField) – Whether this version should be shown to users as deprecated.
-
exception
DoesNotExist¶
-
exception
MultipleObjectsReturned¶
- created_at (
atmo.clusters.provisioners¶
-
class
atmo.clusters.provisioners.ClusterProvisioner[source]¶ The cluster specific provisioner.
-
info(jobflow_id)[source]¶ Returns the cluster info for the cluster with the given Jobflow ID with the fields start time, state and public IP address
-
job_flow_params(*args, **kwargs)[source]¶ Given the parameters returns the extended parameters for EMR job flows for on-demand cluster.
-
list(created_after, created_before=None)[source]¶ Returns a list of cluster infos in the given time frame with the fields: - Jobflow ID - state - start time
-
atmo.clusters.queries¶
-
class
atmo.clusters.queries.ClusterQuerySet(model=None, query=None, using=None, hints=None)[source]¶ A Django queryset that filters by cluster status.
Used by the
Clustermodel.
-
class
atmo.clusters.queries.EMRReleaseQuerySet(model=None, query=None, using=None, hints=None)[source]¶ A Django queryset for the
EMRReleasemodel.
atmo.clusters.tasks¶
-
Task:
atmo.clusters.tasks.deactivate_clusters¶ Deactivate clusters that have been expired.
-
Task:
atmo.clusters.tasks.send_expiration_mails¶ Send expiration emails an hour before the cluster expires.
-
Task:
atmo.clusters.tasks.update_clusters(self)¶ Update the cluster metadata from AWS for the pending clusters.
- To be used periodically.
- Won’t update state if not needed.
- Will queue updating the Cluster’s public IP address if needed.
-
Task:
atmo.clusters.tasks.update_master_address(self, cluster_id, force=False)¶ Update the public IP address for the cluster with the given cluster ID