atmo.clusters¶
The code base to manage AWS EMR clusters.
atmo.clusters.forms¶
-
class
atmo.clusters.forms.
EMRReleaseChoiceField
(*args, **kwargs)[source]¶ A
ModelChoiceField
subclass that usesEMRRelease
objects for the choices and automatically uses a “radioset” rendering – a horizontal button group for easier selection.
-
class
atmo.clusters.forms.
NewClusterForm
(*args, **kwargs)[source]¶ A form used for creating new clusters.
Parameters: - identifier (
RegexField
) – A unique identifier for your cluster, visible in the AWS management console. (Lowercase, use hyphens instead of spaces.) - size (
IntegerField
) – Number of workers to use in the cluster, between 1 and 30. For testing or development 1 is recommended. - lifetime (
IntegerField
) – Lifetime in hours after which the cluster is automatically terminated, between 2 and 24. - ssh_key (
ModelChoiceField
) – Ssh key - emr_release (
EMRReleaseChoiceField
) – Different AWS EMR versions have different versions of software like Hadoop, Spark, etc. See what’s new in each.
- identifier (
atmo.clusters.models¶
-
class
atmo.clusters.models.
Cluster
(id, created_at, modified_at, created_by, emr_release, identifier, size, lifetime, lifetime_extension_count, ssh_key, expires_at, started_at, ready_at, finished_at, jobflow_id, most_recent_status, master_address, expiration_mail_sent)[source]¶ Parameters: - id (
AutoField
) – Id - created_at (
DateTimeField
) – Created at - modified_at (
DateTimeField
) – Modified at - created_by_id (ForeignKey to
User
) – User that created the instance. - emr_release_id (ForeignKey to
EMRRelease
) – Different AWS EMR versions have different versions of software like Hadoop, Spark, etc. See what’s new in each. - identifier (
CharField
) – Cluster name, used to non-uniqely identify individual clusters. - size (
IntegerField
) – Number of computers used in the cluster. - lifetime (
PositiveSmallIntegerField
) – Lifetime of the cluster after which it’s automatically terminated, in hours. - lifetime_extension_count (
PositiveSmallIntegerField
) – Number of lifetime extensions. - ssh_key_id (ForeignKey to
SSHKey
) – SSH key to use when launching the cluster. - expires_at (
DateTimeField
) – Date/time that the cluster will expire and automatically be deleted. - started_at (
DateTimeField
) – Date/time when the cluster was started on AWS EMR. - ready_at (
DateTimeField
) – Date/time when the cluster was ready to run steps on AWS EMR. - finished_at (
DateTimeField
) – Date/time when the cluster was terminated or failed on AWS EMR. - jobflow_id (
CharField
) – AWS cluster/jobflow ID for the cluster, used for cluster management. - most_recent_status (
CharField
) – Most recently retrieved AWS status for the cluster. - master_address (
CharField
) – Public address of the master node.This is only available once the cluster has bootstrapped - expiration_mail_sent (
BooleanField
) – Whether the expiration mail were sent.
-
exception
DoesNotExist
¶
-
exception
MultipleObjectsReturned
¶
-
info
¶ Returns the provisioning information for the cluster.
-
is_active
¶ Returns whether the cluster is active or not.
-
is_expiring_soon
¶ Returns whether the cluster is expiring in the next hour.
-
is_failed
¶ Returns whether the cluster has failed or not.
-
is_ready
¶ Returns whether the cluster is ready or not.
-
is_terminated
¶ Returns whether the cluster is terminated or not.
-
is_terminating
¶ Returns whether the cluster is terminating or not.
- id (
-
class
atmo.clusters.models.
EMRRelease
(created_at, modified_at, version, changelog_url, help_text, is_active, is_experimental, is_deprecated)[source]¶ Parameters: - created_at (
DateTimeField
) – Created at - modified_at (
DateTimeField
) – Modified at - version (
CharField
) – Version - changelog_url (
TextField
) – The URL of the changelog with details about the release. - help_text (
TextField
) – Optional help text to show for users when creating a cluster. - is_active (
BooleanField
) – Whether this version should be shown to the user at all. - is_experimental (
BooleanField
) – Whether this version should be shown to users as experimental. - is_deprecated (
BooleanField
) – Whether this version should be shown to users as deprecated.
-
exception
DoesNotExist
¶
-
exception
MultipleObjectsReturned
¶
- created_at (
atmo.clusters.provisioners¶
-
class
atmo.clusters.provisioners.
ClusterProvisioner
[source]¶ The cluster specific provisioner.
-
info
(jobflow_id)[source]¶ Returns the cluster info for the cluster with the given Jobflow ID with the fields start time, state and public IP address
-
job_flow_params
(*args, **kwargs)[source]¶ Given the parameters returns the extended parameters for EMR job flows for on-demand cluster.
-
list
(created_after, created_before=None)[source]¶ Returns a list of cluster infos in the given time frame with the fields: - Jobflow ID - state - start time
-
atmo.clusters.queries¶
-
class
atmo.clusters.queries.
ClusterQuerySet
(model=None, query=None, using=None, hints=None)[source]¶ A Django queryset that filters by cluster status.
Used by the
Cluster
model.
-
class
atmo.clusters.queries.
EMRReleaseQuerySet
(model=None, query=None, using=None, hints=None)[source]¶ A Django queryset for the
EMRRelease
model.
atmo.clusters.tasks¶
-
Task:
atmo.clusters.tasks.
deactivate_clusters
¶ Deactivate clusters that have been expired.
-
Task:
atmo.clusters.tasks.
send_expiration_mails
¶ Send expiration emails an hour before the cluster expires.
-
Task:
atmo.clusters.tasks.
update_clusters
¶ Update the cluster metadata from AWS for the pending clusters.
- To be used periodically.
- Won’t update state if not needed.
- Will queue updating the Cluster’s public IP address if needed.
-
Task:
atmo.clusters.tasks.
update_master_address
(cluster_id, force=False)¶ Update the public IP address for the cluster with the given cluster ID