Mozilla ATMO

Welcome to the documentation of ATMO, the code that runs Mozilla’s Telementry Analysis Service.

ATMO is a self-service portal to launch on-demand AWS EMR clusters with Apache Spark, Apache Zeppelin and Jupyter installed. Additionally it allows to schedule Spark jobs to run regularly based on uploaded Jupyter (and soon Zeppelin) notebooks.

It provides a management UI for public SSH keys when launching on-demand clusters, login via Google auth and flexible adminstration interfaces for users and admins.

Behind the scenes it’s shipped as Docker images and uses Python 3.6 for the web UI (Django) and the task management (Celery).

Overview

This is a quick overview of how ATMO works, both from the perspective of code structure as well as an architecture overview.

Workflows

There are a few workflows in the ATMO code base that are of interest and decide how it works:

  • Adding SSH keys
  • Creating an on-demand cluster
  • Scheduling a Spark job

Adding SSH keys

Creating an on-demand cluster

Scheduling a Spark job

digraph runjob { job [label="Run job task"]; getrun [shape=diamond, label="Get run"]; hasrun [shape=diamond, label="Has Run?"]; logandreturn [shape=box, label="Log and return"]; isenabled [shape=diamond, label="Is enabled?"]; sync [shape=box, label="Sync run"]; isrunnable [shape=diamond, label="Is runnable?"]; hastimedout [shape=diamond, label="Timed out?"]; isdue [shape=diamond, label="Is due?"]; retryin10mins [shape=box, label="Retry in 10 mins"]; notifyowner [shape=box, label="Notify owner" ]; terminatejob [shape=box, label="Terminate last run" ]; unschedule_and_expire [shape=box, label="Unschedule and expire" ]; provisioncluster [shape=box, label="Provision cluster" ]; job -> getrun; getrun -> hasrun [label="FOUND"]; getrun -> logandreturn [label="NOT FOUND"]; hasrun -> isenabled [label="NO"]; hasrun -> sync [label="YES"]; sync -> isenabled; isenabled -> logandreturn [label="NO"]; isenabled -> isrunnable [label="YES"]; isrunnable -> hastimedout [label="NO"]; isrunnable -> isdue [label="YES"]; hastimedout -> retryin10mins [ label="NO" ]; hastimedout -> notifyowner [ label="YES" ]; notifyowner -> terminatejob [ label="Job ABC timed out too early..."]; isdue -> unschedule_and_expire [ label="NO" ]; isdue -> provisioncluster [ label="YES" ]; }

Maintenance

EMR releases

Dependency upgrades

ATMO uses a number of dependencies for both the backend as well as the frontend web UI.

For the first we’re using pip requirements.txt files to manage dependencies, whose version should always be pinned to an exact version and have hashes attached for the individual release files for pip’s hash-checking mode.

For the frontend dependencies we’re using NPM’s default package.json and NPM >= 5’s package-lock.json.

See below for guides to update both.

Python dependencies

Python dependencies are installed using pip during the Docker image build process. As soon as you build the docker image using make build it’ll check if the appropriate requirements file has changed and rebuilds the container image if needed.

To add a new Python dependency please:

  • Log in into the web container with make shell.
  • Change into the requirements folder with cd /app/requirements
  • Then, depending on the area in which the dependency you’re about to add/update, chose one of the following files to update:
    • build.txt - dependencies for when the Docker image is built, the default requirements file, basically.
    • docs.txt - dependencies for building the Sphinx based docs.
    • tests.txt - dependencies for running the test suite.
  • Add/update the dependency in the file you chose, including a hash for pip’s hash-checking mode. You may want to use the tool hashin to do that, e.g. hashin -r /app/requirements/docs.txt Sphinx.
  • Leave the container again with exit.
  • Run make build on the host machine.

That will rebuild the images used by docker-compose.

NPM (“front-end”) dependencies

The front-end dependencies are installed when building the Docker images just like Python dependencies.

To add a new dependency to ATMO, please:

  • Log in into the web container with make shell.
  • Install the new dependency with npm install --save-exact <name>
  • Delete the temporary node_modules folder: rm -rf /app/node_modules.
  • Leave the container again with exit.
  • Run make build on the host machine
  • Extend the NPM_FILE_PATTERNS setting in the settings.py file with the files that are needed to be copied by Django’s collectstatic management command.

That will rebuild the images used by docker-compose.

Development

ATMO is maintained on GitHub in its own repository at:

Please clone the Git repository using the git command line tool or any other way you’re comfortable with, e.g.:

git clone https://github.com/mozilla/telemetry-analysis-service

ATMO also uses Docker for local development and deployment. Please make sure to install Docker and Docker Compose on your computer to contribute code or documentation changes.

Configuration

To set the application up, please copy the .env-dist file to one named .env and then update the variables starting with AWS_ with the appropriate value.

Set the DJANGO_SECRET_KEY variable using the output of the following command after logging into the Docker container with make shell:

python -c "import secrets; print(secrets.token_urlsafe(50))"

To start the application, run:

make up

Run the tests

There’s a sample test in tests/test_users.py for your convenience, that you can run using the following command on your computer:

make test

This will spin up a Docker container to run the tests, so please set up the development setup first.

The default options for running the test are in pytest.ini. This is a good set of defaults.

Alternatively, e.g. when you want to only run part of the tests first open a console to the web container..

make shell

and then run pytest directly:

pytest

Some helpful command line arguments to pytest (won’t work on make test):

--pdb:
Drop into pdb on test failure.
--create-db:
Create a new test database.
--showlocals:
Shows local variables in tracebacks on errors.
--exitfirst:
Exits on the first failure.
--lf, --last-failed:
Run only the last failed tests.

See pytest --help for more arguments.

Running subsets of tests and specific tests

There are a bunch of ways to specify a subset of tests to run:

  • all the tests that have “foobar” in their names:

    pytest -k foobar
    
  • all the tests that don’t have “foobar” in their names:

    pytest -k "not foobar"
    
  • tests in a certain directory:

    pytest tests/jobs/
    
  • specific test:

    pytest tests/jobs/test_views.py::test_new_spark_job
    

See http://pytest.org/latest/usage.html for more examples.

Troubleshooting

Docker-Compose gives an error message similar to “ERROR: client and server don’t have same version (client : 1.21, server: 1.18)”

Make sure to install the latest versions of both Docker and Docker-Compose. The current versions of these in the Debian repositories might not be mutually compatible.

Django gives an error message similar to OperationalError: SOME_TABLE does not exist

The database likely isn’t set up correctly. Run make migrate to update it.

Django gives some other form of OperationalError, and we don’t really care about the data that’s already in the database (e.g., while developing or testing)

Database errors are usually caused by an improper database configuration. For development purposes, recreating the database will often solve the issue.

Django gives an error message similar to 'NoneType' object has no attribute 'get_frozen_credentials'.

Deployment

Releasing ATMO happens by tagging a CalVer based Git tag with the following pattern:

YYYY.M.N

YYYY is the four-digit year number, M is a single-digit month number and N is a single-digit zero-based counter which does NOT relate to the day of the release. Valid versions numbers are:

  • 2017.10.0
  • 2018.1.0
  • 2018.12.12
  • 1970.1.1

Once the Git tag has been pushed to the main GitHub repository using git push origin --tags, Circle CI will automatically build a tagged Docker image after the tests have passed and push it to Docker Hub. From there the Mozilla CloudOPs team has configured a stage/prod deployment pipeline.

Stage deployments happen automatically when a new release is made. Prod deployments happen on demand by the CloudOPs team.

Reference

Here you’ll find the automated code documentation for the ATMO code:

atmo

These are the submodules of the atmo package that don’t quite fit “topics”, like the atmo.clusters, atmo.jobs and atmo.users packages.

atmo.celery

class atmo.celery.AtmoCelery(main=None, loader=None, backend=None, amqp=None, events=None, log=None, control=None, set_as_current=True, tasks=None, broker=None, include=None, changes=None, config_source=None, fixups=None, task_cls=None, autofinalize=True, namespace=None, strict_typing=True, **kwargs)[source]

A custom Celery class to implement exponential backoff retries.

backoff(n, cap=3600)[source]

Return a fully jittered backoff value for the given number.

send_task(*args, **kwargs)[source]

Send task by name.

Supports the same arguments as @-Task.apply_async().

Arguments:
name (str): Name of task to call (e.g., “tasks.add”). result_cls (~@AsyncResult): Specify custom result class.
class atmo.celery.ExpoBackoffFullJitter(base, cap)[source]

Implement fully jittered exponential retries.

See for more infos:

backoff(n)[source]

Return the exponential backoff value for the given number.

expo(n)[source]

Return the exponential value for the given number.

atmo.celery.celery = <AtmoCelery atmo>

The Celery app instance used by ATMO, which auto-detects Celery config values from Django settings prefixed with “CELERY_” and autodiscovers Celery tasks from tasks.py modules in Django apps.

atmo.context_processors

atmo.context_processors.alerts(request)[source]

Here be dragons, for who are bold enough to break systems and lose data

This adds an alert to requests in stage and development environments.

atmo.context_processors.settings(request)[source]

Adds the Django settings object to the template context.

atmo.context_processors.version(request)[source]

Adds version-related context variables to the context.

atmo.decorators

atmo.decorators.add_permission_required(model, **params)[source]

Checks add object permissions for the given model and parameters.

atmo.decorators.change_permission_required(model, **params)[source]

Checks change object permissions for the given model and parameters.

atmo.decorators.delete_permission_required(model, **params)[source]

Checks delete object permissions for the given model and parameters.

atmo.decorators.modified_date(view_func)[source]

A decorator that when applied to a view using a TemplateResponse will look for a context variable (by default “modified_date”) to set the header (by default “X-ATMO-Modified-Date”) with the ISO formatted value.

This is useful to check for modification on the client side. The end result will be a header like this:

X-ATMO-Modified-Date: 2017-03-14T10:48:53+00:00
atmo.decorators.permission_required(perm, klass, **params)[source]

A decorator that will raise a 404 if an object with the given view parameters isn’t found or if the request user does not have the given permission for the object.

E.g. for checking if the request user is allowed to change a user with the given username:

@permission_required('auth.change_user', User)
def change_user(request, username):
    # can use get() directly since get_object_or_404 was already called
    # in the decorator and would have raised a Http404 if not found
    user = User.objects.get(username=username)
    return render(request, 'change_user.html', context={'user': user})
atmo.decorators.view_permission_required(model, **params)[source]

Checks view object permissions for the given model and parameters.

atmo.models

class atmo.models.CreatedByModel(*args, **kwargs)[source]

An abstract data model that has a relation to the Django user model as configured by the AUTH_USER_MODEL setting. The reverse related name is created_<name of class>s, e.g. user.created_clusters.all() where user is a User instance that has created various Cluster objects before.

Parameters:created_by_id (ForeignKey to User) – User that created the instance.
assign_permission(user, perm)[source]

Assign permission to the given user, e.g. ‘clusters.view_cluster’,

save(*args, **kwargs)[source]

Saves the current instance. Override this in a subclass if you want to control the saving process.

The ‘force_insert’ and ‘force_update’ parameters can be used to insist that the “save” must be an SQL insert or update (or equivalent for non-SQL backends), respectively. Normally, they should not be set.

class atmo.models.EditedAtModel(*args, **kwargs)[source]

An abstract data model used by various other data models throughout ATMO that store timestamps for the creation and modification.

Parameters:
save(*args, **kwargs)[source]

Saves the current instance. Override this in a subclass if you want to control the saving process.

The ‘force_insert’ and ‘force_update’ parameters can be used to insist that the “save” must be an SQL insert or update (or equivalent for non-SQL backends), respectively. Normally, they should not be set.

class atmo.models.PermissionMigrator(apps, model, perm, user_field=None, group=None)[source]

A custom django-guardian permission migration to be used when new model classes are added and users or groups require object permissions retroactively.

assign()[source]

The primary method to assign a permission to the user or group.

remove()[source]

The primary method to remove a permission to the user or group.

class atmo.models.URLActionModel(*args, **kwargs)[source]

A model base class to be used with URL patterns that define actions for models, e.g. /foo/bar/1/edit, /foo/bar/1/delete etc.

url_actions = []

The list of actions to be used to reverse the URL patterns with

url_delimiter = '-'

The delimiter to be used for the URL pattern names.

url_field_name = 'id'

The field name to be used with the keyword argument in the URL pattern.

url_kwarg_name = 'id'

The keyword argument name to be used in the URL pattern.

url_prefix = None

The prefix to be used for the URL pattern names.

atmo.models.next_field_value(model_cls, field_name, field_value, start=2, separator='-', max_length=0, queryset=None)[source]

For the given model class, field name and field value provide a “next” value, which basically means a counter appended to the value.

atmo.names

atmo.names.adjectives = ['admiring', 'adoring', 'affectionate', 'agitated', 'amazing', 'angry', 'awesome', 'blissful', 'boring', 'brave', 'clever', 'cocky', 'compassionate', 'competent', 'condescending', 'confident', 'cranky', 'dazzling', 'determined', 'distracted', 'dreamy', 'eager', 'ecstatic', 'elastic', 'elated', 'elegant', 'eloquent', 'epic', 'fervent', 'festive', 'flamboyant', 'focused', 'friendly', 'frosty', 'gallant', 'gifted', 'goofy', 'gracious', 'happy', 'hardcore', 'heuristic', 'hopeful', 'hungry', 'infallible', 'inspiring', 'jolly', 'jovial', 'keen', 'kind', 'laughing', 'loving', 'lucid', 'mystifying', 'modest', 'musing', 'naughty', 'nervous', 'nifty', 'nostalgic', 'objective', 'optimistic', 'peaceful', 'pedantic', 'pensive', 'practical', 'priceless', 'quirky', 'quizzical', 'relaxed', 'reverent', 'romantic', 'sad', 'serene', 'sharp', 'silly', 'sleepy', 'stoic', 'stupefied', 'suspicious', 'tender', 'thirsty', 'trusting', 'unruffled', 'upbeat', 'vibrant', 'vigilant', 'vigorous', 'wizardly', 'wonderful', 'xenodochial', 'youthful', 'zealous', 'zen']

The adjectives to be used to generate random names.

atmo.names.random_scientist(separator=None)[source]

Generate a random scientist name using the given separator and a random 4-digit number, similar to Heroku’s random project names.

atmo.names.scientists = ['albattani', 'allen', 'almeida', 'agnesi', 'archimedes', 'ardinghelli', 'aryabhata', 'austin', 'babbage', 'banach', 'bardeen', 'bartik', 'bassi', 'beaver', 'bell', 'benz', 'bhabha', 'bhaskara', 'blackwell', 'bohr', 'booth', 'borg', 'bose', 'boyd', 'brahmagupta', 'brattain', 'brown', 'carson', 'chandrasekhar', 'shannon', 'clarke', 'colden', 'cori', 'cray', 'curran', 'curie', 'darwin', 'davinci', 'dijkstra', 'dubinsky', 'easley', 'edison', 'einstein', 'elion', 'engelbart', 'euclid', 'euler', 'fermat', 'fermi', 'feynman', 'franklin', 'galileo', 'gates', 'goldberg', 'goldstine', 'goldwasser', 'golick', 'goodall', 'haibt', 'hamilton', 'hawking', 'heisenberg', 'hermann', 'heyrovsky', 'hodgkin', 'hoover', 'hopper', 'hugle', 'hypatia', 'jackson', 'jang', 'jennings', 'jepsen', 'johnson', 'joliot', 'jones', 'kalam', 'kare', 'keller', 'kepler', 'khorana', 'kilby', 'kirch', 'knuth', 'kowalevski', 'lalande', 'lamarr', 'lamport', 'leakey', 'leavitt', 'lewin', 'lichterman', 'liskov', 'lovelace', 'lumiere', 'mahavira', 'mayer', 'mccarthy', 'mcclintock', 'mclean', 'mcnulty', 'meitner', 'meninsky', 'mestorf', 'minsky', 'mirzakhani', 'morse', 'murdock', 'neumann', 'newton', 'nightingale', 'nobel', 'noether', 'northcutt', 'noyce', 'panini', 'pare', 'pasteur', 'payne', 'perlman', 'pike', 'poincare', 'poitras', 'ptolemy', 'raman', 'ramanujan', 'ride', 'montalcini', 'ritchie', 'roentgen', 'rosalind', 'saha', 'sammet', 'shaw', 'shirley', 'shockley', 'sinoussi', 'snyder', 'spence', 'stallman', 'stonebraker', 'swanson', 'swartz', 'swirles', 'tesla', 'thompson', 'torvalds', 'turing', 'varahamihira', 'visvesvaraya', 'volhard', 'wescoff', 'wiles', 'williams', 'wilson', 'wing', 'wozniak', 'wright', 'yalow', 'yonath']

The scientists to be used to generate random names.

atmo.provisioners

class atmo.provisioners.Provisioner[source]

A base provisioner to be used by specific cases of calling out to AWS EMR. This is currently storing some common code and simplifies testing.

Subclasses need to override there class attributes:

job_flow_params(user_username, user_email, identifier, emr_release, size)[source]

Given the parameters returns the basic parameters for EMR job flows, and handles for example the decision whether to use spot instances or not.

log_dir = None

The name of the log directory, e.g. ‘jobs’.

name_component = None

The name to be used in the identifier, e.g. ‘job’.

spark_emr_configuration()[source]

Fetch the Spark EMR configuration data to be passed as the Configurations parameter to EMR API endpoints.

We store this in S3 to be able to share it between various Telemetry services.

atmo.tasks

Task: atmo.tasks.cleanup_permissions

A Celery task that cleans up old django-guardian object permissions.

atmo.templatetags

atmo.templatetags.full_url(url)[source]

A Django template filter to prepend the given URL path with the full site URL.

atmo.templatetags.markdown(content)[source]

A Django template filter to render the given content as Markdown.

atmo.templatetags.url_update(url, **kwargs)[source]

A Django template tag to update the query parameters for the given URL.

atmo.views

class atmo.views.DashboardView(**kwargs)[source]

The dashboard view that allows filtering clusters and jobs shown.

active_cluster_filter = 'active'

Active filter for clusters

clusters_filters = ['active', 'terminated', 'failed', 'all']

Allowed filters for clusters

default_cluster_filter = 'active'

Default cluster filter

http_method_names = ['get', 'head']

No need to accept POST or DELETE requests

maintainer_group_name = 'Spark job maintainers'

Name of auth group that is checked to display Spark jobs

template_name = 'atmo/dashboard.html'

Template name

atmo.views.permission_denied(request, exception, template_name='403.html')[source]

Permission denied (403) handler.

Template:403.html

If the template does not exist, an Http403 response containing the text “403 Forbidden” (as per RFC 7231) will be returned.

atmo.views.server_error(request, template_name='500.html')[source]

500 error handler.

Template:500.html

atmo.clusters

The code base to manage AWS EMR clusters.

atmo.clusters.forms

class atmo.clusters.forms.EMRReleaseChoiceField(*args, **kwargs)[source]

A ModelChoiceField subclass that uses EMRRelease objects for the choices and automatically uses a “radioset” rendering – a horizontal button group for easier selection.

label_from_instance(obj)[source]

Append the status of the EMR release if it’s experimental or deprecated.

class atmo.clusters.forms.ExtendClusterForm(*args, **kwargs)[source]
class atmo.clusters.forms.NewClusterForm(*args, **kwargs)[source]

A form used for creating new clusters.

Parameters:
  • identifier (RegexField) – A unique identifier for your cluster, visible in the AWS management console. (Lowercase, use hyphens instead of spaces.)
  • size (IntegerField) – Number of workers to use in the cluster, between 1 and 30. For testing or development 1 is recommended.
  • lifetime (IntegerField) – Lifetime in hours after which the cluster is automatically terminated, between 2 and 24.
  • ssh_key (ModelChoiceField) – Ssh key
  • emr_release (EMRReleaseChoiceField) – Different AWS EMR versions have different versions of software like Hadoop, Spark, etc. See what’s new in each.

atmo.clusters.models

class atmo.clusters.models.Cluster(id, created_at, modified_at, created_by, emr_release, identifier, size, lifetime, lifetime_extension_count, ssh_key, expires_at, started_at, ready_at, finished_at, jobflow_id, most_recent_status, master_address, expiration_mail_sent)[source]
Parameters:
  • id (AutoField) – Id
  • created_at (DateTimeField) – Created at
  • modified_at (DateTimeField) – Modified at
  • created_by_id (ForeignKey to User) – User that created the instance.
  • emr_release_id (ForeignKey to EMRRelease) – Different AWS EMR versions have different versions of software like Hadoop, Spark, etc. See what’s new in each.
  • identifier (CharField) – Cluster name, used to non-uniqely identify individual clusters.
  • size (IntegerField) – Number of computers used in the cluster.
  • lifetime (PositiveSmallIntegerField) – Lifetime of the cluster after which it’s automatically terminated, in hours.
  • lifetime_extension_count (PositiveSmallIntegerField) – Number of lifetime extensions.
  • ssh_key_id (ForeignKey to SSHKey) – SSH key to use when launching the cluster.
  • expires_at (DateTimeField) – Date/time that the cluster will expire and automatically be deleted.
  • started_at (DateTimeField) – Date/time when the cluster was started on AWS EMR.
  • ready_at (DateTimeField) – Date/time when the cluster was ready to run steps on AWS EMR.
  • finished_at (DateTimeField) – Date/time when the cluster was terminated or failed on AWS EMR.
  • jobflow_id (CharField) – AWS cluster/jobflow ID for the cluster, used for cluster management.
  • most_recent_status (CharField) – Most recently retrieved AWS status for the cluster.
  • master_address (CharField) – Public address of the master node.This is only available once the cluster has bootstrapped
  • expiration_mail_sent (BooleanField) – Whether the expiration mail were sent.
exception DoesNotExist
exception MultipleObjectsReturned
deactivate()[source]

Shutdown the cluster and update its status accordingly

extend(hours)[source]

Extend the cluster lifetime by the given number of hours.

info

Returns the provisioning information for the cluster.

is_active

Returns whether the cluster is active or not.

is_expiring_soon

Returns whether the cluster is expiring in the next hour.

is_failed

Returns whether the cluster has failed or not.

is_ready

Returns whether the cluster is ready or not.

is_terminated

Returns whether the cluster is terminated or not.

is_terminating

Returns whether the cluster is terminating or not.

save(*args, **kwargs)[source]

Insert the cluster into the database or update it if already present, spawning the cluster if it’s not already spawned.

sync(info=None)[source]

Should be called to update latest cluster status in self.most_recent_status.

class atmo.clusters.models.EMRRelease(created_at, modified_at, version, changelog_url, help_text, is_active, is_experimental, is_deprecated)[source]
Parameters:
  • created_at (DateTimeField) – Created at
  • modified_at (DateTimeField) – Modified at
  • version (CharField) – Version
  • changelog_url (TextField) – The URL of the changelog with details about the release.
  • help_text (TextField) – Optional help text to show for users when creating a cluster.
  • is_active (BooleanField) – Whether this version should be shown to the user at all.
  • is_experimental (BooleanField) – Whether this version should be shown to users as experimental.
  • is_deprecated (BooleanField) – Whether this version should be shown to users as deprecated.
exception DoesNotExist
exception MultipleObjectsReturned

atmo.clusters.provisioners

class atmo.clusters.provisioners.ClusterProvisioner[source]

The cluster specific provisioner.

format_info(cluster)[source]

Formats the data returned by the EMR API for internal ATMO use.

format_list(cluster)[source]

Formats the data returned by the EMR API for internal ATMO use.

info(jobflow_id)[source]

Returns the cluster info for the cluster with the given Jobflow ID with the fields start time, state and public IP address

job_flow_params(*args, **kwargs)[source]

Given the parameters returns the extended parameters for EMR job flows for on-demand cluster.

list(created_after, created_before=None)[source]

Returns a list of cluster infos in the given time frame with the fields: - Jobflow ID - state - start time

start(user_username, user_email, identifier, emr_release, size, public_key)[source]

Given the parameters spawns a cluster with the desired properties and returns the jobflow ID.

stop(jobflow_id)[source]

Stops the cluster with the given JobFlow ID.

atmo.clusters.queries

class atmo.clusters.queries.ClusterQuerySet(model=None, query=None, using=None, hints=None)[source]

A Django queryset that filters by cluster status.

Used by the Cluster model.

active()[source]

The clusters that have an active status.

failed()[source]

The clusters that have an failed status.

terminated()[source]

The clusters that have an terminated status.

class atmo.clusters.queries.EMRReleaseQuerySet(model=None, query=None, using=None, hints=None)[source]

A Django queryset for the EMRRelease model.

active()[source]
deprecated()[source]

The EMR releases that are deprecated.

experimental()[source]

The EMR releases that are considered experimental.

natural_sort_by_version()[source]

Sorts this queryset by the EMR version naturally (human-readable).

stable()[source]

The EMR releases that are considered stable.

atmo.clusters.tasks

Task: atmo.clusters.tasks.deactivate_clusters

Deactivate clusters that have been expired.

Task: atmo.clusters.tasks.send_expiration_mails

Send expiration emails an hour before the cluster expires.

Task: atmo.clusters.tasks.update_clusters(self)

Update the cluster metadata from AWS for the pending clusters.

  • To be used periodically.
  • Won’t update state if not needed.
  • Will queue updating the Cluster’s public IP address if needed.
Task: atmo.clusters.tasks.update_master_address(self, cluster_id, force=False)

Update the public IP address for the cluster with the given cluster ID

atmo.clusters.views

atmo.clusters.views.detail_cluster(request, id)[source]

View to show details about an existing cluster.

atmo.clusters.views.extend_cluster(request, id)[source]

View to extend the lifetime an existing cluster.

atmo.clusters.views.new_cluster(request)[source]

View to create a new cluster.

atmo.clusters.views.terminate_cluster(request, id)[source]

View to terminate an existing cluster.

atmo.jobs

The code base to manage scheduled Spark job via AWS EMR clusters.

atmo.jobs.forms

class atmo.jobs.forms.BaseSparkJobForm(*args, **kwargs)[source]

A base form used for creating new jobs.

Parameters:
  • identifier (RegexField) – A unique identifier for your Spark job, visible in the AWS management console. (Lowercase, use hyphens instead of spaces.)
  • description (CharField) – A brief description of your Spark job’s purpose. This is intended to provide extra context for the data engineering team.
  • result_visibility (ChoiceField) – Whether notebook results are uploaded to a public or private S3 bucket.
  • size (IntegerField) – Number of workers to use when running the Spark job (1 is recommended for testing or development).
  • interval_in_hours (ChoiceField) – Interval at which the Spark job should be run.
  • job_timeout (IntegerField) – Number of hours that a single run of the job can run for before timing out and being terminated.
  • start_date (DateTimeField) – Date and time of when the scheduled Spark job should start running.
  • end_date (DateTimeField) – Date and time of when the scheduled Spark job should stop running - leave this blank if the job should not be disabled.
clean_notebook()[source]

Validate the uploaded notebook file if it ends with the ipynb file extension.

field_order

Copy the defined model form fields and insert the notebook field at the second spot

save(commit=True)[source]

Store the notebook file on S3 and save the Spark job details to the datebase.

class atmo.jobs.forms.EditSparkJobForm(*args, **kwargs)[source]

A BaseSparkJobForm subclass used for editing jobs.

Parameters:
  • identifier (RegexField) – A unique identifier for your Spark job, visible in the AWS management console. (Lowercase, use hyphens instead of spaces.)
  • description (CharField) – A brief description of your Spark job’s purpose. This is intended to provide extra context for the data engineering team.
  • result_visibility (ChoiceField) – Whether notebook results are uploaded to a public or private S3 bucket.
  • size (IntegerField) – Number of workers to use when running the Spark job (1 is recommended for testing or development).
  • interval_in_hours (ChoiceField) – Interval at which the Spark job should be run.
  • job_timeout (IntegerField) – Number of hours that a single run of the job can run for before timing out and being terminated.
  • start_date (DateTimeField) – Date and time of when the scheduled Spark job should start running.
  • end_date (DateTimeField) – Date and time of when the scheduled Spark job should stop running - leave this blank if the job should not be disabled.
class atmo.jobs.forms.NewSparkJobForm(*args, **kwargs)[source]

A BaseSparkJobForm subclass used for creating new jobs.

Parameters:
  • identifier (RegexField) – A unique identifier for your Spark job, visible in the AWS management console. (Lowercase, use hyphens instead of spaces.)
  • description (CharField) – A brief description of your Spark job’s purpose. This is intended to provide extra context for the data engineering team.
  • result_visibility (ChoiceField) – Whether notebook results are uploaded to a public or private S3 bucket.
  • size (IntegerField) – Number of workers to use when running the Spark job (1 is recommended for testing or development).
  • interval_in_hours (ChoiceField) – Interval at which the Spark job should be run.
  • job_timeout (IntegerField) – Number of hours that a single run of the job can run for before timing out and being terminated.
  • start_date (DateTimeField) – Date and time of when the scheduled Spark job should start running.
  • end_date (DateTimeField) – Date and time of when the scheduled Spark job should stop running - leave this blank if the job should not be disabled.
  • emr_release (EMRReleaseChoiceField) – Different AWS EMR versions have different versions of software like Hadoop, Spark, etc. See what’s new in each.
class atmo.jobs.forms.SparkJobAvailableForm(data=None, files=None, auto_id='id_%s', prefix=None, initial=None, error_class=<class 'django.forms.utils.ErrorList'>, label_suffix=None, empty_permitted=False, field_order=None, use_required_attribute=None, renderer=None)[source]

A form used in the views that checks for the availability of identifiers.

atmo.jobs.models

class atmo.jobs.models.SparkJob(*args, **kwargs)[source]

A data model to store details about a scheduled Spark job, to be run on AWS EMR.

Parameters:
  • id (AutoField) – Id
  • created_at (DateTimeField) – Created at
  • modified_at (DateTimeField) – Modified at
  • created_by_id (ForeignKey to User) – User that created the instance.
  • emr_release_id (ForeignKey to EMRRelease) – Different AWS EMR versions have different versions of software like Hadoop, Spark, etc. See what’s new in each.
  • identifier (CharField) – Job name, used to uniqely identify individual jobs.
  • description (TextField) – Job description.
  • notebook_s3_key (CharField) – S3 key of the notebook after uploading it to the Spark code bucket.
  • result_visibility (CharField) – Whether notebook results are uploaded to a public or private bucket
  • size (IntegerField) – Number of computers to use to run the job.
  • interval_in_hours (IntegerField) – Interval at which the job should run, in hours.
  • job_timeout (IntegerField) – Number of hours before the job times out.
  • start_date (DateTimeField) – Date/time that the job should start being scheduled to run.
  • end_date (DateTimeField) – Date/time that the job should stop being scheduled to run, null if no end date.
  • expired_date (DateTimeField) – Date/time that the job was expired.
  • is_enabled (BooleanField) – Whether the job should run or not.
exception DoesNotExist
exception MultipleObjectsReturned
has_finished

Whether the job’s cluster is terminated or failed

has_never_run

Whether the job has run before. Looks at both the cluster status and our own record when we asked it to run.

has_timed_out

Whether the current job run has been running longer than the job’s timeout allows.

is_due

Whether the start date is in the past and the end date is in the future.

is_runnable

Either the job has never run before or was never finished.

This is checked right before the actual provisioning.

run()[source]

Actually run the scheduled Spark job.

save(*args, **kwargs)[source]

Saves the current instance. Override this in a subclass if you want to control the saving process.

The ‘force_insert’ and ‘force_update’ parameters can be used to insist that the “save” must be an SQL insert or update (or equivalent for non-SQL backends), respectively. Normally, they should not be set.

should_run

Whether the scheduled Spark job should run.

terminate()[source]

Stop the currently running scheduled Spark job.

class atmo.jobs.models.SparkJobRun(*args, **kwargs)[source]

A data model to store information about every individual run of a scheduled Spark job.

This denormalizes some values from its related data model SparkJob.

Parameters:
  • id (AutoField) – Id
  • created_at (DateTimeField) – Created at
  • modified_at (DateTimeField) – Modified at
  • spark_job_id (ForeignKey to SparkJob) – Spark job
  • jobflow_id (CharField) – Jobflow id
  • emr_release_version (CharField) – Emr release version
  • size (IntegerField) – Number of computers used to run the job.
  • status (CharField) – Status
  • scheduled_at (DateTimeField) – Date/time that the job was scheduled.
  • started_at (DateTimeField) – Date/time when the cluster was started on AWS EMR.
  • ready_at (DateTimeField) – Date/time when the cluster was ready to run steps on AWS EMR.
  • finished_at (DateTimeField) – Date/time that the job was terminated or failed.
exception DoesNotExist
exception MultipleObjectsReturned
sync(info=None)[source]

Updates latest status and life cycle datetimes.

class atmo.jobs.models.SparkJobRunAlert(*args, **kwargs)[source]

A data model to store job run alerts for later processing by an async job that sends out emails.

Parameters:
exception DoesNotExist
exception MultipleObjectsReturned

atmo.jobs.provisioners

class atmo.jobs.provisioners.SparkJobProvisioner[source]

The Spark job specific provisioner.

add(identifier, notebook_file)[source]

Upload the notebook file to S3

get(key)[source]

Get the S3 file with the given key from the code S3 bucket.

remove(key)[source]

Remove the S3 file with the given key from the code S3 bucket.

results(identifier, is_public)[source]

Return the results created by the job with the given identifier that were uploaded to S3.

Parameters:
  • identifier – Unique identifier of the Spark job.
  • is_public – Whether the Spark job is public or not.
Returns:

A mapping of result prefixes to lists of results.

Return type:

dict

run(user_username, user_email, identifier, emr_release, size, notebook_key, is_public, job_timeout)[source]

Run the Spark job with the given parameters

Parameters:
  • user_username – The username of the Spark job owner.
  • user_email – The email address of the Spark job owner.
  • identifier – The unique identifier of the Spark job.
  • emr_release – The EMR release version.
  • size – The size of the cluster.
  • notebook_key – The name of the notebook file on S3.
  • is_public – Whether the job result should be public or not.
  • job_timeout – The maximum runtime of the job.
Returns:

AWS EMR jobflow ID

Return type:

str

atmo.jobs.queries

class atmo.jobs.queries.SparkJobQuerySet(model=None, query=None, using=None, hints=None)[source]
active()[source]

The Spark jobs that have an active cluster status.

failed()[source]

The Spark jobs that have a failed cluster status.

lapsed()[source]

The Spark jobs that have passed their end dates but haven’t been expired yet.

terminated()[source]

The Spark jobs that have a terminated cluster status.

with_runs()[source]

The Spark jobs with runs.

class atmo.jobs.queries.SparkJobRunQuerySet(model=None, query=None, using=None, hints=None)[source]
active()[source]

The Spark jobs that have an active cluster status.

atmo.jobs.tasks

class atmo.jobs.tasks.SparkJobRunTask[source]

A Celery task base classes to be used by the run_job() task to simplify testing.

check_enabled(spark_job)[source]

Checks if the job should be run at all

get_spark_job(pk)[source]

Load the Spark job with the given primary key.

max_retries = 9

The max number of retries which does not run too long when using the exponential backoff timeouts.

provision_run(spark_job, first_run=False)[source]

Actually run the given Spark job.

If this is the first run we’ll update the “last_run_at” value to the start date of the spark_job so Celery beat knows what’s going on.

sync_run(spark_job)[source]

Updates the cluster status of the latest Spark job run, if available.

terminate_and_notify(spark_job)[source]

When the Spark job has timed out because it has run longer than the maximum runtime we will terminate it (and its cluster) and notify the owner to optimize the Spark job code.

unschedule_and_expire(spark_job)[source]

Remove the Spark job from the periodic schedule and send an email to the owner that it was expired.

atmo.jobs.views

atmo.jobs.views.check_identifier_available(request)[source]

Given a Spark job identifier checks if one already exists.

atmo.jobs.views.delete_spark_job(request, id)[source]

View to delete a scheduled Spark job and then redirects to the dashboard.

atmo.jobs.views.detail_spark_job(request, id)[source]

View to show the details for the scheduled Spark job with the given ID.

atmo.jobs.views.detail_zeppelin_job(request, id)[source]

View to show the details for the scheduled Zeppelin job with the given ID.

atmo.jobs.views.download_spark_job(request, id)[source]

Download the notebook file for the scheduled Spark job with the given ID.

atmo.jobs.views.edit_spark_job(request, id)[source]

View to edit a scheduled Spark job that runs on AWS EMR.

atmo.jobs.views.new_spark_job(request)[source]

View to schedule a new Spark job to run on AWS EMR.

atmo.jobs.views.run_spark_job(request, id)[source]

Run a scheduled Spark job right now, out of sync with its actual schedule.

This will actively ask for confirmation to run the Spark job.

atmo.keys

The code base to manage public SSH keys to be used with ATMO clusters.

atmo.keys.forms

class atmo.keys.forms.SSHKeyForm(*args, **kwargs)[source]

The form to be used when uploaded new SSH keys.

Parameters:
  • title (CharField) – Name to give to this public key
  • key (CharField) – Should start with one of the following prefixes: ssh-rsa, ssh-dss, ecdsa-sha2-nistp256, ecdsa-sha2-nistp384, ecdsa-sha2-nistp521
  • key_file (FileField) – This can usually be found in ~/.ssh/ on your computer.
clean_key()[source]

Checks if the submitted key data:

  • isn’t larger than 100kb
  • is a valid SSH public key (e.g. dismissing if it’s a private key)
  • does not match any of the valid key data prefixes
  • already exists in the database

atmo.keys.models

class atmo.keys.models.SSHKey(*args, **kwargs)[source]

A Django data model to store public SSH keys for logged-in users to be used in the on-demand clusters.

Parameters:
  • id (AutoField) – Id
  • created_at (DateTimeField) – Created at
  • modified_at (DateTimeField) – Modified at
  • created_by_id (ForeignKey to User) – User that created the instance.
  • title (CharField) – Name to give to this public key
  • key (TextField) – Should start with one of the following prefixes: ssh-rsa, ssh-dss, ecdsa-sha2-nistp256, ecdsa-sha2-nistp384, ecdsa-sha2-nistp521
  • fingerprint (CharField) – Fingerprint
exception DoesNotExist
exception MultipleObjectsReturned
VALID_PREFIXES = ['ssh-rsa', 'ssh-dss', 'ecdsa-sha2-nistp256', 'ecdsa-sha2-nistp384', 'ecdsa-sha2-nistp521']

The list of valid SSH key data prefixes, will be validated on save.

prefix

The prefix of the key data, one of the VALID_PREFIXES.

save(*args, **kwargs)[source]

Saves the current instance. Override this in a subclass if you want to control the saving process.

The ‘force_insert’ and ‘force_update’ parameters can be used to insist that the “save” must be an SQL insert or update (or equivalent for non-SQL backends), respectively. Normally, they should not be set.

SSHKey.VALID_PREFIXES = ['ssh-rsa', 'ssh-dss', 'ecdsa-sha2-nistp256', 'ecdsa-sha2-nistp384', 'ecdsa-sha2-nistp521']

The list of valid SSH key data prefixes, will be validated on save.

atmo.keys.utils

atmo.keys.utils.calculate_fingerprint(data)[source]

Calculate the hexadecimal fingerprint for the given key data.

Parameters:data – str - The key data to calculate the fingerprint for.
Returns:The fingerprint.
Return type:str

atmo.keys.views

atmo.keys.views.delete_key(request, id)[source]

View to delete an SSH key with the given ID.

atmo.keys.views.detail_key(request, id, raw=False)[source]

View to show the details for the SSH key with the given ID.

If the optional raw parameter is set it’ll return the raw key data.

atmo.keys.views.list_keys(request)[source]

View to list all SSH keys for the logged-in user.

atmo.keys.views.new_key(request)[source]

View to upload a new SSH key for the logged-in user.

atmo.news

The code base to show the “News” section to users.

atmo.news.views

class atmo.news.views.News[source]

Encapsulate the rendering of the news document NEWS.md.

ast

Return (and cache for repeated querying) the Markdown AST of the NEWS.md file.

current(request)[source]

Return the latest seen version or nothing.

latest

Return the latest version found in the NEWS.md file.

render()[source]

Render the NEWS.md file as a HTML.

update(request, response)[source]

Set the cookie for the given request with the latest seen version.

uptodate(request)[source]

Return whether the current is newer than the last seen version.

atmo.news.views.check_news(request)[source]

View to check if the current user has seen the latest “News” section and return either ‘ok’ or ‘meh’ as a string.

atmo.news.views.list_news(request)[source]

View to list all news and optionally render only part of the template for AJAX requests.

atmo.settings

Django settings for atmo project.

For more information on this file, see https://docs.djangoproject.com/en/1.9/topics/settings/

For the full list of settings and their values, see https://docs.djangoproject.com/en/1.9/ref/settings/

class atmo.settings.AWS[source]

AWS settings

AWS_CONFIG = {'ACCOUNTING_APP_TAG': 'telemetry-analysis', 'ACCOUNTING_TYPE_TAG': 'worker', 'AWS_REGION': 'us-west-2', 'CODE_BUCKET': 'telemetry-analysis-code-2', 'EC2_KEY_NAME': '20161025-dataops-dev', 'EMAIL_SOURCE': 'telemetry-alerts@mozilla.com', 'INSTANCE_APP_TAG': 'telemetry-analysis-worker-instance', 'LOG_BUCKET': 'telemetry-analysis-logs-2', 'MASTER_INSTANCE_TYPE': 'c3.4xlarge', 'MAX_CLUSTER_LIFETIME': 24, 'MAX_CLUSTER_SIZE': 30, 'PRIVATE_DATA_BUCKET': 'telemetry-private-analysis-2', 'PUBLIC_DATA_BUCKET': 'telemetry-public-analysis-2', 'WORKER_INSTANCE_TYPE': 'c3.4xlarge'}

The AWS config values.

PUBLIC_DATA_URL = 'https://s3-us-west-2.amazonaws.com/telemetry-public-analysis-2/'

The URL of the S3 bucket with public job results.

PUBLIC_NB_URL = 'https://nbviewer.jupyter.org/url/s3-us-west-2.amazonaws.com/telemetry-public-analysis-2/'

The URL to show public Jupyter job results with.

class atmo.settings.Base[source]

Configuration that may change per-environment, some with defaults.

LOGGING()[source]

dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s

(key, value) pairs
dict(iterable) -> new dictionary initialized as if via:

d = {} for k, v in iterable:

d[k] = v
dict(**kwargs) -> new dictionary initialized with the name=value pairs
in the keyword argument list. For example: dict(one=1, two=2)
SITE_URL = 'http://localhost:8000'

The URL under which this instance is running

class atmo.settings.Build[source]

Configuration to be used in build (!) environment

CONSTANCE_CONFIG

Dictionary that remembers insertion order

DATABASES

dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s

(key, value) pairs
dict(iterable) -> new dictionary initialized as if via:

d = {} for k, v in iterable:

d[k] = v
dict(**kwargs) -> new dictionary initialized with the name=value pairs
in the keyword argument list. For example: dict(one=1, two=2)
LOGGING()

dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s

(key, value) pairs
dict(iterable) -> new dictionary initialized as if via:

d = {} for k, v in iterable:

d[k] = v
dict(**kwargs) -> new dictionary initialized with the name=value pairs
in the keyword argument list. For example: dict(one=1, two=2)
class atmo.settings.CSP[source]

CSP settings

class atmo.settings.Celery[source]

The Celery specific Django settings.

CELERY_BEAT_MAX_LOOP_INTERVAL = 5

redbeat likes fast loops

CELERY_BEAT_SCHEDULE = {'clean_orphan_obj_perms': {'schedule': <crontab: 30 3 * * * (m/h/d/dM/MY)>, 'task': 'atmo.tasks.cleanup_permissions'}, 'deactivate_clusters': {'schedule': <crontab: * * * * * (m/h/d/dM/MY)>, 'task': 'atmo.clusters.tasks.deactivate_clusters', 'options': {'soft_time_limit': 15, 'expires': 40}}, 'expire_jobs': {'schedule': <crontab: * * * * * (m/h/d/dM/MY)>, 'task': 'atmo.jobs.tasks.expire_jobs', 'options': {'soft_time_limit': 15, 'expires': 40}}, 'send_expiration_mails': {'schedule': <crontab: */5 * * * * (m/h/d/dM/MY)>, 'task': 'atmo.clusters.tasks.send_expiration_mails', 'options': {'expires': 240}}, 'send_run_alert_mails': {'schedule': <crontab: * * * * * (m/h/d/dM/MY)>, 'task': 'atmo.jobs.tasks.send_run_alert_mails', 'options': {'expires': 40}}, 'update_clusters': {'schedule': <crontab: */5 * * * * (m/h/d/dM/MY)>, 'task': 'atmo.clusters.tasks.update_clusters', 'options': {'soft_time_limit': 270, 'expires': 180}}, 'update_jobs_statuses': {'schedule': <crontab: */15 * * * * (m/h/d/dM/MY)>, 'task': 'atmo.jobs.tasks.update_jobs_statuses', 'options': {'soft_time_limit': 870, 'expires': 600}}}

The default/initial schedule to use.

CELERY_BEAT_SCHEDULER = 'redbeat.RedBeatScheduler'

The scheduler to use for periodic and scheduled tasks.

CELERY_BROKER_TRANSPORT_OPTIONS = {'fanout_patterns': True, 'fanout_prefix': True, 'visibility_timeout': 691200}

The Celery broker transport options

CELERY_REDBEAT_LOCK_TIMEOUT = 25

Unless refreshed the lock will expire after this time

CELERY_RESULT_BACKEND = 'django-db'

Use the django_celery_results database backend.

CELERY_RESULT_EXPIRES = datetime.timedelta(14)

Throw away task results after two weeks, for debugging purposes.

CELERY_TASK_SEND_SENT_EVENT = True

Send SENT events as well to know when the task has left the scheduler.

CELERY_TASK_SOFT_TIME_LIMIT = 300

Add a 5 minute soft timeout to all Celery tasks.

CELERY_TASK_TIME_LIMIT = 600

And a 10 minute hard timeout.

CELERY_TASK_TRACK_STARTED = True

Track if a task has been started, not only pending etc.

CELERY_WORKER_DISABLE_RATE_LIMITS = True

Completely disable the rate limiting feature since it’s costly

CELERY_WORKER_HIJACK_ROOT_LOGGER = False

Stop hijacking the root logger so Sentry works.

class atmo.settings.Constance[source]

Constance settings

CONSTANCE_ADDITIONAL_FIELDS = {'announcement_styles': [<class 'django.forms.fields.ChoiceField'>, {'widget': <django.forms.widgets.Select object at 0x7f6de3177b00>, 'choices': (('success', 'success (green)'), ('info', 'info (blue)'), ('warning', 'warning (yellow)'), ('danger', 'danger (red)'))}], 'announcement_title': [<class 'django.forms.fields.CharField'>, {'widget': <django.forms.widgets.TextInput object at 0x7f6de3177b38>}]}

Adds custom widget for announcements.

CONSTANCE_CONFIG = {'ANNOUNCEMENT_CONTENT': ('', 'The announcement content.'), 'ANNOUNCEMENT_CONTENT_MARKDOWN': (False, 'Whether the announcement content should be rendered as CommonMark (Markdown).'), 'ANNOUNCEMENT_ENABLED': (False, 'Whether to show the announcement on every page.'), 'ANNOUNCEMENT_TITLE': ('Announcement', 'The announcement title.', 'announcement_title'), 'ANNOUNCMENT_STYLE': ('info', 'The style of the announcement.', 'announcement_styles'), 'AWS_EFS_DNS': ('fs-616ca0c8.efs.us-west-2.amazonaws.com', 'The DNS name of the EFS mount for EMR clusters'), 'AWS_SPARK_EMR_BUCKET': ('telemetry-spark-emr-2-stage', 'The S3 bucket where the EMR bootstrap scripts are located'), 'AWS_SPARK_INSTANCE_PROFILE': ('telemetry-spark-cloudformation-stage-TelemetrySparkInstanceProfile-UCLC2TTGVX96', 'The AWS instance profile to use for the clusters'), 'AWS_SPOT_BID_CORE': (0.84, 'The spot instance bid price for the cluster workers'), 'AWS_USE_SPOT_INSTANCES': (True, 'Whether to use spot instances on AWS')}

The default config values.

CONSTANCE_CONFIG_FIELDSETS = {'AWS': ('AWS_USE_SPOT_INSTANCES', 'AWS_SPOT_BID_CORE', 'AWS_EFS_DNS', 'AWS_SPARK_EMR_BUCKET', 'AWS_SPARK_INSTANCE_PROFILE'), 'Announcements': ('ANNOUNCEMENT_ENABLED', 'ANNOUNCMENT_STYLE', 'ANNOUNCEMENT_TITLE', 'ANNOUNCEMENT_CONTENT', 'ANNOUNCEMENT_CONTENT_MARKDOWN')}

Some fieldsets for the config values.

CONSTANCE_REDIS_CONNECTION_CLASS = 'django_redis.get_redis_connection'

Using the django-redis connection function for the backend.

class atmo.settings.Core[source]

Configuration that will never change per-environment.

BASE_DIR = '/home/docs/checkouts/readthedocs.org/user_builds/atmo/envs/stable/lib/python3.5/site-packages/mozilla_atmo-2018.3.0-py3.5.egg'

Build paths inside the project like this: os.path.join(BASE_DIR, …)

INSTALLED_APPS = ['atmo.apps.AtmoAppConfig', 'atmo.clusters', 'atmo.jobs', 'atmo.apps.KeysAppConfig', 'atmo.users', 'atmo.stats', 'guardian', 'constance', 'constance.backends.database', 'dockerflow.django', 'django_celery_monitor', 'django_celery_results', 'flat_responsive', 'django.contrib.sites', 'django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', 'mozilla_django_oidc']

The installed apps.

SITE_ID = 1

Using the default first site found by django.contrib.sites

THIS_DIR = '/home/docs/checkouts/readthedocs.org/user_builds/atmo/envs/stable/lib/python3.5/site-packages/mozilla_atmo-2018.3.0-py3.5.egg/atmo'

The directory in which the settings file reside.

VERSION = None

The current ATMO version.

class atmo.settings.Dev[source]

Configuration to be used during development and base class for testing

LOGGING()

dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s

(key, value) pairs
dict(iterable) -> new dictionary initialized as if via:

d = {} for k, v in iterable:

d[k] = v
dict(**kwargs) -> new dictionary initialized with the name=value pairs
in the keyword argument list. For example: dict(one=1, two=2)
class atmo.settings.Docs[source]

Configuration to be used in the documentation environment

LOGGING()

dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s

(key, value) pairs
dict(iterable) -> new dictionary initialized as if via:

d = {} for k, v in iterable:

d[k] = v
dict(**kwargs) -> new dictionary initialized with the name=value pairs
in the keyword argument list. For example: dict(one=1, two=2)
class atmo.settings.Prod[source]

Configuration to be used in prod environment

CONSTANCE_CONFIG

Dictionary that remembers insertion order

DATABASES

dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s

(key, value) pairs
dict(iterable) -> new dictionary initialized as if via:

d = {} for k, v in iterable:

d[k] = v
dict(**kwargs) -> new dictionary initialized with the name=value pairs
in the keyword argument list. For example: dict(one=1, two=2)
LOGGING()

dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s

(key, value) pairs
dict(iterable) -> new dictionary initialized as if via:

d = {} for k, v in iterable:

d[k] = v
dict(**kwargs) -> new dictionary initialized with the name=value pairs
in the keyword argument list. For example: dict(one=1, two=2)
class atmo.settings.Stage[source]

Configuration to be used in stage environment

DATABASES

dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s

(key, value) pairs
dict(iterable) -> new dictionary initialized as if via:

d = {} for k, v in iterable:

d[k] = v
dict(**kwargs) -> new dictionary initialized with the name=value pairs
in the keyword argument list. For example: dict(one=1, two=2)
LOGGING()

dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s

(key, value) pairs
dict(iterable) -> new dictionary initialized as if via:

d = {} for k, v in iterable:

d[k] = v
dict(**kwargs) -> new dictionary initialized with the name=value pairs
in the keyword argument list. For example: dict(one=1, two=2)
class atmo.settings.Test[source]

Configuration to be used during testing

LOGGING()

dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s

(key, value) pairs
dict(iterable) -> new dictionary initialized as if via:

d = {} for k, v in iterable:

d[k] = v
dict(**kwargs) -> new dictionary initialized with the name=value pairs
in the keyword argument list. For example: dict(one=1, two=2)

atmo.users

The code base to handle user sign ups and logins.

atmo.users.utils

atmo.users.utils.generate_username_from_email(email)[source]

Use the unique part of the email as the username for mozilla.com and the full email address for all other users.

Changelog

Welcome to the running release notes of ATMO!

  • You can use this document to see high-level changes done in each release that is git tagged.
  • Backward-incompatible changes or other notable events that have an impact on users are noted individually.
  • The order of this changelog is descending (newest first).
  • Dependency updates are only mentioned when they require user attention.

2018.3.0

date:2018-03-07

Naturally sort EMR releases by version instead of alphabetically when launching EMR clusters or scheduling Spark jobs.

Stop overwriting the owner on save in the admin when the object already exists.

2017.11.0

date:2017-01-01

Fixed login and logout issues with new auth mechanism.

2017.10.2

date:2017-10-31

Switched from Google based authentication to Auth0 based authentication (via their OpenID Connect API).

Removed leftovers from old Heroku deploy method.

2017.10.1

date:2017-10-23

Fix an issue when recording the Spark job run time metric.

2017.10.0

date:2017-10-05

Add ability to upload Zeppelin notebooks.

Remove name generator for scheduled Spark jobs to reduce confusion.

Record Spark job metrics.

Fix recording metrics in database transactions.

2017.8.1

date:2017-08-17

Fix metric duplicates.

2017.8.0

date:2017-08-16

Add more cluster metrics.

2017.7.2

date:2017-07-25

Add metrics for EMR version and cluster extensions.

2017.7.1

date:2017-07-18

Make EMR profile configurable via environment variable.

2017.7.0

date:2017-07-12

Allow EMR bootstrap bucket to be configurable for improved environment specific setup.

Add description of schedule Spark job to alert email body.

Add documentation under https://atmo.readthedocs.io/

2017.6.1

date:2017-06-20

Filter out inactive EMR releases from dropdown and minor UI tweaks.

2017.6.0

date:2017-06-06

Add Zeppelin examples to cluster detail.

2017.5.7

date:2017-05-30

Fix regression introduced when the backoff feature for task retries was improved in 2017.5.5.

2017.5.[5,6]

date:2017-05-24

Fix more race conditions in sending out emails.

Fix duplicate job runs due to job scheduling race conditions.

Store and show datetimes from EMR status updates for better monitoring.

Add job history details to job detail page.

Improved backoff patterns by inlining the Celery task retries.

2017.5.[3,4]

date:2017-05-18

Fix issue with Celery monitoring.

2017.5.2

date:2017-05-17

Fix race conditions in email sending.

Add ability to run job right now.

UI fixes to the cluster and Spark job detail pages.

Upgrade to Django 1.11 and Python 3.6.

Add a responsive admin theme.

Add ability to show a site-wide announcement on top of every page.

Update the status of all past Spark job runs not only the last one.

Better unique cluster identifiers based on scientist names.

2017.5.1

date:2017-05-11

Add status and visual indicators to scheduled Spark jobs listings.

Fix issue with running scheduled Celery tasks multiple times.

2017.5.0

date:2017-05-03

Use user part of email addresses as username (e.g. “jdoe” in “jdoe@mozilla.com) instead of first name.

Add Celery monitoring to Django admin.

2017.4.3

date:2017-04-27

UX updates to job detail page.

Minor fixes for Celery schedule refactoring.

2017.4.2

date:2017-04-26

Updated Celery timeout.

Populate new Celery schedules for all scheduled Spark jobs.

2017.4.1

date:2017-04-25

Add a Celery task for running a Spark job.

This task is used of Redbeat to schedule the Spark jobs using the Celery beat. We add/remove Spark jobs from the schedule on save/delete and can restore the schedule from the database again.

Send emails for Spark jobs when expired and when they have timed out and need to be modified.

Refactored and extended tests.

2017.4.0

date:2017-04-04

Moved EMR releases into own data model for easy maintenance (including deprecation and experimental tags).

Add ability to define a lifetime on cluster start.

Change default lifetime to 8 hours (~a work day), maximum stays at 24 hours.

Add ability to extend the lifetime of clusters on demand. The cluster expiration email will notify cluster owners about that ability, too.

2017.3.[6,7]

date:2017-03-28/2017-03-29

Show all scheduled Spark jobs for admin users in the Spark job maintainers group.

Fix logging for Celery and RedBeat.

2017.3.5

date:2017-03-22

Switch to Celery as task queue to improve stability and processing guarentees.

Wrap more tasks in Django database transactions to reduce risk of race conditions.

Only updates the cluster master address if the cluster isn’t ready.

Pins Node dependencies and use Greenkeeper for dependency CI.

2017.3.4

date:2017-03-20

Fixing an inconsistency with how the run alert status message is stored with values from Amazon, extending the length of the column.

Check and run jobs only every 5 minutes instead of every minute to reduce API access numbers.

2017.3.3

date:2017-03-17

Regression fixes to the email alerting feature introduced in 2017.3.2 that prevented scheduled jobs to run successfully.

2017.3.2

date:2017-03-15

BACKWARD INCOMPATIBLE: Removes EMR release 4.5.0.

BACKWARD INCOMPATIBLE: Make clusters persist the home directory between runs.

Adds a changelog (this file) and a “What’s new?” section (in the footer).

Adds email alerting if a scheduled Spark job fails.

Replaced automatic page refresher with in-page-alerts when page changes on server.

Moved project board to Waffle: https://waffle.io/mozilla/telemetry-analysis-service

Run flake8 automatically as part of test suite.

2017.3.[0,1]

date:2017-03-07/2017-03-08

Selects the SSH key automatically if only one is present.

Uses ListCluster API endpoint for updating Spark job run states instead of DescribeCluster to counteract AWS API throtteling.

2017.2.[9,10,11,12,13]

date:2017-02-23

Regression fixes for the Python 3 migration and Zeppeling integration.

2017.2.[6,7,8]

date:2017-02-20/2017-02-21

Adds the ability to store the history of scheduled Spark job for planned features such as alerting and cost calculations.

2017.2.[4,5]

date:2017-02-17

Adds experimental support for Apache Zeppelin, next to Jupyter a second way to manage notebooks.

Improves client side form validation dramaticlly and changes file selector to better suited system.

Adds exponential backoff retries for the worker system to counteract AWS API throtteling for jobs that update cluster status or run scheduled Spark jobs.

Moves from Python 2 to 3.

2017.2.[1,2,3]

date:2017-02-07/2017-02-10

Uses AWS EC2 spot instances for scheduled Spark jobs with more than one node.

Moves issue management from Bugzilla to GitHub.

2017.1.[11,12]

date:2017-01-31

Self-dogfoods the newly implemented python-dockerflow.

Fix many UX issues in the various forms.

2017.1.[7,8,9,10]

date:2017-01-24

Adds ability to upload personal SSH keys to simplify starting clusters.

Adds a new required description field to Spark job to be able to debug jobs easily.

Adds EMR 5.2.1 to list of available EMR versions.

Uses new shared public SSH key that is used by the hadoop user on EMR.

2017.1.[0,1,2,3,4,5,6]

date:2017-01-20

First release of 2017 that comes with a lot of changes around deployment, UI and UX. o/

Adopts NPM as a way to maintain frontend dependencies.

Adds a object level permission system to be able to share CRUD permissions per user or user group, e.g. admins can see clusters and Spark jobs of other users now.

Makes the cluster and Spark job deletion confirmation happen in place instead of redirecting to separate page that asks for confirmation.

Extends tests and adds test coverage reporting via Codecov.

Drops Travis-CI in favor of Circle CI.

Allows enabling/disabling AWS EC2 spot instances via the Django admin UI in the Constance section.

2016.11.5

date:2016-11-21

Fix job creation edge case.

More NewRelic fixes.

2016.11.[2,3,4]

date:2016-11-17

Fixes logging related to Dockerflow.

Turned off NewRelic’s “high_security” mode.

Increases the job timeouts for less job kills.

Removes the need for Newrelic deploys to Heroku.

2016.11.1

date:2016-11-14

Implements Dockerflow health checks so it follows the best practices of Mozilla’s Dockerflow. Many thanks to @mythmon for the inspiration in the Normandy code.

2016.11.0

date:2016-11-11

The first release of ATMO V2 under the new release system that ports the majority of the V1 to a new codebase.

This is a major milestone after months of work of many contributors, finishing the work of Mozilla community members and staff.

Indices and tables