Article

How to schedule Python script with Django context on AWS Elastic Beanstalk

Last updated 
Apr 22, 2019
 min read

Recently, I worked on a Django project that is already deployed on Elastic Beanstalk. Due to some new feature requirements, I had to create a Python script that can connect to the same database that Django app is connecting to and perform some tasks which can re-use few functions already written. I had to set up a cronjob to run this script exactly once a day.

Elastic Beanstalk makes it easy to deploy a Django project. We can do it easily with AWS CLI or we can upload zip via the console. But when it comes to placing a Django script, it is a little complex. In this post, I am going to talk about how to create such script and how to place this script to run as cron on elastic beanstalk environment.

Let’s create a simple Python script first.

1def main():
2    print("Hello world")
3  
4if __name__ == "__main__":
5    main()

Put this script in your Django project. You may create the folder ‘scripts’ in the root of your project.

Now, let’s start using Django context in this script and name this file as daily_report.py

1from django.conf import settings
2# assume that, I have user model in my app 'users' in this example 
3# project
4from users.models import User
5from users.utils import get_registered_users
6def send_new_registered_emails_to_admin(registered_users):
7    receipents = settings.DAILY_STATS_RECEIPENT_EMAIL_LIST
8    # TODO - 
9    # add logic to send list of registered_emails to receipents
10    pass
11def main():
12    registered_users = get_registered_users()
13    send_new_registered_emails_to_admin(registered_users)
14if __name__ == "__main__":
15    main()

Configuring Django Script for Cron Execution

Now, If I run the above script, it will throw an error saying that Django settings are not properly configured. To use some utility functions, models and Django ORM, we need to configure Django settings inside this script. To do that, we need to make sure that the path to our project’s directory is added to the operating system’s path list and we need to set a proper DJANGO_SETTINGS_MODULE environment variable in the os. This can be done by adding the following lines:

1if __name__ == '__main__' and __package__ is None:
2    os.sys.path.append(
3        os.path.dirname(
4            os.path.dirname(
5                os.path.abspath(__file__))))
6os.environ.setdefault("DJANGO_SETTINGS_MODULE", "project_name.settings")

Creating the .ebextensions Configuration File

Now, we can run this script successfully. The next step is to add some configurations in our project so that this script gets executed as cron on elastic beanstalk environment. For that, we need to create a config file under .ebextentions folder.

Setting Up the Cron Schedule

Configuration files are YAML formatted documents with a .config file extension that you place in a folder named .ebextensions and deploy in your application source bundle.

Let’s call it my_cron.config with the following content as per AWS docs.

1files:
2    "/usr/local/bin/check_leader_only_instance.sh":
3        mode: "000755"
4        owner: root
5        group: root
6        content: |
7            #!/bin/bash
8            INSTANCE_ID=`curl http://169.254.169.254/latest/meta-data/instance-id 2>/dev/null`
9            REGION=`curl -s http://169.254.169.254/latest/dynamic/instance-identity/document 2>/dev/null | jq -r .region`
10
11            # Find the Auto Scaling Group name from the Elastic Beanstalk environment
12            ASG=`aws ec2 describe-tags --filters "Name=resource-id,Values=$INSTANCE_ID" 
13                --region $REGION --output json | jq -r '.[][] | select(.Key=="aws:autoscaling:groupName") | .Value'`
14
15            # Find the first instance in the Auto Scaling Group
16            FIRST=`aws autoscaling describe-auto-scaling-groups --auto-scaling-group-names $ASG 
17                --region $REGION --output json | 
18                jq -r '.AutoScalingGroups[].Instances[] | select(.LifecycleState=="InService") | .InstanceId' | sort | head -1`
19
20            # If the instance ids are the same exit 0
21            [ "$FIRST" = "$INSTANCE_ID" ]
22
23    "/usr/local/bin/my_cron_script.sh":
24        mode: "000755"
25        owner: root
26        group: root
27        content: |            
28            #!/bin/bash
29            /usr/local/bin/check_leader_only_instance.sh || exit
30            # Now run commands that should run on only 1 instance.
31            
32
33    "/etc/cron.d/daily_cron":
34        mode: "000644"
35        owner: root
36        group: root
37        content: |
38            0 0 * * * root /usr/local/bin/my_cron_script.sh 
39
40commands:
41  rm_old_cron:
42    command: "rm -fr /etc/cron.d/*.bak"
43    ignoreErrors: true

With the above configurations, we are adding three files on the server which is running our EB environment.

  1. /etc/cron.d/daily_cron: This is a cron file that contains a schedule of the cronjob and the command to execute a job. In our example, We are calling another shell script file named my_cron.script.sh
  2. check_leader_only_instance.sh: Shell script to check if the server on which it is being executed is a leader server or not. Only needed when you have multiple servers behind a load balancer.
  3. my_cron_script.sh: This shell script will first call check_leader_only_instance.sh to check if the server is a leader server or not. After this check is done, we can add a command to run our Django script.

Activating Virtual Environment and Running Django Script

Now, let’s see the command to execute the Django script. We need to activate a virtual environment or point to Python in the virtual environment folder.

/path/to/venv/bin/python /path/to/folder/daily_report.py

Okay, so you must be wondering how to activate a virtual environment and what is the path to Python on a server running and managed by EB. Here is the answer:

source /opt/python/run/venv/bin/activate
source /opt/python/current/env
<<<<< /opt/python/current/app is the root folder of the source code you have uploaded on Elastic beanstalk >>>>
cd /opt/python/current/app
scripts/daily_report.py

Now, let’s add these commands to our config file so that It can be ready to deploy. Your final config file should look like this now:

1files:
2    "/usr/local/bin/check_leader_only_instance.sh":
3        mode: "000755"
4        owner: root
5        group: root
6        content: |
7            #!/bin/bash
8            INSTANCE_ID=`curl http://169.254.169.254/latest/meta-data/instance-id 2>/dev/null`
9            REGION=`curl -s http://169.254.169.254/latest/dynamic/instance-identity/document 2>/dev/null | jq -r .region`
10
11            # Find the Auto Scaling Group name from the Elastic Beanstalk environment
12            ASG=`aws ec2 describe-tags --filters "Name=resource-id,Values=$INSTANCE_ID" 
13                --region $REGION --output json | jq -r '.[][] | select(.Key=="aws:autoscaling:groupName") | .Value'`
14
15            # Find the first instance in the Auto Scaling Group
16            FIRST=`aws autoscaling describe-auto-scaling-groups --auto-scaling-group-names $ASG 
17                --region $REGION --output json | 
18                jq -r '.AutoScalingGroups[].Instances[] | select(.LifecycleState=="InService") | .InstanceId' | sort | head -1`
19
20            # If the instance ids are the same exit 0
21            [ "$FIRST" = "$INSTANCE_ID" ]
22
23    "/usr/local/bin/my_cron_script.sh":
24        mode: "000755"
25        owner: root
26        group: root
27        content: |            
28            #!/bin/bash
29            /usr/local/bin/check_leader_only_instance.sh || exit
30            source /opt/python/run/venv/bin/activate
31            source /opt/python/current/env
32            cd /opt/python/current/app
33            scripts/daily_report.py            
34
35    "/etc/cron.d/daily_cron":
36        mode: "000644"
37        owner: root
38        group: root
39        content: |
40            0 0 * * * root /usr/local/bin/my_cron_script.sh 
41
42commands:
43  rm_old_cron:
44    command: "rm -fr /etc/cron.d/*.bak"
45    ignoreErrors: true

That’s it. Thanks for reading.

Also Read: Securing a Web Application on Elastic Beanstalk with HTTPS (No Load Balancer)

Authors

Hiren Patel

Software Engineer
My skills includes full stack web development and can also work on deployment and monitoring of web application on a server. The kind of work that I can do and have experience : Back-end: 1. Python scripting 2. Django Rest framework to create REST APIs 3. Writing unit test case and use automation to test build before deployment Front-end: 1. Web application development using Angular 2/4/6 Framework 2. Challenging UI development using HTML5, CSS3 3. CSS, Jquery and SVG animations Others: 1. AWS ec2, s3, RDS, ECS, CI-CD with AWS, 2.Jenkins

Tags

No items found.

Have a project in mind?

Read