Introduction
I notice a lot of people these days using Heroku to host applications they’ve written with Django or Rails. This may save them some time and effort, but they are spending way more money on this specialized hosting than they would pay for a generic Linux host. They are also giving up control of a large portion of their application. I consider the entire stack to be part of my application, and complete control over all the pieces is mandatory.
Because I encourage people to administer their own servers, I have written this tutorial to remove just one more of their excuses not to do so. This tutorial covers everything you need to do to configure a blank Ubuntu Linux 12.04 LTS 64-bit server to run a Django application properly. I will not be covering anything about writing the application itself. I will also only provide minimal discussion of some topics, like database configuration. If you can finish this tutorial, you can Google the necessary documentation for other things. The only thing I completely skip is configuration of search using Haystack and Solr since it is not needed by most applications.
If you want to use a different Linux distribution, most of this tutorial will still be correct. You will have to translate all of the apt-get statements and package names to your distribution of choice. Configuration file paths and layouts may also differ. I chose Ubuntu 12.04 because it is very popular, available on almost every host, and its long term support will keep this tutorial relevant for a longer period of time.
Formatting
In this tutorial all commands and lines of code that you should actually enter are displayed in preformatted text. Within the preformatted text there will be lines that begin with an octothorpe (#). Those lines contain my personal comments on the surrounding lines, and should not be typed. Lines beginning with the dollar symbol ($) are commands that should be typed on the shell. In cases where there are variables that you should replace with appropriate values, I will surround the variable with <> symbols and name it with all capital letters.
# This text is describing the command below it. Don't type it. $ this is a shell command, type it # Type the following command, but replace the <USER> with your username $ sudo adduser <USER>
In cases where edits must be made to a file I will begin that section with a shell command to open the appropriate file in vim, which is my text editor of choice. You may use whichever text editor you wish. I will also describe the edits to be made in the comment text.
$ vim test.txt # add this line to the end of the file TEST = True
Tutorial
Beginning Steps
Begin by installing Ubuntu on a machine you own, or getting a new machine from a hosting provider. I recommend Linode or Amazon EC2. If you choose EC2 the Ubuntu AMI finder should be your starting point.
Once the machine is installed, you will need to login, mostly likely using SSH. I can not more strongly recommend you use SSH Agent Forwarding. I will not cover how to set it up in this tutorial, as it varies based on your client machine. If you haven’t done this already, do it now.
Once you are logged into the machine, you will probably notice it is telling you that a lot of packages need updating. You should update packages whenever they need updating, especially if there are security patches. Just be aware this may disrupt your site for a few minutes. If your site were important enough to where a minute of downtime was a problem, you would not need this tutorial. Here is how you update all packages on an Ubuntu or Debian system.
# update the local package index $ sudo apt-get update # actually upgrade all packages that can be upgraded $ sudo apt-get dist-upgrade # remove any packages that are no longer needed $ sudo apt-get autoremove # reboot the machine, which is only necessary for some updates $ sudo reboot
You may notice frequent use of sudo. Sudo is a program that allows us to execute other commands with root privileges. If you are bothered having to type your password when using sudo, then you can edit your sudoers file to make it easier in the future.
distribute and pip
Many of the Python packages we need are available through Ubuntu and apt-get. However, we do not want to use them. The packages provided by Ubuntu are very likely to be old and out of date. Also, it is only possible to install them globally, whereas we would like to install them in an isolated fashion. If we ran two sites on this server that used different versions of Django, this method would not work.
Many of the Python packages are wrappers around C libraries that will need to be available for linking. Those C libraries must be installed using apt-get before we install the associated Python packages. There are two such packages that are needed for just about everything, and should be installed now. They are build-essential and python-dev. Build-essential is the Ubuntu package which provides all the standard C compilation tools. Python-dev provides the necessary files for compiling Python/C modules.
$ sudo apt-get install build-essential python-dev
Now, there are various tools to install Python packages, and there is much drama and debate in the community about them. I will ignore all that, and say to use Distribute and pip. I have used them for years now, and have had no issues at all.
# download distribute $ curl -O http://python-distribute.org/distribute_setup.py # install distribute $ sudo python distribute_setup.py # remove installation files $ rm distribute* # use distribute to install pip $ sudo easy_install pip
Virtualenv(wrapper)
Now that we have used distribute to install pip, we will use pip to install every other Python package. This solves the problem of getting packages, but it does not solve the problem of isolating them. For that we must use virtualenv. Because virtualenv itself is a little clumsy, we will also use virtualenvwrapper to make things a little easier on ourselves.
# install virtualenv and virtualenvwrapper $ sudo pip install virtualenv virtualenvwrapper # edit the .bashrc file $ vim .bashrc # to enable virtualenvwrapper add this line to the end of the file source /usr/local/bin/virtualenvwrapper.sh #save and quit your editor # exit and log back in to restart your shell $ exit
You will notice after exiting and logging back into the shell that virtualenvwrapper will do a bunch of magic. That is a one-time setup that you will never see again. If you don’t see a bunch of stuff happen, then there is a problem you should fix.
Now, how do you use virtualenv? You create as many virtualenvs as you like, each with a unique name. Only one of these virtualenvs will be active at a given time. Any time you use pip the packages will only be installed into the active virtualenv. In order for a Python program to make use of those Python packages, it must be executed with that virtualenv activated. Let’s create a virtualenv for our project and activate it. You should just keep it active for the rest of this tutorial for ease. You can tell when a virtualenv is active because it will appear in your shell prompt.
# create a virtualenv, I usually give it the same name as my app $ mkvirtualenv <VIRTUALENV_NAME> # The virtualenv will be activated automatically. # You can deactivate it like this $ deactivate # to activate a virtualenv, or change which one is active, do this $ workon <VIRTUALENV_NAME>
One very helpful command you can use now is pip freeze. Pip freeze lists all available Python packages and their versions. It will let you easily see what has been installed. Try these commands out. You will see just how many packages have already been installed system-wide by Ubuntu. You will also see that your virtualenv is isolated from the rest of the system, and doesn’t have much installed in it yet.
$ deactivate $ pip freeze $ workon <VIRTUALENV_NAME> $ pip freeze
Django
Installing Django is very simple. Make sure the virtualenv is activated. This is the last time I will remind you to activate.
# install the newest version of Django $ pip install django # also install docutils, which is used by the django admin $ pip install docutils # If you want to test django, start a new project and run it $ django-admin.py startproject <APP_NAME> $ cd <APP_NAME> # make manage.py executable $ chmod +x manage.py # start the dev server $ ./manage.py runserver 0.0.0.0:8000
PIL / Pillow
PIL is the Python imaging library. It is needed if your application is going to do anything involving at all involving images. Even if you simply allow user to upload images, you will need it. There is a drop in replacement available called Pillow. I’ve never had problems with either one, but some people have problems with PIL. If you don’t know, just choose Pillow. Whichever one you choose, you will need the same set of prerequisite C libraries. If you don’t install these, then PIL might fail if a user tries to upload a jpg, or you try to perform certain image operations.
# install libraries $ sudo apt-get install libjpeg8-dev libfreetype6-dev zlib1g-dev # choose ONE of these $ pip install pillow $ pip install pil
Databases MySQL / PostgreSQL
It’s up to you to choose your database. I prefer MySQL because it is easier. The Django devs prefer PostgreSQL because it is more robust. You can learn to configure the databases using their respective documentation. It’s also up to you to edit the Django settings.py file to point at the appropriate database server. If you choose MySQL, you should configure it so that utf8 is always the default character set, utf8_general_ci is the default collation, and that InnoDB is the default storage engine.
# do this to go with Postgres $ sudo apt-get install postgresql libpq-dev $ pip install psycopg2 # do this to go with MySQL $ sudo apt-get install mysql-server libmysqlclient-dev $ pip install MySQL-Python
Whichever database you choose, you will have to create a database and a user for your application to use. Then you will have to make sure that the user has full permissions on that database.
South
South is a mandatory part of every Django project as far as I am concerned. It manages schema changes to your database about as well as can be expected, and makes life much easier in general.
$ pip install south # add south to the INSTALLED_APPS in your settings.py $ vim <YOUR_APP>/settings.py INSTALLED_APPS = ... 'south', ...
memcached
Just about every web site can benefit from having memcached. It creates a key:value store directly in memory that is very fast and easy. This can be used to store information to reduce database queries and increase performance. There have been many different libraries and modules for memcached, but PyLibMC is presently the preferred choice.
# install memcached server and library $ sudo apt-get install memcached libmemcached-dev # install pylibmc $ pip install pylibmc # edit the CACHES setting in your project's settings.py $ vim <YOUR_APP>/settings.py CACHES = { 'default': { 'BACKEND': 'django.core.cache.backends.memcached.PyLibMCCache', 'LOCATION': '127.0.0.1:11211', } }
Asynchronous Task Execution
Almost every site can benefit from asynchronous task execution. Even if your site isn’t big enough to make it a necessity for scaling and performance, why not do it? It’s very easy to set it up, and requires only free software.
What is asynchronous task execution? Let’s say you have a simple contact form. If you call send_email from the contact form view, then the user who submitted the form will not receive an HTTP response until the email has been sent. Their browser will sit there spinning while the server works. It will also tie up one of your web server threads. To make matters worse, if there is an email error, it will return that error to the user.
What you really want to do is send the user to a thank you page immediately after they submit. Then you can send the email some time later without tying up a web server. If there is an error, you can keep retrying it without bothering the user. To do this we use a Python package called celery.
Celery requires a few pieces to work. First, it needs celery workers. These are the programs that actually do the work, such as sending the emails. Next it needs a message queuing server. This is where the celery workers look to see if there is any work they should be doing. For this you should use RabbitMQ.
There is also celerycam, which will monitor celery by taking snapshots every few seconds. This will allow you to see the state of what celery is doing from the Django admin interface. Lastly, we need a place to store results if any celery tasks produce them. For that we will just use the existing MySQL or PostgreSQL database.
RabbitMQ
When you install RabbitMQ you also have to create a user and grant them permissions on a virtual host. You can pick whatever username, password, and vhost name you like, but I always use the name of my app for all three.
$ sudo apt-get install rabbitmq-server $ sudo rabbitmqctl add_user <RABBIT_USER> <RABBIT_PASSWORD> $ sudo rabbitmqctl add_vhost <RABBIT_VHOST> $ sudo rabbitmqctl set_permissions -p <RABBIT_VHOST> <RABBIT_USER> ".*" ".*" ".*"
Celery
Installing celery itself is a one liner, but it requires a lot of modifications to the Django settings file to get it working properly.
$ pip install django-celery $ vim <YOUR_APP>/settings.py # add djcelery to INSTALLED_APPS INSTALLED_APPS = ... 'djcelery', ... # add these settings BROKER_URL = "amqp://<RABBIT_USER>:<RABBIT_PASSWORD>@localhost:5672/<RABBIT_VHOST>" CELERY_RESULT_BACKEND = "database" # choose the setting that matches your database of choice CELERY_RESULT_DBURI = "mysql://<DB_USER>:<DB_PASSWORD>@localhost/<DB_NAME>" CELERY_RESULT_DBURI = "postgresql://<DB_USER>:<DB_PASSWORD>@localhost/<DB_NAME>" # put these two lines at the very bottom of the settings file import djcelery djcelery.setup_loader()
To test celery you can start up the celery daemon using a management command. If your application is going to have recurring tasks, you should enable events and celerybeat. Otherwise, you can ignore those options.
# start celery with Beat and Event $ ./manage.py celeryd -B -E # start celery normally $ ./manage.py celeryd # press Ctrl+C to quit celery once you see it is working
gunicorn
When you want to serve a modern Python web application, you use a thing called the web server gateway interface (WSGI). That is how your application will talk to the web server. There are a lot of choices when it comes to using WSGI. Apache with mod_wsgi is the old trustworthy option. uWSGI is also very popular. I personally prefer Gunicorn because it works extremely well, and runs great out of the box without any configuration.
$ pip install gunicorn # add gunicorn to INSTALLED_APPS $ vim <YOUR_APP>/settings.py INSTALLED_APPS = ... 'gunicorn', ...
Like celery, you can test Gunicorn by running it from a management command. The Gunicorn management command has many command line parameters. You can read about the options in the Gunicorn documentation. Personally, the only options I use are to set Gunicorn to use four workers and to enable gevent. You will see more about this in the next section.
$ ./manage.py run_gunicorn -w 4 -k gevent # pres Ctrl+C to exit Gunicorn once you see it is working
Supervisor
You may have noticed by now that Ubuntu is very clever in the way it handles servers. When you install MySQL with apt-get it immediately starts it with a default configuration. The same goes for RabbitMQ, memcached, SSH, and almost every other such server. If the machine reboots, these services will start themselves automatically. We can trust the distribution to take care of this for us.
But we have a problem now. We need to start Gunicorn and celery as services, but Ubuntu won’t help us with things we didn’t install with apt-get. One way to solve this problem is to write upstart or init.d scripts for these services. That will work, but it’s a pain in the ass. Those kinds of scripts are not simple to write, and are worse to maintain. Luckily supervisor exists.
Supervisor is installed with apt-get, so it will start automatically. We give supervisor very simple configuration files for any further services we would like it to manage, and it will start those up for us. It has the added benefit that it can restart any service it is managing in the case of failure. If Gunicorn crashes, it will come right back without us having to do anything. Because of this feature, some people choose to manage all their services under supervisor. I do not do this because it is extra work, and it is very rare that it will help you. If something like MySQL crashes, odds are it will not be successfully or safely restarted by supervisor.
Installing supervisor is as simple as apt-get, but configuring it is another matter. We need to create a configuration file for each service that supervisor manages. These configuration files must be located in /etc/supervisor/conf.d/ and must have a name with the conf extension. I have included here the full contents of my three supervisor configuration files, but have only documented the first one, as they are so similar.
$ sudo apt-get install supervisor
# contents of /etc/supervisor/conf.d/celeryd.conf # the name of this service as far as supervisor is concerned [program:celeryd] # the command to start celery command = /home/<USERNAME>/.virtualenvs/<VIRTUALENV_NAME>/bin/python /home/<USERNAME>/<APP_NAME>/manage.py celeryd -B -E # the directory to be in while running this directory = /home/<USERNAME>/<APP_NAME> # the user to run this service as user = <USERNAME> # start this at boot, and restart it if it fails autostart = true autorestart = true # take stdout and stderr of celery and write to these log files stdout_logfile = /var/log/supervisor/celeryd.log stderr_logfile = /var/log/supervisor/celeryd_err.log
# contents of /etc/supervisor/conf.d/celerycam.conf [program:celerycam] command = /home/<USERNAME>/.virtualenvs/<VIRTUALENV_NAME>/bin/python /home/<USERNAME>/<APP_NAME>/manage.py celerycam directory = /home/<USERNAME>/<APP_NAME> user = <USERNAME> autostart = true autorestart = true stdout_logfile = /var/log/supervisor/celerycam.log stderr_logfile = /var/log/supervisor/celerycam_err.log
# contents of /etc/supervisor/conf.d/gunicorn.conf [program:gunicorn] command = /home/<USERNAME>/.virtualenvs/<VIRTUALENV_NAME>/bin/python /home/<USERNAME>/<APP_NAME>/manage.py run_gunicorn -w 4 -k gevent directory = /home/<USERNAME>/<APP_NAME> user = <USERNAME> autostart = true autorestart = true stdout_logfile = /var/log/supervisor/gunicorn.log stderr_logfile = /var/log/supervisor/gunicorn_err.log
You will notice the command for each of the supervisor configurations is strange. Instead of just running manage.py directly, we are specifically calling the python interpreter located inside of our virtualenv. There is no way for supervisor to easily activate itself the way we activate in our shell with the workon command. By specifically using the python interpreter from the virtualenv instead of the system version, celery and gunicorn will have the correct path settings, and will use the packages within the the virtualenv. Also, note the celeryd and gunicorn command line parameters we had discussed earlier to enable beat, event, gevent, etc.
To make supervisor recognize the new configuration files, we must restart it. It’s also pretty obvious from the example blow how to manually control the services that supervisor manages.
# restart supervisor itself $ sudo service supervisor restart # restart/stop/start all services managed by supervisor $ sudo supervisorctl restart all $ sudo supervisorctl stop all $ sudo supervisorctl start all # restart just celeryd $ sudo supervisorctl restart celeryd # start just gunicorn $ sudo supervisorctl start gunicorn
Static and Media
There is a great deal of confusion about static and media in Django. It is partially because the separation of the two was not very clear in earlier releases. This has since been fixed, and the static files system has been completely reworked. While the problem is solved in the current versions, the big changes may have added to the confusion. Allow me to help.
Static files are files that you would put into your repository that are not code. These are almost always css, jss, and image files. Your favicon would also be a static file. This is a file that you make, must be served to web browsers, but does not change unless you update your application.
Media files are files that are also served statically, but they were not made by you. They were uploaded by users. So on a site like YouTube, the YouTube logo in the top left would be a static file, but the actual videos themselves would be media. On Flickr, the css and the login buttons would be static, but the photos uploaded by users are media.
For media files we must simply take files that users upload, put them in a folder, and then serve the files in that folder. For static files we have a slightly trickier issue because they are not so easily located. The django admin keeps its static files in one place. You put your static files in another place. Maybe you pip install another third party module that includes some static files. In this case, you run a management command to collect the static files to a single folder from which they will be served.
# Create directories to hold static and media sudo mkdir /var/www sudo mkdir /var/www/static sudo mkdir /var/www/media # set permissions on these directories sudo chown -R <USERNAME>:www-data /var/www $ vim <YOUR_APP>/settings.py # these settings are explained below MEDIA_ROOT = '/var/www/media/' MEDIA_URL = '/media/' STATIC_ROOT = '/var/www/static/' STATIC_URL = '/static/' # run this whenever static files are changed $ ./manage.py collectstatic
In the django settings file there are four settings. MEDIA_ROOT, MEDIA_URL, STATIC_ROOT, and STATIC_URL. The ROOT settings are the local directories where the files are located. The URL settings are the URLs where these files will be served. Assume we have a file named css/main.css Using the above example, the file will be located in /var/www/static/css/main.css after it is collected. It will be served from the URL http://yourdomain.com/static/css/main.css.
One more benefit of the static system is that you can use pip to add static files to your project. For example, you can pip install django-staticfiles-jquery and django-staticfiles-bootstrap to automatically put jquery and boostrap into your static files when you run collectstatic.
Nginx
This brings us to the final piece of the puzzle. How exactly are these static files served? Gunicorn handles serving all the dynamically generated pages of our application, but it doesn’t know anything about these static files. The Django runserver will do some magic to handle these files to make development easier, but once you enter production and set DEBUG=False, you no longer have this luxury. That is why we put Nginx as our web server out in front of Gunicorn. It will serve all the static files, and also ask Gunicorn to handle any requests it doesn’t know what to do with.
# install nginx $ sudo apt-get install nginx
Much like Apache, Nginx uses two folders for configuration. There is /etc/nginx/sites-available/ and /etc/nginx/sites-enabled/. You put all your nginx configuration files into the sites-available directory. Then if you want a site to be active, you create a symlink to that file from the sites-enabled directory. You have to restart nginx after any configuration changes. Because Ubuntu starts nginx with a default configuration when you install it, you must first remove the symlink to that default configuration.
# remove the default symbolic link $ sudo rm /etc/nginx/sites-enabled/default # create a new blank config, and make a symlink to it $ sudo touch /etc/nginx/sites-available/<YOUR_APP> $ cd /etc/nginx/sites-enabled $ sudo ln -s ../sites-available/<YOUR_APP> # edit the nginx configuration file $ vim /etc/nginx/sites-available/<YOUR_APP>
Here is what the nginx configuration file should look like
# define an upstream server named gunicorn on localhost port 8000 upstream gunicorn { server localhost:8000; } # make an nginx server server { # listen on port 80 listen 80; # for requests to these domains server_name <YOUR_DOMAIN>.com www.<YOUR_DOMAIN>.com; # look in this directory for files to serve root /var/www/; # keep logs in these files access_log /var/log/nginx/<YOUR_APP>.access.log; error_log /var/log/nginx/<YOUR_APP>.error.log; # You need this to allow users to upload large files # See http://wiki.nginx.org/HttpCoreModule#client_max_body_size # I'm not sure where it goes, so I put it in twice. It works. client_max_body_size 0; # THIS IS THE IMPORTANT LINE # this tries to serve a static file at the requested url # if no static file is found, it passes the url to gunicorn try_files $uri @gunicorn; # define rules for gunicorn location @gunicorn { # repeated just in case client_max_body_size 0; # proxy to the gunicorn upstream defined above proxy_pass http://gunicorn; # makes sure the URLs don't actually say http://gunicorn proxy_redirect off; # If gunicorn takes > 5 minutes to respond, give up # Feel free to change the time on this proxy_read_timeout 5m; # make sure these HTTP headers are set properly proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; } }
Restart nginx and your server is now running and serving your django application perfectly!
# restart nginx $ sudo service nginx restart
Conclusion
I’m sure there are going to be errors in this no matter how much I proofread. I also fully admit that there is always more to learn. I welcome any suggested fixes or improvements to any part of this. Just leave a comment below.
I also hope this saves someone some money by letting them run their site on a cheap Linux host instead of a managed host. Most of all I hope it gets some people to view their application as not just the one piece they wrote, but as a whole made up of many separate working parts. And that those other parts are not to be outsourced, but to be embraced. When you control them, you can solve problems with them.
Excellent tutorial!
Nice read. I would add
‘supervisorctl reread’
‘supervisorctl update’
To this tutorial, while talking about supervisor. It would not let me start those till I used these commands.
Worked perfectly on Debian ‘sid’, thanks!
Oh, awesome! I didn’t know about those commands. I was just restarting supervisor entirely using sudo service supervisor restart.
Great tutorial. Thanks for the detailed write up.
When working in a virtualenv (as in your tutorial), PIL doesn’t work any more, that’s where the PILLOW kicks in and fixes your problems. Internal it’s still PIL, just a fixed installer.
I’ve heard this, but before I knew about Pillow I always used PIL in virtualenv, and it always worked just fine.
Great Tutorial! Would have saved me the headache from reading outdated tutorials. Definitely going to re-image my server now for this ! :)
This tutorial will eventually be out of date as well.
Awesome article, this describes exactly what I’m through for the past few weeks and what I should have written down, if it wasn’t for my procrastination.
Now what would be sugar on top is a fabfile to do all of this automatically on a fresh ec2 instance.
Great job on this. I spent last night deploying on webfaction and then heroku trying to decide which route to go and wondering if I should just use linode or ec2. This pretty much covers my stack. One thing I’m not familiar with on a self managed server is setting up SSL and DNS.
It would be nice to create a wiki for all this so it can stay up to date with best practices.
went awesomely until i got into gunicorn. It kept tellin me that I needed gevent installed which I guess means that it wasn’t installed automatically with gunicorn. I was installing locally, but I used Ubuntu 12.04 LTS.
Any ideas on how to get gevent installed correctly?
gevent is optional. If you don’t want to use it, you can remove the “-k gevent” part from the supervisor command for gunicorn. If you do want to use it, then you have to sudo apt-get install libevent-dev and then pip install gevent.
is it all of the configuration if i want to use memcached ?? or i need to configure something else? thank you
You can do a more advanced configuration if you want, but what is in the tutorial is all you need as far as server configuration goes. Everything else you need you can learn from the Django cache documentation. https://docs.djangoproject.com/en/dev/topics/cache/
I’m getting a 502 Bad Gateway after restarting nginx. Any suggestions?
Amazing tutorial on setting up a fresh server. Everyone should read this if they are trying to setup a Django site on their own VPS.
Hi Apreche,
I must say it’s excellent! I’ve used this tutorial a couple of times to setup my Django environment both on Debian and even on Raspberry PI (excluding some applications I don’t need) and it was always very useful and, what’s important, every time with success. Thanks a lot for this!
That usually means that nginx is failing when it tries to hit gunicorn. Check the nginx and gunicorn logs in /var/log to see what the real error is.
Thank you very much for this tutorial. I don’t usually comment on blogs (my bad) but your work deserved it. Very good job. I’m learning django thanks to your setup :)
This is such an amazing tutorial.
I had a couple of things I had to iron out since I’m just getting started with all this stuff.
Anyway… for the run_gunicorn I kept getting:
[ERROR] Connection in use: (‘127.0.0.1’, 8000)
I couldn’t figure out what I was doing wrong. It turns out that django is also using port 8000 when in debug mode. There are severals ways of changing this I guess. One way is to set DEBUG=False in the settings.py.
Or if this tutorial is you playing around (like it is for me), you might instead want to bind gunicorn to a different port (8000 is default).
In this tutorial this could look something like:
./manage.py run_gunicorn -w 4 -b 8080
It now works well for me while still in debug mode. However it’s definitely not advisable to keep DEBUG=True in a live production site.
Thanks a whole hell of a lot for putting this together. Really helpful for someone new to this stuff trying to set up a reasonably optimal and up-to-date server.
Thanks for the tutorial. What are your thoughts on Tornado instead of gunicorn? Everything else pretty much matches what I want to do!
Tornado is completely different from gunicorn. It’s specifically for use if you are building a webapp that is asyncrhonous with websockets and such. You shouldn’t use tornado unless your app absolutely needs it. In general I lean towards using old trustworthy tech instead of hot new things. Yes, new things are a lot more fun to play with, until your site doesn’t work. The fact that I even using nginx/gunicorn is a stretch for me. I would feel much safer with apache/mod_wsgi. Nothing is a better test of software quality than the test of time. The LAMP stack is ancient. The fact that it is still around shows how good it is. I can’t yet say the same for other popular trends like MongoDB, Meteor, or nodejs, even though they are really interesting and fun. If they are still popular a few years from now, it will be so fun to trust them in production.
Apreche, I can’t thank you enough for this tutorial! This helped me a lot till date. Just yesterday,I configured an Ubuntu server following these steps. I’m not using a virtualenv in this server. Every thing ran fine when I was configuring it. But then, after a restart, none of them are working. I’m really puzzled. I raised a stackoverflow question about the same. Could you please throw some light on my problem?
Yes, technically speaking virtualenv is not absolutely necessary if you aren’t running any other things on the same server, but why not use it? It’s not like virtualenv costs extra money to use. Just use it. It will work. Your stuff isn’t working because you are deviating from the norm.
Let me tell you about this friend I have. This friend always complains about technology not working. When they complained about their phone not working, I found out they jailbroke it. When their shell wasn’t working, I found it was because they were using a highly customized zsh. They had a video game that wasn’t working, it was because they were using weird beta video card drivers.
The lesson is that you should not deviate. Don’t do anything weird unless you are a super expert. You are only going to break things. Do things the normal way, and it will work. Use virtualenv.
Apreche, I’ve configured it again with the exact steps here using a virtualenv. I’m using environment variables for some settings. I see the same errors even now.
In celeryd_err.log, I see
ImportError: No module named debug_toolbar
In celerycam_err.log also, I see
ImportError: No module named debug_toolbar
In gunicorn_err.log,
Unknown command: ‘run_gunicorn’
Type ‘manage.py help’ for usage.
This looks like an environment variables issue. I have things like DJANGO_SETTINGS_MODULE, DB_HOST, etc in my virtualenv’s postactivate. Do you know of a way to make them available to supervisor?
I have been using Nginx + apache/mod_wsgi for some time and now following this great tutorial of yours I’ve configured my new server. Everything works fine.
One thing… I am wondering how to add second site to the server.
On all server i had everything configured in virtual hosts.
here I know how to set up only statics in nginx virtual hosts
Thank you
Is there any better way than binding each site to different port ?
example
site one :8000 site two :8001 and so on
regards
Nope. That’s the right way to do it. Run separate gunicorns on separate ports. Then create a separate nginx virtual host for each gunicorn.
Thank you Apreche all done and running
Great tutorial !
Excellent tutorial, clearly written using simple language. Thank you for writing this.
Thanks for this tutorial , Save me so much time and gave a lot of clarity interms of organising the app server.
Just set a VPS up for the first time using your tutorial, and things are working flawlessly. Cannot thank you enough for taking the time to put this together.
In case anyone has the same problem: The only issue I ran into was getting a “502: Bad Gateway” message from nginx at the end of the tutorial. At least in my case, the issue was that I had just switched from doing everything debug-style for the first time, so not only had I forgotten to set DEBUG = False, but I also needed to add a string to the ALLOWED_HOSTS list (I just used ‘..org’).
Could this tutorial be automated with Puppet?
Probably
You totally saved my skin! after a whole day of reading outdated mod_wsgi tutorials and documents, this provided the functionality I wanted with very few modifications! Thanks for the attention to detail. Cheers!
I spent the better part of two days trying to get celeryd to talk with rabbitmq, and I finally realized through trial and error that certain characters are not supported and will cause a credentials failure when connecting (I had a $ character in my password that would never work when trying to connect). Just thought I’d mention it before anyone else experienced the same!
Thanks for all the hard work putting this together – it is much appreciated.
The most frustrating part is getting all the individual steps working by themselves; then getting to the end and having a “502 Bad Gateway” error from nginx. I took Ben’s advice and added the changes to Django settings – but still get the same error.
The nginx error log shows:
2013/11/03 08:18:44 [error] 6807#0: *1 connect() failed (111: Connection refused) while connecting to upstream, client: 888.222.88.22, server: 999.888.77.66, request: “GET / HTTP/1.1”, upstream: “http://127.0.0.1:8001/”, host: “999.888.77.66”
I then tried to change the nginx config file to look like (actual numbers different of course) this:
upstream gunicorn {
server 999.888.77.66:8001;
}
so it reflects my server’s actual address. The nginx error log shows:
2013/11/03 08:22:54 [error] 6934#0: *1 connect() failed (111: Connection refused) while connecting to upstream, client: 888.222.88.22, server: 999.888.77.66, request: “GET / HTTP/1.1”, upstream: “http://999.888.77.66:8001/”, host: “999.888.77.66”
So, it seems that nginx cannot connect to or access gunicorn? Any pointer, hint or suggestion to make this final piece “fit” would be much appreciated!
Ok, so judging by your fake IP addresses, you are running nginx and gunicorn on completely separate machines, be they physical or virtual, is that correct? It says right in the title of the article that this is a single server tutorial, which is why it isn’t working for you. It’s not a huge difference, it’s just something I did not discuss.
Network daemons like gunicorn or nginx are configured to be bound to a particular network interface. Most computers have two. One is the local loopback 127.0.0.1 and one is the actual network connection. Some servers have multiple network connections. A normal computer could sometimes have three if it is connected to a wired network and a wifi network simultaneously. Each of these interfaces has a different IP address. When a daemon is bound to a particular interface, you will only be able to access it on that interface, and not the others. By default gunicorn, and also the django dev server (runserver) are bound to localhost/127.0.0.1. That means that you will not find them unless you are connecting from the same machine as the one on which they are running, and you are connecting to them using the IP 127.0.0.1.
My guess is that your gunicorn is still bound to 127.0.0.1, so your nginx on another machine can not connect to it. You need to change the gunicorn configuration so it either binds to the outgoing network interface, or all network interfaces on its machine. Then it will be visible from the outside world. Also, you should be making heavy use of curl to test which HTTP services are available to and from various hosts, interfaces, and ports.
Thanks for the reply!
I am actually running nginx and gunicorn on exactly the same server. You say “You need to change the gunicorn configuration so it either binds to the outgoing network interface, or all network interfaces on its machine.” Where and how would I do that, and how to test that it does in fact connect correctly?
So you mention that if virtualenv doesn’t “do some magic” after you exit and log back into the shell, there is “a problem you should fix”. I followed the directions exactly as you gave them above, and this is a brand new EC2 Ubuntu 12.04 instance, so I’m wondering if you could shed some light on what sort of problems I might be looking for? Thanks for any help.
Hi Scott… I tried redoing this tutorial “from scratch” and end up at the same point. All of the work is done on a single server, which has an IP adress in the form 999.888.77.66. After configuring, I end up with the “502 Bad Gateway”, and the error log from nginx reports:
2013/11/09 14:56:14 [error] 13892#0: *1 connect() failed (111: Connection refused) while connecting to upstream,
client: 105.229.83.82,
server: 999.888.77.66,
request: “GET / HTTP/1.1”,
upstream: “http://127.0.0.1:8000/”,
host: “999.888.77.66”
If I try and run curl -i 999.888.77.66 from my machine, I get:
HTTP/1.1 502 Bad Gateway
Server: nginx/1.1.19
Date: Sat, 09 Nov 2013 15:06:10 GMT
Content-Type: text/html
Content-Length: 173
Connection: keep-alive
502 Bad Gateway
502 Bad Gateway
nginx/1.1.19
</html
I also saw from .https://library.linode.com/web-servers/nginx/configuration/basic that for access to an IP address I should use:
server_name “”;
quote ‘ if you set server_name to the empty quote set (“”), nginx will process all requests that either do not have a hostname, or that have an unspecified hostname, such as requests for the IP address itself.’
So I tried that but had the same result… Not sure what else to try here?
Thanks again for any help.
Great tutorial…
Though, I am also getting the 502 Bad Gateway error having run the tutorial from scratch on a virgin EC2 instance using an Alestic Ubuntu image.
Did anyone ever really lock down what seems to be this prevalent issue for people following the tutorial?
Pingback: Data Community DC: A Tutorial for Deploying a Django Application that Uses Numpy and Scipy to Google Compute Engine Using Apache2 and modwsgi | The Black Velvet Room
# make a symlink inside enabled sites
$ cd /etc/nginx/sites-enabled
$ sudo ln -s ../sites-available/
I think you should leave this line to after filling in the site config. Today I was deploying a site and for a long time it didn’t work because the
cat /etc/nginx/sites-enabled/
returned blank. It worked well after putting in the content and creating the simlink again. Just an observation. I don’t have the time to verify it thoroughly. If anyone is stuck there, it helps to runcat /etc/nginx/sites-enabled/
to see if it has all the config that you intended it to have.This tutorial is awesome, and is exactly the stack that I was looking for when installing an internal Django project for local service on an Ubuntu box at work. I wanted to add a few notes about what changes I made to make it work for my situation:
The biggest change is using environment variables to store certain things such as database user/password, django secret keys, etc. That strategy is suggested in Two Scoops of Django which, due to it’s popularity, makes it a probable situation for others reading this tutorial. Here is the catch, when supervisord runs celery, gunicorn, etc., the environment variables that I set up in ~/.bashrc were not being imported even though supervisord is ostensibly changing user to the user=.
The way I got around this issue was to put the environment variables in the /etc/supervisor/conf.d/ files with environment= KEY=”value”,KEY2=”value”. I then ran into another issue that all of the special characters I had in my SECRET_KEY were throwing the string parsing from supervisor off. I switched to a alphanumeric-only SECRET_KEY, and it ran without a hitch.
Thanks Apreche for the AWESOME tutorial.
Thanks a lot Apreche. Am really new to the whole world of python and Django development. I configured the machine on clean install of Ubuntu 12.04 LTS. My major concern is (since am new to this whole world), it is all command line based. If am to use IDE (currently using Aptana Studio), how should i do it? Am really confused as to how to get the configuration working with the IDE.
Kindly help.
This tutorial assumes you already know how to properly administer a UNIX/Linux web server, and just need the specifics of setting up the Python/Django relevant portions of the stack. If you do not already have this knowledge, do not setup your own server in a production environment. There are a zillion other things not covered by this tutorial that you don’t even know you need to know. You will cry when your server is compromised due to poor security or crashes due to poor configuration. If you do not plan to acquire these systems administration skills, you should use a managed hosting solution. I strongly recommend you uninstall your IDE forever and fill the gaps on your knowledge.
Thanks for the tip Apreche. Will keep that in mind. I know i can always search on it but any particular links or tips will be really helpful to get me started.