NGINX is a high-performance web server that, if setup correctly, allows for high concurrency while having a relatively low resource usage. It also comes with a large number of first and third party modules that allow for a rich set of functionnalities (check here for more details).
In this article, we'll start by providing a typical configuration template, which will help you write your own. Then we'll dissect the document step-by-step, explaining each of its parts.
Throughout the document, we assume that a) you are running Linux or macOS and b) you have already installed nginx on your computer or are running a docker image with the software. Otherwise, please have a quick look at how to install NGINX as this will not be covered.
A closer look at NGINX
Before diving in the configuration steps, let's have a closer look at how NGINX works, as this will prove useful to better understand some configuration options. At its core, NGINX deals with requests asynchronously, with a single process potentially serving multiple requests concurrently. It is therefore particularly efficient at serving static content (such as HTML files), and hands off dynamic content requests to other servers, acting as a reverse proxy (for instance, with Python, NGINX works well with both uwsgi and gunicorn).
Another point worth mentioning is that NGINX uses a master-worker architecture with the master process mainly responsible for reading the configuration file and creating children processes (among which, the worker processes). Later, with nginx running, the master process allocates the incoming jobs to the single-threaded, independent worker processes which perform the required operations to fulfill the request (such as handling network connections, reading and writing to disk...).
To control how NGINX works, the user can modify the configuration file named nginx.conf
and typically located in the /etc/nginx
directory. This is exactly the file we will focus on.
Understanding the main configuration concepts
The configuration file is built upon two main concepts: contexts and directives. Contexts are sections/blocks where instructions can be set, such as the http and server contexts: http {}
and server {}
. The contexts can be nested and inherit their options from their parents, the upmost context being called the 'main' context and referring to the config file itself. On the other hand, directives are specific configuration options that allow to "customize" the server behavior. They typically contain a name and a value, for example server_name my_domain.com
. NGINX uses 3 main types of directives:
- Standard directive: can only be declared once for each context (for example the
root
directive). When declared within a given context C, a standard directive is passed to all the contexts children of C unless the child overrides it by declaring its own value for the directive. - Array directive: can be declared multiple times within the same context, in which case they append their value to the previous one (hence their name). One of the most widely used array directives is the
access_log
directive. In terms of inheritance, array directives behave like standard directives with their value being passed to all the children contexts unless overriden. - Action directive: invokes a specific action, such as
rewrite
orreturn
directives. These cannot be inherited as they stop the flow of execution.
NGINX configuration
Template file
With all that we have discussed in mind, we are now ready to scan through a typical configuration file and understand the main directives one can use. Here is the template we will look at (do not worry if this looks gibberish, everything will hopefully get much clearer soon).
user www-data;
worker_processes auto;
events {
worker_connections 4096;
}
http {
server {
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
root /usr/share/nginx/html;
gzip on;
gzip_vary on;
gzip_comp_level 4;
gzip_min_length 1024;
gzip_types text/plain text/css text/xml text/javascript application/x-javascript application/xml;
listen 80;
location /p1 {
index page1.html;
}
location = /p2 {
access_log /var/log/nginx/p2.access.log;
index page2.html;
try_files $uri $uri/ /index.html;
}
location / {
index index.html index.htm;
try_files $uri $uri/ /index.html;
}
}
}
Explanation
The user
directive
A configuration file typically starts with a user
directive. This directive lives within the main context and therefore affects all the contexts within the config file. It instructs the master process to run its worker processe(s) as the user specified. For instance, the following directive will set the user to www-data.
user www-data;
The worker_processes
directive
This directive is very important if you need a performant nginx server. It is set from the main context and allows to specify the number of workers the master process will create. For instance, if we use worker_processes 4;
, the master will spawn 4 worker processes. However, as these workers are single-threaded and can handle concurrency by themselves, it is generally better to set this number to be equal to the number of cores available for the machine (but feel free to benchmark different values). NGINX offers us a straight way to do this:
worker_processes auto;
The events
context and worker_connections
directive
We get to our first context in our configuration file: events
. It allows to set global directives handling connections. Typically, this context contains the worker_connections
directive, which sets the maximum number of simultaneous connections per worker process. This value has to be set with caution, since a too high value can lead to too much context switching and therefore waste resources. On the opposite, a too small value limits the number of concurrent connections the web server handles, harming the performance of the website.
Indeed, the number of total connections the web server can handle is simply the product of worker_connections
and worker_processes
values. For example, with the following configuration, we have 2 * 2048 = 8192 simultaneous connections:
worker_processes 2;
events {
worker_connections 4096;
}
To help you set this value, if you are using Linux or macOS, you can run this command ulimit -n
. This outputs the number of file descriptors you have, which acts as an upper bound for the number of worker connections you can use. For example, if the command outputs 256, you should not set the worker_connections
directive to more than 256.
The http
context
This is where all our HTTP server directives live: it typically contains other contexts such as server
or location
which we'll discuss in more details later.
The server
context
This context is nested within the http
context and allows to set the configuration for the virtual server. A configuration file often comes with many declarations of this context, each one defining a specific virtual server to handle incoming requests.
It very often contains the server_name
and listen
directives. The first defines the name(s) of the server and is somehow used as a "fingerprint" of the server: if multiple server contexts are provided, NGINX will parse the header of client requests and match them against the server_name
. This way, it can send the request to the relevant server.
On the other hand, the listen
directive sets the port on which to listen: in our example, it is the port 80, which is the default port for http.
The access_log
and error_log
directives
These allow to specify the path to store our access and error logs. Indeed, nginx automatically keeps logs about client requests (these are controlled by access_log
) and encountered issues (with the error_log
directive).
They can be used multiple times within the configuration file, overriding the value each time. Therefore, in our example file, all requests to the /p2
url will be logged at /var/log/nginx/p2.access.log only (and NOT at /logs/access.log). To log the requests in both locations, we can leverage the fact that access_log
is an array directive and simply change the context to:
location /p2 {
access_log /var/log/nginx/p2.access.log;
access_log /var/log/nginx/access.log;
index page2.html;
try_files $uri $uri/ /index.html;
}
This will append the two values and log the requests both in /var/log/nginx/p2.access.log
and /var/log/nginx/access.log
.
Finally, if we wish to disable logging, we can simply use the following directive access_log off
.
The root
directive
This directive indicates the root path from which NGINX serves static files. For example, if we get a request for /images/nginx.png, NGINX will look for it in the root path specified in the root
directive and will serve us the/usr/share/nginx/html/images/nginx.png
file in our case (if it can find it, of course).
gzip
Using gzip allows to turn on compression, a fairly simple way to boost performance in general. Indeed, with compression, the server will be sending smaller responses to the client, thus making the pages load faster. The gzip on
directive allows to do just that: it enables the gzip compression. However, not all browsers support gzip, so to make NGINX serve a compressed or non-compressed file to the client depending on its capacity to handle it, we must add the gzip_vary on;
directive.
Still, compression is by no means a cure-all. It rather comes with an important trade off as compressing a file can require significant CPU resources from the server (decompressing the file also requires resources on the client side), which can slow the overall handling of the request and harm performance. To solve that, we have two main techniques:
- control the compression level with the directive
gzip_comp_level 4;
. This directive takes a value between 0 and 9, 0 meaning no compression at all while 9 stands for maximum compression. I would advise to keep it around 4 as this seems to yield the most benefits. - only compress heavy files. Once again, NGINX offers a simple way to do so: we just add
gzip_min_length 1024;
for example to set the minimum file size to 1024 bytes. Files smaller than this threshold will therefore be sent uncompressed, while the remaining ones will be gzipped before traveling over the network.
Finally the gzip_types
is simply a way of selecting the types of files that we want NGINX to compress.
The location
context
This is definitely the context you will encounter the most and, just like server
contexts, we can (and often do) use multiple location
contexts. You can think of this context as a way to intercept incoming requests, based on their URIs and then handle them accordingly.
For example, in the piece of code below, we log the request and then return the string "this is a test :)" with a 200 http status code using the return
directive.
location = /p2 {
access_log /var/log/nginx/p2.access.log;
return 200 "this is a test :)"
}
There are actually several ways of matching a URI within a location
context:
- prefix match: looks for URIs starting with the specified value, for example
location /p1
would catch both /p1 and /p12, since both start with the/p1
prefix. - exact match: looks for an exact URI. To do so, we simply need to add a
=
sign right after thelocation
context as inlocation = /p2
. Doing so, this location context would only catch the /p2 and not the /p22 URI. - regex match: matches URIs using regular expressions. To achieve this, we simply need to add a
~
character. For example,location ~ /hello[0-9]
would match any of these URIs: /hello0 to /hello9. One should note that using the ~ modifier is case-sensitive, to use case-insensitive regex match, use the ~* modifier instead.
Finally, when using multiple location contexts with different matching styles, if a URI is matched by several location contexts, NGINX will prioritize the locations according to its priority rules. To keep this article relatively short, we will not cover these here but here is a good post about the subject.
The index
directive
This directive tells NGINX what file to consider as index and serve it if none is specified.
The try_files
directive
try_files
is a useful directive that instructs NGINX to test a sequence of URIs, serving the first one it can find. It can be placed in a server or location context and takes a list of one or more files and directories and a final URI as parameters.
For example, in our configuration file, we tell NGINX to try to serve the requested URI as it comes in the incoming request, if it can find it relative to the root directory ($uri is a built-in variable that returns the client request URI). Otherwise it moves to do the same with the second value. Finally if none of these exists, it falls back to serving a default page (either page1.html
, page2.html
or index.html
depending on the line you consider in the config file). This means that if we request /p2 and page2.html
does not exist, nginx will fall back and serve us the index.html
file instead of a 404 page.
Wrapping up
This brings us to the end of this post and, hopefully, nginx.conf files are now clearer to you.
However, please keep in mind that this article by no means covers all the configuration options but rather attempts to dissect some of the most widely used ones. For example, we did not discuss how to enable SSL or TLS to serve our content over https nor did we talk about using NGINX as a reverse proxy to serve dynamic content processed by a backend. We will probably cover these aspects in a future article, but you should already be able to get going and start writing your own config files ๐!
PS: please feel free to add any comments/remarks about the article, every opinion is highly welcome :)