Amir Ravdat of F5 solutions engineer
Amir Ravdat of F5
24. August 2017
We all know that performance is critical to the success of a website. Nobody wants to use a slow website.
Since its first open source release in 2004, NGINX has been synonymous with powerful websites. more thanThe world's websites use NGINXthan any other web server and more than350 million websites worldwide are now powered by NGINX. But how well does NGINX really work? What hardware configurations offer the best performance at a reasonable price?
We previously published performance statistics for NGINX and NGINXPlus as a reverse proxy in ourNGINXPlus Sizing Guide for Bare Metal Serversand details our testing methodology inNGINXPlus Sizing Guide: How We Tested.
Since these articles were published, we have received many requests for information on the performance of NGINX as a web server, so we are back in the lab. In this blog post, we present detailed performance numbers from our tests: requests per second (RPS) and connections per second (CPS) over live HTTP and HTTPS connections, and HTTP throughput over 50 dedicated channels. The results apply to both NGINX Open Source and NGINXPlus (neither test is based on features unique to NGINXPlus).
Our goal is for this information to help you determine the hardware specifications you need to handle current and future web application traffic, based on your budget and performance requirements.
The test setup we used is almost identical to the setup inNGINXPlus Sizing Guide: How We Tested, except that there is no reverse proxy between the client and the web server. All tests were performed with two separate machines connected with two 40GbE links over a simple flat Layer2 network.
To simulate different amounts of CPU in testing, we varied the number of NGINX worker processes. By default, the number of NGINX worker processes that are started is equal to the number of CPUs available on the computer running NGINX. You can change the number of NGINX worker processes running by changing the value of
worker_processespolicy on/etc/nginx/nginx.confArchive and restart the NGINX service.
For tests where client traffic was secured with HTTPS, we used the following encryption parameters:
- 2048 bit RSA key
- Perfect Forward Secrecy (as indicated by
BISin the encrypted name)
- Open SSL1.0.1f
We use the following hardware to test on the client machine and the web server:
- CPU: 2x Intel(R) Xeon(R) CPU E5‑2699 v3 @ 2,30 GHz, 36 Kernreais (oder 72HT)
- Red: 2x Intel XL71040 GbEQSFP+ (Rev. 01)
- Memory: 16GB
We use the following software for testing:
- Version 4.0.0 of
workRunning on the client machine generated the traffic that NGINX forwarded as a proxy. We install it according to theseinstructions.
- NGINX Open Source version 1.9.7 was running on the web server machines. We install it from the official repository innginx.orgafter theseinstructions.
- UbuntuLinux14.04.1it runs on both the client and the web server.
Performance metrics and analytics
We got the following performance figures from the tests. See the commands usedNGINX Plus Sizing Guide: How We Tested It.
requests per second
Requests Per Second (RPS) measures the ability to process HTTP requests. Each request is sent from the client machine to the NGINX web server. Tests were performed for both unencrypted HTTP traffic and encrypted HTTPS traffic.
Following standard performance testing practice, we use four default file sizes:
- 0 KB simulates an "empty" HTTP request or response with no data attached, such as
- 10KB is roughly equal to larger code files, larger icons, and small image files.
- 100KB means large code files and other larger files.
By issuing small HTTP requests, you get more requests per second with lower overall performance. Issuing large HTTP requests provides fewer requests per second and more throughput because a single request initiates a large file transfer that takes a significant amount of time.
RPS for HTTP requests
The table and graph below show the number of HTTP requests for different numbers of CPUs and different request sizes in kilobytes (KB).
|CPU||0 KB||1 KB||10 KB||100 KB|
Large HTTP requests (for example, 10 and 100 KB sizes in the test) are fragmented and take longer to process. As a result, the graph lines have shallower slopes for larger loads.
When balancing budget and performance, an interesting thing to note is that the slope of the lines changes as you go through 16 CPUs. Servers with 32 CPUs performed as well or better than servers with 36 CPUs at 1KB and 10KB request sizes. Resource contention eventually outweighs the positive effect of adding more CPUs. This suggests that typical server configurations for HTTP traffic with 4 to 8 cores can greatly benefit from adding CPUs to a total of 16, less by using 32, and little to no benefit by moving to 36. Regarding testing Concerned , your mileage always happens to vary.
RPS for HTTPS requests
HTTPS-RPS is inferior to HTTP-RPS for the same hardware with no operating system implemented because the data encryption and decryption required to protect data transmitted between machines is computationally intensive.
However, continued advances in Intel architecture, resulting in servers with faster processors and better memory management, mean that software performance for CPU-bound encryption tasks is continually improving compared to dedicated hardware encryption devices.
Although the RPS of HTTPS is about a quarter lower than HTTP with 16 CPUs, for the most common file sizes it is more effective than HTTP at "throwing the hardware into trouble" in the form of additional CPUs. used and all the way up to 36 CPU.
|CPU||0 KB||1 KB||10 KB||100 KB|
connections per second
Connections Per Second (CPS) measures the ability of NGINX to establish new TCP connections with clients that have made requests. Clients send a series of HTTP or HTTPS requests, each over a new connection. NGINX parses the requests and sends a 0-byte response for each request. The connection ends after the request is satisfied.
Supervision:The HTTPS variant of this test is often mentioned.SSL transactions per second(SSLTPS).
CPS para solicitudes HTTP
The table and graph show CPS for HTTP requests at different amounts of CPU.
The graphics are similar.f(x) = √x, isXis the number of CPUs running. For RPS, the CPS growth levels off around 16 CPUs and there is a slight drop in performance (here in CPS) when we increase the number of CPUs from 32 to 36.
CPS para solicitudes HTTPS
The table and graph show CPS for HTTPS requests. Due to time constraints, we did not run the tests with 32 CPUs.
We see a higher rate of increase in CPS as we add more CPUs. The graphics line levels out at 24 CPUs. For SSL, casting hardware to the problem works fine.
These tests measure the HTTP request throughput (in Gbps) that NGINX can support over a period of 180 seconds.
|CPU||100 KB||1MB||10 MB|
Throughput is proportional to the size of the HTTP requests issued by the client computer. NGINX achieves higher performance when the file size is larger because a given request will result in more data being transferred. However, the maximum performance is around 8 CPUs; more is not necessarily an advantage for high performance tasks.
A few other notes on testing and results:
- Hyper-Threading was available on the CPUs we tested, meaning that additional NGINX worker processes could run to utilize the full capacity of the Hyper-Threading CPUs. We did not enable Hyper-Threading for the tests reported here, but we did notice an improvement in performance with Hyper-Threading in separate tests. In particular, Hyper-Threading improved SSLTPS by approximately 50%.
- The numbers shown here are with OpenSSL1.0.1. We also tested with OpenSSL1.0.2 and saw a 2x performance improvement. OpenSSL1.0.1 is still widely used, but we recommend moving to OpenSSL1.0.2 for more security and better performance.
- We also tested elliptic curve cryptography (ECC), but the results presented here use RSA. For encryption, RSA is still widely used over ECC, although ECC is often used for mobile devices where power efficiency is required. We have seen a 2x to 3x performance increase with ECC compared to standard RSA certificates and encourage you to consider implementing ECC.
The combination of migrating to OpenSSL1.0.2 and migrating to ECC can lead to very significant performance improvements. Additionally, our results show that if you are currently running servers with 4 or 8 CPUs, moving up to 16 CPUs, or even 32 CPUs for SSL, can result in a truly dramatic improvement.
We analyzed performance test results for RPS and CPS over live HTTP and HTTPS connections, as well as HTTP performance over 50 dedicated channels. Use the information in this blog to decide what hardware specifications you need to handle current and future traffic to your website, within your budget and performance requirements.