On GitHub, @RomainMou reported an issue whereby HTTP header field values
where being incorrectly reported as non-ascii by the Python .isacii()
method.
For example, using the following test application
def application(environ, start_response):
t = environ['HTTP_ASCIITEST']
t = "'" + t + "'" + " (" + str(len(t)) + ")"
if t.isascii():
t = t + " [ascii]"
else:
t = t + " [non-ascii]"
resp = t + "\n\n"
start_response("200 OK", [("Content-Type", "text/plain")])
return (bytes(resp, 'latin1'))
You would see the following
$ curl -H "ASCIITEST: $" http://localhost:8080/
'$' (1) [non-ascii]
'$' has an ASCII code of 0x24 (36).
The initial idea was to adjust the second parameter to the
PyUnicode_New() call from 255 to 127. This unfortunately had the
opposite effect.
$ curl -H "ASCIITEST: $" http://localhost:8080/
'$' (1) [ascii]
Good. However...
$ curl -H "ASCIITEST: £" http://localhost:8080/
'£' (2) [ascii]
Not good. Let's take a closer look at this.
'£' is not in basic ASCII, but is in extended ASCII with a value of 0xA3
(163). Its UTF-8 encoding is 0xC2 0xA3, hence the length of 2 bytes
above.
$ strace -s 256 -e sendto,recvfrom curl -H "ASCIITEST: £" http://localhost:8080/
sendto(5, "GET / HTTP/1.1\r\nHost: localhost:8080\r\nUser-Agent: curl/8.0.1\r\nAccept: */*\r\nASCIITEST: \302\243\r\n\r\n", 92, MSG_NOSIGNAL, NULL, 0) = 92
recvfrom(5, "HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\nServer: Unit/1.30.0\r\nDate: Mon, 22 May 2023 12:44:11 GMT\r\nTransfer-Encoding: chunked\r\n\r\n12\r\n'\302\243' (2) [ascii]\n\n\r\n0\r\n\r\n", 102400, 0, NULL, NULL) = 160
'£' (2) [ascii]
So we can see curl sent it UTF-8 encoded '\302\243\' which is C octal
escaped UTF-8 for 0xC2 0xA3, and we got the same back. But it should not
be marked as ASCII.
When doing PyUnicode_New(size, 127) it sets the buffer as ASCII. So we
need to use another function and that function would appear to be
PyUnicode_DecodeCharmap()
Which creates an Unicode object with the correct ascii/non-ascii
properties based on the character encoding.
With this function we now get
$ curl -H "ASCIITEST: $" http://localhost:8080/
'$' (1) [ascii]
$ curl -H "ASCIITEST: £" http://localhost:8080/
'£' (2) [non-ascii]
and for good measure
$ curl -H "ASCIITEST: $ £" http://localhost:8080/
'$ £' (4) [non-ascii]
$ curl -H "ASCIITEST: $" -H "ASCIITEST: £" http://localhost:8080/
'$, £' (5) [non-ascii]
PyUnicode_DecodeCharmap() does require having the full string upfront so
we need to build up the potentially comma separated header field values
string before invoking this function.
I did not want to touch the Python 2.7 code (which may or may not even
be affected by this) so kept these changes completely isolated from
that, hence a slight duplication with the for () loop.
Python 2.7 was sunset on January 1st 2020[0], so this code will
hopefully just disappear soon anyway.
I also purposefully didn't touch other code that may well have similar
issues (such as the HTTP header field names) if we ever get issue
reports about them, we'll deal with them then.
[0]: <https://www.python.org/doc/sunset-python-2/>
Link: <https://docs.python.org/3/c-api/unicode.html>
Closes: <https://github.com/nginx/unit/issues/868>
Reported-by: RomainMou <https://github.com/RomainMou>
Tested-by: RomainMou <https://github.com/RomainMou>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
NGINX Unit
Universal Web App Server
NGINX Unit is a lightweight and versatile open-source server that has two primary capabilities:
- serves static media assets,
- runs application code in seven languages.
Unit compresses several layers of the modern application stack into a potent, coherent solution with a focus on performance, low latency, and scalability. It is intended as a universal building block for any web architecture regardless of its complexity, from enterprise-scale deployments to your pet's homepage.
Its native RESTful JSON API enables dynamic updates with zero interruptions and flexible configuration, while its out-of-the-box productivity reliably scales to production-grade workloads. We achieve that with a complex, asynchronous, multithreading architecture comprising multiple processes to ensure security and robustness while getting the most out of today's computing platforms.
Quick Installation
macOS
$ brew install nginx/unit/unit
For details and available language packages, see the docs.
Docker
$ docker pull unit
For a description of image tags, see the docs.
Amazon Linux, Fedora, RedHat
$ wget https://raw.githubusercontent.com/nginx/unit/master/tools/setup-unit && chmod +x setup-unit
# ./setup-unit repo-config && yum install unit
# ./setup-unit welcome
For details and available language packages, see the docs.
Debian, Ubuntu
$ wget https://raw.githubusercontent.com/nginx/unit/master/tools/setup-unit && chmod +x setup-unit
# ./setup-unit repo-config && apt install unit
# ./setup-unit welcome
For details and available language packages, see the docs.
Running a Hello World App
Unit runs apps in a variety of languages. Let's consider a basic example, choosing PHP for no particular reason.
Suppose you saved a PHP script as /www/helloworld/index.php:
<?php echo "Hello, PHP on Unit!"; ?>
To run it on Unit with the unit-php module installed, first set up an
application object. Let's store our first config snippet in a file called
config.json:
{
"helloworld": {
"type": "php",
"root": "/www/helloworld/"
}
}
Saving it as a file isn't necessary, but can come in handy with larger objects.
Now, PUT it into the /config/applications section of Unit's control API,
usually available by default via a Unix domain socket:
# curl -X PUT --data-binary @config.json --unix-socket \
/path/to/control.unit.sock http://localhost/config/applications
{
"success": "Reconfiguration done."
}
Next, reference the app from a listener object in the /config/listeners
section of the API. This time, we pass the config snippet straight from the
command line:
# curl -X PUT -d '{"127.0.0.1:8000": {"pass": "applications/helloworld"}}' \
--unix-socket /path/to/control.unit.sock http://localhost/config/listeners
{
"success": "Reconfiguration done."
}
Now Unit accepts requests at the specified IP and port, passing them to the application process. Your app works!
$ curl 127.0.0.1:8080
Hello, PHP on Unit!
Finally, query the entire /config section of the control API:
# curl --unix-socket /path/to/control.unit.sock http://localhost/config/
Unit's output should contain both snippets, neatly organized:
{
"listeners": {
"127.0.0.1:8080": {
"pass": "applications/helloworld"
}
},
"applications": {
"helloworld": {
"type": "php",
"root": "/www/helloworld/"
}
}
}
For full details of configuration management, see the docs.
OpenAPI Specification
Our OpenAPI specification aims to simplify configuring and integrating NGINX Unit deployments and provide an authoritative source of knowledge about the control API.
Although the specification is still in the early beta stage, it is a promising step forward for the NGINX Unit community. While working on it, we kindly ask you to experiment and provide feedback to help improve its functionality and usability.
Community
-
The go-to place to start asking questions and share your thoughts is our Slack channel.
-
Our GitHub issues page offers space for a more technical discussion at your own pace.
-
The project map on GitHub sheds some light on our current work and plans for the future.
-
Our official website may provide answers not easily found otherwise.
-
Get involved with the project by contributing! See the contributing guide for details.
-
To reach the team directly, subscribe to the mailing list.
-
For security issues, email us, mentioning NGINX Unit in the subject and following the CVSS v3.1 spec.