Commit Graph

1274 Commits

Author SHA1 Message Date
Andrew Clayton
d5efa5f11f Wasm: Fix multiple successive calls to the request_handler.
When trying to upload files to the luw-upload-reflector demo[0] above a
certain size that would mean Unit would need to make more than two calls
to the request_handler function in the Wasm module we would get the
following error from wasmtime and the upload would stall on the third
call to the request_handler

  WASMTIME ERROR: failed to call function [->wasm_request_handler]
  error while executing at wasm backtrace:
      0: 0x5ce2 - <unknown>!memcpy
      1:  0x7df - luw_req_buf_append
                      at /home/andrew/src/unit-wasm/src/c/libunit-wasm.c:308:14
      2:  0x3a1 - luw_request_handler
                      at /home/andrew/src/unit-wasm/examples/c/luw-upload-reflector.c:110:3

  Caused by:
      wasm trap: out of bounds memory access

This was due to ->content_off (the offset of where the actual body
content starts in the request structure/memory) being some overly large
value.

This was largely down to me being an idiot!

Before calling the loop that makes the calls to the request_handler we
would calculate the new offset, which is now just the size of the
request structure as we don't re-send all the HTTP meta data and headers
etc. However because this value is in the request structure which is in
the shared memory and we use this same memory for requests and
responses, when we make a response we overwrite this request structure
with the response structure, so our ->content_off is now some wacked out
value when we make the next call to the request_handler.

To fix this we just need to reset ->content_off each time round the
loop.

There's also no point in setting ->nfields to 0, it has the same issue
as above, but doesn't get re-used by the Wasm module anyway.

[0]: <https://github.com/nginx/unit-wasm/blob/main/examples/c/luw-upload-reflector.c>

Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-09-25 13:52:13 +01:00
Andrew Clayton
76086d6d7a Wasm: Allow to set the HTTP response status.
This commit enables WebAssembly modules to set the HTTP response status
to something other than the previously hard coded '200 OK'.

To do this they can make a call to nxt_wasm_set_resp_status() providing
the required status code.

If this function isn't called the status code defaults to '200 OK'. The
WebAssembly module can also return -1 from the request_handler function
as a short cut to signal a '500 Internal Server Error'.

Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-09-25 13:49:36 +01:00
Andrew Clayton
c9961610ed Fix build on musl libc with clang.
As reported by @andypost on GitHub, if you try to build Unit on a system
that uses musl libc (such as Alpine Linux) with clang then you get the
following

clang -c -pipe -fPIC -fvisibility=hidden -O -W -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -fstrict-aliasing -Wstrict-overflow=5 -Wmissing-prototypes -Werror -g   -I src -I build/include   \
                      \
                     \
-o build/src/nxt_socketpair.o \
-MMD -MF build/src/nxt_socketpair.dep -MT build/src/nxt_socketpair.o \
src/nxt_socketpair.c
In file included from src/nxt_socketpair.c:8:
src/nxt_socket_msg.h:138:17: error: comparison of integers of different signs: 'unsigned long' and 'long' [-Werror,-Wsign-compare]
         cmsg = CMSG_NXTHDR(&msg, cmsg))
                ^~~~~~~~~~~~~~~~~~~~~~~
/usr/include/sys/socket.h:358:44: note: expanded from macro 'CMSG_NXTHDR'
        __CMSG_LEN(cmsg) + sizeof(struct cmsghdr) >= __MHDR_END(mhdr) - (unsigned char *)(cmsg) \
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from src/nxt_socketpair.c:8:
src/nxt_socket_msg.h:177:17: error: comparison of integers of different signs: 'unsigned long' and 'long' [-Werror,-Wsign-compare]
         cmsg = CMSG_NXTHDR(&msg, cmsg))
                ^~~~~~~~~~~~~~~~~~~~~~~
/usr/include/sys/socket.h:358:44: note: expanded from macro 'CMSG_NXTHDR'
        __CMSG_LEN(cmsg) + sizeof(struct cmsghdr) >= __MHDR_END(mhdr) - (unsigned char *)(cmsg) \
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 errors generated.
make: *** [build/Makefile:261: build/src/nxt_socketpair.o] Error 1

GCC works fine, it seems to have some smarts so that it doesn't give
warnings on system header files.

This seems to be a long standing issue with musl libc (bad casting in
the CMSG_NXTHDR macro) and the workaround employed by several projects
is to disable the -Wsign-compare clang warning for the code in question.

So, that's what we do. We wrap the CMSG_NXTHDR macro in a function, so
we can use the pre-processor in it to selectively disable the warning.

Link: <https://github.com/dotnet/runtime/issues/16438>
Link: <https://git.openembedded.org/meta-openembedded/tree/meta-oe/recipes-devtools/breakpad/breakpad/0001-Turn-off-sign-compare-for-musl-libc.patch>
Link: <https://github.com/dotnet/corefx/blob/57ff88bb75a0/src/Native/Unix/System.Native/pal_networking.c#L811-L829>
Link: <https://patchwork.yoctoproject.org/project/oe/patch/20220407191438.3696227-1-stefan@datenfreihafen.org/>
Closes: <https://github.com/nginx/unit/issues/936>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-09-11 16:59:27 +01:00
Alejandro Colomar
7dd5ad93a4 Log: fixed typo.
Scripted change:

$ grep -ril recevied src/ | xargs sed -i s/recevied/received/

Reported-by: <https://github.com/jeffdafoe>
Closes: <https://github.com/nginx/unit/issues/920>
Cc: <https://github.com/meezaan>
Cc: Timo Stark <t.stark@nginx.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-09-07 23:13:34 +01:00
Konstantin Pavlov
ebcc92069d Added unit pkg-config file. 2023-08-01 10:16:17 -07:00
Andrew Clayton
47ff51009f Wasm: Add support for directory access.
Due to the sandboxed nature of WebAssembly, by default WASM modules
don't have any access to the underlying filesystem.

There is however a capabilities based mechanism[0] for allowing such
access.

This adds a config option to the 'wasm' application type;
'access.filesystem' which takes an array of directory paths that are
then made available to the WASM module. This access works recursively,
i.e everything under a specific path is allowed access to.

Example config might look like

  "access" {
      "filesystem": [
          "/tmp",
          "/var/tmp"
      ]
  }

The actual mechanism used allows directories to be mapped differently in
the guest. But at the moment we don't support that and just map say /tmp
to /tmp. This can be revisited if it's something users clamour for.

Network sockets are another resource that may be controlled in this
manner, for example there is a wasi_config_preopen_socket() function,
however this requires the runtime to open the network socket then
effectively pass this through to the guest.

This is something that can be revisited in the future if users desire
it.

[0]:
<https://github.com/bytecodealliance/wasmtime/blob/main/docs/WASI-capabilities.md>

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-17 13:09:49 +01:00
Andrew Clayton
e99854afdf Wasm: Wire up Wasm language module support to the config system.
This exposes various WebAssembly language module specific options.

The application type is "wasm".

There is a "module" option that is required, this specifies the full
path to the WebAssembly module to be run. This module should be in
binary format, i.e a .wasm file.

There are also currently eight function handlers that can be specified.
Three of them are _required_

1) request_handler

The main driving function. This may be called multiple times for a
single HTTP request if the request is larger than the shared memory.

2) malloc_handler

Used to allocate a chunk of memory at language module startup. This
memory is allocated from the WASM modules address space and is what is
sued for communicating between the WASM module (the guest) and Unit (the
host).

3) free_handler

Used to free the memory from above at language module shutdown.

Then there are the following five _optional_ handlers

1) module_init_handler

If set, called at language module startup.

2) module_end_handler

If set, called at language module shutdown.

3) request_init_handler

If set, called at the start of request. Called only once per HTTP
request.

4) request_end_handler

If set, called once all of a request has been sent to the WASM module.

5) response_end_handler

If set, called at the end of a request, once the WASM module has sent
all its headers and data.

Example config

  "applications": {
      "luw-echo-request": {
          "type": "wasm",
          "module": "/path/to/unit-wasm/examples/c/luw-echo-request.wasm",
          "request_handler": "luw_request_handler",
          "malloc_handler": "luw_malloc_handler",
          "free_handler": "luw_free_handler",
          "module_init_handler": "luw_module_init_handler",
          "module_end_handler": "luw_module_end_handler",
      }
  }

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-17 13:09:49 +01:00
Andrew Clayton
6a211e2b74 Wasm: Add the core of initial WebAssembly language module support.
This adds the core of runtime WebAssembly[0] support. Future commits
will enable this in the Unit core and expose the configuration.

This introduces a new src/wasm directory for storing this source.

We are initially using Wasmtime[0] as the WebAssembly runtime, however
this has been designed with the ability to use different runtimes in
mind.

src/wasm/nxt_wasm.[ch] is the main interface to Unit.

src/wasm/nxt_rt_wasmtime.c is the Wasmtime runtime support. This is
nicely insulated from any knowledge of internal Unit workings.

Wasmtime is what loads and runs the Wasm modules.

The Wasm modules can export functions Wasmtime can call and Wasmtime can
export functions that the module can call.

We make use of both. The terminology used is that function exports are
what the Wasm module exports and function imports are what the Wasm
runtime exports to the module.

We currently have four function imports (functions exported by the
runtime to be called by the Wasm module).

1) nxt_wasm_get_init_mem_size

This allows Wasm modules to get the size of the initially allocated
shared memory. This is the size allocated at Unit startup and what the
Wasm modules can assume they have access to (in reality this shared
memory will likely be larger).

The amount of memory allocated at startup is NXT_WASM_MEM_SIZE which as
of this commit is 32MiB.

We do actually allocate NXT_WASM_MEM_SIZE + NXT_WASM_PAGE_SIZE at
startup which is an extra 64KiB (the smallest allocation unit), this is
to allow room for the response structure and so module developers can
just assume they have the full 32MiB for their actual response.

2) nxt_wasm_send_headers

This allows WASM modules to send their headers.

3) nxt_wasm_send_response

This allows WASM modules to send their response.

4) nxt_wasm_response_end

This allows WASM modules to inform Unit they have finished sending their
response. This calls nxt_unit_request_done()

Then there are currently up to eight functions that a module can export.
Three of which are required. These function can be named anything. I'll
use the Unit configuration names to refer to them

1) request_handler

The main driving function. This may be called multiple times for a
single HTTP request if the request is larger than the shared memory.

2) malloc_handler

Used to allocate a chunk of memory at language module startup. This
memory is allocated from the WASM modules address space and is what is
sued for communicating between the WASM module (the guest) and Unit (the
host).

3) free_handler

Used to free the memory from above at language module shutdown.

Then there are the following optional handlers

1) module_init_handler

If set, called at language module startup.

2) module_end_handler

If set, called at language module shutdown.

3) request_init_handler

If set, called at the start of request. Called only once per HTTP
request.

4) request_end_handler

If set, called once all of a request has been sent to the WASM module.

5) response_end_handler

If set, called at the end of a request, once the WASM module has sent
all its headers and data.

32bits

We currently support 32bit WASM modules, I.e wasm32-wasi. Newer version
of clang, 13+[2], do seem to have support for wasm64 as a target (which
uses a LP64 model). However it's not entirely clear if the WASI SDK
fully supports[3] this and by extension WASI libc/wasi-sysroot.

64bit support is something than can be explored more thoroughly in the
future.

As such in structures that are used to communicate between the host and
guest we use 32bit ints. Even when a single byte might be enough. This
is to avoid issues with structure layout differences between a 64bit
host and 32bit guest (I.e WASM module) and the need for various bits of
structure padding depending on host architecture. Instead everything is
4-byte aligned.

[0]: <https://webassembly.org/>
[1]: <https://wasmtime.dev/>
[2]: <https://reviews.llvm.org/rG670944fb20b226fc22fa993ab521125f9adbd30a>
[3]: <https://github.com/WebAssembly/wasi-sdk/issues/185>

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-17 13:09:14 +01:00
Andrew Clayton
0c44439736 Wasm: Add core configuration data structure.
This is required to actually _build_ the Wasm language module.

The nxt_wasm_app_conf_t structure consists of the modules name, e.g
wasm, then the three required function handlers followed by the five
optional function handlers.

See the next commit for details of these function handlers.

We also need to include the u.wasm union entry that provides access to
the above structure.

The bulk of the configuration infrastructure will be added in a
subsequent commit.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-16 16:28:38 +01:00
Andrew Clayton
52b334acd1 Wasm: Register a new WebAssembly language module type.
This is the first patch in adding WebAssembly language module support.

This just adds a new NXT_APP_WASM type, required by subsequent commits.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-10 16:58:48 +01:00
Andrew Clayton
46573e6993 Index initialise the nxt_app_msg_prefix array.
This makes it much more clear what's what.

This is in preparation for adding WebAssembly language module support.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-10 16:58:35 +01:00
Zhidao HONG
a28bef097c HTTP: controlling response headers support. 2023-08-09 14:37:16 +08:00
Zhidao HONG
9f04d6db63 HTTP: stored matched action in nxt_http_request_t.
No functional changes.
2023-08-09 14:36:16 +08:00
Zhidao HONG
d0fdf5971f NJS: workaround for the warning in nxt_js_call() on Freebsd12 gcc. 2023-07-12 09:31:22 +08:00
Zhidao HONG
458722df55 Var: supported HTTP response header variables.
This commit adds the variable $response_header_NAME.
2023-07-01 12:18:22 +08:00
Zhidao HONG
c61ccec7b4 Variables refactoring.
This commit is to reimplement the variables with an unknown field
such as $header_{name} to make the parsing more generic,
it's a preparation for supporting response header variables.
2023-06-19 16:29:22 +08:00
Zhidao HONG
18d3637e4b NJS: supported 0.8.0. 2023-07-11 09:30:50 +08:00
Alejandro Colomar
c185ae7512 Fixed indentation.
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-06-30 14:38:34 +02:00
Zhidao HONG
a378f6aa31 HTTP: fixed variable caching.
When a variable is accessed in the Unit configuration, the value is cached.
This was useful prior to the URI rewrite feature, but now that the URI (more
precisely, the request target) can be rewritten, the contents of the variable
$uri (which contains the path part of the request target, and is decoded)
should not be cached anymore, or at least the cached value should be invalidated
after a URI rewrite.

Example:

{
	"rewrite": "/prefix$uri",
	"share": "$uri"
}

For a request line like GET /foo?bar=baz HTTP/1.1\r\n, the expected file
served in the response would be /prefix/foo, but due to the caching issue,
Unit currently serves /foo.
2023-05-25 00:27:55 +08:00
synodriver
b84f6ecad4 Python: Fix error checks in nxt_py_asgi_request_handler().
Signed-off-by: synodriver <diguohuangjiajinweijun@gmail.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
[ Re-word commit subject - Andrew ]
Fixes: c4c2f90c5b ("Python: ASGI server introduced.")
Closes: <https://github.com/nginx/unit/issues/895>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-06-01 00:25:40 +01:00
synodriver
93ed66958e Python: Add ASGI lifespan state support.
Lifespan state is a special dict in asgi lifespan scope, which allow
applications to persist data from the lifespan cycle to request/response
handling. The scope["state"] namespace provides a place to store these
sorts of things. The server will ensure that a shallow copy of the
namespace is passed into each subsequent request/response call into the
application.

Some frameworks are already taking advantage of this feature, for
example, starlette, and without this feature they wouldn't work
properly.

Signed-off-by: synodriver <diguohuangjiajinweijun@gmail.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
[ Minor code tweaks to avoid lines > 80 chars, static a function and
  re-work the PyMemberDef structure initialisation for Python <3.7
  and -Wwrite-strings compatibility - Andrew ]
Tested-by: <https://github.com/synodriver>
Tested-by: <https://github.com/hawiliali>
Closes: <https://github.com/nginx/unit/issues/864>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-06-01 00:25:03 +01:00
Alejandro Colomar
47cdfb6f30 Tests: fixed incorrect pointer assignment.
If we don't update the pointer before copying the request body, then we
get the behavior shown below.  After this patch, "foo\n" is rightly
appended at the end of the response body.

Request:

"GET / HTTP/1.1\r\nHost: _\nContent-Length: 4\n\nfoo\n"

Response body:

"""
Hello world!
foo
est data:
  Method: GET
  Protocol: HTTP/1.1
  Remote addr: 127.0.0.1
  Local addr: 127.0.0.1
  Target: /
  Path: /
  Fields:
    Host: _
    Content-Length: 4
  Body:
"""

Fixes: 1bb22d1e92 ("Unit application library.")
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-05-25 11:27:35 +02:00
Alejandro Colomar
f32858dcb7 Added back deprecated options to unitd.
We renamed the options recently, with the intention of keeping the old
names as supported but deprecated for some time, before removal.  This
was done with the configure script options, but in the unitd binary, we
accidentally removed the old names, causing some unintended breakage.
Keep support for the old names, albeit with a deprecation message to
stderr, for some time, until we decide to remove them.

Fixes: 5a37171f73 ("Added default values for pathnames.")
Closes: <https://github.com/nginx/unit/issues/876>
Reported-by: El RIDO <elrido@gmx.net>
Acked-by: Liam Crilly <liam@nginx.com>
Acked-by: Artem Konev <a.konev@f5.com>
Acked-by: Timo Stark <t.stark@nginx.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Cc: Andrei Zeliankou <zelenkov@nginx.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-05-21 01:01:43 +02:00
Andrew Clayton
47683c4704 Python: Fix ASGI applications accessed over IPv6.
There are a couple of reports on GitHub about issues accessing Python
ASGI based applications over IPv6.

A request over IPv6 would result in an error like

2023/05/13 17:49:12 [alert] 47202#47202 [unit] #10: Python failed to create 'client' pair
2023/05/13 17:49:12 [alert] 47202#47202 [unit] Python failed to call 'loop.call_soon'
ValueError: invalid literal for int() with base 10: 'db8:1:1:1ee7:dead:beef:cafe'

The above error was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib64/python3.11/asyncio/base_events.py", line 765, in call_soon
    handle = self._call_soon(callback, args, context)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/asyncio/base_events.py", line 781, in _call_soon
    handle = events.Handle(callback, args, self, context)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SystemError: <class 'asyncio.events.Handle'> returned a result with an exception set

This issue occurred in the nxt_py_asgi_create_ip_address() function
where it tries to create an IP address / port number pair.

It does this by looking for the first ':' in the address and taking
everything after it as the port number. Like in the above error message,
if we tried to access the server @ 2001:db8:1:1:1ee7:dead:beef:cafe,
then we'd end up with the port number as 'db8:1:1:1ee7:dead:beef:cafe'.

There are two issues with this

 1) The IP address and port number are already flowed through
    separately.
 2) Even if (1) wasn't true, it would still be broken for IPv6 as we'd
    expect to a get an address literal like
    [2001:db8:1:1:1ee7:dead:beef:cafe]:8080, however there was no code to
    handle the []'s.

The fix is to simply not try looking for a port number. We pass a port
number into this function to use in the case where we don't find a port
number, we never will...

A further cleanup would be to flow through the server port number when
creating the 'server pair' PyTuple, rather than just using the hard
coded 80.

Closes: <https://github.com/nginx/unit/issues/793>
Closes: <https://github.com/nginx/unit/issues/874>
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-05-18 15:57:11 +01:00
Zhidao HONG
a3c3a29493 NJS: supported loadable modules. 2023-05-08 16:00:25 +08:00
Zhidao HONG
14d6d97bac HTTP: added basic URI rewrite.
This commit introduced the basic URI rewrite. It allows users to change request URI. Note the "rewrite" option ignores the contained query if any and the query from the request is preserverd.
An example:
"routes": [
    {
        "match": {
            "uri": "/v1/test"
        },
        "action": {
            "return": 200
        }
    },
    {
        "action": {
            "rewrite": "/v1$uri",
            "pass": "routes"
        }
    }
]

Reviewed-by: Alejandro Colomar <alx@nginx.com>
2023-04-20 23:20:41 +08:00
Andrew Clayton
1a485fed6a Allow to remove the version string in HTTP responses.
Normally Unit responds to HTTP requests by including a header like

  Server: Unit/1.30.0

however it can sometimes be beneficial to withhold the version
information and in this case just respond with

  Server: Unit

This patch adds a new "settings.http" boolean option called
server_version, which defaults to true, in which case the full version
information is sent. However this can be set to false, e.g

  "settings": {
      "http": {
          "server_version": false
      }
  },

in which case Unit responds without the version information as the
latter example above shows.

Link: <https://www.ietf.org/rfc/rfc9110.html#section-10.2.4>
Closes: <https://github.com/nginx/unit/issues/158>
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-25 13:59:43 +01:00
Andrew Clayton
1fd6eb626b Decouple "Unit" from NXT_SERVER.
Split out the "Unit" name from the NXT_SERVER #define into its own
NXT_NAME #define, then make NXT_SERVER a combination of that and
NXT_VERSION.

This is required for a subsequent commit where we may want the server
name on its own.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-25 13:59:25 +01:00
Andrew Clayton
dcdc8e7466 Remove an erroneous semi-colon.
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-24 19:40:16 +01:00
Andrew Clayton
375556f9aa Don't conflate the error variable in nxt_kqueue_poll().
In nxt_kqueue_poll() error is declared as a nxt_bool_t aka unsigned int
(on x86-64 anyway).

It is used both as a boolean and as the return storage for a bitwise AND
operation.

This has potential to go awry.

If nxt_bool_t was changed to be a u8 then we would have the following
issue

gcc12 -c -pipe -fPIC -fvisibility=hidden -O -W -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wmissing-prototypes -Werror -g -O2 -I src -I build     -I/usr/local/include  -o build/src/nxt_kqueue_engine.o  -MMD -MF build/src/nxt_kqueue_engine.dep -MT build/src/nxt_kqueue_engine.o  src/nxt_kqueue_engine.c
src/nxt_kqueue_engine.c: In function 'nxt_kqueue_poll':
src/nxt_kqueue_engine.c:728:17: error: overflow in conversion from 'int' to 'nxt_bool_t' {aka 'unsigned char'} changes value from '(int)kev->flags & 16384' to '0' [-Werror=overflow]
  728 |         error = (kev->flags & EV_ERROR);
      |                 ^
cc1: all warnings being treated as errors

EV_ERROR has the value 16384, after the AND operation error holds 16384,
however this overflows and wraps around (64 times) exactly to 0.

With nxt_bool_t defined as a u32, we would have a similar issue if
EV_ERROR ever became UINT_MAX + 1 (or a multiple thereof)...

Rather than conflating the use of error, keep error as a boolean (it is
used further down the function) but do the AND operation inside the
if ().

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-24 19:40:16 +01:00
Andrew Clayton
b9177d36e7 Remove a bunch of dead code.
This removes a bunch of unused files that would have been touched by
subsequent commits that switch to using nxt_bool_t (AKA unit6_t) in
structures.

In auto/sources we have

  NXT_LIB_SRC0=" \
      src/nxt_buf_filter.c \
      src/nxt_job_file.c \
      src/nxt_stream_module.c \
      src/nxt_stream_source.c \
      src/nxt_upstream_source.c \
      src/nxt_http_source.c \
      src/nxt_fastcgi_source.c \
      src/nxt_fastcgi_record_parse.c \
  \
      src/nxt_mem_pool_cleanup.h \
      src/nxt_mem_pool_cleanup.c \
  "

None of these seem to actually be used anywhere (other than within
themselves). That variable is _not_ referenced anywhere else.

Also remove the unused related header files: src/nxt_buf_filter.h,
src/nxt_fastcgi_source.h, src/nxt_http_source.h, src/nxt_job_file.h,
src/nxt_stream_source.h and src/nxt_upstream_source.h

Also, these files do not seem to be used, no mention under auto/ or build/

  src/nxt_file_cache.c
  src/nxt_cache.c
  src/nxt_job_file_cache.c

src/nxt_cache.h is #included in src/nxt_main.h, but AFAICT is not
actually used.

With all the above removed

  $ ./configure --openssl --debug --tests && make -j && make -j tests &&
  make libnxt

all builds.

Buildbot passes.

NOTE: You may need to do a 'make clean' before the next build attempt.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-24 19:39:09 +01:00
Alejandro Colomar
fcff55acb6 HTTP: optimizing $request_line.
Don't reconstruct a new string for the $request_line from the parsed
method, target, and HTTP version, but rather keep a pointer to the
original memory where the request line was received.

This will be necessary for implementing URI rewrites, since we want to
log the original request line, and not one constructed from the
rewritten target.

This implementation changes behavior (only for invalid requests) in the
following way:

Previous behavior was to log as many tokens from the request line as
were parsed validly, thus:

Request              -> access log              ; error log

"GET / HTTP/1.1"     -> "GET / HTTP/1.1"     OK ; =
"GET   / HTTP/1.1"   -> "GET / HTTP/1.1"    [1] ; =
"GET / HTTP/2.1"     -> "GET / HTTP/2.1"     OK ; =
"GET / HTTP/1."      -> "GET / HTTP/1."     [2] ; "GET / HTTP/1. [null]"
"GET / food"         -> "GET / food"        [2] ; "GET / food [null]"
"GET / / HTTP/1.1"   -> "GET / / HTTP/1.1"  [2] ; =
"GET /  / HTTP/1.1"  -> "GET /  / HTTP/1.1" [2] ; =
"GET food HTTP/1.1"  -> "GET"                   ; "GET [null] [null]"
"OPTIONS * HTTP/1.1" -> "OPTIONS"           [3] ; "OPTIONS [null] [null]"
"FOOBAR baz HTTP/1.1"-> "FOOBAR"                ; "FOOBAR [null] [null]"
"FOOBAR / HTTP/1.1"  -> "FOOBAR / HTTP/1.1"     ; =
"get / HTTP/1.1"     -> "-"                     ; " [null] [null]"
""                   -> "-"                     ; " [null] [null]"

This behavior was rather inconsistent.  We have several options to go
forward with this patch:

-  NGINX behavior.

   Log the entire request line, up to '\r' | '\n', even if it was
   invalid.

   This is the most informative alternative.  However, RFC-complying
   requests will probably not send invalid requests.

   This information would be interesting to users where debugging
   requests constructed manually via netcat(1) or a similar tool, or
   maybe for debugging a client, are important.  It might be interesting
   to support this in the future if our users are interested; for now,
   since this approach requires looping over invalid requests twice,
   that's an overhead that we better avoid.

-  Previous Unit behavior

   This is relatively fast (almost as fast as the next alternative, the
   one we chose), but the implementation is ugly, in that we need to
   perform the same operation in many places around the code.

   If we want performance, probably the next alternative is better; if
   we want to be informative, then the first one is better (maybe in
   combination with the third one too).

-  Chosen behavior

   Only logging request lines when the request is valid.  For any
   invalid request, or even unsupported ones, the request line will be
   logged as "-".  Thus:

   Request              -> access log [4]

   "GET / HTTP/1.1"     -> "GET / HTTP/1.1"     OK
   "GET   / HTTP/1.1"   -> "GET   / HTTP/1.1"  [1]
   "GET / HTTP/2.1"     -> "-"                 [3]
   "GET / HTTP/1."      -> "-"
   "GET / food"         -> "-"
   "GET / / HTTP/1.1"   -> "GET / / HTTP/1.1"  [2]
   "GET /  / HTTP/1.1"  -> "GET /  / HTTP/1.1" [2]
   "GET food HTTP/1.1"  -> "-"
   "OPTIONS * HTTP/1.1" -> "-"
   "FOOBAR baz HTTP/1.1"-> "-"
   "FOOBAR / HTTP/1.1"  -> "FOOBAR / HTTP/1.1"
   "get / HTTP/1.1"     -> "-"
   ""                   -> "-"

   This is less informative than previous behavior, but considering how
   inconsistent it was, and that RFC-complying agents will probably not
   send us such requests, we're ready to lose that information in the
   log.  This is of course the fastest and simplest implementation we
   can get.

   We've chosen to implement this alternative in this patch.  Since we
   modified the behavior, this patch also changes the affected tests.

[1]:  Multiple successive spaces as a token delimiter is allowed by the
      RFC, but it is discouraged, and considered a security risk.  It is
      currently supported by Unit, but we will probably drop support for
      it in the future.

[2]:  Unit currently supports spaces in the request-target.  This is
      a violation of the relevant RFC (linked below), and will be fixed
      in the future, and consider those targets as invalid, returning
      a 400 (Bad Request), and thus the log lines with the previous
      inconsistent behavior would be changed.

[3]:  Not yet supported.

[4]:  In the error log, regarding the "log_routes" conditional logging
      of the request line, we only need to log the request line if it
      was valid.  It doesn't make sense to log "" or "-" in case that
      the request was invalid, since this is only useful for
      understanding decisions of the router.  In this case, the access
      log is more appropriate, which shows that the request was invalid,
      and a 400 was returned.  When the request line is valid, it is
      printed in the error log exactly as in the access log.

Link: <https://datatracker.ietf.org/doc/html/rfc9112#section-3>
Suggested-by: Liam Crilly <liam@nginx.com>
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Cc: Timo Stark <t.stark@nginx.com>
Cc: Andrei Zeliankou <zelenkov@nginx.com>
Cc: Andrew Clayton <a.clayton@nginx.com>
Cc: Artem Konev <a.konev@f5.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-04-12 11:50:56 +02:00
Andrew Clayton
45c45eaeb4 Add per-application logging.
Currently when running in the foreground, unit application processes
will send stdout to the current TTY and stderr to the unit log file.

That behaviour won't change.

When running as a daemon, unit application processes will send stdout to
/dev/null and stderr to the unit log file.

This commit allows to alter the latter case of unit running as a daemon,
by allowing applications to redirect stdout and/or stderr to specific
log files. This is done via two new application options, 'stdout' &
'stderr', e.g

  "applications": {
      "myapp": {
          ...
          "stdout": "/path/to/log/unit/app/stdout.log",
          "stderr": "/path/to/log/unit/app/stderr.log"
      }
  }

These log files are created by the application processes themselves and
thus the log directories need to be writable by the user (and or group)
of the application processes.

E.g

  $ sudo mkdir -p /path/to/log/unit/app
  $ sudo chown APP_USER /path/to/log/unit/app

These need to be setup before starting unit with the above config.

Currently these log files do not participate in log-file rotation
(SIGUSR1), that may change in a future commit. In the meantime these
logs can be rotated using the traditional copy/truncate method.

NOTE:

You may or may not see stuff printed to stdout as stdout was
traditionally used by CGI applications to communicate with the
webserver.

Closes: <https://github.com/nginx/unit/issues/197>
Closes: <https://github.com/nginx/unit/issues/846>
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-11 19:08:32 +01:00
Andrew Clayton
8b8952930c Add nxt_file_stdout().
This is analogous to the nxt_file_stderr() function and will be used in
a subsequent commit.

This function redirects stdout to a given file descriptor.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-11 19:08:12 +01:00
Andrew Clayton
edbc43558d PHP: Make the filter_input() function work.
On GitHub, @jamesRUS52 reported that the PHP filter_input()[0] function
would just return NULL.

To enable this function we need to run the variables through the
sapi_module.input_filter() function when we call
php_register_variable_safe().

In PHP versions prior to 7.0.0, input_filter() takes 'len' as an
unsigned int, while later versions take it as a size_t.

Now, with this commit and the following PHP

  <?php

  var_dump(filter_input(INPUT_SERVER, 'REMOTE_ADDR'));
  var_dump(filter_input(INPUT_SERVER, 'REQUEST_URI'));
  var_dump(filter_input(INPUT_GET, 'get', FILTER_SANITIZE_SPECIAL_CHARS));

  ?>

you get

  $ curl 'http://localhost:8080/854.php?get=foo<>'
  string(3) "::1"
  string(18) "/854.php?get=foo<>"
  string(13) "foo&#60;&#62;"

[0]: <https://www.php.net/manual/en/function.filter-input.php>

Tested-by: <https://github.com/jamesRUS52>
Closes: <https://github.com/nginx/unit/issues/854>
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-11 19:08:12 +01:00
Andrew Clayton
4852057124 Remove a useless assignment in nxt_mem_zone_alloc_pages().
This was reported by the 'Clang Static Analyzer' as a 'dead nested
assignment'.

We assign prev_size then check if it's != 0 and if true we then set
prev_pages to page_size right shifted by two at the same time setting
prev_size to be right shifted by two (>>=), however page_size is never
used again so no need to set it here.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-03 14:53:04 +01:00
Andrew Clayton
8a9e078e54 Prevent a possible NULL de-reference in nxt_job_create().
We allocate 'job' we then have a check if it's not NULL and do stuff
with it, but then we accessed it outside this check.

Simply return if job is NULL.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-03 14:53:04 +01:00
Andrew Clayton
22993fe41a Remove a useless assignment in nxt_fs_mkdir_all().
This was reported by the 'Clang Static Analyzer' as a 'dead nested
assignment'.

We set end outside the loop but the first time we use it is to assign it
in the loop (not used anywhere else).

Further cleanup could be to reduce the scope of end by moving its
declaration inside the loop.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-04-03 14:53:04 +01:00
Alejandro Colomar
6e16d7ac5b Auto: mirroring installation structure in build tree.
This makes the build tree more organized, which is good for adding new
stuff.  Now, it's useful for example for adding manual pages in man3/,
but it may be useful in the future for example for extending the build
system to run linters (e.g., clang-tidy(1), Clang analyzer, ...) on the
C source code.

Previously, the build tree was quite flat, and looked like this (after
`./configure && make`):

    $ tree -I src build
    build
    ├── Makefile
    ├── autoconf.data
    ├── autoconf.err
    ├── echo
    ├── libnxt.a
    ├── nxt_auto_config.h
    ├── nxt_version.h
    ├── unitd
    └── unitd.8

    1 directory, 9 files

And after this patch, it looks like this:

    $ tree -I src build
    build
    ├── Makefile
    ├── autoconf.data
    ├── autoconf.err
    ├── bin
    │   └── echo
    ├── include
    │   ├── nxt_auto_config.h
    │   └── nxt_version.h
    ├── lib
    │   ├── libnxt.a
    │   └── unit
    │       └── modules
    ├── sbin
    │   └── unitd
    ├── share
    │   └── man
    │       └── man8
    │           └── unitd.8
    └── var
        ├── lib
        │   └── unit
        ├── log
        │   └── unit
        └── run
            └── unit

    17 directories, 9 files

It also solves one issue introduced in
5a37171f73 ("Added default values for pathnames.").  Before that
commit, it was possible to run unitd from the build system
(`./build/unitd`).  Now, since it expects files in a very specific
location, that has been broken.  By having a directory structure that
mirrors the installation, it's possible to trick it to believe it's
installed, and run it from there:

    $ ./configure --prefix=./build
    $ make
    $ ./build/sbin/unitd

Fixes: 5a37171f73 ("Added default values for pathnames.")
Reported-by: Liam Crilly <liam@nginx.com>
Reviewed-by: Konstantin Pavlov <thresh@nginx.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Cc: Andrei Zeliankou <zelenkov@nginx.com>
Cc: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-03-29 00:41:08 +02:00
Alejandro Colomar
5ba79b9b52 Renamed --libstatedir to --statedir.
In BSD systems, it's usually </var/db> or some other dir under </var>
that is not </var/lib>, so $statedir is a more generic name.  See
hier(7).

Reported-by: Andrei Zeliankou <zelenkov@nginx.com>
Reported-by: Zhidao Hong <z.hong@f5.com>
Reviewed-by: Konstantin Pavlov <thresh@nginx.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Cc: Liam Crilly <liam@nginx.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-03-29 00:40:40 +02:00
Alejandro Colomar
0ebce31c92 HTTP: added route logging.
-  Configuration: added "/config/settings/http/log_route".

   Type: bool
   Default: false

   This adds configurability to the error log.  It allows enabling and
   disabling logs related to how the router performs selection of the
   routes.

-  HTTP: logging request line.

   Log level: [notice]

   The request line is essential to understand which logs correspond to
   which request when reading the logs.

-  HTTP: logging route that's been discarded.

   Log level: [info]

-  HTTP: logging route whose action is selected.

   Log level: [notice]

-  HTTP: logging when "fallback" action is taken.

   Log level: [notice]

Closes: <https://github.com/nginx/unit/issues/758>
Link: <https://github.com/nginx/unit/pull/824>
Link: <https://github.com/nginx/unit/pull/839>
Suggested-by: Timo Stark <t.stark@nginx.com>
Suggested-by: Mark L Wood-Patrick <mwoodpatrick@gmail.com>
Suggested-by: Liam Crilly <liam@nginx.com>
Tested-by: Liam Crilly <liam@nginx.com>
Acked-by: Artem Konev <a.konev@f5.com>
Cc: Andrew Clayton <a.clayton@nginx.com>
Cc: Andrei Zeliankou <zelenkov@nginx.com>
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-03-21 13:02:38 +01:00
Alejandro Colomar
773c341d70 HTTP: rewrote while loop as for loop.
This considerably simplifies the function, and will also help log the
iteration in which we are, which corresponds to the route array element.

Link: <https://github.com/nginx/unit/pull/824>
Link: <https://github.com/nginx/unit/pull/839>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Tested-by: Liam Crilly <liam@nginx.com>
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-03-21 13:02:38 +01:00
Andrew Clayton
c18dd1f65b Default PR_SET_NO_NEW_PRIVS to off.
This prctl(2) option was enabled in commit 0277d8f1 ("Isolation: Fix the
enablement of PR_SET_NO_NEW_PRIVS.") and this was being set by default.

This prctl(2) when enabled renders (amongst other things) the set-UID
and set-GID bits on executables ineffective after an execve(2).

This causes an issue for applications that want to execute the
sendmail(8) binary, this includes the PHP mail() function, which is
usually set-GID.

After some internal discussion it was decided to disable this option by
default.

Closes: <https://github.com/nginx/unit/issues/852>
Fixes: 0277d8f1 ("Isolation: Fix the enablement of PR_SET_NO_NEW_PRIVS.")
Fixes: e2b53e16 ("Added "rootfs" feature.")
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-17 04:28:46 +00:00
Andrew Clayton
7d0ceb82c7 Improve an error message regarding Unix domain sockets.
When starting unit, if its Unix domain control socket was already active
you would get an error message like

  2023/03/15 18:07:55 [alert] 53875#8669650 connect(5, unix:/tmp/control.sock) succeed, address already in use

which is confusing in a couple of regards, firstly we have the classic
success/failure message and secondly 'address already in use' is an
actual errno value, EADDRINUSE and we didn't get an error from this
connect(2).

Re-word this error message for greater clarity.

Reported-by: Liam Crilly <liam.crilly@nginx.com>
Cc: Liam Crilly <liam.crilly@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-17 04:28:46 +00:00
Andrew Clayton
2e3e1c7e7b Socket: Remove Unix domain listen sockets upon reconfigure.
Currently when using Unix domain sockets for requests, if unit is
reconfigured then it will fail if it tries to bind(2) again to a Unix
domain socket with something like

  2023/02/25 19:15:50 [alert] 35274#35274 bind(\"unix:/tmp/unit.sock\") failed (98: Address already in use)

When closing such a socket we really need to unlink(2) it. However that
presents a problem in that when running as root, while the main process
runs as root and creates the socket, it's the router process, that runs
as an unprivileged user, e.g nobody, that closes the socket and would
thus remove it, but couldn't due to not having permission, even if the
socket is mode 0666, you need write permissions on the containing
directory to remove a file.

There are several options to solve this, all with varying degrees of
complexity and utility.

  1) Give the user who the router process runs as write permission to
     the directory containing the listen sockets. These can then be
     unlink(2)'d from the router process.

     Simple and would work, but perhaps not the most elegant.

  2) Using capabilities(7). The router process could temporarily attain
     the CAP_DAC_OVERRIDE capability, unlink(7) the socket, then
     relinquish the capability until required again.

     These are Linux specific (other systems may have similar mechanisms
     which would be extra work to support). There is also a, albeit
     small, window where the router process is running with elevated
     privileges.

  3) Have the main process do the unlink(2), it is after all the process
     that created the socket.

     This is what this commit implements.

We create a new port IPC message type of NXT_PORT_MSG_SOCKET_UNLINK,
that is used by the router process to notify the main process about a
Unix domain socket to unlink(2).

Upon doing a reconfigure the router process will call
nxt_router_listen_socket_release() which will close the socket, we
extend this function in the case of non-abstract Unix domain sockets, so
that it will send a message to the main process containing a copy of the
nxt_sockaddr_t structure that will contain the filename of the socket.

In the main process the handler that we have defined,
nxt_main_port_socket_unlink_handler(), for this message type will run
and allow us to look for the socket in question in the listen_sockets
array and remove it and unlink(2) the socket.

This then allows the reconfigure to work if it tries to bind(2) again to
a socket that previously existed.

Link: <https://github.com/nginx/unit/issues/669>
Link: <https://github.com/nginx/unit/pull/735>
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-17 04:28:23 +00:00
Andrew Clayton
8f0192bb4c Remove some dormant code from nxt_process_quit().
In nxt_process_quit() there is a loop that iterates over the
task->thread->runtime->listen_sockets array and closes the connections.

This code has been there from the beginning

  $ git log --pretty=oneline -S'if (rt->listen_sockets != NULL)'
  e9e5ddd5a5 Refactor of process management.
  6f2c9acd18 Processes refactoring. The cycle has been renamed to the runtime.
  $ git log --pretty=oneline -S'if (cycle->listen_sockets != NULL) {'
  6f2c9acd18 Processes refactoring. The cycle has been renamed to the runtime.
  16cbf3c076 Initial version.

but never seems to have been used (AFAICT and certainly not recently,
confirmed by code inspection and running pytests with a bunch of
language modules enabled and the code in question was never executed) as
the listen_sockets array has never been populated... until now.

The previous commit now adds Unix domain sockets to this array so that
they can be unlink(2)'d upon exit and reconfiguration.

This has now caused this dormant code to become active as it now tries
to close these sockets (from at least the prototype processes), this
array is inherited via fork by other processes.

The file descriptor for these sockets is set to -1 when they are put
into this array. This then results in close(-1) calls which caused
multiple failures in the pytests such as

  >       assert not alerts, 'alert(s)'
  E       AssertionError: alert(s)
  E       assert not ['2023/03/09 23:26:14 [alert] 137673#137673 socket close(-1) failed (9: Bad file descriptor)']

I think the simplest thing is to just remove this code.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-17 04:03:56 +00:00
Andrew Clayton
ccaad38bc5 Socket: Remove Unix domain listen sockets at shutdown.
If we don't remove the Unix domain listen socket file then when Unit
restarts it get an error like

  2023/02/25 23:10:11 [alert] 36388#36388 bind(\"unix:/tmp/unit.sock\") failed (98: Address already in use)

This patch makes use of the listen_sockets array, that is already
allocated in the main process but never populated, to place the Unix
domain listen sockets into.

At shutdown we can then loop through this array and unlink(2) any Unix
domain sockets found therein.

Closes: <https://github.com/nginx/unit/issues/792>
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-17 04:03:56 +00:00
Andrew Clayton
172ceba5b6 Router: More accurately allocate request buffer memory.
In nxt_router_prepare_msg() we create a buffer (nxt_unit_request_t *req)
that gets sent to an application process that contains details about a
client request.

This buffer was always a little larger than needed due to allocating space
for the remote address _and_ port and the local address _and_ port. We
also allocate space for the local port separately.

->{local,remote}->length includes the port number and ':' and also the
'[]' for IPv6. E.g [2001:db8::1]:8080

->{local,remote}->address_length represents the length of the unadorned
IP address. E.g 2001:db8::1

Update the buffer size so that we only allocate what is actually needed.

Suggested-by: Zhidao HONG <z.hong@f5.com>
Cc: Zhidao HONG <z.hong@f5.com>
Reviewed-by: Zhidao HONG <z.hong@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-14 17:07:08 +00:00
Andrew Clayton
fa81d7a11a Perl: Fix a crash in the language module.
User @bes-internal reported a Perl module crasher on GitHub.

This was due to a Perl application sending back two responses, for each
response we would call down into XS_NGINX__Unit__Sandbox_cb(), the first
time pctx->req would point to a valid nxt_unit_request_info_t, the
second time pctx->req would be NULL.

Add an invalid responses check which covers this case.

Closes: <https://github.com/nginx/unit/issues/841>
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-10 21:40:28 +00:00
Andrew Clayton
78e1122a3c Router: Fix allocation of request buffer sent to application.
This fixes an issue reported by @Peter2121 on GitHub.

In nxt_router_prepare_msg() we create a buffer (nxt_unit_request_t *req)
that gets sent to an application process that contains details about a
client request.

The req structure comprises various members with the final member being
an array (specified as a flexible array member, with its actual length
denoted by the req->fields_count member) of nxt_unit_field_t's. These
structures specify the length and offset for the various request headers
name/value pairs which are stored after some request metadata that is
stored immediately after this array of structs as individual nul
terminated strings.

After this we have the body content data (if any). So it looks a little
like

  (gdb) x /64bs 0x7f38c976e060
  0x7f38c976e060: "\353\346\244\t\006"		<-- First nxt_unit_field_t
  0x7f38c976e066: ""
  0x7f38c976e067: ""
  0x7f38c976e068: "T\001"
  0x7f38c976e06b: ""
  0x7f38c976e06c: "Z\001"
  0x7f38c976e06f: ""
  ...
  0x7f38c976e170: "\362#\244\v$"		<-- Last nxt_unit_field_t
  0x7f38c976e176: ""
  0x7f38c976e177: ""
  0x7f38c976e178: "\342\002"
  0x7f38c976e17b: ""
  0x7f38c976e17c: "\352\002"
  0x7f38c976e17f: ""
  0x7f38c976e180: "POST"			<-- Start of request metadata
  0x7f38c976e185: "HTTP/1.1"
  0x7f38c976e18e: "unix:"
  0x7f38c976e194: "unix:/dev/shm/842.sock"
  0x7f38c976e1ab: ""
  0x7f38c976e1ac: "fedora"
  0x7f38c976e1b3: "/842.php"
  0x7f38c976e1bc: "HTTP_HOST"			<-- Start of header fields
  0x7f38c976e1c6: "fedora"
  0x7f38c976e1cd: "HTTP_X_FORWARDED_PROTO"
  0x7f38c976e1e4: "https"
  ...
  0x7f38c976e45a: "HTTP_COOKIE"
  0x7f38c976e466: "PHPSESSID=8apkg25r9s9vju3pi085i21eh4"
  0x7f38c976e48b: "public_form=sended"		<-- Body content

Well that's how things are supposed to look! When using Unix domain
sockets what we actually got looked like

  ...
  0x7f6141f3445a: "HTTP_COOKIE"
  0x7f6141f34466: "PHPSESSID=uo5b2nu9buijkc89jotbgmd60vpublic_form=sended"

Here, the body content (from a POST for example) has been appended
straight onto the end of the last header field value. In this case
corrupting the PHP session cookie.  The body content would still be
found by the application as its offset into this buffer is correct.

This problem was actually caused by a0327445 ("PHP: allowed to specify
URLs without a trailing '/'.") which added an extra item into this
request buffer specifying the port number that unit is listening on that
handled this request.

Unfortunately when I wrote that patch I didn't increase the size of this
request buffer to accommodate it.

When using normal TCP sockets we actually end up allocating more space
than required for this buffer, we track the end of this buffer up to
where the body content would go and so we have a few spare bytes between
the nul byte of the last field header value and the start of the body
content.

When using Unix domain sockets, they have no associated port number and
thus the port number has a length of 0 bytes, but we still write a '\0'
in there using up a byte that we didn't account for, this causes us to
loose the nul byte of the last header fields value causing the body data
to be appended to the last header field value.

The fix is simple, account for the local port length, we also add 1 to
it, this covers the nul byte, even if there is no port as with Unix
domain sockets.

Closes: <https://github.com/nginx/unit/issues/842>
Fixes: a0327445 ("PHP: allowed to specify URLs without a trailing '/'.")
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-10 21:39:57 +00:00