Commit Graph

1301 Commits

Author SHA1 Message Date
Zhidao HONG
4c91bebb50 HTTP: enhanced access log with conditional filtering.
This feature allows users to specify conditions to control if access log
should be recorded. The "if" option supports a string and JavaScript code.
If its value is empty, 0, false, null, or undefined, the logs will not be
recorded. And the '!' as a prefix inverses the condition.

Example 1: Only log requests that sent a session cookie.

    {
        "access_log": {
            "if": "$cookie_session",
            "path": "..."
        }
    }

Example 2: Do not log health check requests.

    {
        "access_log": {
            "if": "`${uri == '/health' ? false : true}`",
            "path": "..."
        }
    }

Example 3: Only log requests when the time is before 22:00.

    {
        "access_log": {
            "if": "`${new Date().getHours() < 22}`",
            "path": "..."
        }
    }

or

    {
        "access_log": {
            "if": "!`${new Date().getHours() >= 22}`",
            "path": "..."
        }
    }

Closes: https://github.com/nginx/unit/issues/594
2024-01-29 13:48:53 +08:00
Zhidao HONG
37abe2e463 HTTP: refactored out nxt_http_request_access_log().
This is in preparation for adding conditional access logging.
No functional changes.
2024-01-29 12:10:37 +08:00
Andrei Zeliankou
6452ca111c Node.js: fixed "httpVersion" variable format
According to the Node.js documenation this variable
should only include numbering scheme.

Thanks to @dbit-xia.

Closes: https://github.com/nginx/unit/issues/1085
2024-01-26 15:17:00 +00:00
Andrew Clayton
02d1984c91 HTTP: Remove short read check in nxt_http_static_buf_completion()
On GH, @tonychuuy reported an issue when using Units 'share' action they
would get the following error in the unit log

  2024/01/15 17:53:41 [error] 49#52 *103 file "/var/www/html/public/vendor/telescope/app.css" has changed while sending response to a client

This would happen when trying to serve files over a certain size and the
requested file would not be sent.

This is due to a somewhat bogus check in
nxt_http_static_buf_completion()

I say bogus because it's not clear what the check is trying to
accomplish and the error message is not entirely accurate either.

The check in question goes like

    n = pread(file->fd, buf, size, offset);
    return n;
    ...
    if (n != size) {
        if (n >= 0) {
            /* log file changed error and finish */

            /* >> Problem is here << */
        }

       	/* log general error and finish */
    }

If the number of bytes read is not what we asked for and is > -1 (i.e
not an error) then it says the file has changed, but really it only
checks if the file has _shrunk_ (we can't get back _more_ bytes than we
asked for) since it was stat'd.

This is what happens

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0

We get a request from a client, open the requested file and stat(2) it to
get the file size.

We would then go into a pread/writev loop reading the file data and
sending it to the client until it's all been sent.

However what was happening in this case was this (showing a dummy file
of 149922 bytes)

  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  write(2, "2024/01/17 15:30:50 [error] 1849"..., 109) = 109

We wanted to read 131072 bytes but only read 61440 bytes, the above
check triggered and the file transfer was aborted and the above error
message logged.

Normally for a regular file you will only get less bytes than asked for
if the read call is interrupted by a signal or you're near the end of
file.

There is however at least another situation where this may happen, if
the file in question is being served from a network filesystem.

It turns out that was indeed the case here, the files where being served
over the 9P filesystem protocol. Unit was running in a docker container
in an Ubuntu VM under Windows/WSL2 and the files where being passed
through to the VM from Windows over 9P.

Whatever the intention of this check, it is clearly causing issues in
real world scenarios.

If it was really desired to check if the had changed since it was
opened/stat'd then it would require a different methodology and be a
patch for another day. But as it stands this current check does more
harm than good, so lets just remove it.

With it removed we now get for the above test file

  recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
  openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
  newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0
  mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f367817b000
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 18850, 61440) = 18850
  writev(22, [{iov_base="HTTP/1.1 200 OK\r\nLast-Modified: "..., iov_len=171}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=18850}], 3) = 80461
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 69632, 80290) = 61440
  pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 141730) = 8192
  close(23)                   = 0
  writev(22, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8192}], 2) = 69632

So we can see we do two pread(2)s's and a writev(2), then another two
pread(2)s and another writev(2) and all the file data has been read and
sent to the client.

Reported-by: tonychuuy <https://github.com/tonychuuy>
Link: <https://en.wikipedia.org/wiki/9P_(protocol)>
Fixes: 08a8d1510 ("Basic support for serving static files.")
Closes: https://github.com/nginx/unit/issues/1064
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Reviewed-by: Andrei Zeliankou <zelenkov@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2024-01-20 03:39:57 +00:00
Andrei Zeliankou
a1e00b4e28 White space formatting fixes
Closes: <https://github.com/nginx/unit/pull/1062>
2024-01-16 15:37:07 +00:00
Zhidao HONG
49aee6760a HTTP: added TSTR validation flag to the rewrite option.
This is to improve error messages for rewrite configuration.
Take the configuration as an example:

  {
      "rewrite": "`${a + "
  }

Previously, when applying it the user would see this error message:

  failed to apply previous configuration

After this change, the user will see this improved error message:

  the previous configuration is invalid: "SyntaxError: Unexpected end of input in default:1" in the "rewrite" value.
2023-12-14 16:38:24 +08:00
Andrew Clayton
88854cf146 Ruby: Prevent a possible integer underflow
Coverity picked up a potential issue with the previous commit d9f5f1fb7
("Ruby: Handle response field arrays") in that a size_t could wrap
around to SIZE_MAX - 1.

This would happen if we were given an empty array of header values.

Fixes: d9f5f1fb7 ("Ruby: Handle response field arrays")
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-12-13 03:20:25 +00:00
Andrew Clayton
d9f5f1fb74 Ruby: Handle response field arrays
@xeron on GitHub reported an issue whereby with a Rails 7.1 application
they were getting the following error

  2023/10/22 20:57:28 [error] 56#56 [unit] #8: Ruby: Wrong header entry 'value' from application
  2023/10/22 20:57:28 [error] 56#56 [unit] #8: Ruby: Failed to run ruby script

After some back and forth debugging it turns out rack was trying to send
back a header comprised of an array of values. E.g

  app = Proc.new do |env|
      ["200", {
          "Content-Type" => "text/plain",
          "X-Array-Header" => ["Item-1", "Item-2"],
      }, ["Hello World\n"]]
  end

  run app

It seems this became a possibility in rack v3.0[0]

So along with a header value type of T_STRING we need to also allow
T_ARRAY.

If we get a T_ARRAY we need to build up the header field using the given
values.

E.g

  "X-Array-Header" => ["Item-1", "", "Item-3", "Item-4"],

becomes

  X-Array-Header: Item-1; ; Item-3; Item-4

[0]: <https://github.com/rack/rack/blob/main/UPGRADE-GUIDE.md?plain=1#L26>

Reported-by: Ivan Larionov <xeron.oskom@gmail.com>
Closes: <https://github.com/nginx/unit/issues/974>
Link: <https://github.com/nginx/unit/pull/998>
Tested-by: Timo Stark <t.stark@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-12-08 13:48:33 +00:00
Sergey A. Osokin
6b6e3bd897 Fixed the MD5Encoder deprecation warning. 2023-11-20 10:56:41 -05:00
Andrei Zeliankou
1443d623d4 Node.js: ServerResponse.flushHeaders() implemented.
This closes #1006 issue on GitHub.

Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
2023-11-17 17:27:31 +00:00
Andrew Clayton
919cae7ff9 PHP: Fix a possible file-pointer leak.
In nxt_php_execute() it is possible we could bail out before cleaning up
the FILE * representing the PHP script to execute.

At this point we only need to call fclose(3) on it.

We could have possibly moved the opening of this file to later in the
function, but it is probably good to bail out as early as possible if we
can't open it.

This was found by Coverity.

Fixes: bebc03c72 ("PHP: Implement better error handling.")
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-11-15 03:34:49 +00:00
Andrei Vasiliu
27c787f437 Fix comments for src/nxt_unit.h.
This fixes some typos and grammatical errors in the comments of
src/nxt_unit.h

Link: <https://github.com/nginx/unit/pull/889>
[ Adjust summary and write commit message as this just contains the
  fixes from the PR and not actual changes - Andrew ]
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-11-14 16:48:16 +00:00
David CARLIER
dfdf948f89 Define nxt_cpu_pause for ARM64.
The isb instruction fits for spin loops where it allows to save cpu
power.

Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-11-10 02:59:49 +00:00
Andrew Clayton
5cfad9cc0b Python: Fix header field values character encoding.
On GitHub, @RomainMou reported an issue whereby HTTP header field values
where being incorrectly reported as non-ascii by the Python .isacii()
method.

For example, using the following test application

  def application(environ, start_response):
      t = environ['HTTP_ASCIITEST']

      t = "'" + t + "'" +  " (" + str(len(t)) + ")"

      if t.isascii():
          t = t + " [ascii]"
      else:
          t = t + " [non-ascii]"

      resp = t + "\n\n"

      start_response("200 OK", [("Content-Type", "text/plain")])
      return (bytes(resp, 'latin1'))

You would see the following

  $ curl -H "ASCIITEST: $" http://localhost:8080/
  '$' (1) [non-ascii]

'$' has an ASCII code of 0x24 (36).

The initial idea was to adjust the second parameter to the
PyUnicode_New() call from 255 to 127. This unfortunately had the
opposite effect.

  $ curl -H "ASCIITEST: $" http://localhost:8080/
  '$' (1) [ascii]

Good. However...

  $ curl -H "ASCIITEST: £" http://localhost:8080/
  '£' (2) [ascii]

Not good. Let's take a closer look at this.

'£' is not in basic ASCII, but is in extended ASCII with a value of 0xA3
(163). Its UTF-8 encoding is 0xC2 0xA3, hence the length of 2 bytes
above.

  $ strace -s 256 -e sendto,recvfrom curl -H "ASCIITEST: £" http://localhost:8080/
  sendto(5, "GET / HTTP/1.1\r\nHost: localhost:8080\r\nUser-Agent: curl/8.0.1\r\nAccept: */*\r\nASCIITEST: \302\243\r\n\r\n", 92, MSG_NOSIGNAL, NULL, 0) = 92
  recvfrom(5, "HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\nServer: Unit/1.30.0\r\nDate: Mon, 22 May 2023 12:44:11 GMT\r\nTransfer-Encoding: chunked\r\n\r\n12\r\n'\302\243' (2) [ascii]\n\n\r\n0\r\n\r\n", 102400, 0, NULL, NULL) = 160
  '£' (2) [ascii]

So we can see curl sent it UTF-8 encoded '\302\243\' which is C octal
escaped UTF-8 for 0xC2 0xA3, and we got the same back. But it should not
be marked as ASCII.

When doing PyUnicode_New(size, 127) it sets the buffer as ASCII. So we
need to use another function and that function would appear to be

  PyUnicode_DecodeCharmap()

Which creates an Unicode object with the correct ascii/non-ascii
properties based on the character encoding.

With this function we now get

  $ curl -H "ASCIITEST: $" http://localhost:8080/
  '$' (1) [ascii]

  $ curl -H "ASCIITEST: £" http://localhost:8080/
  '£' (2) [non-ascii]

and for good measure

  $ curl -H "ASCIITEST: $ £" http://localhost:8080/
  '$ £' (4) [non-ascii]

  $ curl -H "ASCIITEST: $" -H "ASCIITEST: £" http://localhost:8080/
  '$, £' (5) [non-ascii]

PyUnicode_DecodeCharmap() does require having the full string upfront so
we need to build up the potentially comma separated header field values
string before invoking this function.

I did not want to touch the Python 2.7 code (which may or may not even
be affected by this) so kept these changes completely isolated from
that, hence a slight duplication with the for () loop.

Python 2.7 was sunset on January 1st 2020[0], so this code will
hopefully just disappear soon anyway.

I also purposefully didn't touch other code that may well have similar
issues (such as the HTTP header field names) if we ever get issue
reports about them, we'll deal with them then.

[0]: <https://www.python.org/doc/sunset-python-2/>

Link: <https://docs.python.org/3/c-api/unicode.html>
Closes: <https://github.com/nginx/unit/issues/868>
Reported-by: RomainMou <https://github.com/RomainMou>
Tested-by: RomainMou <https://github.com/RomainMou>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-11-09 17:53:09 +00:00
Andrew Clayton
dd0c53a77d Python: Do nxt_unit_sptr_get() earlier in nxt_python_field_value().
This is a preparatory patch for fixing an issue with the encoding of
http header field values.

This patch simply moves the nxt_unit_sptr_get() to the top of the
function where we will need it in the next commit.

Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-11-08 21:53:46 +00:00
Andrei Zeliankou
78c133d0ca Var: simplified length calculation for $status variable. 2023-11-08 17:38:07 +00:00
Andrei Zeliankou
a88e857b5b Var: $request_id variable.
This variable contains a string that is formed using random data and
can be used as a unique request identifier.

This closes #714 issue on GitHub.
2023-11-08 17:34:59 +00:00
Zhidao HONG
6ae7142840 Removed trailing 0 from debug message in nxt_credential_get(). 2023-11-02 17:07:34 +08:00
Konstantin Pavlov
f1ce2a5ac2 Node.js: provide reasonable default paths for macOS. 2023-09-26 16:14:21 -07:00
Andrew Clayton
01d185cb52 Wasm: Re-add a removed 'const' qualifier in nxt_rt_wasmtime.c.
This was inadvertently removed in 76086d6d ("Wasm: Allow to set the HTTP
response status.")

Fixes: 76086d6d ("Wasm: Allow to set the HTTP response status.")
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-10-10 20:44:07 +01:00
Zhidao HONG
9c8b9a46a4 Refactored nxt_vsprintf(). 2023-10-10 14:30:02 +08:00
Andrew Clayton
30142d2a3c HTTP: Fix URL with query string rewrite.
On Github, @rlandgrebe reported an issue when trying to rewrite URLs
that contained query strings.

With the PHP language module we were in fact segfaulting (SIGSEGV) in
libphp

  [93960.462952] unitd[20940]: segfault at 7f307cef6476 ip 00007f2f81a94577 sp 00007fff28a777d0 error 4 in libphp-8.2.so[7f2f818df000+2fd000] likely on CPU 0 (core 0, socket 0)

  #0  0x00007f2abd494577 in php_default_treat_data (arg=1, str=0x0,
      destArray=<optimized out>)
      at /usr/src/debug/php-8.2.10-1.fc38.x86_64/main/php_variables.c:488
  488                     if (c_var && *c_var) {
  (gdb) p c_var
  $1 = 0x7f2bb8880676 <error: Cannot access memory at address 0x7f2bb8880676>

This was when trying to get the query string which somehow is pointing
off into the woods.

This gdb debug session when doing rewrite basically shows the core of
the issue

  (gdb) x /64bs req->fields
  ...
  0x7f7eaaaa8090: "GET"
  0x7f7eaaaa8094: "HTTP/1.1"
  0x7f7eaaaa809d: "::1"
  0x7f7eaaaa80a1: "::1"
  0x7f7eaaaa80a5: "8080"
  0x7f7eaaaa80aa: "localhost"
  0x7f7eaaaa80b4: "/test?q=a"
  0x7f7eaaaa80be: "/test"
  ...

  (gdb) p target_pos
  $4 = (void *) 0x7f7eaaaa80b4

  (gdb) p query_pos
  $6 = (void *) 0x7f7eaaaa6af6

  (gdb) p r->args->start
  $8 = (u_char *) 0x7f7ea4002b02 "q=a HTTP/1.1\r\nHost: localhost:8080\r\nUser-Agent: curl/8.0.1\r\nAccept: */*\r\n\r\n"
  (gdb) p r->target.start
  $9 = (u_char *) 0x7f7ea40040c0 "/test?q=a"

That last address, 0x7f7ea40040c0, looks out of wack, it should be
smaller than r->args->start.

That results in a calculation in nxt_router_prepare_msg()

  if (r->args->start != NULL) {
        query_pos = nxt_pointer_to(target_pos,
                                   r->args->start - r->target.start);

        nxt_unit_sptr_set(&req->query, query_pos);

  } else {

that goes negative that then is stored in req->query.offset which is a
uint32_t and so wraps backwards from UINT_MAX to give us an offset of a
little under 4GiB, hence the above invalid memory access.

All this happens due to in nxt_http_rewrite() if we have a URL with a
query string, we create a new memory allocation to store the transformed
URL and query string.

We set r->target to point to this new allocation, but we also need to
point r->args->start to the start of the query string in this new
allocation.

Reported-by: René Landgrebe <https://github.com/rlandgrebe>
Tested-by: René Landgrebe <https://github.com/rlandgrebe>
Tested-by: Liam Crilly <liam.crilly@nginx.com>
Fixes: 14d6d97b ("HTTP: added basic URI rewrite.")
Closes: <https://github.com/nginx/unit/issues/964>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-10-05 13:38:15 +01:00
Liam Crilly
7fac908742 Added routes array to the default configuration.
The default configuration previously contained just a listeners and
applications object. Since routes is now a principle configuration object,
and a recommended way of configurating Unit, it is now included in the
default configuration.

This change benefits new users because it explicitly introduces the three
principle configuration objects which leads more intuitively to the
documentation. Experienced users may choose to ignore or delete routes.

routes is defined as an array instead of an object because this change
is designed to assist new users, where the simpler form of routes is
easier to understand.
2023-10-02 09:43:57 +01:00
Zhidao HONG
b6216f0bb7 Java: fixed the calculation related to the response buffer.
We need to take into account the size of the nxt_unit_response_t
structure itself when calculating where to start appending data to in
memory.

Closes: <https://github.com/nginx/unit/issues/923>
Reported-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: Andrew Clayton <a.clayton@nginx.org>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-09-28 15:21:16 +01:00
Andrei Zeliankou
2d0e502d2a Node.js: ServerRequest.destroy() implemented.
This closes #871 issue on GitHub.
2023-09-26 12:49:39 +01:00
Andrei Zeliankou
e0c2675774 Node.js: response body chunk can now be a Uint8Array.
Starting from Node.js 15.0.0 the chunk parameter of the response.write()
can be a Uint8Array.

This closes #870 issue on GitHub.
2023-09-26 12:49:39 +01:00
Andrew Clayton
0f630c3f60 Wasm: Allow uploads larger than 4GiB.
Currently Wasm modules are limited to a 32bit address space (until at
least the memory64 work is completed). All the counters etc in the
request structure were u32's. Which matched with 32bit memory
limitation.

However there is really no need to not allow >4GiB uploads that can be
saved off to disk or some such.

To do this we just need to increase the ->content_len &
->total_content_sent members to u64's.

However because we need the request structure to have the exact same
layout on 32bit (for Wasm modules) as it does on 64bit we need to re-jig
the order of some of these members and add a four-byte padding member.

Thus the request structure now looks like on 64bit (as shown by
pahole(1))

  struct nxt_wasm_request_s {
          uint32_t                   method_off;           /*     0     4 */
          uint32_t                   method_len;           /*     4     4 */
          uint32_t                   version_off;          /*     8     4 */
          uint32_t                   version_len;          /*    12     4 */
          uint32_t                   path_off;             /*    16     4 */
          uint32_t                   path_len;             /*    20     4 */
          uint32_t                   query_off;            /*    24     4 */
          uint32_t                   query_len;            /*    28     4 */
          uint32_t                   remote_off;           /*    32     4 */
          uint32_t                   remote_len;           /*    36     4 */
          uint32_t                   local_addr_off;       /*    40     4 */
          uint32_t                   local_addr_len;       /*    44     4 */
          uint32_t                   local_port_off;       /*    48     4 */
          uint32_t                   local_port_len;       /*    52     4 */
          uint32_t                   server_name_off;      /*    56     4 */
          uint32_t                   server_name_len;      /*    60     4 */
          /* --- cacheline 1 boundary (64 bytes) --- */
          uint64_t                   content_len;          /*    64     8 */
          uint64_t                   total_content_sent;   /*    72     8 */
          uint32_t                   content_sent;         /*    80     4 */
          uint32_t                   content_off;          /*    84     4 */
          uint32_t                   request_size;         /*    88     4 */
          uint32_t                   nfields;              /*    92     4 */
          uint32_t                   tls;                  /*    96     4 */
          char                       __pad[4];             /*   100     4 */
          nxt_wasm_http_field_t      fields[];             /*   104     0 */

          /* size: 104, cachelines: 2, members: 25 */
          /* last cacheline: 40 bytes */
  };

and the same structure (taken from unit-wasm) compiled as 32bit

  struct luw_req {
          u32                        method_off;           /*     0     4 */
          u32                        method_len;           /*     4     4 */
          u32                        version_off;          /*     8     4 */
          u32                        version_len;          /*    12     4 */
          u32                        path_off;             /*    16     4 */
          u32                        path_len;             /*    20     4 */
          u32                        query_off;            /*    24     4 */
          u32                        query_len;            /*    28     4 */
          u32                        remote_off;           /*    32     4 */
          u32                        remote_len;           /*    36     4 */
          u32                        local_addr_off;       /*    40     4 */
          u32                        local_addr_len;       /*    44     4 */
          u32                        local_port_off;       /*    48     4 */
          u32                        local_port_len;       /*    52     4 */
          u32                        server_name_off;      /*    56     4 */
          u32                        server_name_len;      /*    60     4 */
          /* --- cacheline 1 boundary (64 bytes) --- */
          u64                        content_len;          /*    64     8 */
          u64                        total_content_sent;   /*    72     8 */
          u32                        content_sent;         /*    80     4 */
          u32                        content_off;          /*    84     4 */
          u32                        request_size;         /*    88     4 */
          u32                        nr_fields;            /*    92     4 */
          u32                        tls;                  /*    96     4 */
          char                       __pad[4];             /*   100     4 */
          struct luw_hdr_field       fields[];             /*   104     0 */

          /* size: 104, cachelines: 2, members: 25 */
          /* last cacheline: 40 bytes */
  };

We can see the structures have the same layout, same size and no
padding.

We need the __pad member as otherwise I saw gcc and clang on Alpine
Linux automatically add the 'packed' attribute to the structure which
made the two structures not match.

Link: <https://github.com/WebAssembly/memory64>
Link: <https://github.com/nginx/unit-wasm>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-09-25 13:52:13 +01:00
Andrew Clayton
d5efa5f11f Wasm: Fix multiple successive calls to the request_handler.
When trying to upload files to the luw-upload-reflector demo[0] above a
certain size that would mean Unit would need to make more than two calls
to the request_handler function in the Wasm module we would get the
following error from wasmtime and the upload would stall on the third
call to the request_handler

  WASMTIME ERROR: failed to call function [->wasm_request_handler]
  error while executing at wasm backtrace:
      0: 0x5ce2 - <unknown>!memcpy
      1:  0x7df - luw_req_buf_append
                      at /home/andrew/src/unit-wasm/src/c/libunit-wasm.c:308:14
      2:  0x3a1 - luw_request_handler
                      at /home/andrew/src/unit-wasm/examples/c/luw-upload-reflector.c:110:3

  Caused by:
      wasm trap: out of bounds memory access

This was due to ->content_off (the offset of where the actual body
content starts in the request structure/memory) being some overly large
value.

This was largely down to me being an idiot!

Before calling the loop that makes the calls to the request_handler we
would calculate the new offset, which is now just the size of the
request structure as we don't re-send all the HTTP meta data and headers
etc. However because this value is in the request structure which is in
the shared memory and we use this same memory for requests and
responses, when we make a response we overwrite this request structure
with the response structure, so our ->content_off is now some wacked out
value when we make the next call to the request_handler.

To fix this we just need to reset ->content_off each time round the
loop.

There's also no point in setting ->nfields to 0, it has the same issue
as above, but doesn't get re-used by the Wasm module anyway.

[0]: <https://github.com/nginx/unit-wasm/blob/main/examples/c/luw-upload-reflector.c>

Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-09-25 13:52:13 +01:00
Andrew Clayton
76086d6d7a Wasm: Allow to set the HTTP response status.
This commit enables WebAssembly modules to set the HTTP response status
to something other than the previously hard coded '200 OK'.

To do this they can make a call to nxt_wasm_set_resp_status() providing
the required status code.

If this function isn't called the status code defaults to '200 OK'. The
WebAssembly module can also return -1 from the request_handler function
as a short cut to signal a '500 Internal Server Error'.

Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-09-25 13:49:36 +01:00
Andrew Clayton
c9961610ed Fix build on musl libc with clang.
As reported by @andypost on GitHub, if you try to build Unit on a system
that uses musl libc (such as Alpine Linux) with clang then you get the
following

clang -c -pipe -fPIC -fvisibility=hidden -O -W -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -fstrict-aliasing -Wstrict-overflow=5 -Wmissing-prototypes -Werror -g   -I src -I build/include   \
                      \
                     \
-o build/src/nxt_socketpair.o \
-MMD -MF build/src/nxt_socketpair.dep -MT build/src/nxt_socketpair.o \
src/nxt_socketpair.c
In file included from src/nxt_socketpair.c:8:
src/nxt_socket_msg.h:138:17: error: comparison of integers of different signs: 'unsigned long' and 'long' [-Werror,-Wsign-compare]
         cmsg = CMSG_NXTHDR(&msg, cmsg))
                ^~~~~~~~~~~~~~~~~~~~~~~
/usr/include/sys/socket.h:358:44: note: expanded from macro 'CMSG_NXTHDR'
        __CMSG_LEN(cmsg) + sizeof(struct cmsghdr) >= __MHDR_END(mhdr) - (unsigned char *)(cmsg) \
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from src/nxt_socketpair.c:8:
src/nxt_socket_msg.h:177:17: error: comparison of integers of different signs: 'unsigned long' and 'long' [-Werror,-Wsign-compare]
         cmsg = CMSG_NXTHDR(&msg, cmsg))
                ^~~~~~~~~~~~~~~~~~~~~~~
/usr/include/sys/socket.h:358:44: note: expanded from macro 'CMSG_NXTHDR'
        __CMSG_LEN(cmsg) + sizeof(struct cmsghdr) >= __MHDR_END(mhdr) - (unsigned char *)(cmsg) \
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 errors generated.
make: *** [build/Makefile:261: build/src/nxt_socketpair.o] Error 1

GCC works fine, it seems to have some smarts so that it doesn't give
warnings on system header files.

This seems to be a long standing issue with musl libc (bad casting in
the CMSG_NXTHDR macro) and the workaround employed by several projects
is to disable the -Wsign-compare clang warning for the code in question.

So, that's what we do. We wrap the CMSG_NXTHDR macro in a function, so
we can use the pre-processor in it to selectively disable the warning.

Link: <https://github.com/dotnet/runtime/issues/16438>
Link: <https://git.openembedded.org/meta-openembedded/tree/meta-oe/recipes-devtools/breakpad/breakpad/0001-Turn-off-sign-compare-for-musl-libc.patch>
Link: <https://github.com/dotnet/corefx/blob/57ff88bb75a0/src/Native/Unix/System.Native/pal_networking.c#L811-L829>
Link: <https://patchwork.yoctoproject.org/project/oe/patch/20220407191438.3696227-1-stefan@datenfreihafen.org/>
Closes: <https://github.com/nginx/unit/issues/936>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-09-11 16:59:27 +01:00
Alejandro Colomar
7dd5ad93a4 Log: fixed typo.
Scripted change:

$ grep -ril recevied src/ | xargs sed -i s/recevied/received/

Reported-by: <https://github.com/jeffdafoe>
Closes: <https://github.com/nginx/unit/issues/920>
Cc: <https://github.com/meezaan>
Cc: Timo Stark <t.stark@nginx.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-09-07 23:13:34 +01:00
Konstantin Pavlov
ebcc92069d Added unit pkg-config file. 2023-08-01 10:16:17 -07:00
Andrew Clayton
47ff51009f Wasm: Add support for directory access.
Due to the sandboxed nature of WebAssembly, by default WASM modules
don't have any access to the underlying filesystem.

There is however a capabilities based mechanism[0] for allowing such
access.

This adds a config option to the 'wasm' application type;
'access.filesystem' which takes an array of directory paths that are
then made available to the WASM module. This access works recursively,
i.e everything under a specific path is allowed access to.

Example config might look like

  "access" {
      "filesystem": [
          "/tmp",
          "/var/tmp"
      ]
  }

The actual mechanism used allows directories to be mapped differently in
the guest. But at the moment we don't support that and just map say /tmp
to /tmp. This can be revisited if it's something users clamour for.

Network sockets are another resource that may be controlled in this
manner, for example there is a wasi_config_preopen_socket() function,
however this requires the runtime to open the network socket then
effectively pass this through to the guest.

This is something that can be revisited in the future if users desire
it.

[0]:
<https://github.com/bytecodealliance/wasmtime/blob/main/docs/WASI-capabilities.md>

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-17 13:09:49 +01:00
Andrew Clayton
e99854afdf Wasm: Wire up Wasm language module support to the config system.
This exposes various WebAssembly language module specific options.

The application type is "wasm".

There is a "module" option that is required, this specifies the full
path to the WebAssembly module to be run. This module should be in
binary format, i.e a .wasm file.

There are also currently eight function handlers that can be specified.
Three of them are _required_

1) request_handler

The main driving function. This may be called multiple times for a
single HTTP request if the request is larger than the shared memory.

2) malloc_handler

Used to allocate a chunk of memory at language module startup. This
memory is allocated from the WASM modules address space and is what is
sued for communicating between the WASM module (the guest) and Unit (the
host).

3) free_handler

Used to free the memory from above at language module shutdown.

Then there are the following five _optional_ handlers

1) module_init_handler

If set, called at language module startup.

2) module_end_handler

If set, called at language module shutdown.

3) request_init_handler

If set, called at the start of request. Called only once per HTTP
request.

4) request_end_handler

If set, called once all of a request has been sent to the WASM module.

5) response_end_handler

If set, called at the end of a request, once the WASM module has sent
all its headers and data.

Example config

  "applications": {
      "luw-echo-request": {
          "type": "wasm",
          "module": "/path/to/unit-wasm/examples/c/luw-echo-request.wasm",
          "request_handler": "luw_request_handler",
          "malloc_handler": "luw_malloc_handler",
          "free_handler": "luw_free_handler",
          "module_init_handler": "luw_module_init_handler",
          "module_end_handler": "luw_module_end_handler",
      }
  }

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-17 13:09:49 +01:00
Andrew Clayton
6a211e2b74 Wasm: Add the core of initial WebAssembly language module support.
This adds the core of runtime WebAssembly[0] support. Future commits
will enable this in the Unit core and expose the configuration.

This introduces a new src/wasm directory for storing this source.

We are initially using Wasmtime[0] as the WebAssembly runtime, however
this has been designed with the ability to use different runtimes in
mind.

src/wasm/nxt_wasm.[ch] is the main interface to Unit.

src/wasm/nxt_rt_wasmtime.c is the Wasmtime runtime support. This is
nicely insulated from any knowledge of internal Unit workings.

Wasmtime is what loads and runs the Wasm modules.

The Wasm modules can export functions Wasmtime can call and Wasmtime can
export functions that the module can call.

We make use of both. The terminology used is that function exports are
what the Wasm module exports and function imports are what the Wasm
runtime exports to the module.

We currently have four function imports (functions exported by the
runtime to be called by the Wasm module).

1) nxt_wasm_get_init_mem_size

This allows Wasm modules to get the size of the initially allocated
shared memory. This is the size allocated at Unit startup and what the
Wasm modules can assume they have access to (in reality this shared
memory will likely be larger).

The amount of memory allocated at startup is NXT_WASM_MEM_SIZE which as
of this commit is 32MiB.

We do actually allocate NXT_WASM_MEM_SIZE + NXT_WASM_PAGE_SIZE at
startup which is an extra 64KiB (the smallest allocation unit), this is
to allow room for the response structure and so module developers can
just assume they have the full 32MiB for their actual response.

2) nxt_wasm_send_headers

This allows WASM modules to send their headers.

3) nxt_wasm_send_response

This allows WASM modules to send their response.

4) nxt_wasm_response_end

This allows WASM modules to inform Unit they have finished sending their
response. This calls nxt_unit_request_done()

Then there are currently up to eight functions that a module can export.
Three of which are required. These function can be named anything. I'll
use the Unit configuration names to refer to them

1) request_handler

The main driving function. This may be called multiple times for a
single HTTP request if the request is larger than the shared memory.

2) malloc_handler

Used to allocate a chunk of memory at language module startup. This
memory is allocated from the WASM modules address space and is what is
sued for communicating between the WASM module (the guest) and Unit (the
host).

3) free_handler

Used to free the memory from above at language module shutdown.

Then there are the following optional handlers

1) module_init_handler

If set, called at language module startup.

2) module_end_handler

If set, called at language module shutdown.

3) request_init_handler

If set, called at the start of request. Called only once per HTTP
request.

4) request_end_handler

If set, called once all of a request has been sent to the WASM module.

5) response_end_handler

If set, called at the end of a request, once the WASM module has sent
all its headers and data.

32bits

We currently support 32bit WASM modules, I.e wasm32-wasi. Newer version
of clang, 13+[2], do seem to have support for wasm64 as a target (which
uses a LP64 model). However it's not entirely clear if the WASI SDK
fully supports[3] this and by extension WASI libc/wasi-sysroot.

64bit support is something than can be explored more thoroughly in the
future.

As such in structures that are used to communicate between the host and
guest we use 32bit ints. Even when a single byte might be enough. This
is to avoid issues with structure layout differences between a 64bit
host and 32bit guest (I.e WASM module) and the need for various bits of
structure padding depending on host architecture. Instead everything is
4-byte aligned.

[0]: <https://webassembly.org/>
[1]: <https://wasmtime.dev/>
[2]: <https://reviews.llvm.org/rG670944fb20b226fc22fa993ab521125f9adbd30a>
[3]: <https://github.com/WebAssembly/wasi-sdk/issues/185>

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-17 13:09:14 +01:00
Andrew Clayton
0c44439736 Wasm: Add core configuration data structure.
This is required to actually _build_ the Wasm language module.

The nxt_wasm_app_conf_t structure consists of the modules name, e.g
wasm, then the three required function handlers followed by the five
optional function handlers.

See the next commit for details of these function handlers.

We also need to include the u.wasm union entry that provides access to
the above structure.

The bulk of the configuration infrastructure will be added in a
subsequent commit.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-16 16:28:38 +01:00
Andrew Clayton
52b334acd1 Wasm: Register a new WebAssembly language module type.
This is the first patch in adding WebAssembly language module support.

This just adds a new NXT_APP_WASM type, required by subsequent commits.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-10 16:58:48 +01:00
Andrew Clayton
46573e6993 Index initialise the nxt_app_msg_prefix array.
This makes it much more clear what's what.

This is in preparation for adding WebAssembly language module support.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-10 16:58:35 +01:00
Zhidao HONG
a28bef097c HTTP: controlling response headers support. 2023-08-09 14:37:16 +08:00
Zhidao HONG
9f04d6db63 HTTP: stored matched action in nxt_http_request_t.
No functional changes.
2023-08-09 14:36:16 +08:00
Zhidao HONG
d0fdf5971f NJS: workaround for the warning in nxt_js_call() on Freebsd12 gcc. 2023-07-12 09:31:22 +08:00
Zhidao HONG
458722df55 Var: supported HTTP response header variables.
This commit adds the variable $response_header_NAME.
2023-07-01 12:18:22 +08:00
Zhidao HONG
c61ccec7b4 Variables refactoring.
This commit is to reimplement the variables with an unknown field
such as $header_{name} to make the parsing more generic,
it's a preparation for supporting response header variables.
2023-06-19 16:29:22 +08:00
Zhidao HONG
18d3637e4b NJS: supported 0.8.0. 2023-07-11 09:30:50 +08:00
Alejandro Colomar
c185ae7512 Fixed indentation.
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-06-30 14:38:34 +02:00
Zhidao HONG
a378f6aa31 HTTP: fixed variable caching.
When a variable is accessed in the Unit configuration, the value is cached.
This was useful prior to the URI rewrite feature, but now that the URI (more
precisely, the request target) can be rewritten, the contents of the variable
$uri (which contains the path part of the request target, and is decoded)
should not be cached anymore, or at least the cached value should be invalidated
after a URI rewrite.

Example:

{
	"rewrite": "/prefix$uri",
	"share": "$uri"
}

For a request line like GET /foo?bar=baz HTTP/1.1\r\n, the expected file
served in the response would be /prefix/foo, but due to the caching issue,
Unit currently serves /foo.
2023-05-25 00:27:55 +08:00
synodriver
b84f6ecad4 Python: Fix error checks in nxt_py_asgi_request_handler().
Signed-off-by: synodriver <diguohuangjiajinweijun@gmail.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
[ Re-word commit subject - Andrew ]
Fixes: c4c2f90c5b ("Python: ASGI server introduced.")
Closes: <https://github.com/nginx/unit/issues/895>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-06-01 00:25:40 +01:00
synodriver
93ed66958e Python: Add ASGI lifespan state support.
Lifespan state is a special dict in asgi lifespan scope, which allow
applications to persist data from the lifespan cycle to request/response
handling. The scope["state"] namespace provides a place to store these
sorts of things. The server will ensure that a shallow copy of the
namespace is passed into each subsequent request/response call into the
application.

Some frameworks are already taking advantage of this feature, for
example, starlette, and without this feature they wouldn't work
properly.

Signed-off-by: synodriver <diguohuangjiajinweijun@gmail.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
[ Minor code tweaks to avoid lines > 80 chars, static a function and
  re-work the PyMemberDef structure initialisation for Python <3.7
  and -Wwrite-strings compatibility - Andrew ]
Tested-by: <https://github.com/synodriver>
Tested-by: <https://github.com/hawiliali>
Closes: <https://github.com/nginx/unit/issues/864>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-06-01 00:25:03 +01:00
Alejandro Colomar
47cdfb6f30 Tests: fixed incorrect pointer assignment.
If we don't update the pointer before copying the request body, then we
get the behavior shown below.  After this patch, "foo\n" is rightly
appended at the end of the response body.

Request:

"GET / HTTP/1.1\r\nHost: _\nContent-Length: 4\n\nfoo\n"

Response body:

"""
Hello world!
foo
est data:
  Method: GET
  Protocol: HTTP/1.1
  Remote addr: 127.0.0.1
  Local addr: 127.0.0.1
  Target: /
  Path: /
  Fields:
    Host: _
    Content-Length: 4
  Body:
"""

Fixes: 1bb22d1e92 ("Unit application library.")
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-05-25 11:27:35 +02:00
Alejandro Colomar
f32858dcb7 Added back deprecated options to unitd.
We renamed the options recently, with the intention of keeping the old
names as supported but deprecated for some time, before removal.  This
was done with the configure script options, but in the unitd binary, we
accidentally removed the old names, causing some unintended breakage.
Keep support for the old names, albeit with a deprecation message to
stderr, for some time, until we decide to remove them.

Fixes: 5a37171f73 ("Added default values for pathnames.")
Closes: <https://github.com/nginx/unit/issues/876>
Reported-by: El RIDO <elrido@gmx.net>
Acked-by: Liam Crilly <liam@nginx.com>
Acked-by: Artem Konev <a.konev@f5.com>
Acked-by: Timo Stark <t.stark@nginx.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Cc: Andrei Zeliankou <zelenkov@nginx.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-05-21 01:01:43 +02:00