Commit Graph

28 Commits

Author SHA1 Message Date
Zhidao HONG
14d6d97bac HTTP: added basic URI rewrite.
This commit introduced the basic URI rewrite. It allows users to change request URI. Note the "rewrite" option ignores the contained query if any and the query from the request is preserverd.
An example:
"routes": [
    {
        "match": {
            "uri": "/v1/test"
        },
        "action": {
            "return": 200
        }
    },
    {
        "action": {
            "rewrite": "/v1$uri",
            "pass": "routes"
        }
    }
]

Reviewed-by: Alejandro Colomar <alx@nginx.com>
2023-04-20 23:20:41 +08:00
Alejandro Colomar
fcff55acb6 HTTP: optimizing $request_line.
Don't reconstruct a new string for the $request_line from the parsed
method, target, and HTTP version, but rather keep a pointer to the
original memory where the request line was received.

This will be necessary for implementing URI rewrites, since we want to
log the original request line, and not one constructed from the
rewritten target.

This implementation changes behavior (only for invalid requests) in the
following way:

Previous behavior was to log as many tokens from the request line as
were parsed validly, thus:

Request              -> access log              ; error log

"GET / HTTP/1.1"     -> "GET / HTTP/1.1"     OK ; =
"GET   / HTTP/1.1"   -> "GET / HTTP/1.1"    [1] ; =
"GET / HTTP/2.1"     -> "GET / HTTP/2.1"     OK ; =
"GET / HTTP/1."      -> "GET / HTTP/1."     [2] ; "GET / HTTP/1. [null]"
"GET / food"         -> "GET / food"        [2] ; "GET / food [null]"
"GET / / HTTP/1.1"   -> "GET / / HTTP/1.1"  [2] ; =
"GET /  / HTTP/1.1"  -> "GET /  / HTTP/1.1" [2] ; =
"GET food HTTP/1.1"  -> "GET"                   ; "GET [null] [null]"
"OPTIONS * HTTP/1.1" -> "OPTIONS"           [3] ; "OPTIONS [null] [null]"
"FOOBAR baz HTTP/1.1"-> "FOOBAR"                ; "FOOBAR [null] [null]"
"FOOBAR / HTTP/1.1"  -> "FOOBAR / HTTP/1.1"     ; =
"get / HTTP/1.1"     -> "-"                     ; " [null] [null]"
""                   -> "-"                     ; " [null] [null]"

This behavior was rather inconsistent.  We have several options to go
forward with this patch:

-  NGINX behavior.

   Log the entire request line, up to '\r' | '\n', even if it was
   invalid.

   This is the most informative alternative.  However, RFC-complying
   requests will probably not send invalid requests.

   This information would be interesting to users where debugging
   requests constructed manually via netcat(1) or a similar tool, or
   maybe for debugging a client, are important.  It might be interesting
   to support this in the future if our users are interested; for now,
   since this approach requires looping over invalid requests twice,
   that's an overhead that we better avoid.

-  Previous Unit behavior

   This is relatively fast (almost as fast as the next alternative, the
   one we chose), but the implementation is ugly, in that we need to
   perform the same operation in many places around the code.

   If we want performance, probably the next alternative is better; if
   we want to be informative, then the first one is better (maybe in
   combination with the third one too).

-  Chosen behavior

   Only logging request lines when the request is valid.  For any
   invalid request, or even unsupported ones, the request line will be
   logged as "-".  Thus:

   Request              -> access log [4]

   "GET / HTTP/1.1"     -> "GET / HTTP/1.1"     OK
   "GET   / HTTP/1.1"   -> "GET   / HTTP/1.1"  [1]
   "GET / HTTP/2.1"     -> "-"                 [3]
   "GET / HTTP/1."      -> "-"
   "GET / food"         -> "-"
   "GET / / HTTP/1.1"   -> "GET / / HTTP/1.1"  [2]
   "GET /  / HTTP/1.1"  -> "GET /  / HTTP/1.1" [2]
   "GET food HTTP/1.1"  -> "-"
   "OPTIONS * HTTP/1.1" -> "-"
   "FOOBAR baz HTTP/1.1"-> "-"
   "FOOBAR / HTTP/1.1"  -> "FOOBAR / HTTP/1.1"
   "get / HTTP/1.1"     -> "-"
   ""                   -> "-"

   This is less informative than previous behavior, but considering how
   inconsistent it was, and that RFC-complying agents will probably not
   send us such requests, we're ready to lose that information in the
   log.  This is of course the fastest and simplest implementation we
   can get.

   We've chosen to implement this alternative in this patch.  Since we
   modified the behavior, this patch also changes the affected tests.

[1]:  Multiple successive spaces as a token delimiter is allowed by the
      RFC, but it is discouraged, and considered a security risk.  It is
      currently supported by Unit, but we will probably drop support for
      it in the future.

[2]:  Unit currently supports spaces in the request-target.  This is
      a violation of the relevant RFC (linked below), and will be fixed
      in the future, and consider those targets as invalid, returning
      a 400 (Bad Request), and thus the log lines with the previous
      inconsistent behavior would be changed.

[3]:  Not yet supported.

[4]:  In the error log, regarding the "log_routes" conditional logging
      of the request line, we only need to log the request line if it
      was valid.  It doesn't make sense to log "" or "-" in case that
      the request was invalid, since this is only useful for
      understanding decisions of the router.  In this case, the access
      log is more appropriate, which shows that the request was invalid,
      and a 400 was returned.  When the request line is valid, it is
      printed in the error log exactly as in the access log.

Link: <https://datatracker.ietf.org/doc/html/rfc9112#section-3>
Suggested-by: Liam Crilly <liam@nginx.com>
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Cc: Timo Stark <t.stark@nginx.com>
Cc: Andrei Zeliankou <zelenkov@nginx.com>
Cc: Andrew Clayton <a.clayton@nginx.com>
Cc: Artem Konev <a.konev@f5.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-04-12 11:50:56 +02:00
Andrew Clayton
4418f99cd4 Constified numerous function parameters.
As was pointed out by the cppcheck[0] static code analysis utility we
can mark numerous function parameters as 'const'. This acts as a hint to
the compiler about our intentions and the compiler will tell us when we
deviate from them.

[0]: https://cppcheck.sourceforge.io/
2022-06-22 00:30:44 +02:00
Valentin Bartenev
fb80502513 HTTP parser: allowed more characters in header field names.
Previously, all requests that contained in header field names characters other
than alphanumeric, or "-", or "_" were rejected with a 400 "Bad Request" error
response.

Now, the parser allows the same set of characters as specified in RFC 7230,
including: "!", "#", "$", "%", "&", "'", "*", "+", ".", "^", "`", "|", and "~".
Header field names that contain only these characters are considered valid.

Also, there's a new option introduced: "discard_unsafe_fields".  It accepts
boolean value and it is set to "true" by default.

When this option is "true", all header field names that contain characters
in valid range, but other than alphanumeric or "-" are skipped during parsing.
When the option is "false", these header fields aren't skipped.

Requests with non-valid characters in header field names according to
RFC 7230 are rejected regardless of "discard_unsafe_fields" setting.

This closes #422 issue on GitHub.
2020-11-17 16:50:06 +03:00
Igor Sysoev
65799c7252 Upstream chunked transfer encoding support. 2020-06-23 14:16:45 +03:00
Max Romanov
6bda9b5eeb Using malloc/free for the http fields hash.
This is required due to lack of a graceful shutdown: there is a small gap
between the runtime's memory pool release and router process's exit. Thus, a
worker thread may start processing a request between these two operations,
which may result in an http fields hash access and subsequent crash.

To simplify issue reproduction, it makes sense to add a 2 sec sleep before
exit() in nxt_runtime_exit().
2020-04-16 17:09:23 +03:00
Igor Sysoev
ddde9c23cf Initial proxy support. 2019-11-14 16:39:54 +03:00
Valentin Bartenev
f7d3db314d HTTP parser: removed unused "exten" field.
This field was intended for MIME type lookup by file extension when serving
static files, but this use case is too narrow; only a fraction of requests
targets static content, and the URI presumably isn't rewritten.  Moreover,
current implementation uses the entire filename for MIME type lookup if the
file has no extension.

Instead of extracting filenames and extensions when parsing requests, it's
easier to obtain them right before serving static content; this behavior is
already implemented.  Thus, we can drop excessive logic from parser.
2019-09-30 19:11:17 +03:00
Valentin Bartenev
3b77e402a9 HTTP parser: removed unused "plus_in_target" flag. 2019-09-16 20:17:42 +03:00
Valentin Bartenev
56f4085b9d HTTP parser: removed unused "offset" field.
Thanks to 洪志道 (Hong Zhi Dao).
2019-09-16 20:17:42 +03:00
Valentin Bartenev
2fb7a1bfb9 HTTP parser: removed unused "exten_start" and "args_start" fields. 2019-09-16 20:17:42 +03:00
Valentin Bartenev
64be8717bd Configuration: added ability to access object members with slashes.
Now URI encoding can be used to escape "/" in the request path:

  GET /config/listeners/unix:%2Fpath%2Fto%2Fsocket/
2019-09-16 20:17:42 +03:00
Max Romanov
2a6af1d214 Added "extern" to nxt_http_fields_hash_proto to avoid link issues. 2019-09-09 17:39:24 +03:00
Max Romanov
29911538ea Improving response header fields processing.
Fields are filtered one by one before being added to fields list.
This avoids adding and then skipping connection-specific fields.
2019-08-16 00:56:38 +03:00
Igor Sysoev
0ba7cfce75 Added routing based on header fields. 2019-05-30 15:33:51 +03:00
Valentin Bartenev
0c38ff0e66 Checking for major HTTP version. 2018-01-15 20:50:20 +03:00
Valentin Bartenev
a073616fc3 Improved HTTP version representation. 2018-01-15 20:50:14 +03:00
Valentin Bartenev
3fb140d6d2 HTTP parser: improved error reporting. 2018-01-15 20:49:59 +03:00
Valentin Bartenev
45d08d5145 HTTP parser: introduced nxt_http_parse_fields(). 2017-12-27 15:45:23 +03:00
Valentin Bartenev
8830d73261 HTTP parser: reworked header fields handling. 2017-12-25 17:04:22 +03:00
Max Romanov
f3107f3896 Complex target parser copied from NGINX.
nxt_app_request_header_t fields renamed:
- 'path' renamed to 'target'.
- 'path_no_query' renamed to 'path' and contains parsed value.
2017-07-05 13:31:45 +03:00
Valentin Bartenev
accb489492 HTTP parser: reduced memory consumption of header fields list. 2017-06-20 22:32:13 +03:00
Igor Sysoev
f888a5310c Using new memory pool implementation. 2017-06-20 19:49:17 +03:00
Valentin Bartenev
db6642f374 HTTP parser: decoupled header fields processing. 2017-06-13 20:11:29 +03:00
Valentin Bartenev
4df646a258 HTTP parser. 2017-03-01 15:29:18 +03:00
Valentin Bartenev
fde4d18e3a Removed legacy HTTP parser. 2017-03-01 15:17:55 +03:00
Igor Sysoev
de532922d9 Introducing tasks. 2017-01-23 19:56:03 +03:00
Igor Sysoev
16cbf3c076 Initial version. 2017-01-17 20:00:00 +03:00