Commit Graph

50 Commits

Author SHA1 Message Date
Alejandro Colomar
1b05161107 Removed the unsafe nxt_memcmp() wrapper for memcmp(3).
The casts are unnecessary, since memcmp(3)'s arguments are 'void *'.
It might have been necessary in the times of K&R, where 'void *' didn't
exist.  Nowadays, it's unnecessary, and _very_ unsafe, since casts can
hide all classes of bugs by silencing most compiler warnings.

The changes from nxt_memcmp() to memcmp(3) were scripted:

$ find src/ -type f \
  | grep '\.[ch]$' \
  | xargs sed -i 's/nxt_memcmp/memcmp/'

Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-11-04 00:30:27 +01:00
Andrew Clayton
4418f99cd4 Constified numerous function parameters.
As was pointed out by the cppcheck[0] static code analysis utility we
can mark numerous function parameters as 'const'. This acts as a hint to
the compiler about our intentions and the compiler will tell us when we
deviate from them.

[0]: https://cppcheck.sourceforge.io/
2022-06-22 00:30:44 +02:00
Alejandro Colomar
952bcc50bf Fixed #define style.
We had a mix of styles for declaring function-like macros:

Style A:
 #define                    \
 foo()                      \
     do {                   \
         ...                \
     } while (0)

Style B:
 #define foo()              \
     do {                   \
         ...                \
     } while (0)

We had a similar number of occurences of each style:

 $ grep -rnI '^\w*(.*\\' | wc -l
 244
 $ grep -rn 'define.*(.*)' | wc -l
 239

(Those regexes aren't perfect, but a very decent approximation.)

Real examples:

 $ find src -type f | xargs sed -n '/^nxt_double_is_zero/,/^$/p'
 nxt_double_is_zero(f)                                                         \
     (fabs(f) <= FLT_EPSILON)

 $ find src -type f | xargs sed -n '/define nxt_http_field_set/,/^$/p'
 #define nxt_http_field_set(_field, _name, _value)                             \
     do {                                                                      \
         (_field)->name_length = nxt_length(_name);                            \
         (_field)->value_length = nxt_length(_value);                          \
         (_field)->name = (u_char *) _name;                                    \
         (_field)->value = (u_char *) _value;                                  \
     } while (0)

I'd like to standardize on a single style for them, and IMO,
having the identifier in the same line as #define is a better
option for the following reasons:

- Programmers are used to `#define foo() ...` (readability).
- One less line of code.
- The program for finding them is really simple (see below).

 function grep_ngx_func()
 {
     if (($# != 1)); then
         >&2 echo "Usage: ${FUNCNAME[0]} <func>";
         return 1;
     fi;

     find src -type f \
     | grep '\.[ch]$' \
     | xargs grep -l "$1" \
     | sort \
     | xargs pcregrep -Mn "(?s)^\$[\w\s*]+?^$1\(.*?^}";

     find src -type f \
     | grep '\.[ch]$' \
     | xargs grep -l "$1" \
     | sort \
     | xargs pcregrep -Mn "(?s)define $1\(.*?^$" \
     | sed -E '1s/^[^:]+:[0-9]+:/&\n\n/';
 }

 $ grep_ngx_func
 Usage: grep_ngx_func <func>

 $ grep_ngx_func nxt_http_field_set
 src/nxt_http.h:98:

 #define nxt_http_field_set(_field, _name, _value)                             \
     do {                                                                      \
         (_field)->name_length = nxt_length(_name);                            \
         (_field)->value_length = nxt_length(_value);                          \
         (_field)->name = (u_char *) _name;                                    \
         (_field)->value = (u_char *) _value;                                  \
     } while (0)

 $ grep_ngx_func nxt_sprintf
 src/nxt_sprintf.c:56:

 u_char * nxt_cdecl
 nxt_sprintf(u_char *buf, u_char *end, const char *fmt, ...)
 {
     u_char   *p;
     va_list  args;

     va_start(args, fmt);
     p = nxt_vsprintf(buf, end, fmt, args);
     va_end(args);

     return p;
 }

................
Scripted change:
................

$ find src -type f \
  | grep '\.[ch]$' \
  | xargs sed -i '/define *\\$/{N;s/ *\\\n/ /;s/        //}'
2022-05-03 12:11:14 +02:00
Valentin Bartenev
fb80502513 HTTP parser: allowed more characters in header field names.
Previously, all requests that contained in header field names characters other
than alphanumeric, or "-", or "_" were rejected with a 400 "Bad Request" error
response.

Now, the parser allows the same set of characters as specified in RFC 7230,
including: "!", "#", "$", "%", "&", "'", "*", "+", ".", "^", "`", "|", and "~".
Header field names that contain only these characters are considered valid.

Also, there's a new option introduced: "discard_unsafe_fields".  It accepts
boolean value and it is set to "true" by default.

When this option is "true", all header field names that contain characters
in valid range, but other than alphanumeric or "-" are skipped during parsing.
When the option is "false", these header fields aren't skipped.

Requests with non-valid characters in header field names according to
RFC 7230 are rejected regardless of "discard_unsafe_fields" setting.

This closes #422 issue on GitHub.
2020-11-17 16:50:06 +03:00
Max Romanov
6bda9b5eeb Using malloc/free for the http fields hash.
This is required due to lack of a graceful shutdown: there is a small gap
between the runtime's memory pool release and router process's exit. Thus, a
worker thread may start processing a request between these two operations,
which may result in an http fields hash access and subsequent crash.

To simplify issue reproduction, it makes sense to add a 2 sec sleep before
exit() in nxt_runtime_exit().
2020-04-16 17:09:23 +03:00
Igor Sysoev
ddde9c23cf Initial proxy support. 2019-11-14 16:39:54 +03:00
Valentin Bartenev
f7d3db314d HTTP parser: removed unused "exten" field.
This field was intended for MIME type lookup by file extension when serving
static files, but this use case is too narrow; only a fraction of requests
targets static content, and the URI presumably isn't rewritten.  Moreover,
current implementation uses the entire filename for MIME type lookup if the
file has no extension.

Instead of extracting filenames and extensions when parsing requests, it's
easier to obtain them right before serving static content; this behavior is
already implemented.  Thus, we can drop excessive logic from parser.
2019-09-30 19:11:17 +03:00
Valentin Bartenev
2dbda125db HTTP parser: normalization of paths ending with "." or "..".
Earlier, the paths were normalized only if there was a "/" at the end, which
is wrong according to section 5.2.4 of RFC 3986 and hypothetically may allow
to the directory above the document root.
2019-09-30 19:11:17 +03:00
Valentin Bartenev
6352c21a58 HTTP parser: fixed parsing of target after literal space character.
In theory, all space characters in request target must be encoded; however,
some clients may violate the specification.  For the sake of interoperability,
Unit supports unencoded space characters.

Previously, if there was a space character before the extension or arguments
parts, those parts weren't recognized.  Also, quoted symbols and complex
target weren't detected after a space character.
2019-09-17 18:40:21 +03:00
Valentin Bartenev
3b77e402a9 HTTP parser: removed unused "plus_in_target" flag. 2019-09-16 20:17:42 +03:00
Valentin Bartenev
2fb7a1bfb9 HTTP parser: removed unused "exten_start" and "args_start" fields. 2019-09-16 20:17:42 +03:00
Valentin Bartenev
64be8717bd Configuration: added ability to access object members with slashes.
Now URI encoding can be used to escape "/" in the request path:

  GET /config/listeners/unix:%2Fpath%2Fto%2Fsocket/
2019-09-16 20:17:42 +03:00
Max Romanov
29911538ea Improving response header fields processing.
Fields are filtered one by one before being added to fields list.
This avoids adding and then skipping connection-specific fields.
2019-08-16 00:56:38 +03:00
Igor Sysoev
0ba7cfce75 Added routing based on header fields. 2019-05-30 15:33:51 +03:00
Andrey Zelenkov
22de5fcddf Style. 2019-03-11 17:31:59 +03:00
Valentin Bartenev
11cecce114 HTTP parser: relaxed checking of fields values.
Allowing characters up to 0xFF doesn't conflict with RFC 7230.
Particularly, this make it possible to pass unencoded UTF-8 data
through HTTP headers, which can be useful.
2018-07-03 15:18:16 +03:00
Igor Sysoev
606eda045b Removed '\r' and '\n' artifact macros. 2018-06-25 16:56:45 +03:00
Valentin Bartenev
41317e37da HTTP parser: saving partial method.
This is useful for log purposes.
2018-04-10 16:51:22 +03:00
Valentin Bartenev
8d697e8004 HTTP parser: saving unsupported version.
This is useful for log purposes.
2018-04-10 16:51:22 +03:00
Valentin Bartenev
b1b9c78362 HTTP parser: correct "target" for partial or invalid request line. 2018-04-10 16:51:22 +03:00
Valentin Bartenev
d15b4ca906 Style. 2018-04-05 15:49:41 +03:00
Valentin Bartenev
0665896a55 Style: capitalized letters in hexadecimal literals. 2018-04-04 18:13:05 +03:00
Valentin Bartenev
701a54c177 HTTP parser: excluding leading and trailing tabs from field values.
As required by RFC 7230.
2018-03-15 21:08:29 +03:00
Valentin Bartenev
0b628bfe48 HTTP parser: allowing tabs in field values as per RFC 7230. 2018-03-15 21:07:57 +03:00
Valentin Bartenev
3d2f85d9ca HTTP parser: restricting allowed characters in fields values.
According to RFC 7230 only printable 7-bit ASCII characters are allowed
in field values.
2018-03-15 21:07:56 +03:00
Valentin Bartenev
5a003df1fe HTTP parser: fixed parsing of field values ending with space.
This closes #82 issue on GitHub.
2018-03-15 20:52:39 +03:00
Valentin Bartenev
7fe8f72364 HTTP parser: simplified nxt_http_parse_field_value().
There's no need in loop after 4ac474b68658.

Found by Coverity (CID 259713).
2018-01-25 10:31:22 +03:00
Valentin Bartenev
477e8177b7 HTTP parser: restricting control chars in header fields values.
This also fixes an infinite loop here (found with honggfuzz).
2018-01-24 15:02:56 +03:00
Valentin Bartenev
0c38ff0e66 Checking for major HTTP version. 2018-01-15 20:50:20 +03:00
Valentin Bartenev
a073616fc3 Improved HTTP version representation. 2018-01-15 20:50:14 +03:00
Valentin Bartenev
3fb140d6d2 HTTP parser: improved error reporting. 2018-01-15 20:49:59 +03:00
Valentin Bartenev
e8aada94de HTTP parser: allowing underscore in header field names. 2018-01-09 16:50:47 +03:00
Valentin Bartenev
45d08d5145 HTTP parser: introduced nxt_http_parse_fields(). 2017-12-27 15:45:23 +03:00
Valentin Bartenev
95a9cb94d5 HTTP parser: fixed memory overflow in the collisions test.
The level hash uses the NULL value as the indicator of a free entry in a bucket.
So, inserting a NULL value breaks the hash and can lead to a bucket overflow.

In case of the collision counter, the value wasn't initialized, since it's not
needed for the purpose of checking collisions.  As a result, it might contain
any garbage from the stack and in some rare cases the value was NULL.

Now the value is initilized.
2017-12-26 17:18:57 +03:00
Valentin Bartenev
8830d73261 HTTP parser: reworked header fields handling. 2017-12-25 17:04:22 +03:00
Valentin Bartenev
67d72d46f7 HTTP parser: improved detection of corrupted request line. 2017-12-08 19:18:00 +03:00
Valentin Bartenev
20d720dfc5 HTTP parser: slightly improved readability of code.
As suggested by Igor Sysoev.
2017-12-08 19:18:00 +03:00
Max Romanov
f3107f3896 Complex target parser copied from NGINX.
nxt_app_request_header_t fields renamed:
- 'path' renamed to 'target'.
- 'path_no_query' renamed to 'path' and contains parsed value.
2017-07-05 13:31:45 +03:00
Valentin Bartenev
dfd3cc8c0e Applied nxt_pointer_to() and nxt_value_at() where possible. 2017-06-27 17:27:18 +03:00
Valentin Bartenev
accb489492 HTTP parser: reduced memory consumption of header fields list. 2017-06-20 22:32:13 +03:00
Igor Sysoev
f888a5310c Using new memory pool implementation. 2017-06-20 19:49:17 +03:00
Valentin Bartenev
db6642f374 HTTP parser: decoupled header fields processing. 2017-06-13 20:11:29 +03:00
Valentin Bartenev
f6e7c2b6a6 HTTP parser: fixed handling header fields with missing colon. 2017-06-09 21:49:51 +03:00
Valentin Bartenev
dee819daab HTTP parser: changed style of a comment.
As requested by Igor.
2017-05-31 14:35:33 +03:00
Valentin Bartenev
ed38d86abb Added missing "fall through" comments to make GCC 7 happy. 2017-05-10 19:19:14 +03:00
Valentin Bartenev
558d1f8687 HTTP parser: fixed minimum length optimization in headers hash. 2017-04-25 16:57:14 +03:00
Valentin Bartenev
5745e48264 More optimizations of HTTP parser.
SSE 4.2 code removed, since loop unrolling gives better results.
2017-03-08 00:38:52 +03:00
Valentin Bartenev
4df646a258 HTTP parser. 2017-03-01 15:29:18 +03:00
Valentin Bartenev
fde4d18e3a Removed legacy HTTP parser. 2017-03-01 15:17:55 +03:00
Igor Sysoev
16cbf3c076 Initial version. 2017-01-17 20:00:00 +03:00