summary refs log tree commit
diff options
context:
space:
mode:
authorJeremy Evans <code@jeremyevans.net>2022-05-10 09:21:16 -0700
committerJeremy Evans <code@jeremyevans.net>2022-05-25 08:33:11 -0700
commitcf29914d96ac81fe938d658c826582dc010d8e0a (patch)
tree981e22c99d624554496dd7c57a358d69d727d6f0
parent00229670c88053daaa2ca7e43a74d32db5c47abf (diff)
downloadrack-cf29914d96ac81fe938d658c826582dc010d8e0a.tar.gz
Remove dead code in multipart parser
The conditional with the dead code is:

```ruby
raise EOFError, "bad content body" if @sbuf.rest_size >= @bufsize
```

The dead code would be executed when consume_boundary fails in
handle_fast_forward, if the remaining data in the string scanner
is larger than the buffer size.

This branch appears to be impossible to hit.  However, we want to
be very sure of that before removing it.

As some background, FAST_FORWARD is the initial parser state, and
is never reentered.  The processing of the initial state in the
state machine is basically:

```ruby
read_data
read_data until handle_fast_forward == :MIME_HEAD
```

handle_fast_forward calls consume_boundary, which does
scan_until(BOUNDARY_REGEX) on the read data. BOUNDARY_REGEX is
`/\A([^\n]*(?:\n|\Z))/`, which matches all strings.  It matches
up to the first newline if there is a newline, or the entire
string if there is not a newline.

After the scan, it strips the read data, and then checks if it
matches the boundary.  If it matches the boundary, then there
is a state transition and the conditional is not reached.

If it doesn't match the boundary, if there is data left in the
string scanner, it loops and rescans using BOUNDARY_REGEX, which
again, matches all strings.  The loop in consume_boundary basically
makes sure that either the boundary is matched, or there is
no data left in the string scanner.

If there is no data left in the string scanner, then the loop
exits.  In that case, the branch is reached, but `@sbuf.rest_size`
is always 0.

To confirm that `@sbuf.rest_size` is always 0 before the conditional,
I added the following code before the conditional:

```ruby
abort("non-0 after consume_boundary") unless @sbuf.rest_size == 0
```

All tests still passed with that code. Note that the only test
where the conditional was even reached was the insanely long
boundary test.

If you try to force hitting the dead code by setting the buffer
size to 0, you get an EmptyContentError raised in read_data before
starting the state machine.

Looking at the history, this code initially comes from
f95113402b7239f225282806673e1b6424522b18, where it was possible to
hit the code because the equivalent of BOUNDARY_REGEX didn't use
\Z (the regex used was was `/\A([^\n]*\n)/`).  The \Z wasn't added
until 360d374ab1a263fcf81a18d6e5ac74cccccbc7e9, and the dead branch
has probably not been possible to hit since.
-rw-r--r--lib/rack/multipart/parser.rb1
1 files changed, 0 insertions, 1 deletions
diff --git a/lib/rack/multipart/parser.rb b/lib/rack/multipart/parser.rb
index 8c5f0a10..dd807d1c 100644
--- a/lib/rack/multipart/parser.rb
+++ b/lib/rack/multipart/parser.rb
@@ -254,7 +254,6 @@ module Rack
         if consume_boundary
           @state = :MIME_HEAD
         else
-          raise EOFError, "bad content body" if @sbuf.rest_size >= @bufsize
           :want_read
         end
       end