Forms without explicit enctype are submitted as application/x-www-form-
urlencoded. This is the default behaviour in Rails. However, this
enctype does not allow transmission of binary data (files).
Would it not make sense to specify the enctype multipart/form-data by
default instead? i.e. all the form_for helpers would add this enctype
to the form tag, unless overriden by the developer.
This way, file uploads "just work" and normal key/value pairs continue
to work as well. There is no downside. I browsed through lighthouse
and only found obscure NN4 bugs related to multipart/form-data, which
I believe we can safely ignore. Even IE4 supports this enctype.
I can fork and make this change, but would like to float this idea
beforehand.
I like that idea. I don't see any downsides for that. Are there
performance differences when submitting an multipart form?
Yes there are: multipart forms send more data since every field has its own MIME header. From RFC2388:
"The multipart/form-data encoding has a high overhead and performance impact if there are many fields with short values. However, in practice, for the forms in use, for example, in HTML, the average overhead is not significant."
If this overhead was deemed not significant in August 1998, in my opinion it is even less significant now and this is a change we could make without any problems.
I don't think multipart parsing is very expensive nowadays, but I have
some doubts about the efficiency of the Rails/Rack multipart parser.
It performs a lot of string operations and if it creates a tempfile
even for small files then it can totally kill parsing performance.
Here are some preliminary benchmarks, done with a fresh Rails 2.3.8
app, ruby 1.8.7, and a single render :nothing => true.
It does seem multipart forms are a bit slower overall, although they
were faster under webrick for a high number of parameters. As far as I
could tell the numbers scaled linearly.
In concrete terms, for Passenger, when posting 1000 simple parameters
the speed penalty is 0.764 ms (yes milliseconds).
Will repeat with a Rails 3 app too and post those results.
Benchmark script: http://gist.github.com/442546
Webrick:
= 1 key/value pair, 1000 posts
user system total real
default: 1.320000 0.150000 1.470000 ( 15.797317)
multipart: 1.400000 0.130000 1.530000 ( 16.273821)
= 1000 key/value pairs, 1000 posts
user system total real
default: 26.170000 0.570000 26.740000 ( 74.681545)
multipart: 26.130000 0.540000 26.670000 ( 74.073439)
Passenger
= 1 key/value pair, 1000 posts
user system total real
default: 1.250000 0.160000 1.410000 ( 25.887364)
multipart: 1.300000 0.150000 1.450000 ( 26.857967)
= 1000 key/value pairs, 1000 posts
user system total real
default: 26.060000 0.510000 26.570000 ( 70.453057)
multipart: 26.060000 0.480000 26.540000 ( 71.216598)
Ran the benchmarks on Rails 3, which would actually suggest faster
multipart parsing for a small number of parameters. Otherwise the
numbers are similar. Hongli, thoughts?
= 1 key/value pair, 1000 posts
user system total real
default: 1.310000 0.240000 1.550000 ( 25.266783)
multipart: 1.360000 0.230000 1.590000 ( 23.380536)
= 1000 key/value pairs, 1000 posts
user system total real
default: 26.120000 0.730000 26.850000 ( 73.697901)
multipart: 26.330000 0.720000 27.050000 ( 74.567602)
The web server has got nothing to do with it, it only forwards the
multipart data that the HTTP client sent. Don't benchmark the web
server, benchmark the multipart parser instead. Any difference that
you see is numbers is likely due to web server-specific things that
have got nothing to do with multipart parsing.
What I'd really like to see is a generic multipart parser written in C/
C++ that can be used in Ruby. mod_porter does the parsing for you in
the web server but unfortunately their parser depends on all kinds of
Apache stuff and can't be easily split off.