unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
* Re: Please move to github
@ 2014-08-02  7:51 Gary Grossman
  2014-08-02  7:54 ` Kapil Israni
  2014-08-02  8:50 ` Eric Wong
  0 siblings, 2 replies; 12+ messages in thread
From: Gary Grossman @ 2014-08-02  7:51 UTC (permalink / raw)
  To: e; +Cc: unicorn-public, michael

Hi Eric,

I work with Michael, and this discussion sure got off on the
wrong foot... we love unicorn and use it heavily, and just
want to contribute back to it.

To detail the encoding problem we were trying to fix, unicorn
uses rb_str_new in several places to create Ruby strings.
For Ruby 1.9 and later, these strings are assigned ASCII-8BIT
encoding.

While the Rack specification doesn't dictate what encoding
should be used for strings in the environment, many
developers would probably expect the default external encoding
setting in Encoding.default_external to be used.

Many Rails applications use UTF8 heavily. The use of ASCII-8BIT
in the env can lead to Encoding::CompatibilityErrors being
raised when a UTF8 string and ASCII-8BIT string are concatenated,
which happens frequently when properties like request.url are
referenced in erb templates. To get around these problems,
an app would have to force encoding on the strings in the env
manually. It seems a shame to do this in slower Ruby code when
it could be done up front by unicorn.

We'd like to propose that unicorn use rb_external_str_new to
make strings instead of rb_str_new.

Perhaps you have your reasons for continuing to use rb_str_new
but we figured we'd run this by you.

Here's a proposed patch.

Gary

From befb01530c8d930ba53cc58b979ddf42a4c12565 Mon Sep 17 00:00:00 2001
From: Gary Grossman <gary.grossman@gmail.com>
Date: Sat, 2 Aug 2014 00:19:30 -0700
Subject: [PATCH] If unicorn is used with Ruby 1.9 or later, use
 rb_external_str_new instead of rb_str_new to create strings. The resulting
 strings will use the default external encoding. Continue using rb_str_new for
 older versions of Ruby.

Using the default external encoding instead of ASCII-8BIT for
strings is more in line with developer expectations and will cause
less unexpected bugs such as Encoding::CompatibilityErrors which
result when, say, a UTF8 string and ASCII-8BIT string are
concatenated together.

Added a unit test to ensure that strings returned in the Rack
environment conform to the default external encoding.
---
 ext/unicorn_http/ext_help.h |  6 ++++++
 test/unit/test_request.rb   | 13 +++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/ext/unicorn_http/ext_help.h b/ext/unicorn_http/ext_help.h
index c87c272..6806f8e 100644
--- a/ext/unicorn_http/ext_help.h
+++ b/ext/unicorn_http/ext_help.h
@@ -79,4 +79,10 @@ static int str_cstr_case_eq(VALUE val, const char *ptr, long len)
 #define STR_CSTR_CASE_EQ(val, const_str) \
   str_cstr_case_eq(val, const_str, sizeof(const_str) - 1)
 
+#ifdef HAVE_RUBY_ENCODING_H
+/* Use default external encoding for strings for Ruby 1.9+,
+ * fall back to rb_str_new when unavailable */
+#define rb_str_new rb_external_str_new
+#endif
+
 #endif /* ext_help_h */
diff --git a/test/unit/test_request.rb b/test/unit/test_request.rb
index fbda1a2..0a105e0 100644
--- a/test/unit/test_request.rb
+++ b/test/unit/test_request.rb
@@ -179,4 +179,17 @@ class RequestTest < Test::Unit::TestCase
     env['rack.input'].rewind
     res = @lint.call(env)
   end
+
+  def test_encoding
+    if ''.respond_to?(:encoding)
+      client = MockRequest.new("GET http://e:3/x?y=z HTTP/1.1\r\n" \
+                               "Host: foo\r\n\r\n")
+      env = @request.read(client)
+      encoding = Encoding.default_external
+      assert_equal encoding, env['REQUEST_PATH'].encoding
+      assert_equal encoding, env['PATH_INFO'].encoding
+      assert_equal encoding, env['QUERY_STRING'].encoding
+    end
+  end
+
 end
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Please move to github
  2014-08-02  7:51 Please move to github Gary Grossman
@ 2014-08-02  7:54 ` Kapil Israni
  2014-08-02  8:02   ` Eric Wong
  2014-08-02  8:50 ` Eric Wong
  1 sibling, 1 reply; 12+ messages in thread
From: Kapil Israni @ 2014-08-02  7:54 UTC (permalink / raw)
  Cc: unicorn-public

How do I unsubscribe from this email list?


On Sat, Aug 2, 2014 at 12:51 AM, Gary Grossman <gary.grossman@gmail.com>
wrote:

> Hi Eric,
>
> I work with Michael, and this discussion sure got off on the
> wrong foot... we love unicorn and use it heavily, and just
> want to contribute back to it.
>
> To detail the encoding problem we were trying to fix, unicorn
> uses rb_str_new in several places to create Ruby strings.
> For Ruby 1.9 and later, these strings are assigned ASCII-8BIT
> encoding.
>
> While the Rack specification doesn't dictate what encoding
> should be used for strings in the environment, many
> developers would probably expect the default external encoding
> setting in Encoding.default_external to be used.
>
> Many Rails applications use UTF8 heavily. The use of ASCII-8BIT
> in the env can lead to Encoding::CompatibilityErrors being
> raised when a UTF8 string and ASCII-8BIT string are concatenated,
> which happens frequently when properties like request.url are
> referenced in erb templates. To get around these problems,
> an app would have to force encoding on the strings in the env
> manually. It seems a shame to do this in slower Ruby code when
> it could be done up front by unicorn.
>
> We'd like to propose that unicorn use rb_external_str_new to
> make strings instead of rb_str_new.
>
> Perhaps you have your reasons for continuing to use rb_str_new
> but we figured we'd run this by you.
>
> Here's a proposed patch.
>
> Gary
>
> From befb01530c8d930ba53cc58b979ddf42a4c12565 Mon Sep 17 00:00:00 2001
> From: Gary Grossman <gary.grossman@gmail.com>
> Date: Sat, 2 Aug 2014 00:19:30 -0700
> Subject: [PATCH] If unicorn is used with Ruby 1.9 or later, use
>  rb_external_str_new instead of rb_str_new to create strings. The resulting
>  strings will use the default external encoding. Continue using rb_str_new
> for
>  older versions of Ruby.
>
> Using the default external encoding instead of ASCII-8BIT for
> strings is more in line with developer expectations and will cause
> less unexpected bugs such as Encoding::CompatibilityErrors which
> result when, say, a UTF8 string and ASCII-8BIT string are
> concatenated together.
>
> Added a unit test to ensure that strings returned in the Rack
> environment conform to the default external encoding.
> ---
>  ext/unicorn_http/ext_help.h |  6 ++++++
>  test/unit/test_request.rb   | 13 +++++++++++++
>  2 files changed, 19 insertions(+)
>
> diff --git a/ext/unicorn_http/ext_help.h b/ext/unicorn_http/ext_help.h
> index c87c272..6806f8e 100644
> --- a/ext/unicorn_http/ext_help.h
> +++ b/ext/unicorn_http/ext_help.h
> @@ -79,4 +79,10 @@ static int str_cstr_case_eq(VALUE val, const char *ptr,
> long len)
>  #define STR_CSTR_CASE_EQ(val, const_str) \
>    str_cstr_case_eq(val, const_str, sizeof(const_str) - 1)
>
> +#ifdef HAVE_RUBY_ENCODING_H
> +/* Use default external encoding for strings for Ruby 1.9+,
> + * fall back to rb_str_new when unavailable */
> +#define rb_str_new rb_external_str_new
> +#endif
> +
>  #endif /* ext_help_h */
> diff --git a/test/unit/test_request.rb b/test/unit/test_request.rb
> index fbda1a2..0a105e0 100644
> --- a/test/unit/test_request.rb
> +++ b/test/unit/test_request.rb
> @@ -179,4 +179,17 @@ class RequestTest < Test::Unit::TestCase
>      env['rack.input'].rewind
>      res = @lint.call(env)
>    end
> +
> +  def test_encoding
> +    if ''.respond_to?(:encoding)
> +      client = MockRequest.new("GET http://e:3/x?y=z HTTP/1.1\r\n" \
> +                               "Host: foo\r\n\r\n")
> +      env = @request.read(client)
> +      encoding = Encoding.default_external
> +      assert_equal encoding, env['REQUEST_PATH'].encoding
> +      assert_equal encoding, env['PATH_INFO'].encoding
> +      assert_equal encoding, env['QUERY_STRING'].encoding
> +    end
> +  end
> +
>  end
> --
> 1.9.1
>
>
>


-- 
Kapil


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Please move to github
  2014-08-02  7:54 ` Kapil Israni
@ 2014-08-02  8:02   ` Eric Wong
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2014-08-02  8:02 UTC (permalink / raw)
  To: Kapil Israni; +Cc: unicorn-public

Kapil Israni <kapil.israni@gmail.com> wrote:
> How do I unsubscribe from this email list?

Send an email to: unicorn-public+unsubscribe@bogomips.org
(it should've been mentioned in the welcome message, and is in
 every header).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Please move to github
  2014-08-02  7:51 Please move to github Gary Grossman
  2014-08-02  7:54 ` Kapil Israni
@ 2014-08-02  8:50 ` Eric Wong
  2014-08-02 19:07   ` Gary Grossman
  1 sibling, 1 reply; 12+ messages in thread
From: Eric Wong @ 2014-08-02  8:50 UTC (permalink / raw)
  To: Gary Grossman; +Cc: unicorn-public, michael

Gary Grossman <gary.grossman@gmail.com> wrote:
> Hi Eric,
> 
> I work with Michael, and this discussion sure got off on the
> wrong foot... we love unicorn and use it heavily, and just
> want to contribute back to it.

No worries, cultural differences happen.  Thanks for following up.

> To detail the encoding problem we were trying to fix, unicorn
> uses rb_str_new in several places to create Ruby strings.
> For Ruby 1.9 and later, these strings are assigned ASCII-8BIT
> encoding.
> 
> While the Rack specification doesn't dictate what encoding
> should be used for strings in the environment, many
> developers would probably expect the default external encoding
> setting in Encoding.default_external to be used.

Right, the Rack spec does not dictate this.  Doing this out-of-spec has
the ability to break existing apps as well as compatibility with other
app servers.

What do other app servers do?

My main concern is having more different behavior between various Rack
servers servers, making it harder to switch between them.

Another concern is breaking apps which are already working around this
but work with non-UTF-8 encodings.

The rack-devel mailing list had a discussion on this in September 2010
and a decision was never reached. You can search the archives at:
http://groups.google.com/group/rack-devel

I've also saved the thread to a mbox at
http://80x24.org/rack-devel-encoding-2010.mbox.gz
since Google Groups archives are a bit painful to navigate.

Disclaimer: I am not an encoding expert, so for that reason I prefer
to let other Rack folks make the decision.

> Many Rails applications use UTF8 heavily. The use of ASCII-8BIT
> in the env can lead to Encoding::CompatibilityErrors being
> raised when a UTF8 string and ASCII-8BIT string are concatenated,
> which happens frequently when properties like request.url are
> referenced in erb templates. To get around these problems,
> an app would have to force encoding on the strings in the env
> manually. It seems a shame to do this in slower Ruby code when
> it could be done up front by unicorn.

Yes, this existing behavior sucks on UTF-8-heavy apps.  I would rather
not add more unicorn-only options which make switching between servers
harder.

Do you have performance measurements for doing this as pure-Ruby
middleware vs your patch?

My dislike of lock-in also applies to app servers.  Application-visible
differences like these should be avoided so people can switch between
servers, too.

So it should be best if there were a way to do this for all Rack
servers.

> We'd like to propose that unicorn use rb_external_str_new to
> make strings instead of rb_str_new.
> 
> Perhaps you have your reasons for continuing to use rb_str_new
> but we figured we'd run this by you.

If the Rack spec mandated encodings, I would do it in a heartbeat.

> Subject: [PATCH] If unicorn is used with Ruby 1.9 or later, use
>  rb_external_str_new instead of rb_str_new to create strings. The resulting
>  strings will use the default external encoding. Continue using rb_str_new for
>  older versions of Ruby.

A better, shorter, more direct subject would be:

Subject: use Encoding.default_external for header values

Commit message body is fine <snip>

> +#ifdef HAVE_RUBY_ENCODING_H
> +/* Use default external encoding for strings for Ruby 1.9+,
> + * fall back to rb_str_new when unavailable */
> +#define rb_str_new rb_external_str_new
> +#endif

This is too heavy-handed, as some strings (buffers) may
need to stay binary via rb_str_new.  If we were to do this, it would
something like:

#ifdef HAVE_RUBY_ENCODING_H
#  define env_val_new(ptr,len) rb_external_str_new((ptr),(len))
#else
#  define env_val_new(ptr,len) rb_str_new((ptr),(len))
#endif

... And only making sure header values are set to external.

Last I checked the HTTP RFCs (it's been a while) header keys are
required to be US-ASCII-only (and our parser enforces that).

> +  def test_encoding
> +    if ''.respond_to?(:encoding)
> +      client = MockRequest.new("GET http://e:3/x?y=z HTTP/1.1\r\n" \
> +                               "Host: foo\r\n\r\n")
> +      env = @request.read(client)
> +      encoding = Encoding.default_external
> +      assert_equal encoding, env['REQUEST_PATH'].encoding
> +      assert_equal encoding, env['PATH_INFO'].encoding
> +      assert_equal encoding, env['QUERY_STRING'].encoding
> +    end

This would need to test and work with (and appropriately reject)
invalid requests with bad encodings, too.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Please move to github
  2014-08-02  8:50 ` Eric Wong
@ 2014-08-02 19:07   ` Gary Grossman
  2014-08-02 19:33     ` Michael Fischer
  2014-08-02 20:15     ` Please move to github Eric Wong
  0 siblings, 2 replies; 12+ messages in thread
From: Gary Grossman @ 2014-08-02 19:07 UTC (permalink / raw)
  To: Eric Wong; +Cc: unicorn-public, michael

Hi Eric,

Thanks for your reply and for reviewing the patch!

>Right, the Rack spec does not dictate this.  Doing this out-of-spec has
>the ability to break existing apps as well as compatibility with other
>app servers.

It's true, my patch is too naive since it's a pretty drastic change
in behavior not behind any kind of switch.

>What do other app servers do?

I did a little survey. ASCII-8BIT is kind of the de facto standard
even if it's not mandated by the Rack specification. Phusion
Passenger, Thin and WEBrick all send mostly ASCII-8BIT strings in
the env.

>My main concern is having more different behavior between various Rack
>servers servers, making it harder to switch between them.

Very valid; Rack wouldn't be much of a standard if there were a bunch
of variants in use.

>Another concern is breaking apps which are already working around this
>but work with non-UTF-8 encodings.

We'd pretty much need to introduce some kind of configuration
switch, at least for the short term and maybe for the long term.
The hope would be that it could become the default setting.
Apps that don't use UTF8 should be able to set their desired default
external encoding appropriately.

>The rack-devel mailing list had a discussion on this in September 2010
>and a decision was never reached. You can search the archives at:
>http://groups.google.com/group/rack-devel

I came across this thread but didn't realize that was the last word
so far when it came to Rack and encodings.

This might be one of those instances where it would be helpful for
implementation to lead specification. Unicorn is one of the leading
servers of its genre, if not the leader. If you supported a switch
that made the encoding regime more sane, I think other popular servers
like Thin and Passenger would swiftly follow and it might re-energize
the discussion about getting encodings into the Rack spec once and
for all, and give a base for experimentation and iteration for
getting the encodings in the spec right.

There's a lot of developer pain here. Many apps probably are serving
up encoding-related 500 errors without knowing it. There are
stories of developers adding "# encoding" everywhere, setting
the external/internal encoding, and then "things are fine until it
blows up somewhere else." I heard recently that a very large company
has stuck with Ruby 1.8.7, probably to avoid these encoding issues
among other things. It would be nice to improve the situation.

>Disclaimer: I am not an encoding expert, so for that reason I prefer
>to let other Rack folks make the decision.

I'm not an encoding expert either! Most people aren't... which is
why it'd be nice if they didn't have to know so much about it when
they write a Rack app!

>Do you have performance measurements for doing this as pure-Ruby
>middleware vs your patch?

I don't have measurements currently but I'll get some.
Our app is several years old and so there's a lot of stuff in
request.env by the time we get around to forcing everything to
UTF8 encoding. I wouldn't be surprised if the hit on
every single request is small but significant for us.

>So it should be best if there were a way to do this for all Rack
>servers.

Thanks again for reviewing the patch. I'll work on a new patch that
incorporates your comments and has a switch for enabling/disabling
the functionality, and I'll try to follow roughly what the spec
group in 2010 thought would make sense in terms of encodings for
the various strings in the env. And I'll see if I can ask the
Rack folks to chime in.

Gary


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Please move to github
  2014-08-02 19:07   ` Gary Grossman
@ 2014-08-02 19:33     ` Michael Fischer
  2014-08-04  7:22       ` Hongli Lai
  2014-08-02 20:15     ` Please move to github Eric Wong
  1 sibling, 1 reply; 12+ messages in thread
From: Michael Fischer @ 2014-08-02 19:33 UTC (permalink / raw)
  To: Gary Grossman; +Cc: Eric Wong, unicorn-public, Michael Grosser

On Sat, Aug 2, 2014 at 12:07 PM, Gary Grossman <gary.grossman@gmail.com>
wrote:

This might be one of those instances where it would be helpful for
> implementation to lead specification. Unicorn is one of the leading
> servers of its genre, if not the leader. If you supported a switch
> that made the encoding regime more sane, I think other popular servers
> like Thin and Passenger would swiftly follow and it might re-energize
> the discussion about getting encodings into the Rack spec once and
> for all, and give a base for experimentation and iteration for
> getting the encodings in the spec right.
>

I agree with Gary here.  It's often too easy to decide to preserve the
status quo because things work well enough -- and then, eventually, time
catches up with you and it no longer does.

If Gary's proposal makes sense, and improves matters without doing
significant harm -- despite it not adhering to the letter of Rack
compliance as it is currently specified today -- it would represent a major
step forward if implemented in Unicorn.  (And as Gary suggested, the
specification and other implementors will probably catch up by necessity if
the behavior proves beneficial.)

The first step is to prove it's worth shaking the tree with some
benchmarks, though. :)

--Michael


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Please move to github
  2014-08-02 19:07   ` Gary Grossman
  2014-08-02 19:33     ` Michael Fischer
@ 2014-08-02 20:15     ` Eric Wong
  1 sibling, 0 replies; 12+ messages in thread
From: Eric Wong @ 2014-08-02 20:15 UTC (permalink / raw)
  To: Gary Grossman; +Cc: unicorn-public, michael

Gary Grossman <gary.grossman@gmail.com> wrote:
> We'd pretty much need to introduce some kind of configuration
> switch, at least for the short term and maybe for the long term.
> The hope would be that it could become the default setting.
> Apps that don't use UTF8 should be able to set their desired default
> external encoding appropriately.

If possible, I would like to avoid an option and rely on
Encoding.default_external in a new major version.  Too many ways to set
the same thing is confusing and requires extra documentation overhead.

> >The rack-devel mailing list had a discussion on this in September 2010
> >and a decision was never reached. You can search the archives at:
> >http://groups.google.com/group/rack-devel
> 
> I came across this thread but didn't realize that was the last word
> so far when it came to Rack and encodings.
> 
> This might be one of those instances where it would be helpful for
> implementation to lead specification. Unicorn is one of the leading
> servers of its genre, if not the leader. If you supported a switch
> that made the encoding regime more sane, I think other popular servers
> like Thin and Passenger would swiftly follow and it might re-energize
> the discussion about getting encodings into the Rack spec once and
> for all, and give a base for experimentation and iteration for
> getting the encodings in the spec right.

I might start with WEBrick (or the Rack/WEBrick handler).  WEBrick is
distributed with Ruby and maintained by the core team.  It's not used in
production much, but it the reference implementation which is usable
from all Ruby implementations.

naruse (from that rack-devel thread) is also active in Ruby core and
is very knowledgeable in these areas.

> Thanks again for reviewing the patch. I'll work on a new patch that
> incorporates your comments and has a switch for enabling/disabling
> the functionality, and I'll try to follow roughly what the spec
> group in 2010 thought would make sense in terms of encodings for
> the various strings in the env. And I'll see if I can ask the
> Rack folks to chime in.

Definitely get other Rack folks to chime in, even if it is a
unicorn-only change.  This has been a problem for years already,
so taking more time to get things right won't hurt.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Please move to github
  2014-08-02 19:33     ` Michael Fischer
@ 2014-08-04  7:22       ` Hongli Lai
  2014-08-04  8:48         ` Rack encodings (was: Please move to github) Eric Wong
  0 siblings, 1 reply; 12+ messages in thread
From: Hongli Lai @ 2014-08-04  7:22 UTC (permalink / raw)
  To: Michael Fischer; +Cc: Gary Grossman, Eric Wong, unicorn-public, Michael Grosser

On Sat, Aug 2, 2014 at 9:33 PM, Michael Fischer <mfischer@zendesk.com> wrote:
> On Sat, Aug 2, 2014 at 12:07 PM, Gary Grossman <gary.grossman@gmail.com>
> wrote:
>> This might be one of those instances where it would be helpful for
>> implementation to lead specification. Unicorn is one of the leading
>> servers of its genre, if not the leader. If you supported a switch
>> that made the encoding regime more sane, I think other popular servers
>> like Thin and Passenger would swiftly follow and it might re-energize
>> the discussion about getting encodings into the Rack spec once and
>> for all, and give a base for experimentation and iteration for
>> getting the encodings in the spec right.
>>
>
> I agree with Gary here.  It's often too easy to decide to preserve the
> status quo because things work well enough -- and then, eventually, time
> catches up with you and it no longer does.

Hi guys. Phusion Passenger author here. I would very much support
standardization of encoding issues. Every now and then, a user submits
a bug report on Phusion Passenger, mentioning an encoding problem. The
user would say that the problem occurs on Phusion Passenger but not on
Unicorn/Thin/etc. The Rack spec doesn't say anything about encodings
so strictly speaking it's not "our fault", but it's still hard to tell
users that it's "their fault" or "their framework's fault" based on
this alone. It's also not a helpful answer: users often have no idea
what to do about the issue.

At this point, I don't really care what the standard is, as long as
it's a sane standard that everybody can follow.

In my opinion, following Encoding.default_external is not helpful.
Most users have absolutely no idea how to configure
Encoding.default_external, or even know that it exists. I've also
never, ever seen anybody who does *not* want default_external to be
UTF-8. If it's not set to UTF-8, then it's always by accident (e.g.
the user not knowing that it depends on LC_CTYPE, that LC_CTYPE is set
differently in the shell than from an init script, or even what
LC_CTYPE is).

-- 
Phusion | Web Application deployment, scaling, and monitoring solutions

Web: http://www.phusion.nl/
E-mail: info@phusion.nl
Chamber of commerce no: 08173483 (The Netherlands)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Rack encodings (was: Please move to github)
  2014-08-04  7:22       ` Hongli Lai
@ 2014-08-04  8:48         ` Eric Wong
  2014-08-04  9:46           ` Hongli Lai
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Wong @ 2014-08-04  8:48 UTC (permalink / raw)
  To: Hongli Lai
  Cc: Michael Fischer, Gary Grossman, unicorn-public, Michael Grosser

(Long overdue Subject: change)

Hongli Lai <hongli@phusion.nl> wrote:
> Hi guys. Phusion Passenger author here. I would very much support
> standardization of encoding issues.

> At this point, I don't really care what the standard is, as long as
> it's a sane standard that everybody can follow.

Fair enough.  Would you/Phusion be comfortable taking the lead here?
This feels like another "hot potato" issue :>

> In my opinion, following Encoding.default_external is not helpful.
> Most users have absolutely no idea how to configure
> Encoding.default_external, or even know that it exists. I've also
> never, ever seen anybody who does *not* want default_external to be
> UTF-8. If it's not set to UTF-8, then it's always by accident (e.g.
> the user not knowing that it depends on LC_CTYPE, that LC_CTYPE is set
> differently in the shell than from an init script, or even what
> LC_CTYPE is).

Perhaps we need to educate users to set LC_CTYPE/LC_ALL/LANG so
Encoding.default_external works as intended?  Adding another
option to Rack will just as likely to get missed.

Maybe servers could emit a big warning saying:

    WARNING: Encoding.default_external is not UTF-8 ...

And add a --quiet-utf8-warning option for the few folks who really do
not want UTF-8.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Rack encodings (was: Please move to github)
  2014-08-04  8:48         ` Rack encodings (was: Please move to github) Eric Wong
@ 2014-08-04  9:46           ` Hongli Lai
  0 siblings, 0 replies; 12+ messages in thread
From: Hongli Lai @ 2014-08-04  9:46 UTC (permalink / raw)
  To: Eric Wong; +Cc: Michael Fischer, Gary Grossman, unicorn-public, Michael Grosser

On Mon, Aug 4, 2014 at 10:48 AM, Eric Wong <e@80x24.org> wrote:
> Fair enough.  Would you/Phusion be comfortable taking the lead here?
> This feels like another "hot potato" issue :>

Unfortunately, we're too busy with a major project to be able to lead
this effort.

-- 
Phusion | Web Application deployment, scaling, and monitoring solutions

Web: http://www.phusion.nl/
E-mail: info@phusion.nl
Chamber of commerce no: 08173483 (The Netherlands)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Rack encodings (was: Please move to github)
@ 2014-08-05  5:56 Gary Grossman
  2014-08-05  6:28 ` Eric Wong
  0 siblings, 1 reply; 12+ messages in thread
From: Gary Grossman @ 2014-08-05  5:56 UTC (permalink / raw)
  To: hongli; +Cc: unicorn-public, michael, e, mfischer, gary.grossman

It feels like we were getting some momentum here on an important but
long-dormant issue here... maybe it's time to move this discussion
to rack-devel? Perhaps there's another Rack luminary who can lead
the charge, or at least see if there's some consensus after a few
more years of shared experience on what "sane" encodings might
look like.

A lightweight way to move the implementation forward might be a
simple Rack middleware gem which sets the new encodings on the 
environment, or adding the functionality to rack itself. Once
developers were comfortable with the new regime, the app servers
could follow suit and put those encodings in the env natively,
and the Rubyland implementation of the new encodings could be
dropped.

Gary


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Rack encodings (was: Please move to github)
  2014-08-05  5:56 Rack encodings (was: Please move to github) Gary Grossman
@ 2014-08-05  6:28 ` Eric Wong
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2014-08-05  6:28 UTC (permalink / raw)
  To: Gary Grossman; +Cc: hongli, unicorn-public, michael, mfischer

Gary Grossman <gary.grossman@gmail.com> wrote:
> It feels like we were getting some momentum here on an important but
> long-dormant issue here... maybe it's time to move this discussion
> to rack-devel?

Sure, rack-devel is a pretty dormant mailing list but there's been a
burst of activity a few weeks ago.

Unlike this list, subscription is required to post; and first posts
from newbies are moderated.  For folks who do not login to Google
(crazies like me :P) subscription is possible without any login
or password: rack-devel+subscribe@googlegroups.com

> Perhaps there's another Rack luminary who can lead
> the charge, or at least see if there's some consensus after a few
> more years of shared experience on what "sane" encodings might
> look like.

At least there's other server implementers who'll probably
chime in.

> A lightweight way to move the implementation forward might be a
> simple Rack middleware gem which sets the new encodings on the 
> environment, or adding the functionality to rack itself. Once
> developers were comfortable with the new regime, the app servers
> could follow suit and put those encodings in the env natively,
> and the Rubyland implementation of the new encodings could be
> dropped.

Sounds like a good plan.  Thanks for bringing more attention to this.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-08-05  6:28 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-02  7:51 Please move to github Gary Grossman
2014-08-02  7:54 ` Kapil Israni
2014-08-02  8:02   ` Eric Wong
2014-08-02  8:50 ` Eric Wong
2014-08-02 19:07   ` Gary Grossman
2014-08-02 19:33     ` Michael Fischer
2014-08-04  7:22       ` Hongli Lai
2014-08-04  8:48         ` Rack encodings (was: Please move to github) Eric Wong
2014-08-04  9:46           ` Hongli Lai
2014-08-02 20:15     ` Please move to github Eric Wong
  -- strict thread matches above, loose matches on Subject: below --
2014-08-05  5:56 Rack encodings (was: Please move to github) Gary Grossman
2014-08-05  6:28 ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://yhbt.net/unicorn.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).