unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
* Re: Please move to github
@ 2014-08-01 22:32 Victor Kmita
  0 siblings, 0 replies; 22+ messages in thread
From: Victor Kmita @ 2014-08-01 22:32 UTC (permalink / raw)
  To: e; +Cc: unicorn-public, michael

"No.  Never.=94 =93It absolutely sickens me to encounter users who seem =
to be incapable of using git without a proprietary communications tool."
Dude. You sicken me.=


^ permalink raw reply	[flat|nested] 22+ messages in thread
* Re: Please move to github
@ 2014-08-02  7:51 Gary Grossman
  2014-08-02  7:54 ` Kapil Israni
  2014-08-02  8:50 ` Eric Wong
  0 siblings, 2 replies; 22+ messages in thread
From: Gary Grossman @ 2014-08-02  7:51 UTC (permalink / raw)
  To: e; +Cc: unicorn-public, michael

Hi Eric,

I work with Michael, and this discussion sure got off on the
wrong foot... we love unicorn and use it heavily, and just
want to contribute back to it.

To detail the encoding problem we were trying to fix, unicorn
uses rb_str_new in several places to create Ruby strings.
For Ruby 1.9 and later, these strings are assigned ASCII-8BIT
encoding.

While the Rack specification doesn't dictate what encoding
should be used for strings in the environment, many
developers would probably expect the default external encoding
setting in Encoding.default_external to be used.

Many Rails applications use UTF8 heavily. The use of ASCII-8BIT
in the env can lead to Encoding::CompatibilityErrors being
raised when a UTF8 string and ASCII-8BIT string are concatenated,
which happens frequently when properties like request.url are
referenced in erb templates. To get around these problems,
an app would have to force encoding on the strings in the env
manually. It seems a shame to do this in slower Ruby code when
it could be done up front by unicorn.

We'd like to propose that unicorn use rb_external_str_new to
make strings instead of rb_str_new.

Perhaps you have your reasons for continuing to use rb_str_new
but we figured we'd run this by you.

Here's a proposed patch.

Gary

From befb01530c8d930ba53cc58b979ddf42a4c12565 Mon Sep 17 00:00:00 2001
From: Gary Grossman <gary.grossman@gmail.com>
Date: Sat, 2 Aug 2014 00:19:30 -0700
Subject: [PATCH] If unicorn is used with Ruby 1.9 or later, use
 rb_external_str_new instead of rb_str_new to create strings. The resulting
 strings will use the default external encoding. Continue using rb_str_new for
 older versions of Ruby.

Using the default external encoding instead of ASCII-8BIT for
strings is more in line with developer expectations and will cause
less unexpected bugs such as Encoding::CompatibilityErrors which
result when, say, a UTF8 string and ASCII-8BIT string are
concatenated together.

Added a unit test to ensure that strings returned in the Rack
environment conform to the default external encoding.
---
 ext/unicorn_http/ext_help.h |  6 ++++++
 test/unit/test_request.rb   | 13 +++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/ext/unicorn_http/ext_help.h b/ext/unicorn_http/ext_help.h
index c87c272..6806f8e 100644
--- a/ext/unicorn_http/ext_help.h
+++ b/ext/unicorn_http/ext_help.h
@@ -79,4 +79,10 @@ static int str_cstr_case_eq(VALUE val, const char *ptr, long len)
 #define STR_CSTR_CASE_EQ(val, const_str) \
   str_cstr_case_eq(val, const_str, sizeof(const_str) - 1)
 
+#ifdef HAVE_RUBY_ENCODING_H
+/* Use default external encoding for strings for Ruby 1.9+,
+ * fall back to rb_str_new when unavailable */
+#define rb_str_new rb_external_str_new
+#endif
+
 #endif /* ext_help_h */
diff --git a/test/unit/test_request.rb b/test/unit/test_request.rb
index fbda1a2..0a105e0 100644
--- a/test/unit/test_request.rb
+++ b/test/unit/test_request.rb
@@ -179,4 +179,17 @@ class RequestTest < Test::Unit::TestCase
     env['rack.input'].rewind
     res = @lint.call(env)
   end
+
+  def test_encoding
+    if ''.respond_to?(:encoding)
+      client = MockRequest.new("GET http://e:3/x?y=z HTTP/1.1\r\n" \
+                               "Host: foo\r\n\r\n")
+      env = @request.read(client)
+      encoding = Encoding.default_external
+      assert_equal encoding, env['REQUEST_PATH'].encoding
+      assert_equal encoding, env['PATH_INFO'].encoding
+      assert_equal encoding, env['QUERY_STRING'].encoding
+    end
+  end
+
 end
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread
* Re: Please move to github
@ 2014-08-02  7:46 Gary Grossman
  0 siblings, 0 replies; 22+ messages in thread
From: Gary Grossman @ 2014-08-02  7:46 UTC (permalink / raw)
  To: e; +Cc: unicorn-public, michael

Hi Eric,

I work with Michael, and this discussion sure got off on the
wrong foot... we love unicorn and use it heavily, and just
want to contribute back to it.

To detail the encoding problem we were trying to fix, unicorn
uses rb_str_new in several places to create Ruby strings.
For Ruby 1.9 and later, these strings are assigned ASCII-8BIT
encoding.

While the Rack specification doesn't dictate what encoding
should be used for strings in the environment, many
developers would probably expect the default external encoding
setting in Encoding.default_external to be used.

Many Rails applications use UTF8 heavily. The use of ASCII-8BIT
in the env can lead to Encoding::CompatibilityErrors being
raised when a UTF8 string and ASCII-8BIT string are concatenated,
which happens frequently when properties like request.url are
referenced in erb templates. To get around these problems,
an app would have to force encoding on the strings in the env
manually. It seems a shame to do this in slower Ruby code when
it could be done up front by unicorn.

We'd like to propose that unicorn use rb_external_str_new to
make strings instead of rb_str_new.

Perhaps you have your reasons for continuing to use rb_str_new
but we figured we'd run this by you.

Here's a proposed patch.

From befb01530c8d930ba53cc58b979ddf42a4c12565 Mon Sep 17 00:00:00 2001
From: Gary Grossman <gary.grossman@gmail.com>
Date: Sat, 2 Aug 2014 00:19:30 -0700
Subject: [PATCH] If unicorn is used with Ruby 1.9 or later, use
 rb_external_str_new instead of rb_str_new to create strings. The resulting
 strings will use the default external encoding. Continue using rb_str_new
for
 older versions of Ruby.

Using the default external encoding instead of ASCII-8BIT for
strings is more in line with developer expectations and will cause
less unexpected bugs such as Encoding::CompatibilityErrors which
result when, say, a UTF8 string and ASCII-8BIT string are
concatenated together.

Added a unit test to ensure that strings returned in the Rack
environment conform to the default external encoding.
---
 ext/unicorn_http/ext_help.h |  6 ++++++
 test/unit/test_request.rb   | 13 +++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/ext/unicorn_http/ext_help.h b/ext/unicorn_http/ext_help.h
index c87c272..6806f8e 100644
--- a/ext/unicorn_http/ext_help.h
+++ b/ext/unicorn_http/ext_help.h
@@ -79,4 +79,10 @@ static int str_cstr_case_eq(VALUE val, const char *ptr,
long len)
 #define STR_CSTR_CASE_EQ(val, const_str) \
   str_cstr_case_eq(val, const_str, sizeof(const_str) - 1)

+#ifdef HAVE_RUBY_ENCODING_H
+/* Use default external encoding for strings for Ruby 1.9+,
+ * fall back to rb_str_new when unavailable */
+#define rb_str_new rb_external_str_new
+#endif
+
 #endif /* ext_help_h */
diff --git a/test/unit/test_request.rb b/test/unit/test_request.rb
index fbda1a2..0a105e0 100644
--- a/test/unit/test_request.rb
+++ b/test/unit/test_request.rb
@@ -179,4 +179,17 @@ class RequestTest < Test::Unit::TestCase
     env['rack.input'].rewind
     res = @lint.call(env)
   end
+
+  def test_encoding
+    if ''.respond_to?(:encoding)
+      client = MockRequest.new("GET http://e:3/x?y=z HTTP/1.1\r\n" \
+                               "Host: foo\r\n\r\n")
+      env = @request.read(client)
+      encoding = Encoding.default_external
+      assert_equal encoding, env['REQUEST_PATH'].encoding
+      assert_equal encoding, env['PATH_INFO'].encoding
+      assert_equal encoding, env['QUERY_STRING'].encoding
+    end
+  end
+
 end
-- 
1.9.1

Gary


^ permalink raw reply related	[flat|nested] 22+ messages in thread
* Please move to github
@ 2014-08-01 20:27 Michael Grosser
  2014-08-01 21:32 ` Eric Wong
  0 siblings, 1 reply; 22+ messages in thread
From: Michael Grosser @ 2014-08-01 20:27 UTC (permalink / raw)
  To: unicorn-public

Current state makes it very hard to mange/search/fork/open-issues etc
especially for newcomers,
please move the project to github so we can have nice disussions
forks/prs etc goodness.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2014-08-04  7:23 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-01 22:32 Please move to github Victor Kmita
  -- strict thread matches above, loose matches on Subject: below --
2014-08-02  7:51 Gary Grossman
2014-08-02  7:54 ` Kapil Israni
2014-08-02  8:02   ` Eric Wong
2014-08-02  8:50 ` Eric Wong
2014-08-02 19:07   ` Gary Grossman
2014-08-02 19:33     ` Michael Fischer
2014-08-04  7:22       ` Hongli Lai
2014-08-02 20:15     ` Eric Wong
2014-08-02  7:46 Gary Grossman
2014-08-01 20:27 Michael Grosser
2014-08-01 21:32 ` Eric Wong
2014-08-01 22:27   ` Michael Grosser
2014-08-01 22:41     ` Eric Wong
     [not found]       ` <CAAms34NByMcXnbGQ3DvCZTsQuh6yBn8sMSNeTi7Ss4VmmYoOrQ@mail.gmail.com>
2014-08-01 23:07         ` Eric Wong
2014-08-01 22:48     ` Gabe da Silveira
2014-08-01 23:09       ` Xavier Noria
2014-08-01 23:12         ` Michael Grosser
2014-08-01 23:24           ` Eric Wong
2014-08-01 23:26           ` Daniel Evans
2014-08-01 23:38             ` Michael Grosser
2014-08-01 23:18         ` Aaron Suggs

Code repositories for project(s) associated with this public inbox

	https://yhbt.net/unicorn.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).