Date | Commit message (Collapse) |
|
|
|
|
|
|
|
Eventually this will have more switches for testing
various bigfile options.
|
|
|
|
SIGPIPE is very handy for scripting, and I hit SIGINT
in the console pretty often, too; so don't spew to
STDERR when we catch these signals.
|
|
This supports several UNIX-like subcommands:
cp FILE KEY - copy a file to a given key
cat KEY(s) - cat any number of keys to STDOUT
ls PREFIX - list keys matching PREFIX (not globbing)
rm KEY(s) - remove keys
mv FROMKEY TOKEY - rename a key
stat KEY(s) - show various information, including URLs and Size
tee KEY - read input from STDIN and write it to key
(due to the limitations of HTTP servers
and clients this is not streamed)
|
|
Yes, I'm quite miserly when it comes to memory usage. Since the
file is already on disk, just read it incrementally and stream
it out to avoid having to deal with potential memory exhaustion
issues on busy systems. There's also no benefit to slurping
256MB and anything above 64K leads to diminishing returns on
most systems I've seen.
|
|
Reading the rest of the body can take a long time for big files
(we expect that for big blocks) and the timeout is stupid in
that case.
|
|
This block will be passed the IO object for reading the file.
This is to prevent the client from having to slurp an entire
large file into memory all at once.
|
|
Enable it in the HTTPFile PUT code. We'll also use
it for get_file_data when handling large files next.
|
|
The underlying list_keys function can return nil,
so don't try to run .empty? on nil.
|
|
IO.sysread is more GC-friendly than IO.read because it does not
have to allocate additional userspace buffers. Userspace read
buffering is redundant nowadays with modern kernels (especially
in Linux 2.6 and almost always a performance hit).
Additionally, use an underdocumented feature of both IO.sysread
and IO.read that allows it to reuse the existing buffer.
Further reducing garbage collector overhead for large files.
|
|
16M for a chunk is a huge amount of memory to slurp at once.
64K is much more reasonable and chunk sizes above this lead
to dimishing returns in performance.
|
|
Ref: http://rubyforge.org/tracker/index.php?func=detail&aid=15987&group_id=1513&atid=5923
> Submitted By: Matthew Willson
> Summary: Errors on subsequent requests after client times out
> waiting for a response from tracker on a previous request (it
> leaves the socket open)
>
> Detailed description
>
> The summary says it all really.
>
> Once in a while, the tracker will time out responding to, say, a
> create_open or a create_close command, raising one of these:
>
> MogileFS::UnreadableSocketError: 146.101.142.132:7001 never became readable
> from /usr/lib/ruby/gems/1.8/gems/mogilefs-client-1.2.1/lib/mogilefs/backend.rb:158:in `readable?'
> from /usr/lib/ruby/gems/1.8/gems/mogilefs-client-1.2.1/lib/mogilefs/backend.rb:122:in `do_request'
> from /usr/lib/ruby/1.8/thread.rb:135:in `synchronize'
> from /usr/lib/ruby/gems/1.8/gems/mogilefs-client-1.2.1/lib/mogilefs/backend.rb:108:in `do_request'
> from /usr/lib/ruby/gems/1.8/gems/mogilefs-client-1.2.1/lib/mogilefs/backend.rb:16:in `create_open'
> from /usr/lib/ruby/gems/1.8/gems/mogilefs-client-1.2.1/lib/mogilefs/mogilefs.rb:108:in `new_file'
> from /usr/lib/ruby/gems/1.8/gems/mogilefs-client-1.2.1/lib/mogilefs/mogilefs.rb:163:in `store_content'
> from ./script/../config/../config/../app/models/mogile_backed.rb:59:in `store_in_mogile'
> from ./script/../config/../config/../app/models/image.rb:139:in `data'
> from ./script/../config/../config/../app/models/mogile_backed.rb:59:in `store_in_mogile'
> from (irb):15
> from (irb):15
>
> The problem is that, if you code catches this error and carries
> on using the same client object for a subsequent request, the
> 'OK' response which we timed out waiting for, will eventually
> arrive, and sit in the socket's read buffer. It will then be
> read and treated as the response to some unconnected subsequent
> command, resulting in a variety of seemingly intermittent and
> confusing errors.
>
> I've attached a patch for this against 1.2.1, which simply
> closes the socket whenever it times out waiting for a reponse.
> The next request will then open a new socket as required.
>
> Also included a quick fix/improvement to error reporting in one
> case, which helped me to track the problem down.
|
|
Merging involved "Content-length" => "Content-Length"
capitalizing 'L' as per p4#3627 (aka SVN r433)
Ref: http://rubyforge.org/tracker/index.php?func=detail&aid=13764&group_id=1513&atid=5923
> Submitted By: Andy Lo-A-Foe (arloafoe)
> Category: mogilefs-client
> Summary:
> Store very large files (> 256M) without running out of memory in store_file
>
> Detailed description
>
> This is a patch to the MogileFS::store_file mechanism in order to
> support very large filee stores using HTTPFile. We sometimes have to
> store files of up to 1GB in size. Using chunking is not really an option
> since it has proven to be very unreliable (mogtool) and there is no
> support for it in the current version of this client. This patch
> basically reads 16M chunks at a time and writes them to the tracker
> socket instead of trying to stuff the while file in the StringIO and
> running out of memory. It's probably very rough and the get_file_data
> symmetry patch is not there yet. Feedback appreciated.
|
|
From p4 revision #3630
git-svn-id: http://seattlerb.rubyforge.org/svn/mogilefs-client/dev@434 d2e05cf2-00e0-46e5-a3de-bbee4d6b9404
|
|
Removed infinite loop in MogileFS::HTTPFile#store_file.
Made MogileFS#get_file_data timeout configurable.
Add MogileFS#size.
From p4 revision #3627
git-svn-id: http://seattlerb.rubyforge.org/svn/mogilefs-client/dev@433 d2e05cf2-00e0-46e5-a3de-bbee4d6b9404
|
|
Submitted by Matthew Willson.
From p4 revision #3337
git-svn-id: http://seattlerb.rubyforge.org/svn/mogilefs-client/dev@377 d2e05cf2-00e0-46e5-a3de-bbee4d6b9404
|
|
From p4 revision #3336
git-svn-id: http://seattlerb.rubyforge.org/svn/mogilefs-client/dev@376 d2e05cf2-00e0-46e5-a3de-bbee4d6b9404
|