Date | Commit message (Collapse) |
|
This reduces unnecessary boilerplate due to switching khash.
|
|
There are no apparent benefits at the moment, but theoretically
it can reduce the amount of allocations we do and improve locality.
|
|
We do not need to be holding devstats_lock when releasing
a local buffer which will never be used by another thread.
|
|
ccan/list has branchless add/del operations and (IMHO) a
better API.
|
|
pthread_exit and abort never returns, so quiet down some
warnings when using -Wunreachable-code on clang.
Unfortunately using -Wunreachable-code globally is too noisy due to
1) Ragel-generated code.
2) constant branch conditions for build-time options (trace/cork)
|
|
readdir on the same DIR pointer is undefined if DIR was inherited by
multiple children. Using the reentrant readdir_r would not have
helped, since the underlying file descriptor and kernel file handle
were still shared (and we need rewinddir, too).
This readdir usage bug existed in cmogstored since the earliest
releases, but was harmless until the cmogstored 1.3 series.
This misuse of readdir lead to hitting a leftover call to free().
So this bug only manifested since
commit 1fab1e7a7f03f3bc0abb1b5181117f2d4605ce3b
(svc: implement top-level by_mog_devid hash)
Fortunately, these bugs only affect users of the undocumented
multi-process feature (not just multi-threaded).
|
|
It's unlikely we'll even come close to see 2-4 billion devices in a
MogileFS instance for a while. Meanwhile, it's also unlikely the
kernel will ever run that many threads, either. So make it easier
to pack and shrink data structures to save a few bytes and perhaps
get better memory alignement.
For reference, the POSIX semaphore API specifies initial values
with unsigned (int) values, too.
This leads to a minor size reduction (and we're not even packing):
$ ~/linux/scripts/bloat-o-meter cmogstored.before cmogstored
add/remove: 0/0 grow/shrink: 0/13 up/down: 0/-86 (-86)
function old new delta
mog_svc_dev_quit_prepare 13 12 -1
mog_mgmt_fn_aio_threads 147 146 -1
mog_dev_user_rescale_i 27 26 -1
mog_ioq_requeue_prepare 52 50 -2
mog_ioq_init 80 78 -2
mog_thrpool_start 101 96 -5
mog_svc_dev_user_rescale 143 137 -6
mog_svc_start_each 264 256 -8
mog_svc_aio_threads_handler 257 249 -8
mog_ioq_ready 263 255 -8
mog_ioq_next 303 295 -8
mog_svc_thrpool_rescale 206 197 -9
mog_thrpool_set_size 1028 1001 -27
|
|
This should avoid concurrency bugs where client may run in
multiple threads if we switch to multi-threaded graceful shutdown.
|
|
By reducing the capacity of each ioq, we force each running worker
thread to yield the current client and hit an exit point
(epoll_wait/kqueue) sooner.
|
|
Users reducing or increasing thread counts should increase
ioq capacity, otherwise there's no point in having more or
less threads if they are synched to the ioq capacity.
|
|
This replaces the fsck_queue internals with a generic
ioq implementation which is based on the MogileFS devid,
and not the operating system devid.
|
|
We're using per-svc-based thread pools, so different MogileFS
instances we serve no longer affect each other. This means
changing the aio_threads count only affects the svc of the
sidechannel port which triggered the change.
|
|
This improves maintainability in case MogileFS changest these
limits.
|
|
Both hash_initialize and hash_insert may return NULL to indicate
allocation errors. So implement a mog_oom_if_null helper function to
destroy the process instead of attempting to continue and dereferencing
NULL pointers.
This may affect configurations with limited memory and lacking
overcommit; but is unlikely to trigger given the small memory footprint
of cmogstored.
|
|
This will allow us to lookup devices for per-(mog)device I/O queues.
|
|
If the mogstored sidechannel is inactive (in HTTP-only mode), we should
still count the number of devices correctly to correctly scale the
number of worker threads.
|
|
This simplifies code, reduces contention, and reduces the
chances of independent MogileFS instances (with one instance
of cmogstored) stepping over each other.
Most cmogstored deployments are single docroot (for a single
instance of MogileFS), however cmogstored supports multiple
docroots for some rare configurations and we support them here.
|
|
This will help ensure availability when new devices are added,
without additional user interaction to manually set aio_threads
via sidechannel.
|
|
There's no reason to be referencing FDs for these acceptors
since they're infrequently accessed by svc, so this should
make our internals more consistent. This also removes our
use of mog_fd_get (outside of test code).
|
|
For systems without memstream support, using temporary files to
emulate memstream opens us up to more common (than ENOMEM)
errors such as: EIO, ENOSPC, ENFILE and EMFILE.
Since we don't want our server to die completely on these
(sometimes temporary) error cases, we'll just stop publishing
iostat data to "watch" subscribers.
|
|
This prevents us from losing iostat utilization each time the
mount list is rescanned.
Additionally, this allows us to read iostat utilization (and
write to sidechannel clients) concurrently while the mount list
is being refreshed.
|
|
This is better than open-coding a length everywhere.
|
|
gnulib did it for us in m4/gnulib-cache.m4, we'll match.
|
|
The "watch" sidechannel command no longer gets confused in case
two "devXX" directories share the same filesystem (and thus
st_dev). This case should be rare in production setups, but
happens frequently in testing.
Perl mogstored does not have this bug.
|
|
Similarly, if folks continue to rely on the Perl mogstored
daemon for whatever reason, avoid potentially conflicting and
having unnecessary wakeups/activity for usage file changes.
|
|
This matches the behavior of Perl mogstored. Some
systems (like one of mine) may have many major devices
and fewer devices dedicated to MogileFS storage.
This really *should* be tunable, though...
|
|
This shuts down any iostats subscribers on graceful exit.
|
|
This forces us to invalidate the mog_fd structure before calling
close() on the file descriptor. Eventually, this lets us
gracefully shutdown by scanning fdmap to invalidate old
connections.
|
|
We'll use a tmpfile() for now, but later we should
probably just rip out the *printf() dependencies
entirely since they use a lot of stack.
|
|
We only need the allocated memory for the short duration of the
(non-blocking) broadcast.
Keeping the devstats buffer around is wasteful, and also
increases the chance the allocated memory will be free()-ed in
another thread (something few, if any MT-aware malloc()
implementations are optimized for).
|
|
There's no actual error here, but older GCCs (like the ones
found on CentOS 5.x) weren't as smart...
|
|
Nuked old history since it was missing copyright/GPLv3 notices.
|