about summary refs log tree commit homepage
path: root/thrpool.c
DateCommit message (Collapse)
2013-12-09thrpool: sleep instead of yield when poking thread
This unfortunate loop burned too much CPU on FreeBSD and caused shutdown to take too long when using sched_yield. nanosleep for 10ms instead, hopefully allowing the system to accomplish some disk I/O and other tasks before we poke it again. Reported-by: Mikolaj Golub
2013-07-14downgrade thread/device-count fields to unsigned int
It's unlikely we'll even come close to see 2-4 billion devices in a MogileFS instance for a while. Meanwhile, it's also unlikely the kernel will ever run that many threads, either. So make it easier to pack and shrink data structures to save a few bytes and perhaps get better memory alignement. For reference, the POSIX semaphore API specifies initial values with unsigned (int) values, too. This leads to a minor size reduction (and we're not even packing): $ ~/linux/scripts/bloat-o-meter cmogstored.before cmogstored add/remove: 0/0 grow/shrink: 0/13 up/down: 0/-86 (-86) function old new delta mog_svc_dev_quit_prepare 13 12 -1 mog_mgmt_fn_aio_threads 147 146 -1 mog_dev_user_rescale_i 27 26 -1 mog_ioq_requeue_prepare 52 50 -2 mog_ioq_init 80 78 -2 mog_thrpool_start 101 96 -5 mog_svc_dev_user_rescale 143 137 -6 mog_svc_start_each 264 256 -8 mog_svc_aio_threads_handler 257 249 -8 mog_ioq_ready 263 255 -8 mog_ioq_next 303 295 -8 mog_svc_thrpool_rescale 206 197 -9 mog_thrpool_set_size 1028 1001 -27
2013-07-11mgmt: checksumming is interruptible during thread shutdown
We want to yield dying threads as soon as possible during thread shutdown, so we check the quit flag and yield the running thread to trigger a MOG_NEXT_ACTIVE.
2013-06-25introduce mog_yield wrapper around sched_yield/pthread_yield
While pthread_yield is non-standard, it is relatively common and preferable for systems where pthreads are _not_ 1:1 mapped to kernel threads. This also provides a stronger yield to weaken the priority of the calling thread wherever we previously used sched_yield.
2013-06-25call sched_yield repeatedly when terminating threads
This should allow the threads we're terminating to more quickly enter a safe state where they're allowed to exit. On SMP systems, we need to yield the signalling thread more times to increase the probability the interrupted thread can run (and exit).
2013-06-25replace pthreads cancellation with explicit checks
Due to data/event loss, we cannot rely on normal syscalls (accept/epoll_wait) being cancellation points. The benefits of using a standardized API to terminate threads asynchronously are lost when toggling cancellation flags. This implementation allows us to be more explicit and obvious at the few points where our worker threads may exit and reduces the amount of code we have. By avoiding the calls to pthread_setcancelstate, we should halve the number of atomic operations required in the common case (where the thread is not marked for termination).
2013-06-25refactor handling of "server aio_threads = " command
We're using per-svc-based thread pools, so different MogileFS instances we serve no longer affect each other. This means changing the aio_threads count only affects the svc of the sidechannel port which triggered the change.
2013-06-25switch to per-svc (per-docroot) queues
This simplifies code, reduces contention, and reduces the chances of independent MogileFS instances (with one instance of cmogstored) stepping over each other. Most cmogstored deployments are single docroot (for a single instance of MogileFS), however cmogstored supports multiple docroots for some rare configurations and we support them here.
2013-06-25thrpool: add comment explaining minimum thread count
I forgot why this bound was necessary, so add a comment ensuring I do not forget again.
2013-06-25update aio_threads count when new devices appear
This will help ensure availability when new devices are added, without additional user interaction to manually set aio_threads via sidechannel.
2013-02-16handle pthread_create returning ENOMEM on old glibc
Older glibc will return ENOMEM on mprotect() failures. This bug was only fixed in 2011, so the long-term distros and old installations may not have the necessary backports. ref: http://www.sourceware.org/bugzilla/show_bug.cgi?id=386
2013-02-16graceful handling of pthread_create EAGAIN failure
pthread_create may return EAGAIN as a temporary failure, do not abort a running process if this is the case. For the initial mountlist scan, we must retry indefinitely for cmogstored to be usable. However, with our thread pools, we can always run fewer threads (as long as there is at least one thread per-pool).
2013-01-17copyright comment updates for 2013
gnulib did it for us in m4/gnulib-cache.m4, we'll match.
2012-12-08thrpool: signal threads concurrently at shutdown
This speeds up shutdown for kqueue users, as kevent() is not a cancellation point. While we're at it, remove the unnecessary check for mog_queue. before pthread_kill(). This check was a remnant of the old, NOTE_TRIGGER-based implementation.
2012-11-12mgmt: support "server aio_threads = <digit>"
This allows tunable thread counts at runtime like regular mogstored (using Perlbal).
2012-05-02kqueue: rely on EINTR instead of EVFILT_USER to shutdown
Using pthread_cancel() and pthread_kill() allows us to do shutdowns of individual threads in the future. EVFILT_USER will just spam the kernel and the thread-specific "dying" hack won't work if we only want to shut down a single thread. kevent() is not a cancellation point in FreeBSD and will not be in libkqueue, either. However libkqueue will set errno==EINTR if it is interrupted, allowing cancellation requests to go through.
2012-04-21kqueue: schedule wakeup of sleepers during shutdown
By explicitly giving kevent() sleepers a chance to wakeup and run, we can reduce the number of times we need to trigger wakeups via NOTE_TRIGGER.
2012-04-21queue: rework kevent cancellation handling
The kevent() function as implemented by libkqueue does not support thread cancellation the same way a real kevent() (on FreeBSD) appears to. So pretend no implementation of kevent() is cancelable and handle cancellation ourselves using pthread_testcancel(). This allows us to support any platform where kevent() may work, since it's unclear if other *BSDs implement kevent() as a cancellation point.
2012-04-20only set explicit stack size on GNU/libc and FreeBSD
We don't know enough about the libc other platforms to make an intelligent choice about stack size, so just use the default to avoid potential problems.
2012-04-19thrpool: use default stack size for libkqueue users
libkqueue appears to use a lot of stack, so just use the default stack size to avoid unexplained segfaults.
2012-02-11do not rely on BUFSIZ=8192
BUFSIZ is only 1024 on FreeBSD, this is too small to be optimal for large I/O operations.
2012-02-10threads based on the number of usable major devices
This should really be tunable, but we can do that later.
2012-01-18thrpool: add BUFSIZ (8K on glibc) to thread stack
The *printf() family of functions may allocate BUFSIZ on the stack. We'll need those functions (including syslog(3)) in various places, so it's safer to have more stack (and it can give more meaningful assert() messages).
2012-01-11initial commit
Nuked old history since it was missing copyright/GPLv3 notices.