cmogstored.git - alternative mogstored implementation for MogileFS

Date	Commit message (Collapse)
2014-05-29	dev: minor khash-related cleanups khash
	This reduces unnecessary boilerplate due to switching khash.
2014-05-29	try khash out for mapping mog_devids
	There are no apparent benefits at the moment, but theoretically it can reduce the amount of allocations we do and improve locality.
2014-05-23	svc_dev: calling free does not need the lock ccan-list
	We do not need to be holding devstats_lock when releasing a local buffer which will never be used by another thread.
2014-05-22	trade sys/queue.h LIST_* for ccan/list
	ccan/list has branchless add/del operations and (IMHO) a better API.
2014-04-08	minor cleanups for functions which do not return
	pthread_exit and abort never returns, so quiet down some warnings when using -Wunreachable-code on clang. Unfortunately using -Wunreachable-code globally is too noisy due to 1) Ragel-generated code. 2) constant branch conditions for build-time options (trace/cork)
2013-10-12	avoid use-after-free with multi-process setups
	readdir on the same DIR pointer is undefined if DIR was inherited by multiple children. Using the reentrant readdir_r would not have helped, since the underlying file descriptor and kernel file handle were still shared (and we need rewinddir, too). This readdir usage bug existed in cmogstored since the earliest releases, but was harmless until the cmogstored 1.3 series. This misuse of readdir lead to hitting a leftover call to free(). So this bug only manifested since commit 1fab1e7a7f03f3bc0abb1b5181117f2d4605ce3b (svc: implement top-level by_mog_devid hash) Fortunately, these bugs only affect users of the undocumented multi-process feature (not just multi-threaded).
2013-07-14	downgrade thread/device-count fields to unsigned int
	It's unlikely we'll even come close to see 2-4 billion devices in a MogileFS instance for a while. Meanwhile, it's also unlikely the kernel will ever run that many threads, either. So make it easier to pack and shrink data structures to save a few bytes and perhaps get better memory alignement. For reference, the POSIX semaphore API specifies initial values with unsigned (int) values, too. This leads to a minor size reduction (and we're not even packing): $ ~/linux/scripts/bloat-o-meter cmogstored.before cmogstored add/remove: 0/0 grow/shrink: 0/13 up/down: 0/-86 (-86) function old new delta mog_svc_dev_quit_prepare 13 12 -1 mog_mgmt_fn_aio_threads 147 146 -1 mog_dev_user_rescale_i 27 26 -1 mog_ioq_requeue_prepare 52 50 -2 mog_ioq_init 80 78 -2 mog_thrpool_start 101 96 -5 mog_svc_dev_user_rescale 143 137 -6 mog_svc_start_each 264 256 -8 mog_svc_aio_threads_handler 257 249 -8 mog_ioq_ready 263 255 -8 mog_ioq_next 303 295 -8 mog_svc_thrpool_rescale 206 197 -9 mog_thrpool_set_size 1028 1001 -27
2013-07-14	ioq: reset internal queues during requeue/shutdown
	This should avoid concurrency bugs where client may run in multiple threads if we switch to multi-threaded graceful shutdown.
2013-07-12	svc: increase responsiveness of graceful shutdown
	By reducing the capacity of each ioq, we force each running worker thread to yield the current client and hit an exit point (epoll_wait/kqueue) sooner.
2013-07-12	ioq: rescale to match user-set aio_threads values
	Users reducing or increasing thread counts should increase ioq capacity, otherwise there's no point in having more or less threads if they are synched to the ioq capacity.
2013-07-10	introduce generic I/O queue functionality
	This replaces the fsck_queue internals with a generic ioq implementation which is based on the MogileFS devid, and not the operating system devid.
2013-06-25	refactor handling of "server aio_threads = " command
	We're using per-svc-based thread pools, so different MogileFS instances we serve no longer affect each other. This means changing the aio_threads count only affects the svc of the sidechannel port which triggered the change.
2013-06-25	define MOG_DEVID_MAX and MOG_PATH_MAX variables
	This improves maintainability in case MogileFS changest these limits.
2013-06-25	consistently check OOM from hash_initialize/hash_insert
	Both hash_initialize and hash_insert may return NULL to indicate allocation errors. So implement a mog_oom_if_null helper function to destroy the process instead of attempting to continue and dereferencing NULL pointers. This may affect configurations with limited memory and lacking overcommit; but is unlikely to trigger given the small memory footprint of cmogstored.
2013-06-25	svc: implement top-level by_mog_devid hash
	This will allow us to lookup devices for per-(mog)device I/O queues.
2013-06-25	fix devices/thread count if sidechannel is inactive
	If the mogstored sidechannel is inactive (in HTTP-only mode), we should still count the number of devices correctly to correctly scale the number of worker threads.
2013-06-25	switch to per-svc (per-docroot) queues
	This simplifies code, reduces contention, and reduces the chances of independent MogileFS instances (with one instance of cmogstored) stepping over each other. Most cmogstored deployments are single docroot (for a single instance of MogileFS), however cmogstored supports multiple docroots for some rare configurations and we support them here.
2013-06-25	update aio_threads count when new devices appear
	This will help ensure availability when new devices are added, without additional user interaction to manually set aio_threads via sidechannel.
2013-05-06	favor "struct mog_fd" for acceptors over int FDs
	There's no reason to be referencing FDs for these acceptors since they're infrequently accessed by svc, so this should make our internals more consistent. This also removes our use of mog_fd_get (outside of test code).
2013-01-31	better error handling when faking memstream
	For systems without memstream support, using temporary files to emulate memstream opens us up to more common (than ENOMEM) errors such as: EIO, ENOSPC, ENFILE and EMFILE. Since we don't want our server to die completely on these (sometimes temporary) error cases, we'll just stop publishing iostat data to "watch" subscribers.
2013-01-31	split iostat util% tracking from mountlist
	This prevents us from losing iostat utilization each time the mount list is rescanned. Additionally, this allows us to read iostat utilization (and write to sidechannel clients) concurrently while the mount list is being refreshed.
2013-01-31	consistent allocation size for iostat utilization
	This is better than open-coding a length everywhere.
2013-01-17	copyright comment updates for 2013
	gnulib did it for us in m4/gnulib-cache.m4, we'll match.
2012-09-06	fix I/O util for multiple devXX dirs on the same FS v0.5.0
	The "watch" sidechannel command no longer gets confused in case two "devXX" directories share the same filesystem (and thus st_dev). This case should be rare in production setups, but happens frequently in testing. Perl mogstored does not have this bug.
2012-04-18	avoid usage file if mgmt sidechannel is inactive
	Similarly, if folks continue to rely on the Perl mogstored daemon for whatever reason, avoid potentially conflicting and having unnecessary wakeups/activity for usage file changes.
2012-03-12	change thread count based on number of dev* entries
	This matches the behavior of Perl mogstored. Some systems (like one of mine) may have many major devices and fewer devices dedicated to MogileFS storage. This really should be tunable, though...
2012-02-22	add new mog_svc_dev_shutdown() method
	This shuts down any iostats subscribers on graceful exit.
2012-02-20	redo mog_fd_put() and actually use it
	This forces us to invalidate the mog_fd structure before calling close() on the file descriptor. Eventually, this lets us gracefully shutdown by scanning fdmap to invalidate old connections.
2012-02-05	compat_memstream: for systems lacking open_memstream()
	We'll use a tmpfile() for now, but later we should probably just rip out the *printf() dependencies entirely since they use a lot of stack.
2012-01-13	remove devstats iovec from struct mog_svc
	We only need the allocated memory for the short duration of the (non-blocking) broadcast. Keeping the devstats buffer around is wasteful, and also increases the chance the allocated memory will be free()-ed in another thread (something few, if any MT-aware malloc() implementations are optimized for).
2012-01-12	svc_dev: avoid strict aliasing warning on older GCC
	There's no actual error here, but older GCCs (like the ones found on CentOS 5.x) weren't as smart...
2012-01-11	initial commit
	Nuked old history since it was missing copyright/GPLv3 notices.