All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Beata Michalska <b.michalska@samsung.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-api@vger.kernel.org, greg@kroah.com, jack@suse.cz,
	tytso@mit.edu, adilger.kernel@dilger.ca, hughd@google.com,
	lczerner@redhat.com, hch@infradead.org,
	linux-ext4@vger.kernel.org, linux-mm@kvack.org,
	kyungmin.park@samsung.com, kmpark@infradead.org
Subject: Re: [RFC v3 1/4] fs: Add generic file system event notifications
Date: Fri, 19 Jun 2015 19:28:11 +0200	[thread overview]
Message-ID: <5584512B.5020301@samsung.com> (raw)
In-Reply-To: <20150619000341.GM10224@dastard>

On 06/19/2015 02:03 AM, Dave Chinner wrote:
> On Thu, Jun 18, 2015 at 10:25:08AM +0200, Beata Michalska wrote:
>> On 06/18/2015 01:06 AM, Dave Chinner wrote:
>>> On Tue, Jun 16, 2015 at 03:09:30PM +0200, Beata Michalska wrote:
>>>> Introduce configurable generic interface for file
>>>> system-wide event notifications, to provide file
>>>> systems with a common way of reporting any potential
>>>> issues as they emerge.
>>>>
>>>> The notifications are to be issued through generic
>>>> netlink interface by newly introduced multicast group.
>>>>
>>>> Threshold notifications have been included, allowing
>>>> triggering an event whenever the amount of free space drops
>>>> below a certain level - or levels to be more precise as two
>>>> of them are being supported: the lower and the upper range.
>>>> The notifications work both ways: once the threshold level
>>>> has been reached, an event shall be generated whenever
>>>> the number of available blocks goes up again re-activating
>>>> the threshold.
>>>>
>>>> The interface has been exposed through a vfs. Once mounted,
>>>> it serves as an entry point for the set-up where one can
>>>> register for particular file system events.
>>>>
>>>> Signed-off-by: Beata Michalska <b.michalska@samsung.com>
>>>
>>> This has massive scalability problems:
> ....
>>> Have you noticed that the filesystems have percpu counters for
>>> tracking global space usage? There's good reason for that - taking a
>>> spinlock in such a hot accounting path causes severe contention.
> ....
>>> Then puts the entire netlink send path inside this spinlock, which
>>> includes memory allocation and all sorts of non-filesystem code
>>> paths. And it may be inside critical filesystem locks as well....
>>>
>>> Apart from the serialisation problem of the locking, adding
>>> memory allocation and the network send path to filesystem code
>>> that is effectively considered "innermost" filesystem code is going
>>> to have all sorts of problems for various filesystems. In the XFS
>>> case, we simply cannot execute this sort of function in the places
>>> where we update global space accounting.
>>>
>>> As it is, I think the basic concept of separate tracking of free
>>> space if fundamentally flawed. What I think needs to be done is that
>>> filesystems need access to the thresholds for events, and then the
>>> filesystems call fs_event_send_thresh() themselves from appropriate
>>> contexts (ie. without compromising locking, scalability, memory
>>> allocation recursion constraints, etc).
>>>
>>> e.g. instead of tracking every change in free space, a filesystem
>>> might execute this once every few seconds from a workqueue:
>>>
>>> 	event = fs_event_need_space_warning(sb, <fs_free_space>)
>>> 	if (event)
>>> 		fs_event_send_thresh(sb, event);
>>>
>>> User still gets warnings about space usage, but there's no runtime
>>> overhead or problems with lock/memory allocation contexts, etc.
>>
>> Having fs to keep a firm hand on thresholds limits would indeed be
>> far more sane approach though that would require each fs to
>> add support for that and handle most of it on their own. Avoiding
>>> this was the main rationale behind this rfc.
>> If fs people agree to that, I'll be more than willing to drop this
>> in favour of the per-fs tracking solution. 
>> Personally, I hope they will.
> 
> I was hoping that you'd think a little more about my suggestion and
> work out how to do background threshold event detection generically.
> I kind of left it as "an exercise for the reader" because it seems
> obvious to me.
> 
> Hint: ->statfs allows you to get the total, free and used space
> from filesystems in a generic manner.
> 
> Cheers,
> 
> Dave.
> 

I haven't given up on that, so yes, I'm still working on a more suitable
generic solution.
Background detection is one of the options, though it needs some more thoughts.
Giving up the sync approach means less accuracy for the threshold notifications,
but I guess this could be fine-tuned to get an acceptable level. Another bump:
how this tuning is supposed to be done (additional config option maybe)? 
The interface would have to keep it somehow sane - but what would 'sane' mean
in this case (?) Also, I'm not sure whether single approach would server here
well for all the potentially supported file systems so this would have to be
properly adjusted (taking the threshold levels into consideration as well). 
And still,it would require some form of synchronization with tracked fs so that
this 'detection' is not being unnecessarily performed (i.e. while fs remains frozen).

There is also an idea of using an interface resembling the stackable fs:
a transparent file system layered on top of the tracked one 
(solely for the tracking purposes). This would simplify handling the trace 
object's lifetime - no more list of registered traces.
It would also give a way of tracking (to some extent) the changes in the amount
of available space, which combined with tweaked background check could give
a solution with less performance overhead than the original one.
I'll try this one and see how it goes.

Thank You for your feedback so far - I really appreciate it.


Best Regards
Beata 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at  http://www.tux.org/lkml/

WARNING: multiple messages have this Message-ID (diff)
From: Beata Michalska <b.michalska-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
To: Dave Chinner <david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org>
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	greg-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org,
	jack-AlSwsSmVLrQ@public.gmane.org,
	tytso-3s7WtUTddSA@public.gmane.org,
	adilger.kernel-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org,
	hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
	lczerner-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	kyungmin.park-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org,
	kmpark-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org
Subject: Re: [RFC v3 1/4] fs: Add generic file system event notifications
Date: Fri, 19 Jun 2015 19:28:11 +0200	[thread overview]
Message-ID: <5584512B.5020301@samsung.com> (raw)
In-Reply-To: <20150619000341.GM10224@dastard>

On 06/19/2015 02:03 AM, Dave Chinner wrote:
> On Thu, Jun 18, 2015 at 10:25:08AM +0200, Beata Michalska wrote:
>> On 06/18/2015 01:06 AM, Dave Chinner wrote:
>>> On Tue, Jun 16, 2015 at 03:09:30PM +0200, Beata Michalska wrote:
>>>> Introduce configurable generic interface for file
>>>> system-wide event notifications, to provide file
>>>> systems with a common way of reporting any potential
>>>> issues as they emerge.
>>>>
>>>> The notifications are to be issued through generic
>>>> netlink interface by newly introduced multicast group.
>>>>
>>>> Threshold notifications have been included, allowing
>>>> triggering an event whenever the amount of free space drops
>>>> below a certain level - or levels to be more precise as two
>>>> of them are being supported: the lower and the upper range.
>>>> The notifications work both ways: once the threshold level
>>>> has been reached, an event shall be generated whenever
>>>> the number of available blocks goes up again re-activating
>>>> the threshold.
>>>>
>>>> The interface has been exposed through a vfs. Once mounted,
>>>> it serves as an entry point for the set-up where one can
>>>> register for particular file system events.
>>>>
>>>> Signed-off-by: Beata Michalska <b.michalska-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
>>>
>>> This has massive scalability problems:
> ....
>>> Have you noticed that the filesystems have percpu counters for
>>> tracking global space usage? There's good reason for that - taking a
>>> spinlock in such a hot accounting path causes severe contention.
> ....
>>> Then puts the entire netlink send path inside this spinlock, which
>>> includes memory allocation and all sorts of non-filesystem code
>>> paths. And it may be inside critical filesystem locks as well....
>>>
>>> Apart from the serialisation problem of the locking, adding
>>> memory allocation and the network send path to filesystem code
>>> that is effectively considered "innermost" filesystem code is going
>>> to have all sorts of problems for various filesystems. In the XFS
>>> case, we simply cannot execute this sort of function in the places
>>> where we update global space accounting.
>>>
>>> As it is, I think the basic concept of separate tracking of free
>>> space if fundamentally flawed. What I think needs to be done is that
>>> filesystems need access to the thresholds for events, and then the
>>> filesystems call fs_event_send_thresh() themselves from appropriate
>>> contexts (ie. without compromising locking, scalability, memory
>>> allocation recursion constraints, etc).
>>>
>>> e.g. instead of tracking every change in free space, a filesystem
>>> might execute this once every few seconds from a workqueue:
>>>
>>> 	event = fs_event_need_space_warning(sb, <fs_free_space>)
>>> 	if (event)
>>> 		fs_event_send_thresh(sb, event);
>>>
>>> User still gets warnings about space usage, but there's no runtime
>>> overhead or problems with lock/memory allocation contexts, etc.
>>
>> Having fs to keep a firm hand on thresholds limits would indeed be
>> far more sane approach though that would require each fs to
>> add support for that and handle most of it on their own. Avoiding
>>> this was the main rationale behind this rfc.
>> If fs people agree to that, I'll be more than willing to drop this
>> in favour of the per-fs tracking solution. 
>> Personally, I hope they will.
> 
> I was hoping that you'd think a little more about my suggestion and
> work out how to do background threshold event detection generically.
> I kind of left it as "an exercise for the reader" because it seems
> obvious to me.
> 
> Hint: ->statfs allows you to get the total, free and used space
> from filesystems in a generic manner.
> 
> Cheers,
> 
> Dave.
> 

I haven't given up on that, so yes, I'm still working on a more suitable
generic solution.
Background detection is one of the options, though it needs some more thoughts.
Giving up the sync approach means less accuracy for the threshold notifications,
but I guess this could be fine-tuned to get an acceptable level. Another bump:
how this tuning is supposed to be done (additional config option maybe)? 
The interface would have to keep it somehow sane - but what would 'sane' mean
in this case (?) Also, I'm not sure whether single approach would server here
well for all the potentially supported file systems so this would have to be
properly adjusted (taking the threshold levels into consideration as well). 
And still,it would require some form of synchronization with tracked fs so that
this 'detection' is not being unnecessarily performed (i.e. while fs remains frozen).

There is also an idea of using an interface resembling the stackable fs:
a transparent file system layered on top of the tracked one 
(solely for the tracking purposes). This would simplify handling the trace 
object's lifetime - no more list of registered traces.
It would also give a way of tracking (to some extent) the changes in the amount
of available space, which combined with tweaked background check could give
a solution with less performance overhead than the original one.
I'll try this one and see how it goes.

Thank You for your feedback so far - I really appreciate it.


Best Regards
Beata 

WARNING: multiple messages have this Message-ID (diff)
From: Beata Michalska <b.michalska@samsung.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-api@vger.kernel.org, greg@kroah.com, jack@suse.cz,
	tytso@mit.edu, adilger.kernel@dilger.ca, hughd@google.com,
	lczerner@redhat.com, hch@infradead.org,
	linux-ext4@vger.kernel.org, linux-mm@kvack.org,
	kyungmin.park@samsung.com, kmpark@infradead.org
Subject: Re: [RFC v3 1/4] fs: Add generic file system event notifications
Date: Fri, 19 Jun 2015 19:28:11 +0200	[thread overview]
Message-ID: <5584512B.5020301@samsung.com> (raw)
In-Reply-To: <20150619000341.GM10224@dastard>

On 06/19/2015 02:03 AM, Dave Chinner wrote:
> On Thu, Jun 18, 2015 at 10:25:08AM +0200, Beata Michalska wrote:
>> On 06/18/2015 01:06 AM, Dave Chinner wrote:
>>> On Tue, Jun 16, 2015 at 03:09:30PM +0200, Beata Michalska wrote:
>>>> Introduce configurable generic interface for file
>>>> system-wide event notifications, to provide file
>>>> systems with a common way of reporting any potential
>>>> issues as they emerge.
>>>>
>>>> The notifications are to be issued through generic
>>>> netlink interface by newly introduced multicast group.
>>>>
>>>> Threshold notifications have been included, allowing
>>>> triggering an event whenever the amount of free space drops
>>>> below a certain level - or levels to be more precise as two
>>>> of them are being supported: the lower and the upper range.
>>>> The notifications work both ways: once the threshold level
>>>> has been reached, an event shall be generated whenever
>>>> the number of available blocks goes up again re-activating
>>>> the threshold.
>>>>
>>>> The interface has been exposed through a vfs. Once mounted,
>>>> it serves as an entry point for the set-up where one can
>>>> register for particular file system events.
>>>>
>>>> Signed-off-by: Beata Michalska <b.michalska@samsung.com>
>>>
>>> This has massive scalability problems:
> ....
>>> Have you noticed that the filesystems have percpu counters for
>>> tracking global space usage? There's good reason for that - taking a
>>> spinlock in such a hot accounting path causes severe contention.
> ....
>>> Then puts the entire netlink send path inside this spinlock, which
>>> includes memory allocation and all sorts of non-filesystem code
>>> paths. And it may be inside critical filesystem locks as well....
>>>
>>> Apart from the serialisation problem of the locking, adding
>>> memory allocation and the network send path to filesystem code
>>> that is effectively considered "innermost" filesystem code is going
>>> to have all sorts of problems for various filesystems. In the XFS
>>> case, we simply cannot execute this sort of function in the places
>>> where we update global space accounting.
>>>
>>> As it is, I think the basic concept of separate tracking of free
>>> space if fundamentally flawed. What I think needs to be done is that
>>> filesystems need access to the thresholds for events, and then the
>>> filesystems call fs_event_send_thresh() themselves from appropriate
>>> contexts (ie. without compromising locking, scalability, memory
>>> allocation recursion constraints, etc).
>>>
>>> e.g. instead of tracking every change in free space, a filesystem
>>> might execute this once every few seconds from a workqueue:
>>>
>>> 	event = fs_event_need_space_warning(sb, <fs_free_space>)
>>> 	if (event)
>>> 		fs_event_send_thresh(sb, event);
>>>
>>> User still gets warnings about space usage, but there's no runtime
>>> overhead or problems with lock/memory allocation contexts, etc.
>>
>> Having fs to keep a firm hand on thresholds limits would indeed be
>> far more sane approach though that would require each fs to
>> add support for that and handle most of it on their own. Avoiding
>>> this was the main rationale behind this rfc.
>> If fs people agree to that, I'll be more than willing to drop this
>> in favour of the per-fs tracking solution. 
>> Personally, I hope they will.
> 
> I was hoping that you'd think a little more about my suggestion and
> work out how to do background threshold event detection generically.
> I kind of left it as "an exercise for the reader" because it seems
> obvious to me.
> 
> Hint: ->statfs allows you to get the total, free and used space
> from filesystems in a generic manner.
> 
> Cheers,
> 
> Dave.
> 

I haven't given up on that, so yes, I'm still working on a more suitable
generic solution.
Background detection is one of the options, though it needs some more thoughts.
Giving up the sync approach means less accuracy for the threshold notifications,
but I guess this could be fine-tuned to get an acceptable level. Another bump:
how this tuning is supposed to be done (additional config option maybe)? 
The interface would have to keep it somehow sane - but what would 'sane' mean
in this case (?) Also, I'm not sure whether single approach would server here
well for all the potentially supported file systems so this would have to be
properly adjusted (taking the threshold levels into consideration as well). 
And still,it would require some form of synchronization with tracked fs so that
this 'detection' is not being unnecessarily performed (i.e. while fs remains frozen).

There is also an idea of using an interface resembling the stackable fs:
a transparent file system layered on top of the tracked one 
(solely for the tracking purposes). This would simplify handling the trace 
object's lifetime - no more list of registered traces.
It would also give a way of tracking (to some extent) the changes in the amount
of available space, which combined with tweaked background check could give
a solution with less performance overhead than the original one.
I'll try this one and see how it goes.

Thank You for your feedback so far - I really appreciate it.


Best Regards
Beata 



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-06-19 17:28 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-16 13:09 [RFC v3 0/4] fs: Add generic file system event notifications Beata Michalska
2015-06-16 13:09 ` Beata Michalska
2015-06-16 13:09 ` [RFC v3 1/4] " Beata Michalska
2015-06-16 13:09   ` Beata Michalska
2015-06-16 16:21   ` Al Viro
2015-06-16 16:21     ` Al Viro
2015-06-16 16:21     ` Al Viro
2015-06-17  9:22     ` Beata Michalska
2015-06-17  9:22       ` Beata Michalska
2015-06-17 23:06   ` Dave Chinner
2015-06-17 23:06     ` Dave Chinner
2015-06-18  8:25     ` Beata Michalska
2015-06-18  8:25       ` Beata Michalska
2015-06-19  0:03       ` Dave Chinner
2015-06-19  0:03         ` Dave Chinner
2015-06-19 17:28         ` Beata Michalska [this message]
2015-06-19 17:28           ` Beata Michalska
2015-06-19 17:28           ` Beata Michalska
2015-06-19 23:21           ` Dave Chinner
2015-06-19 23:21             ` Dave Chinner
2015-06-22 15:46             ` Beata Michalska
2015-06-22 15:46               ` Beata Michalska
2015-06-18 11:17   ` Kinglong Mee
2015-06-18 11:17     ` Kinglong Mee
2015-06-18 14:50     ` Beata Michalska
2015-06-18 14:50       ` Beata Michalska
2015-06-24  8:47   ` Dmitry Monakhov
2015-06-24  8:47     ` Dmitry Monakhov
2015-06-24 15:31     ` Beata Michalska
2015-06-24 15:31       ` Beata Michalska
2015-06-24 16:26       ` Steve French
2015-06-24 16:26         ` Steve French
2015-06-26  7:30         ` Beata Michalska
2015-06-26  7:30           ` Beata Michalska
2015-07-22 15:55   ` Bartlomiej Zolnierkiewicz
2015-07-22 15:55     ` Bartlomiej Zolnierkiewicz
2015-07-30  8:22     ` Beata Michalska
2015-07-30  8:22       ` Beata Michalska
2015-06-16 13:09 ` [RFC v3 2/4] ext4: Add helper function to mark group as corrupted Beata Michalska
2015-06-16 13:09   ` Beata Michalska
2015-07-22 10:40   ` Bartlomiej Zolnierkiewicz
2015-07-22 10:40     ` Bartlomiej Zolnierkiewicz
2015-06-16 13:09 ` [RFC v3 3/4] ext4: Add support for generic FS events Beata Michalska
2015-06-16 13:09   ` Beata Michalska
2015-06-16 13:09   ` Beata Michalska
2015-06-17  6:15   ` Leon Romanovsky
2015-06-17  6:15     ` Leon Romanovsky
2015-06-17  9:25     ` Beata Michalska
2015-06-17  9:25       ` Beata Michalska
2015-06-16 13:09 ` [RFC v3 4/4] shmem: " Beata Michalska
2015-06-16 13:09   ` Beata Michalska
2015-06-17  6:08   ` Leon Romanovsky
2015-06-17  6:08     ` Leon Romanovsky
2015-06-17  9:23     ` Beata Michalska
2015-06-17  9:23       ` Beata Michalska

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5584512B.5020301@samsung.com \
    --to=b.michalska@samsung.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=david@fromorbit.com \
    --cc=greg@kroah.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kmpark@infradead.org \
    --cc=kyungmin.park@samsung.com \
    --cc=lczerner@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.