NVDIMM Device and Persistent Memory development
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Dan Williams <dan.j.williams@intel.com>,
	Arnd Bergmann <arnd@arndb.de>, Dave Chinner <david@fromorbit.com>
Cc: linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-mm@kvack.org, linux-arch@vger.kernel.org,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>,
	Matthew Wilcox <willy@infradead.org>,
	Russell King <linux@armlinux.org.uk>,
	nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, dm-devel@lists.linux.dev
Subject: Re: [RFC PATCH v3 2/4] dax: Check for data cache aliasing at runtime
Date: Fri, 2 Feb 2024 15:02:50 -0500	[thread overview]
Message-ID: <5e838147-524c-40e5-b106-e388bf4e549b@efficios.com> (raw)
In-Reply-To: <65bd457460fb1_719322942@dwillia2-mobl3.amr.corp.intel.com.notmuch>

On 2024-02-02 14:41, Dan Williams wrote:
> Mathieu Desnoyers wrote:
>> On 2024-02-02 12:37, Dan Williams wrote:
>>> Mathieu Desnoyers wrote:
>> [...]
>>>>
>>>
>>>> The alternative route I intend to take is to audit all callers
>>>> of alloc_dax() and make sure they all save the alloc_dax() return
>>>> value in a struct dax_device * local variable first for the sake
>>>> of checking for IS_ERR(). This will leave the xyz->dax_dev pointer
>>>> initialized to NULL in the error case and simplify the rest of
>>>> error checking.
>>>
>>> I could maybe get on board with that, but it needs a comment somewhere
>>> about the asymmetric subtlety.
>>
>> Is this "somewhere" at every alloc_dax() call site, or do you have
>> something else in mind ?
> 
> At least kill_dax() should mention the asymmetry I think.

Here is what I intend to add:

  * Note, because alloc_dax() returns an ERR_PTR() on error, callers
  * typically store its result into a local variable in order to check
  * the result. Therefore, care must be taken to populate the struct
  * device dax_dev field make sure the dax_dev is not leaked.

> 
>>> The real question is what to do about device-dax. I *think* it is not
>>> affected by cpu_dcache aliasing because it never accesses user mappings
>>> through a kernel alias. I doubt device-dax is in use on these platforms,
>>> but we might need another fixup for that if someone screams about the
>>> alloc_dax() behavior change making them lose device-dax access.
>>
>> By "device-dax", I understand you mean drivers/dax/Kconfig:DEV_DAX.
>>
>> Based on your analysis, is alloc_dax() still the right spot where
>> to place this runtime check ? Which call sites are responsible
>> for invoking alloc_dax() for device-dax ?
> 
> That is in devm_create_dev_dax().
> 
>> If we know which call sites do not intend to use the kernel linear
>> mapping, we could introduce a flag (or a new variant of the alloc_dax()
>> API) that would either enforce or skip the check.
> 
> Hmmm, it looks like there is already a natural flag for that. If
> alloc_dax() is passed a NULL operations pointer it means there are no
> kernel usages of the aliased mapping. That actually fits rather nicely.

Good, I was reaching the same conclusion when I received your reply.
I'll do that. It ends up being:

         /*
          * Unavailable on architectures with virtually aliased data caches,
          * except for device-dax (NULL operations pointer), which does
          * not use aliased mappings from the kernel.
          */
         if (ops && cpu_dcache_is_aliasing())
                 return ERR_PTR(-EOPNOTSUPP);

> 
> [..]
>>>>> @@ -804,6 +808,15 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs)
>>>>>     	if (!IS_ENABLED(CONFIG_FUSE_DAX))
>>>>>     		return 0;
>>>>>     
>>>>> +	dax_dev = alloc_dax(fs, &virtio_fs_dax_ops);
>>>>> +	if (IS_ERR(dax_dev)) {
>>>>> +		int rc = PTR_ERR(dax_dev);
>>>>> +
>>>>> +		if (rc == -EOPNOTSUPP)
>>>>> +			return 0;
>>>>> +		return rc;
>>>>> +	}
>>>>
>>>> What is gained by moving this allocation here ?
>>>
>>> The gain is to fail early in virtio_fs_setup_dax() since the fundamental
>>> dependency of alloc_dax() success is not met. For example why let the
>>> setup progress to devm_memremap_pages() when alloc_dax() is going to
>>> return ERR_PTR(-EOPNOTSUPP).
>>
>> What I don't know is whether there is a dependency requiring to do
>> devm_request_mem_region(), devm_kzalloc(), devm_memremap_pages()
>> before calling alloc_dax() ?
>>
>> Those 3 calls are used to populate:
>>
>>           fs->window_phys_addr = (phys_addr_t) cache_reg.addr;
>>           fs->window_len = (phys_addr_t) cache_reg.len;
>>
>> and then alloc_dax() takes "fs" as private data parameter. So it's
>> unclear to me whether we can swap the invocation order. I suspect
>> that it is not an issue because it is only used to populate
>> dax_dev->private, but I prefer to confirm this with you just to be
>> on the safe side.
> 
> Thanks for that. All of those need to be done before the fs goes live
> later in virtio_device_ready(), but before that point nothing should be
> calling into virtio_fs_dax_ops, so as far as I can see it is safe to
> change the order.

Sounds good, I'll do that.

I will soon be ready to send out a RFC v4, which is still only
compiled-tested. Do you happen to have some kind of test suite
you can use to automate some of the runtime testing ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


  reply	other threads:[~2024-02-02 20:02 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-31 16:25 [RFC PATCH v3 0/4] Introduce cpu_dcache_is_aliasing() to fix DAX regression Mathieu Desnoyers
2024-01-31 16:25 ` [RFC PATCH v3 1/4] dm: Treat alloc_dax failure as non-fatal Mathieu Desnoyers
2024-01-31 16:25 ` [RFC PATCH v3 2/4] dax: Check for data cache aliasing at runtime Mathieu Desnoyers
2024-01-31 21:02   ` Dan Williams
2024-01-31 21:39     ` Mathieu Desnoyers
2024-01-31 22:18       ` Dan Williams
2024-02-01 15:44         ` Mathieu Desnoyers
2024-02-02 14:40           ` Mathieu Desnoyers
2024-02-02 16:32           ` Mathieu Desnoyers
2024-02-02 17:37           ` Dan Williams
2024-02-02 19:29             ` Mathieu Desnoyers
2024-02-02 19:41               ` Dan Williams
2024-02-02 20:02                 ` Mathieu Desnoyers [this message]
2024-02-02 20:14                   ` Dan Williams
2024-02-02 20:18                     ` Mathieu Desnoyers
2024-01-31 16:25 ` [RFC PATCH v3 3/4] Introduce cpu_dcache_is_aliasing() across all architectures Mathieu Desnoyers
2024-01-31 17:17   ` Christoph Hellwig
2024-01-31 17:59     ` Mathieu Desnoyers
2024-01-31 20:42       ` Dan Williams
2024-01-31 16:25 ` [RFC PATCH v3 4/4] dax: Fix incorrect list of data cache aliasing architectures Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5e838147-524c-40e5-b106-e388bf4e549b@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=david@fromorbit.com \
    --cc=dm-devel@lists.linux.dev \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@armlinux.org.uk \
    --cc=nvdimm@lists.linux.dev \
    --cc=torvalds@linux-foundation.org \
    --cc=vishal.l.verma@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).