All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Caleb Connolly <caleb.connolly@linaro.org>
To: mwleeds@mailtundra.com
Cc: u-boot@lists.denx.de
Subject: Re: [PATCH 3/5] zfs: Fix unaligned read of uint64
Date: Thu, 25 Apr 2024 13:57:26 +0200	[thread overview]
Message-ID: <53b87b44-e87a-4baf-b104-64e62c47263e@linaro.org> (raw)
In-Reply-To: <yMP4wWXZMjDL0Xi9eeSdNuUBW7pE82Ie-UoUMbNBa7K1Vckv8A4da_KjBUlGkpeQfQsADCJ8z73FxL9AlyQjSJ41TJM1ohjNHzAStZpdhUk=@mailtundra.com>



On 25/04/2024 06:02, mwleeds@mailtundra.com wrote:
> Hi Caleb,
> 
> Thanks for this interesting feedback. I saw these patches were already merged
> since you sent this but I added a few thoughts below anyway.

Ah, a shame.. It would be good to find a scalable solution to this!
> 
> On Friday, April 12th, 2024 at 11:50 AM, Caleb Connolly <caleb.connolly@linaro.org> wrote:
> 
>> Hi Phaedrus,
>>
>> On 07/04/2024 03:47, mwleeds@mailtundra.com wrote:
>>
>>> Without this patch, when trying to boot zfs using U-Boot on a Jetson TX2
>>> NX (which is aarch64), I get a CPU reset error like so:
>>>
>>> "Synchronous Abort" handler, esr 0x96000021
>>> elr: 00000000800c9000 lr : 00000000800c8ffc (reloc)
>>> elr: 00000000fff77000 lr : 00000000fff76ffc
>>> x0 : 00000000ffb40f04 x1 : 0000000000000000
>>> x2 : 000000000000000a x3 : 0000000003100000
>>> x4 : 0000000003100000 x5 : 0000000000000034
>>> x6 : 00000000fff9cc6e x7 : 000000000000000f
>>> x8 : 00000000ff7f84a0 x9 : 0000000000000008
>>> x10: 00000000ffb40f04 x11: 0000000000000006
>>> x12: 000000000001869f x13: 0000000000000001
>>> x14: 00000000ff7f84bc x15: 0000000000000010
>>> x16: 0000000000002080 x17: 00000000001fffff
>>> x18: 00000000ff7fbdd8 x19: 00000000ffb405f8
>>> x20: 00000000ffb40dd0 x21: 00000000fffabe5e
>>> x22: 000000ea77940000 x23: 00000000ffb42090
>>> x24: 0000000000000000 x25: 0000000000000000
>>> x26: 0000000000000000 x27: 0000000000000000
>>> x28: 0000000000bab10c x29: 00000000ff7f85f0
>>>
>>> Code: d00001a0 9103a000 94006ac6 f9401ba0 (f9400000)
>>> Resetting CPU ...
>>>
>>> This happens when be64_to_cpu() is called on a value that exists at a
>>> memory address that's 4 byte aligned but not 8 byte aligned (e.g. an
>>> address ending in 04). The call stack where that happens is:
>>> check_pool_label() ->
>>> zfs_nvlist_lookup_uint64(vdevnvlist, ZPOOL_CONFIG_ASHIFT,...) ->
>>> be64_to_cpu()
>>
>>
>> This is very odd, aarch64 doesn't generally have these restrictions. I
>> got a bit nerdsniped when I saw this so I did some digging and figured this:
>>
>> 1. Your abort exception doesn't include the FAR_ELx register (which
>> should contain the address that was being accessed when the abort
>> occured). This means your board is running in EL3.
>> 2. It turns out there is an "A" flag in the SCTLR_ELx register, when set
>> this flag causes a fault when trying to load from an address that isn't
>> aligned to the size of the data element (see "A, bit" on
>> https://developer.arm.com/documentation/ddi0595/2021-06/AArch64-Registers/SCTLR-EL3--System-Control-Register--EL3-
>>
>> I'm not sure who's in the "wrong" here, maybe the driver should avoid
>> unaligned accesses? But then again, I don't think you should be running
>> a ZFS driver in EL3.
>>
>> I'm not familiar with the Jetson Nano, but maybe there's a documented
>> way to run U-Boot so that it isn't executing in EL3? Or if not you could
>> also try unsetting the A flag.
> 
> I may look into this if I get a chance. However if I write some assembly code
> to change the execution level or unset the A flag, I worry that the code would
> work fine on the hardware I have (Jetson TX2 NX) but behave differently on

Well, unsetting the A flag might arguably have some niche negative 
outcomes, but I really struggle to see how this is a driver bug to be 
honest...
> another platform. And I don't think I can easily set up testing environments
> with u-boot + zfs on different platforms to find out.

fwiw, if you do intend to investigate this further, I'd be happy to 
validate your assumptions on a Qualcomm board (where we are rather 
boring and running in EL1 or EL2).
> 
>>
>> If this really is something to fix in the driver, I don't think
>> hotpatching every unaligned access with a malloc() is the right solution.
>>
> 
> I'm certainly open to other ideas. The difficulty is the data structure we're
> parsing in this file is read from disk and it's only 4-byte aligned.

According to the ARM docs, the issue is that you're loading an 8-byte 
value from a 4-byte aligned address. If you split it into two 4-byte 
reads (for the upper and lower words) then it should work fine. This 
would avoid the malloc at any rate.

Thanks and regards,
> 
>>> Signed-off-by: Phaedrus Leeds mwleeds@mailtundra.com
>>> Tested-by: Phaedrus Leeds mwleeds@mailtundra.com
>>
>>
>> regarding your question about re-sending to remove these tags, I'd say
>> probably yes, and especially if you're going to send a new revision anyway.
>>
>> fwiw you seem to have gotten pretty much everything else about the patch
>> submission process spot on :)
>>
>> Kind regards,
>>
>>> ---
>>> fs/zfs/zfs.c | 18 ++++++++++++++++++
>>> 1 file changed, 18 insertions(+)
>>>
>>> diff --git a/fs/zfs/zfs.c b/fs/zfs/zfs.c
>>> index 61d58fce68..9a50deac18 100644
>>> --- a/fs/zfs/zfs.c
>>> +++ b/fs/zfs/zfs.c
>>> @@ -1552,35 +1552,53 @@ nvlist_find_value(char *nvlist, char *name, int valtype, char **val,
>>> if (nelm_out)
>>> *nelm_out = nelm;
>>> return 1;
>>> }
>>>
>>> nvlist += encode_size; /* goto the next nvpair */
>>> }
>>> return 0;
>>> }
>>>
>>> +int is_word_aligned_ptr(void *ptr) {
>>> + return ((uintptr_t)ptr & (sizeof(void *) - 1)) == 0;
>>> +}
>>> +
>>> int
>>> zfs_nvlist_lookup_uint64(char *nvlist, char *name, uint64_t *out)
>>> {
>>> char *nvpair;
>>> size_t size;
>>> int found;
>>>
>>> found = nvlist_find_value(nvlist, name, DATA_TYPE_UINT64, &nvpair, &size, 0);
>>> if (!found)
>>> return 0;
>>> if (size < sizeof(uint64_t)) {
>>> printf("invalid uint64\n");
>>> return ZFS_ERR_BAD_FS;
>>> }
>>>
>>> + /* On arm64, calling be64_to_cpu() on a value stored at a memory address
>>> + * that's not 8-byte aligned causes the CPU to reset. Avoid that by copying the
>>> + * value somewhere else if needed.
>>> + */
>>> + if (!is_word_aligned_ptr((void *)nvpair)) {
>>> + uint64_t *alignedptr = malloc(sizeof(uint64_t));
>>> + if (!alignedptr)
>>> + return 0;
>>> + memcpy(alignedptr, nvpair, sizeof(uint64_t));
>>> + *out = be64_to_cpu(*alignedptr);
>>> + free(alignedptr);
>>> + return 1;
>>> + }
>>> +
>>> out = be64_to_cpu((uint64_t *) nvpair);
>>> return 1;
>>> }
>>>
>>> char *
>>> zfs_nvlist_lookup_string(char *nvlist, char *name)
>>> {
>>> char *nvpair;
>>> char *ret;
>>> size_t slen;
>>
>>
>> --
>> // Caleb (they/them)

-- 
// Caleb (they/them)

  reply	other threads:[~2024-04-25 11:57 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-07  1:47 [PATCH 0/5] zfs: Fix zfs support on aarch64 mwleeds
2024-04-07  1:47 ` [PATCH 1/5] zfs: Fix malloc() success check mwleeds
2024-04-07 11:22   ` Igor Opaniuk
2024-04-12 15:54     ` mwleeds
2024-04-07  1:47 ` [PATCH 2/5] zfs: Add a comment to clarify nvlist memory layout mwleeds
2024-04-07  1:47 ` [PATCH 3/5] zfs: Fix unaligned read of uint64 mwleeds
2024-04-12 18:50   ` Caleb Connolly
2024-04-12 18:57     ` Caleb Connolly
2024-04-25  4:02     ` mwleeds
2024-04-25 11:57       ` Caleb Connolly [this message]
2024-05-08 14:54         ` Phaedrus Leeds
2024-05-08 20:04         ` mwleeds
2024-04-07  1:47 ` [PATCH 4/5] zfs: Fix return value of fs_devread() mwleeds
2024-04-07  1:47 ` [PATCH 5/5] zfs: Fix zfs_read() to actually work mwleeds
2024-04-17 19:15 ` [PATCH 0/5] zfs: Fix zfs support on aarch64 Tom Rini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53b87b44-e87a-4baf-b104-64e62c47263e@linaro.org \
    --to=caleb.connolly@linaro.org \
    --cc=mwleeds@mailtundra.com \
    --cc=u-boot@lists.denx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.