Linux-PCI Archive mirror
 help / color / mirror / Atom feed
From: Josselin Mouette <josselin.mouette@exaion.com>
To: linux-pci@vger.kernel.org
Cc: Bjorn Helgaas <helgaas@kernel.org>
Subject: [Regression] [PCI/VPD] Possible memory corruption caused by invalid VPD data (commit found)
Date: Thu, 07 Mar 2024 17:07:50 +0100	[thread overview]
Message-ID: <aaea0b30c35bb73b947727e4b3ec354d6b5c399c.camel@exaion.com> (raw)

We’ve been observing a subtle kernel bug on a few servers after kernel
upgrades (starting from 5.15 and persisting in 6.8-rc1). The bug arises
only on machines with Mellanox Connect-X 3 cards and the symptom is
RabbitMQ disconnections caused by packet loss on the system Ethernet
card (Intel I350). Replacing the I350 by a 82580 produced the exact
same symptoms.

A bisect led to this change:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5fe204eab174fd474227f23fd47faee4e7a6c000

Reverting the patch and adding more warnings (patch follows) allowed us
to identify that the VPD data in the Connect-X 3 firmware is missing
VPD_STIN_END, which makes it return at a 32k offset. But I presume the
VPD data is incorrect far before that 32k limit.
[   43.854869] mlx4_core 0000:16:00.0: missing VPD_STIN_END at offset 32769

Bjorn advised (thanks!) to look for what process is reading that VPD
data. In our case it is libvirtd, and enabling debugging in libvirtd
turned out a very interesting exercise, since it starts spewing
gabajillions of VPD errors, especially in the Intel 82580 data.

That igb data does not look corrupt when we revert the change mentioned
earlier, and we don’t see the packet loss either.

I’m not proficient in Kernel nor PCI internals, but a plausible
explanation is that incorrect handling of the returned data causes out-
of-bounds memory write, so this would mean a bug somewhere else, still
to be found. 

If this hypothesis is correct, there are security implications, since a
specifically crafted PCI firmware could elevate privileges to kernel
level. In all cases, it does not look sensible to return data that is
known to be incorrect.

-- 
Josselin MOUETTE
Infrastructure & Security architect
EXAION


             reply	other threads:[~2024-03-07 16:17 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-07 16:07 Josselin Mouette [this message]
2024-03-07 16:09 ` [PATCH 1/2] Revert "PCI/VPD: Allow access to valid parts of VPD if some is invalid" Josselin Mouette
2024-03-07 16:10   ` [PATCH 2/2] Add better warnings about invalid VPD data Josselin Mouette
2024-03-07 22:36   ` [PATCH 1/2] Revert "PCI/VPD: Allow access to valid parts of VPD if some is invalid" Bjorn Helgaas
2024-05-02 22:23     ` Bjorn Helgaas
2024-05-03  6:45       ` Hannes Reinecke
2024-03-08  7:53   ` Josselin Mouette
2024-03-08  7:54     ` [PATCH 2/2] Add better warnings about invalid VPD data Josselin Mouette
2024-03-07 16:16 ` [Regression] [PCI/VPD] Possible memory corruption caused by invalid VPD data (commit found) Josselin Mouette
2024-03-07 23:11 ` Bjorn Helgaas
2024-03-08  7:42   ` Josselin Mouette

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aaea0b30c35bb73b947727e4b3ec354d6b5c399c.camel@exaion.com \
    --to=josselin.mouette@exaion.com \
    --cc=helgaas@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).