All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Jani Nikula <jani.nikula@linux.intel.com>
To: Jonathan Corbet <corbet@lwn.net>,
	Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: linux-doc@vger.kernel.org, Ricardo Ribalda <ribalda@chromium.org>,
	Tiffany Lin <tiffany.lin@mediatek.com>,
	Andrew-CT Chen <andrew-ct.chen@mediatek.com>,
	Yunfei Dong <yunfei.dong@mediatek.com>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Matthias Brugger <matthias.bgg@gmail.com>,
	AngeloGioacchino Del Regno
	<angelogioacchino.delregno@collabora.com>,
	Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
	Hans Verkuil <hverkuil@xs4all.nl>,
	Kieran Bingham <kieran.bingham@ideasonboard.com>,
	Bin Liu <bin.liu@mediatek.com>,
	Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>,
	Philipp Zabel <p.zabel@pengutronix.de>,
	Stanimir Varbanov <stanimir.k.varbanov@gmail.com>,
	Vikash Garodia <quic_vgarodia@quicinc.com>,
	Bryan O'Donoghue <bryan.odonoghue@linaro.org>,
	Bjorn Andersson <andersson@kernel.org>,
	Konrad Dybcio <konrad.dybcio@linaro.org>,
	Sylwester Nawrocki <s.nawrocki@samsung.com>,
	Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>,
	Alim Akhtar <alim.akhtar@samsung.com>,
	Marek Szyprowski <m.szyprowski@samsung.com>,
	Andrzej Hajda <andrzej.hajda@intel.com>,
	Bingbu Cao <bingbu.cao@intel.com>,
	Tianshu Qiu <tian.shu.qiu@intel.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Neil Armstrong <neil.armstrong@linaro.org>,
	Kevin Hilman <khilman@baylibre.com>,
	Jerome Brunet <jbrunet@baylibre.com>,
	Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Subject: Re: [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields
Date: Tue, 06 Feb 2024 13:20:34 +0200	[thread overview]
Message-ID: <87wmrhdekd.fsf@intel.com> (raw)
In-Reply-To: <874jemtq2f.fsf@meer.lwn.net>

On Mon, 05 Feb 2024, Jonathan Corbet <corbet@lwn.net> wrote:
> Sakari Ailus <sakari.ailus@linux.intel.com> writes:
>
>>> Sigh ... seeing more indecipherable regexes added to kernel-doc is like
>>> seeing another load of plastic bags dumped into the ocean...  it doesn't
>>> change the basic situation, but it's still sad.
>>> 
>>> Oh well, applied, thanks.
>>
>> Thanks. I have to say I feel the same...
>>
>> Regexes aren't great for parsing C, that's for sure. :-I But what are the
>> options? Write a proper parser for (a subset of) C?
>
> Every now and then I've pondered on this a bit.  There are parsers out
> there, of course; we could consider using something like tree-sitter.
> There's just two little problems:
>
> - That's a massive dependency to drag into the docs build that seems
>   unlikely to speed things up.
>
> - kernel-doc is really two parsers - one for C code, one for the
>   comment syntax.  Strangely, nobody has written a grammar for this
>   combination.
>
> A suitably motivated developer could probably create a C+kerneldoc
> grammer that would let us make a rock-solid, tree-sitter-based parser
> that would be mostly maintained by somebody else.  But that doesn't get
> us around the "adding a big dependency" problem.

After we'd made kernel-doc the perl script to produce rst, and
kernel-doc the Sphinx extension to consume it, I pondered the same
questions, and wondered what it should all look like if you could just
ignore all the kernel legacy.

I've told the story before, but what I ended up with was:

- Use Python bindings for libclang to parse the source code. Clang is
  obviously a big dependency, but nowadays more people have it already
  installed, and the Python part on top is neglible.

- Don't parse the contents of the comments, at all. Treat it as pure
  rst, and let Sphinx handle it.

That's pretty much how Hawkmoth [1] got started. I never even considered
it for kernel, because it would've been:

> <back to work now...>

Although Mesa now uses it to produce stuff like [2].

A suitably motivated developer could probably get it to work with the
kernel... Nowadays you could use Sphinx mechanisms to extend it to
convert kernel-doc style comments to rst.

There are a number of issues that might make it difficult, though:

- kernel-doc parses extra magic stuff like EXPORT_SYMBOL().

- all the special casing in kernel-doc dump_struct(), like

	$members =~ s/\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;/$2/gos;

- it's a compiler, so you'll need to pass suitable compiler options,
  which might be difficult with all the per-directory kbuild magic

- might end up being slow, because it's a compiler (although there's
  some caching to avoid parsing the same file multiple times like
  kernel-doc currently does)

Anyway, I think it would be important to separate the parsing of C and
parsing of comments. It's kind of in the same bag in kernel-doc. But if
you want to cross-check, say, the parameters/members against the
documentation, you'll need the C AST while parsing the comments. And the
preprocessor tricks employed in the kernel are probably going to be a
nightmare.

What I'm saying is, while Hawkmoth is perhaps not the right solution,
using any generic C parser will face some of the same issues regardless.


BR,
Jani.

[1] https://github.com/jnikula/hawkmoth/
[2] https://docs.mesa3d.org/isl/index.html

-- 
Jani Nikula, Intel

  parent reply	other threads:[~2024-02-06 11:20 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-31  8:49 [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields Sakari Ailus
2024-02-05 17:04 ` Jonathan Corbet
2024-02-05 21:30   ` Sakari Ailus
2024-02-06  0:05     ` Jonathan Corbet
2024-02-06  3:50       ` scripts/kernel-doc parsing issues Randy Dunlap
2025-02-14  3:15         ` Randy Dunlap
2025-02-14  7:57           ` Mauro Carvalho Chehab
2024-02-06 11:20       ` Jani Nikula [this message]
2024-02-06 15:13         ` [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields Jonathan Corbet
2025-02-14  7:45         ` Mauro Carvalho Chehab
2025-02-14 15:56           ` Jonathan Corbet
2025-02-14 16:29             ` Jani Nikula
2025-02-14 16:38               ` Jonathan Corbet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wmrhdekd.fsf@intel.com \
    --to=jani.nikula@linux.intel.com \
    --cc=alim.akhtar@samsung.com \
    --cc=andersson@kernel.org \
    --cc=andrew-ct.chen@mediatek.com \
    --cc=andrzej.hajda@intel.com \
    --cc=angelogioacchino.delregno@collabora.com \
    --cc=bin.liu@mediatek.com \
    --cc=bingbu.cao@intel.com \
    --cc=bryan.odonoghue@linaro.org \
    --cc=corbet@lwn.net \
    --cc=ezequiel@vanguardiasur.com.ar \
    --cc=gregkh@linuxfoundation.org \
    --cc=hverkuil@xs4all.nl \
    --cc=jbrunet@baylibre.com \
    --cc=khilman@baylibre.com \
    --cc=kieran.bingham@ideasonboard.com \
    --cc=konrad.dybcio@linaro.org \
    --cc=krzysztof.kozlowski@linaro.org \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=martin.blumenstingl@googlemail.com \
    --cc=matthias.bgg@gmail.com \
    --cc=mchehab@kernel.org \
    --cc=neil.armstrong@linaro.org \
    --cc=p.zabel@pengutronix.de \
    --cc=quic_vgarodia@quicinc.com \
    --cc=ribalda@chromium.org \
    --cc=s.nawrocki@samsung.com \
    --cc=sakari.ailus@linux.intel.com \
    --cc=stanimir.k.varbanov@gmail.com \
    --cc=tian.shu.qiu@intel.com \
    --cc=tiffany.lin@mediatek.com \
    --cc=yunfei.dong@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.