* [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields
@ 2024-01-31 8:49 Sakari Ailus
2024-02-05 17:04 ` Jonathan Corbet
0 siblings, 1 reply; 13+ messages in thread
From: Sakari Ailus @ 2024-01-31 8:49 UTC (permalink / raw)
To: Jonathan Corbet, linux-doc
Cc: Ricardo Ribalda, Sakari Ailus, Tiffany Lin, Andrew-CT Chen,
Yunfei Dong, Mauro Carvalho Chehab, Matthias Brugger,
AngeloGioacchino Del Regno, Laurent Pinchart, Hans Verkuil,
Kieran Bingham, Bin Liu, Ezequiel Garcia, Philipp Zabel,
Stanimir Varbanov, Vikash Garodia, Bryan O'Donoghue,
Bjorn Andersson, Konrad Dybcio, Sylwester Nawrocki,
Krzysztof Kozlowski, Alim Akhtar, Marek Szyprowski, Andrzej Hajda,
Bingbu Cao, Tianshu Qiu, Greg Kroah-Hartman, Neil Armstrong,
Kevin Hilman, Jerome Brunet, Martin Blumenstingl
In a rather unusual arrangement in include/media/v4l2-vp9.h struct
v4l2_vp9_frame_symbol_counts has fields that are arrays of pointers, not a
pointer to an array, which is what's usually done.
Add support for such arrays of pointers to kernel-doc.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Ricardo Ribalda <ribalda@chromium.org>
---
No change since the RFC, just added the acks.
scripts/kernel-doc | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index e8aefd258a29..23c91b11585a 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -1509,6 +1509,15 @@ sub create_parameterlist($$$$) {
$type =~ s/([^\(]+\(\*?)\s*$param/$1/;
save_struct_actual($param);
push_parameter($param, $type, $arg, $file, $declaration_name);
+ } elsif ($arg =~ m/\(.+\)\s*\[/) {
+ # array-of-pointers
+ $arg =~ tr/#/,/;
+ $arg =~ m/[^\(]+\(\s*\*\s*([\w\[\]\.]*?)\s*(\s*\[\s*[\w]+\s*\]\s*)*\)/;
+ $param = $1;
+ $type = $arg;
+ $type =~ s/([^\(]+\(\*?)\s*$param/$1/;
+ save_struct_actual($param);
+ push_parameter($param, $type, $arg, $file, $declaration_name);
} elsif ($arg) {
$arg =~ s/\s*:\s*/:/g;
$arg =~ s/\s*\[/\[/g;
--
2.39.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields
2024-01-31 8:49 [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields Sakari Ailus
@ 2024-02-05 17:04 ` Jonathan Corbet
2024-02-05 21:30 ` Sakari Ailus
0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Corbet @ 2024-02-05 17:04 UTC (permalink / raw)
To: Sakari Ailus, linux-doc
Cc: Ricardo Ribalda, Sakari Ailus, Tiffany Lin, Andrew-CT Chen,
Yunfei Dong, Mauro Carvalho Chehab, Matthias Brugger,
AngeloGioacchino Del Regno, Laurent Pinchart, Hans Verkuil,
Kieran Bingham, Bin Liu, Ezequiel Garcia, Philipp Zabel,
Stanimir Varbanov, Vikash Garodia, Bryan O'Donoghue,
Bjorn Andersson, Konrad Dybcio, Sylwester Nawrocki,
Krzysztof Kozlowski, Alim Akhtar, Marek Szyprowski, Andrzej Hajda,
Bingbu Cao, Tianshu Qiu, Greg Kroah-Hartman, Neil Armstrong,
Kevin Hilman, Jerome Brunet, Martin Blumenstingl
Sakari Ailus <sakari.ailus@linux.intel.com> writes:
> In a rather unusual arrangement in include/media/v4l2-vp9.h struct
> v4l2_vp9_frame_symbol_counts has fields that are arrays of pointers, not a
> pointer to an array, which is what's usually done.
>
> Add support for such arrays of pointers to kernel-doc.
>
> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
> Acked-by: Randy Dunlap <rdunlap@infradead.org>
> Tested-by: Randy Dunlap <rdunlap@infradead.org>
> Tested-by: Ricardo Ribalda <ribalda@chromium.org>
> ---
> No change since the RFC, just added the acks.
>
> scripts/kernel-doc | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/scripts/kernel-doc b/scripts/kernel-doc
> index e8aefd258a29..23c91b11585a 100755
> --- a/scripts/kernel-doc
> +++ b/scripts/kernel-doc
> @@ -1509,6 +1509,15 @@ sub create_parameterlist($$$$) {
> $type =~ s/([^\(]+\(\*?)\s*$param/$1/;
> save_struct_actual($param);
> push_parameter($param, $type, $arg, $file, $declaration_name);
> + } elsif ($arg =~ m/\(.+\)\s*\[/) {
> + # array-of-pointers
> + $arg =~ tr/#/,/;
> + $arg =~ m/[^\(]+\(\s*\*\s*([\w\[\]\.]*?)\s*(\s*\[\s*[\w]+\s*\]\s*)*\)/;
> + $param = $1;
> + $type = $arg;
> + $type =~ s/([^\(]+\(\*?)\s*$param/$1/;
> + save_struct_actual($param);
> + push_parameter($param, $type, $arg, $file, $declaration_name);
> } elsif ($arg) {
Sigh ... seeing more indecipherable regexes added to kernel-doc is like
seeing another load of plastic bags dumped into the ocean... it doesn't
change the basic situation, but it's still sad.
Oh well, applied, thanks.
jon
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields
2024-02-05 17:04 ` Jonathan Corbet
@ 2024-02-05 21:30 ` Sakari Ailus
2024-02-06 0:05 ` Jonathan Corbet
0 siblings, 1 reply; 13+ messages in thread
From: Sakari Ailus @ 2024-02-05 21:30 UTC (permalink / raw)
To: Jonathan Corbet
Cc: linux-doc, Ricardo Ribalda, Tiffany Lin, Andrew-CT Chen,
Yunfei Dong, Mauro Carvalho Chehab, Matthias Brugger,
AngeloGioacchino Del Regno, Laurent Pinchart, Hans Verkuil,
Kieran Bingham, Bin Liu, Ezequiel Garcia, Philipp Zabel,
Stanimir Varbanov, Vikash Garodia, Bryan O'Donoghue,
Bjorn Andersson, Konrad Dybcio, Sylwester Nawrocki,
Krzysztof Kozlowski, Alim Akhtar, Marek Szyprowski, Andrzej Hajda,
Bingbu Cao, Tianshu Qiu, Greg Kroah-Hartman, Neil Armstrong,
Kevin Hilman, Jerome Brunet, Martin Blumenstingl
Hi Jon,
On Mon, Feb 05, 2024 at 10:04:18AM -0700, Jonathan Corbet wrote:
> Sakari Ailus <sakari.ailus@linux.intel.com> writes:
>
> > In a rather unusual arrangement in include/media/v4l2-vp9.h struct
> > v4l2_vp9_frame_symbol_counts has fields that are arrays of pointers, not a
> > pointer to an array, which is what's usually done.
> >
> > Add support for such arrays of pointers to kernel-doc.
> >
> > Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
> > Acked-by: Randy Dunlap <rdunlap@infradead.org>
> > Tested-by: Randy Dunlap <rdunlap@infradead.org>
> > Tested-by: Ricardo Ribalda <ribalda@chromium.org>
> > ---
> > No change since the RFC, just added the acks.
> >
> > scripts/kernel-doc | 9 +++++++++
> > 1 file changed, 9 insertions(+)
> >
> > diff --git a/scripts/kernel-doc b/scripts/kernel-doc
> > index e8aefd258a29..23c91b11585a 100755
> > --- a/scripts/kernel-doc
> > +++ b/scripts/kernel-doc
> > @@ -1509,6 +1509,15 @@ sub create_parameterlist($$$$) {
> > $type =~ s/([^\(]+\(\*?)\s*$param/$1/;
> > save_struct_actual($param);
> > push_parameter($param, $type, $arg, $file, $declaration_name);
> > + } elsif ($arg =~ m/\(.+\)\s*\[/) {
> > + # array-of-pointers
> > + $arg =~ tr/#/,/;
> > + $arg =~ m/[^\(]+\(\s*\*\s*([\w\[\]\.]*?)\s*(\s*\[\s*[\w]+\s*\]\s*)*\)/;
> > + $param = $1;
> > + $type = $arg;
> > + $type =~ s/([^\(]+\(\*?)\s*$param/$1/;
> > + save_struct_actual($param);
> > + push_parameter($param, $type, $arg, $file, $declaration_name);
> > } elsif ($arg) {
>
> Sigh ... seeing more indecipherable regexes added to kernel-doc is like
> seeing another load of plastic bags dumped into the ocean... it doesn't
> change the basic situation, but it's still sad.
>
> Oh well, applied, thanks.
Thanks. I have to say I feel the same...
Regexes aren't great for parsing C, that's for sure. :-I But what are the
options? Write a proper parser for (a subset of) C?
--
Regards,
Sakari Ailus
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields
2024-02-05 21:30 ` Sakari Ailus
@ 2024-02-06 0:05 ` Jonathan Corbet
2024-02-06 3:50 ` scripts/kernel-doc parsing issues Randy Dunlap
2024-02-06 11:20 ` [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields Jani Nikula
0 siblings, 2 replies; 13+ messages in thread
From: Jonathan Corbet @ 2024-02-06 0:05 UTC (permalink / raw)
To: Sakari Ailus
Cc: linux-doc, Ricardo Ribalda, Tiffany Lin, Andrew-CT Chen,
Yunfei Dong, Mauro Carvalho Chehab, Matthias Brugger,
AngeloGioacchino Del Regno, Laurent Pinchart, Hans Verkuil,
Kieran Bingham, Bin Liu, Ezequiel Garcia, Philipp Zabel,
Stanimir Varbanov, Vikash Garodia, Bryan O'Donoghue,
Bjorn Andersson, Konrad Dybcio, Sylwester Nawrocki,
Krzysztof Kozlowski, Alim Akhtar, Marek Szyprowski, Andrzej Hajda,
Bingbu Cao, Tianshu Qiu, Greg Kroah-Hartman, Neil Armstrong,
Kevin Hilman, Jerome Brunet, Martin Blumenstingl
Sakari Ailus <sakari.ailus@linux.intel.com> writes:
>> Sigh ... seeing more indecipherable regexes added to kernel-doc is like
>> seeing another load of plastic bags dumped into the ocean... it doesn't
>> change the basic situation, but it's still sad.
>>
>> Oh well, applied, thanks.
>
> Thanks. I have to say I feel the same...
>
> Regexes aren't great for parsing C, that's for sure. :-I But what are the
> options? Write a proper parser for (a subset of) C?
Every now and then I've pondered on this a bit. There are parsers out
there, of course; we could consider using something like tree-sitter.
There's just two little problems:
- That's a massive dependency to drag into the docs build that seems
unlikely to speed things up.
- kernel-doc is really two parsers - one for C code, one for the
comment syntax. Strangely, nobody has written a grammar for this
combination.
A suitably motivated developer could probably create a C+kerneldoc
grammer that would let us make a rock-solid, tree-sitter-based parser
that would be mostly maintained by somebody else. But that doesn't get
us around the "adding a big dependency" problem.
<back to work now...>
jon
^ permalink raw reply [flat|nested] 13+ messages in thread
* scripts/kernel-doc parsing issues
2024-02-06 0:05 ` Jonathan Corbet
@ 2024-02-06 3:50 ` Randy Dunlap
2025-02-14 3:15 ` Randy Dunlap
2024-02-06 11:20 ` [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields Jani Nikula
1 sibling, 1 reply; 13+ messages in thread
From: Randy Dunlap @ 2024-02-06 3:50 UTC (permalink / raw)
To: Jonathan Corbet, Sakari Ailus; +Cc: linux-doc, Mauro Carvalho Chehab
[reduced Cc: list]
[was: Re: [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields]
On 2/5/24 16:05, Jonathan Corbet wrote:
> Sakari Ailus <sakari.ailus@linux.intel.com> writes:
>
>>> Sigh ... seeing more indecipherable regexes added to kernel-doc is like
>>> seeing another load of plastic bags dumped into the ocean... it doesn't
>>> change the basic situation, but it's still sad.
>>>
>>> Oh well, applied, thanks.
>>
>> Thanks. I have to say I feel the same...
>>
>> Regexes aren't great for parsing C, that's for sure. :-I But what are the
>> options? Write a proper parser for (a subset of) C?
>
> Every now and then I've pondered on this a bit. There are parsers out
> there, of course; we could consider using something like tree-sitter.
> There's just two little problems:
>
> - That's a massive dependency to drag into the docs build that seems
> unlikely to speed things up.
>
> - kernel-doc is really two parsers - one for C code, one for the
> comment syntax. Strangely, nobody has written a grammar for this
> combination.
>
> A suitably motivated developer could probably create a C+kerneldoc
> grammer that would let us make a rock-solid, tree-sitter-based parser
> that would be mostly maintained by somebody else. But that doesn't get
> us around the "adding a big dependency" problem.
>
> <back to work now...>
As I said here on the RFC patch from Sakari:
https://lore.kernel.org/all/aa94772b-7010-4bba-b099-d3b8fe1b97aa@infradead.org/
"Yet another kernel-doc bug. I have a list of 5 or 6 or 8 bugs that are
similar to this one, but I didn't have this one."
The patch to report Excess struct or union members has unearthed several
kernel-doc "parsing" problems.
I have not tried to fix any of these in scripts/kernel-doc yet. I might get
around to it, but it's not a high priority for me.
Examples:
1) drivers/slimbus/stream.c:49: warning: Excess struct member 'segdist_codes' description in 'segdist_code'
struct declaration and definition together. Also possible that the leading "static const"
confuses scripts/kernel-doc.
2) include/linux/spi/spi.h:246: warning: Function parameter or struct member 'cs_index_mask:SPI_CS_CNT_MAX' not described in 'spi_device'
include/linux/spi/spi.h:246: warning: Excess struct member 'cs_index_mask' description in 'spi_device'
scripts/kernel-doc handles some bit fields in structs successfully, so something is
different about this one.
3) fs/ntfs/compress.c:24: warning: cannot understand function prototype: 'typedef enum '
fs/ntfs/* has been removed in linux-next (still in mainline for a little while), but this
shows that scripts/kernel-doc does not handle a 'typedef enum' successfully.
4) drivers/misc/vmw_balloon.c:260: warning: Excess struct member 'reserved' description in 'vmballoon_batch_entry'
This may be the same problem as #2, with using bit fields in a struct.
5) drivers/base/power/runtime.c:362: warning: Excess function parameter 'dev' description in '__rpm_callback'
Confused by either the first function parameter (a function pointer) or the trailing
__releases() and __acquires() attributes.
6) drivers/md/bcache/request.c:309: warning: expecting prototype for bch_data_insert(). Prototype was for CLOSURE_CALLBACK() instead
and
fs/bcachefs/io_write.c:1558: warning: expecting prototype for bch2_write(). Prototype was for CLOSURE_CALLBACK() instead
CLOSURE_CALLBACK() and function parameters are confusing scripts/kernel-doc.
7) drivers/iio/adc/at91-sama5d2_adc.c:471: warning: Excess struct member 'adc_channels' description in 'at91_adc_platform'
Fixed by Sakari's patch. :)
8) drivers/pci/controller/pcie-iproc-msi.c:110: warning: Excess struct member 'reg_offsets' description in 'iproc_msi'
Fixed by Sakari's patch. :)
9) drivers/usb/gadget/udc/pch_udc.c:361: warning: Excess struct member 'stall' description in 'pch_udc_dev'
pch_udc.c:361: warning: Excess struct member 'prot_stall' description in 'pch_udc_dev'
pch_udc.c:361: warning: Excess struct member 'registered' description in 'pch_udc_dev'
pch_udc.c:361: warning: Excess struct member 'suspended' description in 'pch_udc_dev'
pch_udc.c:361: warning: Excess struct member 'connected' description in 'pch_udc_dev'
pch_udc.c:361: warning: Excess struct member 'vbus_session' description in 'pch_udc_dev'
pch_udc.c:361: warning: Excess struct member 'set_cfg_not_acked' description in 'pch_udc_dev'
pch_udc.c:361: warning: Excess struct member 'waiting_zlp_ack' description in 'pch_udc_dev'
All of these except @registered (which is just an Excess description) are declared with one
'unsigned' followed by a list of bit fields, which isn't kernel coding style but it is valid C.
or it might just be 'unsigned' without having a following 'int' that is the problem. I don't
know -- haven't looked yet.
10) Matthew Wilcox pointed out to me that commit 0d55d48b19ff is causing problems with
generated output. A few instances of using TAB or multiple spaces have been patched
recently, but there are others that are not being addressed. I don't have a list of these.
I don't know anything about tree-sitter, so if I were going to add a parser, I would
probably (foolishly?) first try using sparse and a sparse extension.
cheers.
--
#Randy
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields
2024-02-06 0:05 ` Jonathan Corbet
2024-02-06 3:50 ` scripts/kernel-doc parsing issues Randy Dunlap
@ 2024-02-06 11:20 ` Jani Nikula
2024-02-06 15:13 ` Jonathan Corbet
2025-02-14 7:45 ` Mauro Carvalho Chehab
1 sibling, 2 replies; 13+ messages in thread
From: Jani Nikula @ 2024-02-06 11:20 UTC (permalink / raw)
To: Jonathan Corbet, Sakari Ailus
Cc: linux-doc, Ricardo Ribalda, Tiffany Lin, Andrew-CT Chen,
Yunfei Dong, Mauro Carvalho Chehab, Matthias Brugger,
AngeloGioacchino Del Regno, Laurent Pinchart, Hans Verkuil,
Kieran Bingham, Bin Liu, Ezequiel Garcia, Philipp Zabel,
Stanimir Varbanov, Vikash Garodia, Bryan O'Donoghue,
Bjorn Andersson, Konrad Dybcio, Sylwester Nawrocki,
Krzysztof Kozlowski, Alim Akhtar, Marek Szyprowski, Andrzej Hajda,
Bingbu Cao, Tianshu Qiu, Greg Kroah-Hartman, Neil Armstrong,
Kevin Hilman, Jerome Brunet, Martin Blumenstingl
On Mon, 05 Feb 2024, Jonathan Corbet <corbet@lwn.net> wrote:
> Sakari Ailus <sakari.ailus@linux.intel.com> writes:
>
>>> Sigh ... seeing more indecipherable regexes added to kernel-doc is like
>>> seeing another load of plastic bags dumped into the ocean... it doesn't
>>> change the basic situation, but it's still sad.
>>>
>>> Oh well, applied, thanks.
>>
>> Thanks. I have to say I feel the same...
>>
>> Regexes aren't great for parsing C, that's for sure. :-I But what are the
>> options? Write a proper parser for (a subset of) C?
>
> Every now and then I've pondered on this a bit. There are parsers out
> there, of course; we could consider using something like tree-sitter.
> There's just two little problems:
>
> - That's a massive dependency to drag into the docs build that seems
> unlikely to speed things up.
>
> - kernel-doc is really two parsers - one for C code, one for the
> comment syntax. Strangely, nobody has written a grammar for this
> combination.
>
> A suitably motivated developer could probably create a C+kerneldoc
> grammer that would let us make a rock-solid, tree-sitter-based parser
> that would be mostly maintained by somebody else. But that doesn't get
> us around the "adding a big dependency" problem.
After we'd made kernel-doc the perl script to produce rst, and
kernel-doc the Sphinx extension to consume it, I pondered the same
questions, and wondered what it should all look like if you could just
ignore all the kernel legacy.
I've told the story before, but what I ended up with was:
- Use Python bindings for libclang to parse the source code. Clang is
obviously a big dependency, but nowadays more people have it already
installed, and the Python part on top is neglible.
- Don't parse the contents of the comments, at all. Treat it as pure
rst, and let Sphinx handle it.
That's pretty much how Hawkmoth [1] got started. I never even considered
it for kernel, because it would've been:
> <back to work now...>
Although Mesa now uses it to produce stuff like [2].
A suitably motivated developer could probably get it to work with the
kernel... Nowadays you could use Sphinx mechanisms to extend it to
convert kernel-doc style comments to rst.
There are a number of issues that might make it difficult, though:
- kernel-doc parses extra magic stuff like EXPORT_SYMBOL().
- all the special casing in kernel-doc dump_struct(), like
$members =~ s/\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;/$2/gos;
- it's a compiler, so you'll need to pass suitable compiler options,
which might be difficult with all the per-directory kbuild magic
- might end up being slow, because it's a compiler (although there's
some caching to avoid parsing the same file multiple times like
kernel-doc currently does)
Anyway, I think it would be important to separate the parsing of C and
parsing of comments. It's kind of in the same bag in kernel-doc. But if
you want to cross-check, say, the parameters/members against the
documentation, you'll need the C AST while parsing the comments. And the
preprocessor tricks employed in the kernel are probably going to be a
nightmare.
What I'm saying is, while Hawkmoth is perhaps not the right solution,
using any generic C parser will face some of the same issues regardless.
BR,
Jani.
[1] https://github.com/jnikula/hawkmoth/
[2] https://docs.mesa3d.org/isl/index.html
--
Jani Nikula, Intel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields
2024-02-06 11:20 ` [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields Jani Nikula
@ 2024-02-06 15:13 ` Jonathan Corbet
2025-02-14 7:45 ` Mauro Carvalho Chehab
1 sibling, 0 replies; 13+ messages in thread
From: Jonathan Corbet @ 2024-02-06 15:13 UTC (permalink / raw)
To: Jani Nikula, Sakari Ailus
Cc: linux-doc, Ricardo Ribalda, Tiffany Lin, Andrew-CT Chen,
Yunfei Dong, Mauro Carvalho Chehab, Matthias Brugger,
AngeloGioacchino Del Regno, Laurent Pinchart, Hans Verkuil,
Kieran Bingham, Bin Liu, Ezequiel Garcia, Philipp Zabel,
Stanimir Varbanov, Vikash Garodia, Bryan O'Donoghue,
Bjorn Andersson, Konrad Dybcio, Sylwester Nawrocki,
Krzysztof Kozlowski, Alim Akhtar, Marek Szyprowski, Andrzej Hajda,
Bingbu Cao, Tianshu Qiu, Greg Kroah-Hartman, Neil Armstrong,
Kevin Hilman, Jerome Brunet, Martin Blumenstingl
Jani Nikula <jani.nikula@linux.intel.com> writes:
> What I'm saying is, while Hawkmoth is perhaps not the right solution,
> using any generic C parser will face some of the same issues regardless.
...which is why I mentioned tree-sitter, where it would be (relatively)
easy to augment the C grammar with a grammar for the kerneldoc comments
and end up with everything in a single AST.
It's still not going to happen, but one can dream...:)
jon
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: scripts/kernel-doc parsing issues
2024-02-06 3:50 ` scripts/kernel-doc parsing issues Randy Dunlap
@ 2025-02-14 3:15 ` Randy Dunlap
2025-02-14 7:57 ` Mauro Carvalho Chehab
0 siblings, 1 reply; 13+ messages in thread
From: Randy Dunlap @ 2025-02-14 3:15 UTC (permalink / raw)
To: Jonathan Corbet, Sakari Ailus; +Cc: linux-doc, Mauro Carvalho Chehab
On 2/5/24 7:50 PM, Randy Dunlap wrote:
> [reduced Cc: list]
>
> [was: Re: [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields]
>
[snip]
>
> As I said here on the RFC patch from Sakari:
> https://lore.kernel.org/all/aa94772b-7010-4bba-b099-d3b8fe1b97aa@infradead.org/
>
> "Yet another kernel-doc bug. I have a list of 5 or 6 or 8 bugs that are
> similar to this one, but I didn't have this one."
>
> The patch to report Excess struct or union members has unearthed several
> kernel-doc "parsing" problems.
>
> I have not tried to fix any of these in scripts/kernel-doc yet. I might get
> around to it, but it's not a high priority for me.
>
>
> Examples:
>
> 1) drivers/slimbus/stream.c:49: warning: Excess struct member 'segdist_codes' description in 'segdist_code'
>
> struct declaration and definition together. Also possible that the leading "static const"
> confuses scripts/kernel-doc.
>
> 2) include/linux/spi/spi.h:246: warning: Function parameter or struct member 'cs_index_mask:SPI_CS_CNT_MAX' not described in 'spi_device'
> include/linux/spi/spi.h:246: warning: Excess struct member 'cs_index_mask' description in 'spi_device'
>
> scripts/kernel-doc handles some bit fields in structs successfully, so something is
> different about this one.
>
> 3) fs/ntfs/compress.c:24: warning: cannot understand function prototype: 'typedef enum '
>
> fs/ntfs/* has been removed in linux-next (still in mainline for a little while), but this
> shows that scripts/kernel-doc does not handle a 'typedef enum' successfully.
>
> 4) drivers/misc/vmw_balloon.c:260: warning: Excess struct member 'reserved' description in 'vmballoon_batch_entry'
>
> This may be the same problem as #2, with using bit fields in a struct.
>
> 5) drivers/base/power/runtime.c:362: warning: Excess function parameter 'dev' description in '__rpm_callback'
>
> Confused by either the first function parameter (a function pointer) or the trailing
> __releases() and __acquires() attributes.
>
> 6) drivers/md/bcache/request.c:309: warning: expecting prototype for bch_data_insert(). Prototype was for CLOSURE_CALLBACK() instead
>
> and
> fs/bcachefs/io_write.c:1558: warning: expecting prototype for bch2_write(). Prototype was for CLOSURE_CALLBACK() instead
>
> CLOSURE_CALLBACK() and function parameters are confusing scripts/kernel-doc.
>
> 7) drivers/iio/adc/at91-sama5d2_adc.c:471: warning: Excess struct member 'adc_channels' description in 'at91_adc_platform'
>
> Fixed by Sakari's patch. :)
>
> 8) drivers/pci/controller/pcie-iproc-msi.c:110: warning: Excess struct member 'reg_offsets' description in 'iproc_msi'
>
> Fixed by Sakari's patch. :)
>
> 9) drivers/usb/gadget/udc/pch_udc.c:361: warning: Excess struct member 'stall' description in 'pch_udc_dev'
> pch_udc.c:361: warning: Excess struct member 'prot_stall' description in 'pch_udc_dev'
> pch_udc.c:361: warning: Excess struct member 'registered' description in 'pch_udc_dev'
> pch_udc.c:361: warning: Excess struct member 'suspended' description in 'pch_udc_dev'
> pch_udc.c:361: warning: Excess struct member 'connected' description in 'pch_udc_dev'
> pch_udc.c:361: warning: Excess struct member 'vbus_session' description in 'pch_udc_dev'
> pch_udc.c:361: warning: Excess struct member 'set_cfg_not_acked' description in 'pch_udc_dev'
> pch_udc.c:361: warning: Excess struct member 'waiting_zlp_ack' description in 'pch_udc_dev'
>
> All of these except @registered (which is just an Excess description) are declared with one
> 'unsigned' followed by a list of bit fields, which isn't kernel coding style but it is valid C.
> or it might just be 'unsigned' without having a following 'int' that is the problem. I don't
> know -- haven't looked yet.
>
> 10) Matthew Wilcox pointed out to me that commit 0d55d48b19ff is causing problems with
> generated output. A few instances of using TAB or multiple spaces have been patched
> recently, but there are others that are not being addressed. I don't have a list of these.
Here are a few more that I found recently.
11) security/landlock/ruleset.c:
security/landlock/ruleset.c:205: warning: Function parameter or struct member ''
not described in 'insert_rule'
security/landlock/ruleset.c:205: warning: Excess function parameter 'layers' des
cription in 'insert_rule'
security/landlock/ruleset.c:692: warning: Function parameter or struct member ''
not described in 'landlock_init_layer_masks'
security/landlock/ruleset.c:692: warning: Excess function parameter 'layer_masks
' description in 'landlock_init_layer_masks'
12) security/landlock/fs.c:
security/landlock/fs.c:762: warning: Function parameter or struct member '' not
described in 'is_access_to_paths_allowed'
security/landlock/fs.c:762: warning: Excess function parameter 'layer_masks_pare
nt1' description in 'is_access_to_paths_allowed'
security/landlock/fs.c:762: warning: Excess function parameter 'layer_masks_pare
nt2' description in 'is_access_to_paths_allowed'
security/landlock/fs.c:1002: warning: Function parameter or struct member '' not
described in 'collect_domain_accesses'
security/landlock/fs.c:1002: warning: Excess function parameter 'layer_masks_dom
' description in 'collect_domain_accesses'
13) security/ipe/hooks.c:
security/ipe/hooks.c:55: warning: Function parameter or struct member '__always_
unused' not described in 'ipe_mmap_file'
security/ipe/hooks.c:55: warning: Excess function parameter 'reqprot' descriptio
n in 'ipe_mmap_file'
security/ipe/hooks.c:83: warning: Function parameter or struct member '__always_
unused' not described in 'ipe_file_mprotect'
security/ipe/hooks.c:83: warning: Excess function parameter 'reqprot' descriptio
n in 'ipe_file_mprotect'
Probably just always ignore __always_unused.
cheers.
--
~Randy
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields
2024-02-06 11:20 ` [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields Jani Nikula
2024-02-06 15:13 ` Jonathan Corbet
@ 2025-02-14 7:45 ` Mauro Carvalho Chehab
2025-02-14 15:56 ` Jonathan Corbet
1 sibling, 1 reply; 13+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-14 7:45 UTC (permalink / raw)
To: Jani Nikula, Jonathan Corbet
Cc: Sakari Ailus, linux-doc, Ricardo Ribalda, Tiffany Lin,
Andrew-CT Chen, Yunfei Dong, Mauro Carvalho Chehab,
Matthias Brugger, AngeloGioacchino Del Regno, Laurent Pinchart,
Hans Verkuil, Kieran Bingham, Bin Liu, Ezequiel Garcia,
Philipp Zabel, Stanimir Varbanov, Vikash Garodia,
Bryan O'Donoghue, Bjorn Andersson, Konrad Dybcio,
Sylwester Nawrocki, Krzysztof Kozlowski, Alim Akhtar,
Marek Szyprowski, Andrzej Hajda, Bingbu Cao, Tianshu Qiu,
Greg Kroah-Hartman, Neil Armstrong, Kevin Hilman, Jerome Brunet,
Martin Blumenstingl
Em Tue, 06 Feb 2024 13:20:34 +0200
Jani Nikula <jani.nikula@linux.intel.com> escreveu:
> On Mon, 05 Feb 2024, Jonathan Corbet <corbet@lwn.net> wrote:
> > Sakari Ailus <sakari.ailus@linux.intel.com> writes:
> >
> >>> Sigh ... seeing more indecipherable regexes added to kernel-doc is like
> >>> seeing another load of plastic bags dumped into the ocean... it doesn't
> >>> change the basic situation, but it's still sad.
> >>>
> >>> Oh well, applied, thanks.
> >>
> >> Thanks. I have to say I feel the same...
> >>
> >> Regexes aren't great for parsing C, that's for sure. :-I But what are the
> >> options? Write a proper parser for (a subset of) C?
> >
> > Every now and then I've pondered on this a bit. There are parsers out
> > there, of course; we could consider using something like tree-sitter.
> > There's just two little problems:
> >
> > - That's a massive dependency to drag into the docs build that seems
> > unlikely to speed things up.
> >
> > - kernel-doc is really two parsers - one for C code, one for the
> > comment syntax. Strangely, nobody has written a grammar for this
> > combination.
> >
> > A suitably motivated developer could probably create a C+kerneldoc
> > grammer that would let us make a rock-solid, tree-sitter-based parser
> > that would be mostly maintained by somebody else. But that doesn't get
> > us around the "adding a big dependency" problem.
>
> After we'd made kernel-doc the perl script to produce rst, and
> kernel-doc the Sphinx extension to consume it, I pondered the same
> questions, and wondered what it should all look like if you could just
> ignore all the kernel legacy.
>
> I've told the story before, but what I ended up with was:
>
> - Use Python bindings for libclang to parse the source code. Clang is
> obviously a big dependency, but nowadays more people have it already
> installed, and the Python part on top is neglible.
Ok, but this may increase the doc generation time, and would make clang
mandatory for doc generation. As you mentioned, most developers have
clang installed already, so it shouldn't be a big issue.
The real challenge is really how to do such change without
introducing regressions.
The way I see is that we should first convert it to Python with all
regex stuff inside, and then find ways to switch to either libclang
or to some other lexical analyzer library.
> - Don't parse the contents of the comments, at all. Treat it as pure
> rst, and let Sphinx handle it.
This will break existing stuff, as Kernel-doc language uses, for
instance, %const and @symbol, which aren't part of Sphinx. Also,
Sphinx doesn't support inlined comments. DRM, in particular, has
a lot of those, but other subsystems are also big fans of inlined
comments.
>
> That's pretty much how Hawkmoth [1] got started. I never even considered
> it for kernel, because it would've been:
>
> > <back to work now...>
>
> Although Mesa now uses it to produce stuff like [2].
>
> A suitably motivated developer could probably get it to work with the
> kernel... Nowadays you could use Sphinx mechanisms to extend it to
> convert kernel-doc style comments to rst.
IMO we still need a command-line interface as it makes a lot easier
to debug stuff. So, just using Hawkmoth directly may be problematic,
depending on how easy/hard would be to create a command line interface
for it, but yeah, we can take a look on how it works and consider it
at the future of kernel-doc.
> There are a number of issues that might make it difficult, though:
>
> - kernel-doc parses extra magic stuff like EXPORT_SYMBOL().
This is actually done in separate from the rest of the script.
Yet, the output filtering logic needs to cope with internal, exported,
nosymbol and "function" (actually symbol) logic.
> - all the special casing in kernel-doc dump_struct(), like
>
> $members =~ s/\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;/$2/gos;
This is the hardest part and what makes dump_struct, dump_function
and create_parameterlist functions complex.
Even with a lexical analyzer (based or not on libclang), I
suspect we'll still need a lexical ruleset to parse Kernel macros,
as those are actually extensions to C language.
Most of the kernel-doc maintenance are due to such macros.
> - it's a compiler, so you'll need to pass suitable compiler options,
> which might be difficult with all the per-directory kbuild magic
>
> - might end up being slow, because it's a compiler (although there's
> some caching to avoid parsing the same file multiple times like
> kernel-doc currently does)
>
> Anyway, I think it would be important to separate the parsing of C and
> parsing of comments.
Separate C and comments is the easiest part of the script, and it is
already there. This is done by a state machine: comments are sent to the
comments-parsing function, while C is sent to process_proto function.
The in-lined comments are harder to parse, as it requires an extra
logic to detect the end of a function/enum/struct, but also handled by
the state machine.
I haven't seen much maintenance on most of kernel-doc code. The big
headache is how to maintain process_proto work.
The actual implementation is trivial:
def process_proto(self, ln, line):
"""STATE_PROTO: reading a function/whatever prototype."""
if doc_inline_oneline.search(line):
self.entry.section = doc_inline_oneline.group(1)
self.entry.contents = doc_inline_oneline.group(2)
if self.entry.contents != "":
self.entry.contents += "\n"
self.dump_section(start_new=False)
elif doc_inline_start.search(line):
self.state = self.STATE_INLINE
self.inline_doc_state = self.STATE_INLINE_NAME
elif self.entry.decl_type == 'function':
self.process_proto_function(ln, line)
else:
self.process_proto_type(ln, line)
[1] see the Python's conversion of kernel-doc at:
https://lore.kernel.org/linux-doc/20250213170413.39caf2d7@foz.lan/T/#me4149c5a2ea5be7bb9659273c4e6cc48f837c36a
So, basically, it dispatches inlined comments to the comments parser.
The code/data are dispatched to two functions:
- process_proto_function (function and function-like macros);
- or process_proto_type (data) - which dispatches into
dump_function/dump_enum/... functions, and has an
ancillary function to parse struct/enum/function parameters.
I would say that at least 95% of the maintenance efforts are
there. The hardest part of it is basically due to C macros we use at
the Kernel.
If we switch to libclang to handle process_proto, we'll still need to
add lexical extensions to such macros.
> It's kind of in the same bag in kernel-doc. But if
> you want to cross-check, say, the parameters/members against the
> documentation, you'll need the C AST while parsing the comments. And the
> preprocessor tricks employed in the kernel are probably going to be a
> nightmare.
>
> What I'm saying is, while Hawkmoth is perhaps not the right solution,
> using any generic C parser will face some of the same issues regardless.
Agreed: the same issues we have with the regular expressions we have
at dump_struct and dump_function (which are the core of the issues)
will still require some lexical rules to libclang - or whatever lexical
parser we use.
The advantage of regular expressions is that they're there for years,
and lots of developers are familiar with that. On the other hand,
some of them are really complex to deal with, like:
- struct_group*
- syscalls;
- trace events
- ...
-
The way I decided to port kernel-doc to python was based on a different
criteria:
1. Keep it as close as possible to the existing script, as it makes
easier to debug it;
2. Use classes to encapsulate the logic;
3. I placed the core of the parser, including the state machine
handling, on a KernelDoc class. The entire parser is there;
4. The C part of the parser uses an ancillary Namespace variable:
self.entry = argparse.Namespace
which is filled by process_proto and their ancillary routines.
Once the Kernel-doc entry is ready, there is a store() function
that adds them to a list of Kerneldoc entries:
self.entries["entry"].append((name, args))
I opted to use Namespace, as IMHO it is cleaner to work like
if this is a C struct than to use a dict.
5. Extra features like output filtering for internal, external, symbol,
no symbol handling at parsing time;
6. format output is done by separate classes:
- class OutputFormat: with common code and some "virtual"
dispatchers;
- class RestFormat(OutputFormat): handles ReST output;
- class ManFormat(OutputFormat): handles man/troff output.
7. the OutputFormat class does the symbol output filtering.
8. right now format output clasees just use print(), but I'm planning
to convert the message output function into an iterator with
something like:
yield (file_name, line_number, output_data)
to allow the Sphinx extension to get each symbol in separate
(just like I did for get_abi.py).
-
If we decide to use a separate C lexical analyzer, my suggestion
would be to move the functions that handle struct.entry (basically
process_proto and its ancillary functions) to a separate class and
then re-implement it using libclang (or some other lexical analyzer).
Thanks,
Mauro
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: scripts/kernel-doc parsing issues
2025-02-14 3:15 ` Randy Dunlap
@ 2025-02-14 7:57 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 13+ messages in thread
From: Mauro Carvalho Chehab @ 2025-02-14 7:57 UTC (permalink / raw)
To: Randy Dunlap
Cc: Jonathan Corbet, Sakari Ailus, linux-doc, Mauro Carvalho Chehab
Em Thu, 13 Feb 2025 19:15:01 -0800
Randy Dunlap <rdunlap@infradead.org> escreveu:
> On 2/5/24 7:50 PM, Randy Dunlap wrote:
> > [reduced Cc: list]
> >
> > [was: Re: [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields]
> >
>
>
> [snip]
>
> >
> > As I said here on the RFC patch from Sakari:
> > https://lore.kernel.org/all/aa94772b-7010-4bba-b099-d3b8fe1b97aa@infradead.org/
> >
> > "Yet another kernel-doc bug. I have a list of 5 or 6 or 8 bugs that are
> > similar to this one, but I didn't have this one."
> >
> > The patch to report Excess struct or union members has unearthed several
> > kernel-doc "parsing" problems.
> >
> > I have not tried to fix any of these in scripts/kernel-doc yet. I might get
> > around to it, but it's not a high priority for me.
> >
> >
> > Examples:
> >
> > 1) drivers/slimbus/stream.c:49: warning: Excess struct member 'segdist_codes' description in 'segdist_code'
> >
> > struct declaration and definition together. Also possible that the leading "static const"
> > confuses scripts/kernel-doc.
> >
> > 2) include/linux/spi/spi.h:246: warning: Function parameter or struct member 'cs_index_mask:SPI_CS_CNT_MAX' not described in 'spi_device'
> > include/linux/spi/spi.h:246: warning: Excess struct member 'cs_index_mask' description in 'spi_device'
> >
> > scripts/kernel-doc handles some bit fields in structs successfully, so something is
> > different about this one.
> >
> > 3) fs/ntfs/compress.c:24: warning: cannot understand function prototype: 'typedef enum '
> >
> > fs/ntfs/* has been removed in linux-next (still in mainline for a little while), but this
> > shows that scripts/kernel-doc does not handle a 'typedef enum' successfully.
> >
> > 4) drivers/misc/vmw_balloon.c:260: warning: Excess struct member 'reserved' description in 'vmballoon_batch_entry'
> >
> > This may be the same problem as #2, with using bit fields in a struct.
> >
> > 5) drivers/base/power/runtime.c:362: warning: Excess function parameter 'dev' description in '__rpm_callback'
> >
> > Confused by either the first function parameter (a function pointer) or the trailing
> > __releases() and __acquires() attributes.
> >
> > 6) drivers/md/bcache/request.c:309: warning: expecting prototype for bch_data_insert(). Prototype was for CLOSURE_CALLBACK() instead
> >
> > and
> > fs/bcachefs/io_write.c:1558: warning: expecting prototype for bch2_write(). Prototype was for CLOSURE_CALLBACK() instead
> >
> > CLOSURE_CALLBACK() and function parameters are confusing scripts/kernel-doc.
> >
> > 7) drivers/iio/adc/at91-sama5d2_adc.c:471: warning: Excess struct member 'adc_channels' description in 'at91_adc_platform'
> >
> > Fixed by Sakari's patch. :)
> >
> > 8) drivers/pci/controller/pcie-iproc-msi.c:110: warning: Excess struct member 'reg_offsets' description in 'iproc_msi'
> >
> > Fixed by Sakari's patch. :)
> >
> > 9) drivers/usb/gadget/udc/pch_udc.c:361: warning: Excess struct member 'stall' description in 'pch_udc_dev'
> > pch_udc.c:361: warning: Excess struct member 'prot_stall' description in 'pch_udc_dev'
> > pch_udc.c:361: warning: Excess struct member 'registered' description in 'pch_udc_dev'
> > pch_udc.c:361: warning: Excess struct member 'suspended' description in 'pch_udc_dev'
> > pch_udc.c:361: warning: Excess struct member 'connected' description in 'pch_udc_dev'
> > pch_udc.c:361: warning: Excess struct member 'vbus_session' description in 'pch_udc_dev'
> > pch_udc.c:361: warning: Excess struct member 'set_cfg_not_acked' description in 'pch_udc_dev'
> > pch_udc.c:361: warning: Excess struct member 'waiting_zlp_ack' description in 'pch_udc_dev'
> >
> > All of these except @registered (which is just an Excess description) are declared with one
> > 'unsigned' followed by a list of bit fields, which isn't kernel coding style but it is valid C.
> > or it might just be 'unsigned' without having a following 'int' that is the problem. I don't
> > know -- haven't looked yet.
> >
> > 10) Matthew Wilcox pointed out to me that commit 0d55d48b19ff is causing problems with
> > generated output. A few instances of using TAB or multiple spaces have been patched
> > recently, but there are others that are not being addressed. I don't have a list of these.
>
> Here are a few more that I found recently.
>
> 11) security/landlock/ruleset.c:
> security/landlock/ruleset.c:205: warning: Function parameter or struct member ''
> not described in 'insert_rule'
> security/landlock/ruleset.c:205: warning: Excess function parameter 'layers' des
> cription in 'insert_rule'
> security/landlock/ruleset.c:692: warning: Function parameter or struct member ''
> not described in 'landlock_init_layer_masks'
> security/landlock/ruleset.c:692: warning: Excess function parameter 'layer_masks
> ' description in 'landlock_init_layer_masks'
>
> 12) security/landlock/fs.c:
> security/landlock/fs.c:762: warning: Function parameter or struct member '' not
> described in 'is_access_to_paths_allowed'
> security/landlock/fs.c:762: warning: Excess function parameter 'layer_masks_pare
> nt1' description in 'is_access_to_paths_allowed'
> security/landlock/fs.c:762: warning: Excess function parameter 'layer_masks_pare
> nt2' description in 'is_access_to_paths_allowed'
> security/landlock/fs.c:1002: warning: Function parameter or struct member '' not
> described in 'collect_domain_accesses'
> security/landlock/fs.c:1002: warning: Excess function parameter 'layer_masks_dom
> ' description in 'collect_domain_accesses'
>
> 13) security/ipe/hooks.c:
> security/ipe/hooks.c:55: warning: Function parameter or struct member '__always_
> unused' not described in 'ipe_mmap_file'
> security/ipe/hooks.c:55: warning: Excess function parameter 'reqprot' descriptio
> n in 'ipe_mmap_file'
> security/ipe/hooks.c:83: warning: Function parameter or struct member '__always_
> unused' not described in 'ipe_file_mprotect'
> security/ipe/hooks.c:83: warning: Excess function parameter 'reqprot' descriptio
> n in 'ipe_file_mprotect'
>
> Probably just always ignore __always_unused.
My suggestion is to first switch to the Python version, and then address
the issues not fixed yet by Sakari's patch at the Python version.
I would very much prefer to use regex eXtended flag (re.X/re.VERBOSE) to
implement any required new expressions with proper comments, but, at least
on my tests, Python support for it didn't work:
https://lore.kernel.org/linux-doc/6958d7a5-2403-423d-a0a3-0c43931a7d30@infradead.org/T/#m558bbed6272fe1bd988c53dcca2de9af9a8882d3
Funny enough, it worked when I ran using Python command line interpreter.
Anyway, getting regex explanations can also be done via
https://regex101.com/. I usually test there more complex regular expressions
with existing data, as it helps debugging issues.
Perhaps I did some silly mistake there, so it could be worth investing
some time to check why re.compile() didn't work for __attribute__((foo)).
We may also implement exceptions using some other ways:
- special function handlers (there are already two, for
syscalls and trace events);
- some lexical analyzer ruleset;
- code.
But, at least for now, I would use regex when possible (if it works,
via re.X).
Thanks,
Mauro
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields
2025-02-14 7:45 ` Mauro Carvalho Chehab
@ 2025-02-14 15:56 ` Jonathan Corbet
2025-02-14 16:29 ` Jani Nikula
0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Corbet @ 2025-02-14 15:56 UTC (permalink / raw)
To: Mauro Carvalho Chehab, Jani Nikula
Cc: Sakari Ailus, linux-doc, Ricardo Ribalda, Tiffany Lin,
Andrew-CT Chen, Yunfei Dong, Mauro Carvalho Chehab,
Matthias Brugger, AngeloGioacchino Del Regno, Laurent Pinchart,
Hans Verkuil, Kieran Bingham, Bin Liu, Ezequiel Garcia,
Philipp Zabel, Stanimir Varbanov, Vikash Garodia,
Bryan O'Donoghue, Bjorn Andersson, Konrad Dybcio,
Sylwester Nawrocki, Krzysztof Kozlowski, Alim Akhtar,
Marek Szyprowski, Andrzej Hajda, Bingbu Cao, Tianshu Qiu,
Greg Kroah-Hartman, Neil Armstrong, Kevin Hilman, Jerome Brunet,
Martin Blumenstingl
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> Em Tue, 06 Feb 2024 13:20:34 +0200
> Jani Nikula <jani.nikula@linux.intel.com> escreveu:
>> - Use Python bindings for libclang to parse the source code. Clang is
>> obviously a big dependency, but nowadays more people have it already
>> installed, and the Python part on top is neglible.
>
> Ok, but this may increase the doc generation time, and would make clang
> mandatory for doc generation. As you mentioned, most developers have
> clang installed already, so it shouldn't be a big issue.
I have pondered such things (tree-sitter too) in the past. The problem
is that these parsers parse C, which isn't what kernel-doc is looking
for. It needs to see the code without preprocessor intervention, and it
needs to see the comments.
There might also be more resistance to requiring clang that one might
expect.
>> - Don't parse the contents of the comments, at all. Treat it as pure
>> rst, and let Sphinx handle it.
That would fix the latter problem, at the cost of breaking tens of
thousands of kerneldoc comments in the code. I don't relish the idea of
all the churn needed to fix that up... That might have been a good
decision to make 20 years or so ago; it's hard to see it as an option
now.
jon
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields
2025-02-14 15:56 ` Jonathan Corbet
@ 2025-02-14 16:29 ` Jani Nikula
2025-02-14 16:38 ` Jonathan Corbet
0 siblings, 1 reply; 13+ messages in thread
From: Jani Nikula @ 2025-02-14 16:29 UTC (permalink / raw)
To: Jonathan Corbet, Mauro Carvalho Chehab
Cc: Sakari Ailus, linux-doc, Ricardo Ribalda, Tiffany Lin,
Andrew-CT Chen, Yunfei Dong, Mauro Carvalho Chehab,
Matthias Brugger, AngeloGioacchino Del Regno, Laurent Pinchart,
Hans Verkuil, Kieran Bingham, Bin Liu, Ezequiel Garcia,
Philipp Zabel, Stanimir Varbanov, Vikash Garodia,
Bryan O'Donoghue, Bjorn Andersson, Konrad Dybcio,
Sylwester Nawrocki, Krzysztof Kozlowski, Alim Akhtar,
Marek Szyprowski, Andrzej Hajda, Bingbu Cao, Tianshu Qiu,
Greg Kroah-Hartman, Neil Armstrong, Kevin Hilman, Jerome Brunet,
Martin Blumenstingl
On Fri, 14 Feb 2025, Jonathan Corbet <corbet@lwn.net> wrote:
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
>
>> Em Tue, 06 Feb 2024 13:20:34 +0200
>> Jani Nikula <jani.nikula@linux.intel.com> escreveu:
>>> - Don't parse the contents of the comments, at all. Treat it as pure
>>> rst, and let Sphinx handle it.
>
> That would fix the latter problem, at the cost of breaking tens of
> thousands of kerneldoc comments in the code. I don't relish the idea of
> all the churn needed to fix that up... That might have been a good
> decision to make 20 years or so ago; it's hard to see it as an option
> now.
Just to be clear, I feel I must repeat the context of my email:
>>> After we'd made kernel-doc the perl script to produce rst, and
>>> kernel-doc the Sphinx extension to consume it, I pondered the same
>>> questions, and wondered what it should all look like if you could just
>>> ignore all the kernel legacy.
There's a huge (olympic pool size) if in there, and that's the main
reason I could create a fun and lean little project around the idea in
the first place!
BR,
Jani.
--
Jani Nikula, Intel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields
2025-02-14 16:29 ` Jani Nikula
@ 2025-02-14 16:38 ` Jonathan Corbet
0 siblings, 0 replies; 13+ messages in thread
From: Jonathan Corbet @ 2025-02-14 16:38 UTC (permalink / raw)
To: Jani Nikula, Mauro Carvalho Chehab
Cc: Sakari Ailus, linux-doc, Ricardo Ribalda, Tiffany Lin,
Andrew-CT Chen, Yunfei Dong, Mauro Carvalho Chehab,
Matthias Brugger, AngeloGioacchino Del Regno, Laurent Pinchart,
Hans Verkuil, Kieran Bingham, Bin Liu, Ezequiel Garcia,
Philipp Zabel, Stanimir Varbanov, Vikash Garodia,
Bryan O'Donoghue, Bjorn Andersson, Konrad Dybcio,
Sylwester Nawrocki, Krzysztof Kozlowski, Alim Akhtar,
Marek Szyprowski, Andrzej Hajda, Bingbu Cao, Tianshu Qiu,
Greg Kroah-Hartman, Neil Armstrong, Kevin Hilman, Jerome Brunet,
Martin Blumenstingl
Jani Nikula <jani.nikula@linux.intel.com> writes:
> On Fri, 14 Feb 2025, Jonathan Corbet <corbet@lwn.net> wrote:
>> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
>>
>>> Em Tue, 06 Feb 2024 13:20:34 +0200
>>> Jani Nikula <jani.nikula@linux.intel.com> escreveu:
>>>> - Don't parse the contents of the comments, at all. Treat it as pure
>>>> rst, and let Sphinx handle it.
>>
>> That would fix the latter problem, at the cost of breaking tens of
>> thousands of kerneldoc comments in the code. I don't relish the idea of
>> all the churn needed to fix that up... That might have been a good
>> decision to make 20 years or so ago; it's hard to see it as an option
>> now.
>
> Just to be clear, I feel I must repeat the context of my email:
>
>>>> After we'd made kernel-doc the perl script to produce rst, and
>>>> kernel-doc the Sphinx extension to consume it, I pondered the same
>>>> questions, and wondered what it should all look like if you could just
>>>> ignore all the kernel legacy.
>
> There's a huge (olympic pool size) if in there, and that's the main
> reason I could create a fun and lean little project around the idea in
> the first place!
Yup, sorry if I trimmed that out.
I just wanted to be clear as well... I've often thought that life would
be a lot easier if we could stop trying to maintain our own C parser
(and maybe Rust too someday?), but I don't really see a path toward
that.
Thanks,
jon
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-02-14 16:38 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-31 8:49 [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields Sakari Ailus
2024-02-05 17:04 ` Jonathan Corbet
2024-02-05 21:30 ` Sakari Ailus
2024-02-06 0:05 ` Jonathan Corbet
2024-02-06 3:50 ` scripts/kernel-doc parsing issues Randy Dunlap
2025-02-14 3:15 ` Randy Dunlap
2025-02-14 7:57 ` Mauro Carvalho Chehab
2024-02-06 11:20 ` [PATCH 1/1] kernel-doc: Support arrays of pointers struct fields Jani Nikula
2024-02-06 15:13 ` Jonathan Corbet
2025-02-14 7:45 ` Mauro Carvalho Chehab
2025-02-14 15:56 ` Jonathan Corbet
2025-02-14 16:29 ` Jani Nikula
2025-02-14 16:38 ` Jonathan Corbet
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.