* Dash in name of a manual page
@ 2024-06-19 20:06 Alejandro Colomar
2024-06-24 16:21 ` G. Branden Robinson
0 siblings, 1 reply; 4+ messages in thread
From: Alejandro Colomar @ 2024-06-19 20:06 UTC (permalink / raw
To: linux-man, branden
[-- Attachment #1: Type: text/plain, Size: 556 bytes --]
Hi Branden,
Let's say I write the manual page for git-diff(1).
The file name is <man1/git-diff.1>.
In TH, should I use \- or just -?
.TH git\-diff 1 2024-06-19 git
.TH git-diff 1 2024-06-19 git
How about SH Name?
.SH Name
git\-diff \- Show changes between commits, commit and working tree, etc
.SH Name
git-diff \- Show changes between commits, commit and working tree, etc
I'm worried especially about the Name section, in case that \-
interferes with man-db.
Cheers,
Alex
--
<https://www.alejandro-colomar.es/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Dash in name of a manual page
2024-06-19 20:06 Dash in name of a manual page Alejandro Colomar
@ 2024-06-24 16:21 ` G. Branden Robinson
2024-06-25 23:43 ` Alejandro Colomar
0 siblings, 1 reply; 4+ messages in thread
From: G. Branden Robinson @ 2024-06-24 16:21 UTC (permalink / raw
To: Alejandro Colomar; +Cc: linux-man
[-- Attachment #1: Type: text/plain, Size: 4941 bytes --]
Hi Alex,
Sorry for my delay in responding.
At 2024-06-19T22:06:56+0200, Alejandro Colomar wrote:
> Let's say I write the manual page for git-diff(1).
> The file name is <man1/git-diff.1>.
>
> In TH, should I use \- or just -?
>
> .TH git\-diff 1 2024-06-19 git
> .TH git-diff 1 2024-06-19 git
This is a style choice. The formatter itself doesn't care. I don't
have a strong prescription in this area, just some observations.
1. The difference matters only on output devices that distinguish
hyphens from minus signs.
2. A problem arises from the difference usually only when someone
attempts to copy-and-paste text from a man page to a shell prompt or
text editor, and gets the "wrong" kind of "dash".
3. In general, Unix systems are case-sensitive.
4. I doubt that there is a tradition of copy-and-pasting the man page
name from the rendered document header because there is _another_
tradition, still widely seen, of rendering the document title in
full caps. This tradition came from the Bell Labs CSRC in the
1970s, and obviously those folks knew point #3. (Also they
interacted with the system using teletypewriters, so they had no
"copy-and-paste" experiences.)
5. groff 1.23 makes the foregoing capitalization behavior user-
configurable via a register.[1] So you, the page author, don't know
whether the reader will see your document title in full caps or not.
If they do, copy-and-paste will be defeated anyway ("man GIT-DIFF").
Consequently, my personal judgment would be to not bother with `\-` in
the first argument to the `TH` macro. But I can't say that things are
likely to go wrong if you _do_ bother. Odds are it simply won't suffice
to make the document title copy-and-paste-able for some of your
audience, and a big part of that audience will never notice either way.
> How about SH Name?
I reason differently about this case.
> .SH Name
> git\-diff \- Show changes between commits, commit and working tree, etc
>
> .SH Name
> git-diff \- Show changes between commits, commit and working tree, etc
In the "Name" section of a man page, we start with a comma-separated
list of topics, each of which is supposed to identify a component of the
system. In sections 1, 6, and 8 (commands), we seldom see the "list"
aspect of this specification exercised (or rather, the list is a
singleton). But in sections 2 and 3, lists of function names (and
sometimes C objects [variables]) are common. Since these all name
things you might type that exist somewhere on the system, as programs
resolved by $PATH search or as symbols in object files, or as macros the
compiler will recognize, the argument for marking them up as "literals",
with boldface and `\-` to get hyphen-minus characters, is stronger.[2]
> I'm worried especially about the Name section, in case that \-
> interferes with man-db.
There is no need to worry when you can easily put the question to an
empirical test. Run lexgrog(1) on your document to see what it says.
$ lexgrog ./man/roff.7.man
./man/roff.7.man: "roff - concepts and history of roff typesetting"
$ lexgrog - <<EOF
.TH git-diff 1 2024-06-24 "groff test suite"
.SH Name
git\-diff \- show changes between commits, commit and working tree, etc.
EOF
-: "git-diff - show changes between commits, commit and working tree, etc."
man-db seems happy to me.
Regards,
Branden
[0] Secret footnote: The practices I suggested above also translate well
to mdoc(7) practice, where the `Dt` macro defines the "document
title", and `Nm` calls designate "names" of topics the page
discusses. I don't advocate mdoc(7) over man(7), but I also do not
wish to create unnecessary impedance mismatches between them.
[1] ...which is defeated if the document shouts its title in full caps
in its *roff source. But here is the configuration control.
groff_man(7):
-rCT=1 Set the man page identifier (the first argument to .TH) in
full capitals in headers and footers. This transformation
is off by default because it discards case distinction
information.
People who aren't accustomed to viewing man page documents with
"groff" or "nroff", but do use man-db, would likely put the
foregoing command-line option into $MANROFFOPT. An approach that
works for any (groff 1.23) system regardless of man librarian is to
edit the "man.local" file to set the register (`.nr CT 1`). See
section "Files" of groff_man(7). (mandoc(1) doesn't respect this
and likely never will; its maintainer scorns configurability.)
[2] Since `-` isn't a valid character in C identifiers, that aspect of
the discussion doesn't hold for that language. But I am trying to
reason ecumenically, and command names in kebab-case are well known.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Dash in name of a manual page
2024-06-24 16:21 ` G. Branden Robinson
@ 2024-06-25 23:43 ` Alejandro Colomar
2024-06-26 0:53 ` G. Branden Robinson
0 siblings, 1 reply; 4+ messages in thread
From: Alejandro Colomar @ 2024-06-25 23:43 UTC (permalink / raw
To: G. Branden Robinson; +Cc: linux-man
[-- Attachment #1: Type: text/plain, Size: 6688 bytes --]
On Mon, Jun 24, 2024 at 11:21:59AM GMT, G. Branden Robinson wrote:
> Hi Alex,
Hi Branden,
> Sorry for my delay in responding.
Huh, I had missed your reply somehow. :(
>
> At 2024-06-19T22:06:56+0200, Alejandro Colomar wrote:
> > Let's say I write the manual page for git-diff(1).
> > The file name is <man1/git-diff.1>.
> >
> > In TH, should I use \- or just -?
> >
> > .TH git\-diff 1 2024-06-19 git
> > .TH git-diff 1 2024-06-19 git
>
> This is a style choice. The formatter itself doesn't care. I don't
> have a strong prescription in this area, just some observations.
>
> 1. The difference matters only on output devices that distinguish
> hyphens from minus signs.
This includes utf8, which is my main output device.
> 2. A problem arises from the difference usually only when someone
> attempts to copy-and-paste text from a man page to a shell prompt or
> text editor, and gets the "wrong" kind of "dash".
And in my head, which feels that there's something wrong, even if few
people will be affected by it.
> 3. In general, Unix systems are case-sensitive.
And dash/hyphen sensitive.
> 4. I doubt that there is a tradition of copy-and-pasting the man page
> name from the rendered document header because there is _another_
> tradition, still widely seen, of rendering the document title in
> full caps. This tradition came from the Bell Labs CSRC in the
> 1970s, and obviously those folks knew point #3. (Also they
> interacted with the system using teletypewriters, so they had no
> "copy-and-paste" experiences.)
Since man(1) is case-insensitive, this is still meaningful:
$ cat /usr/local/man/man1/foo-foo.1 ;
.TH foo-foo 1 1 1
.SH Name
foo-foo
\-
foo foo
$ man /usr/local/man/man1/foo-foo.1 \
| head -n1 \
| awk '{print $1}';
foo‐foo(1)
$ man /usr/local/man/man1/foo-foo.1 \
| head -n1 \
| awk '{print $1}' \
| xargs man -w;
No manual entry for foo‐foo(1)
but
$ cat /usr/local/man/man1/bar-bar.1 ;
.TH bar\-bar 1 1 1
.SH Name
bar\-bar
\-
bar bar
$ man /usr/local/man/man1/bar-bar.1 \
| head -n1 \
| awk '{print $1}';
bar-bar(1)
$ man /usr/local/man/man1/bar-bar.1 \
| head -n1 \
| awk '{print $1}' \
| xargs man -w;
/usr/local/man/man1/bar-bar.1
Of course, I don't expect this to be used (or useful) often, or maybe
ever, but let's be correct. :)
> 5. groff 1.23 makes the foregoing capitalization behavior user-
> configurable via a register.[1] So you, the page author, don't know
> whether the reader will see your document title in full caps or not.
> If they do, copy-and-paste will be defeated anyway ("man GIT-DIFF").
>
> Consequently, my personal judgment would be to not bother with `\-` in
> the first argument to the `TH` macro. But I can't say that things are
> likely to go wrong if you _do_ bother. Odds are it simply won't suffice
> to make the document title copy-and-paste-able for some of your
> audience, and a big part of that audience will never notice either way.
I guess I'll use the escape there.
> > How about SH Name?
>
> I reason differently about this case.
>
> > .SH Name
> > git\-diff \- Show changes between commits, commit and working tree, etc
> >
> > .SH Name
> > git-diff \- Show changes between commits, commit and working tree, etc
>
> In the "Name" section of a man page, we start with a comma-separated
> list of topics, each of which is supposed to identify a component of the
> system. In sections 1, 6, and 8 (commands), we seldom see the "list"
> aspect of this specification exercised (or rather, the list is a
> singleton). But in sections 2 and 3, lists of function names (and
> sometimes C objects [variables]) are common. Since these all name
> things you might type that exist somewhere on the system, as programs
> resolved by $PATH search or as symbols in object files, or as macros the
> compiler will recognize, the argument for marking them up as "literals",
> with boldface and `\-` to get hyphen-minus characters, is stronger.[2]
Okay.
> > I'm worried especially about the Name section, in case that \-
> > interferes with man-db.
>
> There is no need to worry when you can easily put the question to an
> empirical test. Run lexgrog(1) on your document to see what it says.
>
> $ lexgrog ./man/roff.7.man
> ./man/roff.7.man: "roff - concepts and history of roff typesetting"
> $ lexgrog - <<EOF
> .TH git-diff 1 2024-06-24 "groff test suite"
> .SH Name
> git\-diff \- show changes between commits, commit and working tree, etc.
> EOF
> -: "git-diff - show changes between commits, commit and working tree, etc."
>
> man-db seems happy to me.
So, what's the rule? The first white-space-delimited \- (that is, the
\- forms a separate token) is the separator, right?
>
> Regards,
> Branden
>
> [0] Secret footnote: The practices I suggested above also translate well
> to mdoc(7) practice, where the `Dt` macro defines the "document
> title", and `Nm` calls designate "names" of topics the page
> discusses. I don't advocate mdoc(7) over man(7), but I also do not
> wish to create unnecessary impedance mismatches between them.
>
> [1] ...which is defeated if the document shouts its title in full caps
> in its *roff source. But here is the configuration control.
>
> groff_man(7):
> -rCT=1 Set the man page identifier (the first argument to .TH) in
> full capitals in headers and footers. This transformation
> is off by default because it discards case distinction
> information.
>
> People who aren't accustomed to viewing man page documents with
> "groff" or "nroff", but do use man-db, would likely put the
> foregoing command-line option into $MANROFFOPT. An approach that
> works for any (groff 1.23) system regardless of man librarian is to
> edit the "man.local" file to set the register (`.nr CT 1`). See
> section "Files" of groff_man(7). (mandoc(1) doesn't respect this
> and likely never will; its maintainer scorns configurability.)
>
> [2] Since `-` isn't a valid character in C identifiers, that aspect of
> the discussion doesn't hold for that language. But I am trying to
> reason ecumenically, and command names in kebab-case are well known.
The pages I was considering writing are keyctl-...(1). I want to
separate that huge page into one per subcommand, as git(1) does.
Have a lovely night!
Alex
--
<https://www.alejandro-colomar.es/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Dash in name of a manual page
2024-06-25 23:43 ` Alejandro Colomar
@ 2024-06-26 0:53 ` G. Branden Robinson
0 siblings, 0 replies; 4+ messages in thread
From: G. Branden Robinson @ 2024-06-26 0:53 UTC (permalink / raw
To: Alejandro Colomar; +Cc: linux-man
[-- Attachment #1: Type: text/plain, Size: 2714 bytes --]
Hi Alex,
At 2024-06-26T01:43:22+0200, Alejandro Colomar wrote:
> On Mon, Jun 24, 2024 at 11:21:59AM GMT, G. Branden Robinson wrote:
> Huh, I had missed your reply somehow. :(
No worries, you're still pretty fast. ;-)
> > 1. The difference matters only on output devices that distinguish
> > hyphens from minus signs.
>
> This includes utf8, which is my main output device.
Yes, and not everyone uses a font that shows them the difference so
they're bound to be frustrated anyway.
> And in my head, which feels that there's something wrong, even if few
> people will be affected by it.
Our irritations are to some extent our own prerogatives. ;-)
> > 3. In general, Unix systems are case-sensitive.
>
> And dash/hyphen sensitive.
Only where the character encoding distinguishes them! The term
"hyphen-minus" was unattested in English until the ISO 8859 committee
found itself dealing with the mess that ANSI X3.4 made.
(That said, I prefer their mess to IBM's.)
> Of course, I don't expect this to be used (or useful) often, or maybe
> ever, but let's be correct. :)
[...]
> I guess I'll use the escape there [in TH].
I won't stop you. ;-)
> > > How about SH Name?
> >
> > I reason differently about this case.
> >
> > > .SH Name
> > man-db seems happy to me.
>
> So, what's the rule? The first white-space-delimited \- (that is, the
> \- forms a separate token) is the separator, right?
Strictly, the rule is up to lexgrog(1) or whatever parses man page
documents to build indexes from them. This is not something over which
a *roff formatter has authority.
lexgrog(1):
When using the traditional man macro set, a correct NAME section
looks something like this:
.SH NAME
foo \- program to do something
Some manual pagers require the ‘\-’ to be exactly as shown; mandb
is more tolerant, but for compatibility with other systems it is
nevertheless a good idea to retain the backslash.
In practice, man-db's lexgrog accepts a pretty motley stew of
separators. It maps several patterns to a (whitespace-bounded) "-"
token.
https://gitlab.com/man-db/man-db/-/blob/main/src/lexgrog.l?ref_type=heads#L96
https://gitlab.com/man-db/man-db/-/blob/main/src/lexgrog.l?ref_type=heads#L217
> The pages I was considering writing are keyctl-...(1). I want to
> separate that huge page into one per subcommand, as git(1) does.
Understood. People are daunted by gigantic man pages. groff(7) is
pretty big, but is edged out by ffmpeg(1). bash(1) leaves both far in
the dust.
zshall(1) is truly staggering. But that may be cheating.
Regards,
Branden
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-06-26 0:54 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-19 20:06 Dash in name of a manual page Alejandro Colomar
2024-06-24 16:21 ` G. Branden Robinson
2024-06-25 23:43 ` Alejandro Colomar
2024-06-26 0:53 ` G. Branden Robinson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).