Linux-man Archive mirror
 help / color / mirror / Atom feed
* Dash in name of a manual page
@ 2024-06-19 20:06 Alejandro Colomar
  2024-06-24 16:21 ` G. Branden Robinson
  0 siblings, 1 reply; 4+ messages in thread
From: Alejandro Colomar @ 2024-06-19 20:06 UTC (permalink / raw
  To: linux-man, branden

[-- Attachment #1: Type: text/plain, Size: 556 bytes --]

Hi Branden,

Let's say I write the manual page for git-diff(1).
The file name is <man1/git-diff.1>.

In TH, should I use \- or just -?

	.TH git\-diff 1 2024-06-19 git
	.TH git-diff 1 2024-06-19 git

How about SH Name?

	.SH Name
	git\-diff \- Show changes between commits, commit and working tree, etc

	.SH Name
	git-diff \- Show changes between commits, commit and working tree, etc

I'm worried especially about the Name section, in case that \-
interferes with man-db.

Cheers,
Alex	

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Dash in name of a manual page
  2024-06-19 20:06 Dash in name of a manual page Alejandro Colomar
@ 2024-06-24 16:21 ` G. Branden Robinson
  2024-06-25 23:43   ` Alejandro Colomar
  0 siblings, 1 reply; 4+ messages in thread
From: G. Branden Robinson @ 2024-06-24 16:21 UTC (permalink / raw
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 4941 bytes --]

Hi Alex,

Sorry for my delay in responding.

At 2024-06-19T22:06:56+0200, Alejandro Colomar wrote:
> Let's say I write the manual page for git-diff(1).
> The file name is <man1/git-diff.1>.
> 
> In TH, should I use \- or just -?
> 
> 	.TH git\-diff 1 2024-06-19 git
> 	.TH git-diff 1 2024-06-19 git

This is a style choice.  The formatter itself doesn't care.  I don't
have a strong prescription in this area, just some observations.

1.  The difference matters only on output devices that distinguish
    hyphens from minus signs.

2.  A problem arises from the difference usually only when someone
    attempts to copy-and-paste text from a man page to a shell prompt or
    text editor, and gets the "wrong" kind of "dash".

3.  In general, Unix systems are case-sensitive.

4.  I doubt that there is a tradition of copy-and-pasting the man page
    name from the rendered document header because there is _another_
    tradition, still widely seen, of rendering the document title in
    full caps.  This tradition came from the Bell Labs CSRC in the
    1970s, and obviously those folks knew point #3.  (Also they
    interacted with the system using teletypewriters, so they had no
    "copy-and-paste" experiences.)

5.  groff 1.23 makes the foregoing capitalization behavior user-
    configurable via a register.[1]  So you, the page author, don't know
    whether the reader will see your document title in full caps or not.
    If they do, copy-and-paste will be defeated anyway ("man GIT-DIFF").

Consequently, my personal judgment would be to not bother with `\-` in
the first argument to the `TH` macro.  But I can't say that things are
likely to go wrong if you _do_ bother.  Odds are it simply won't suffice
to make the document title copy-and-paste-able for some of your
audience, and a big part of that audience will never notice either way.

> How about SH Name?

I reason differently about this case.

> 	.SH Name
> 	git\-diff \- Show changes between commits, commit and working tree, etc
> 
> 	.SH Name
> 	git-diff \- Show changes between commits, commit and working tree, etc

In the "Name" section of a man page, we start with a comma-separated
list of topics, each of which is supposed to identify a component of the
system.  In sections 1, 6, and 8 (commands), we seldom see the "list"
aspect of this specification exercised (or rather, the list is a
singleton).  But in sections 2 and 3, lists of function names (and
sometimes C objects [variables]) are common.  Since these all name
things you might type that exist somewhere on the system, as programs
resolved by $PATH search or as symbols in object files, or as macros the
compiler will recognize, the argument for marking them up as "literals",
with boldface and `\-` to get hyphen-minus characters, is stronger.[2]

> I'm worried especially about the Name section, in case that \-
> interferes with man-db.

There is no need to worry when you can easily put the question to an
empirical test.  Run lexgrog(1) on your document to see what it says.

$ lexgrog ./man/roff.7.man
./man/roff.7.man: "roff - concepts and history of roff typesetting"
$ lexgrog - <<EOF
.TH git-diff 1 2024-06-24 "groff test suite"
.SH Name
git\-diff \- show changes between commits, commit and working tree, etc.
EOF
-: "git-diff - show changes between commits, commit and working tree, etc."

man-db seems happy to me.

Regards,
Branden

[0] Secret footnote: The practices I suggested above also translate well
    to mdoc(7) practice, where the `Dt` macro defines the "document
    title", and `Nm` calls designate "names" of topics the page
    discusses.  I don't advocate mdoc(7) over man(7), but I also do not
    wish to create unnecessary impedance mismatches between them.

[1] ...which is defeated if the document shouts its title in full caps
    in its *roff source.  But here is the configuration control.

groff_man(7):
     -rCT=1   Set the man page identifier (the first argument to .TH) in
              full capitals in headers and footers.  This transformation
              is off by default because it discards case distinction
              information.

    People who aren't accustomed to viewing man page documents with
    "groff" or "nroff", but do use man-db, would likely put the
    foregoing command-line option into $MANROFFOPT.  An approach that
    works for any (groff 1.23) system regardless of man librarian is to
    edit the "man.local" file to set the register (`.nr CT 1`).  See
    section "Files" of groff_man(7).  (mandoc(1) doesn't respect this
    and likely never will; its maintainer scorns configurability.)

[2] Since `-` isn't a valid character in C identifiers, that aspect of
    the discussion doesn't hold for that language.  But I am trying to
    reason ecumenically, and command names in kebab-case are well known.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Dash in name of a manual page
  2024-06-24 16:21 ` G. Branden Robinson
@ 2024-06-25 23:43   ` Alejandro Colomar
  2024-06-26  0:53     ` G. Branden Robinson
  0 siblings, 1 reply; 4+ messages in thread
From: Alejandro Colomar @ 2024-06-25 23:43 UTC (permalink / raw
  To: G. Branden Robinson; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 6688 bytes --]

On Mon, Jun 24, 2024 at 11:21:59AM GMT, G. Branden Robinson wrote:
> Hi Alex,

Hi Branden,

> Sorry for my delay in responding.

Huh, I had missed your reply somehow.  :(

> 
> At 2024-06-19T22:06:56+0200, Alejandro Colomar wrote:
> > Let's say I write the manual page for git-diff(1).
> > The file name is <man1/git-diff.1>.
> > 
> > In TH, should I use \- or just -?
> > 
> > 	.TH git\-diff 1 2024-06-19 git
> > 	.TH git-diff 1 2024-06-19 git
> 
> This is a style choice.  The formatter itself doesn't care.  I don't
> have a strong prescription in this area, just some observations.
> 
> 1.  The difference matters only on output devices that distinguish
>     hyphens from minus signs.

This includes utf8, which is my main output device.

> 2.  A problem arises from the difference usually only when someone
>     attempts to copy-and-paste text from a man page to a shell prompt or
>     text editor, and gets the "wrong" kind of "dash".

And in my head, which feels that there's something wrong, even if few
people will be affected by it.

> 3.  In general, Unix systems are case-sensitive.

And dash/hyphen sensitive.

> 4.  I doubt that there is a tradition of copy-and-pasting the man page
>     name from the rendered document header because there is _another_
>     tradition, still widely seen, of rendering the document title in
>     full caps.  This tradition came from the Bell Labs CSRC in the
>     1970s, and obviously those folks knew point #3.  (Also they
>     interacted with the system using teletypewriters, so they had no
>     "copy-and-paste" experiences.)

Since man(1) is case-insensitive, this is still meaningful:

	$ cat /usr/local/man/man1/foo-foo.1 ;
	.TH foo-foo 1 1 1
	.SH Name
	foo-foo
	\-
	foo foo

	$ man /usr/local/man/man1/foo-foo.1 \
		| head -n1 \
		| awk '{print $1}';
	foo‐foo(1)

	$ man /usr/local/man/man1/foo-foo.1 \
		| head -n1 \
		| awk '{print $1}' \
		| xargs man -w;
	No manual entry for foo‐foo(1)

but

	$ cat /usr/local/man/man1/bar-bar.1 ;
	.TH bar\-bar 1 1 1
	.SH Name
	bar\-bar
	\-
	bar bar

	$ man /usr/local/man/man1/bar-bar.1 \
		| head -n1 \
		| awk '{print $1}';
	bar-bar(1)

	$ man /usr/local/man/man1/bar-bar.1 \
		| head -n1 \
		| awk '{print $1}' \
		| xargs man -w;
	/usr/local/man/man1/bar-bar.1

Of course, I don't expect this to be used (or useful) often, or maybe
ever, but let's be correct.  :)

> 5.  groff 1.23 makes the foregoing capitalization behavior user-
>     configurable via a register.[1]  So you, the page author, don't know
>     whether the reader will see your document title in full caps or not.
>     If they do, copy-and-paste will be defeated anyway ("man GIT-DIFF").
> 
> Consequently, my personal judgment would be to not bother with `\-` in
> the first argument to the `TH` macro.  But I can't say that things are
> likely to go wrong if you _do_ bother.  Odds are it simply won't suffice
> to make the document title copy-and-paste-able for some of your
> audience, and a big part of that audience will never notice either way.

I guess I'll use the escape there.

> > How about SH Name?
> 
> I reason differently about this case.
> 
> > 	.SH Name
> > 	git\-diff \- Show changes between commits, commit and working tree, etc
> > 
> > 	.SH Name
> > 	git-diff \- Show changes between commits, commit and working tree, etc
> 
> In the "Name" section of a man page, we start with a comma-separated
> list of topics, each of which is supposed to identify a component of the
> system.  In sections 1, 6, and 8 (commands), we seldom see the "list"
> aspect of this specification exercised (or rather, the list is a
> singleton).  But in sections 2 and 3, lists of function names (and
> sometimes C objects [variables]) are common.  Since these all name
> things you might type that exist somewhere on the system, as programs
> resolved by $PATH search or as symbols in object files, or as macros the
> compiler will recognize, the argument for marking them up as "literals",
> with boldface and `\-` to get hyphen-minus characters, is stronger.[2]

Okay.

> > I'm worried especially about the Name section, in case that \-
> > interferes with man-db.
> 
> There is no need to worry when you can easily put the question to an
> empirical test.  Run lexgrog(1) on your document to see what it says.
>
> $ lexgrog ./man/roff.7.man
> ./man/roff.7.man: "roff - concepts and history of roff typesetting"
> $ lexgrog - <<EOF
> .TH git-diff 1 2024-06-24 "groff test suite"
> .SH Name
> git\-diff \- show changes between commits, commit and working tree, etc.
> EOF
> -: "git-diff - show changes between commits, commit and working tree, etc."
> 
> man-db seems happy to me.

So, what's the rule?  The first white-space-delimited \- (that is, the
\- forms a separate token) is the separator, right?

> 
> Regards,
> Branden
> 
> [0] Secret footnote: The practices I suggested above also translate well
>     to mdoc(7) practice, where the `Dt` macro defines the "document
>     title", and `Nm` calls designate "names" of topics the page
>     discusses.  I don't advocate mdoc(7) over man(7), but I also do not
>     wish to create unnecessary impedance mismatches between them.
> 
> [1] ...which is defeated if the document shouts its title in full caps
>     in its *roff source.  But here is the configuration control.
> 
> groff_man(7):
>      -rCT=1   Set the man page identifier (the first argument to .TH) in
>               full capitals in headers and footers.  This transformation
>               is off by default because it discards case distinction
>               information.
> 
>     People who aren't accustomed to viewing man page documents with
>     "groff" or "nroff", but do use man-db, would likely put the
>     foregoing command-line option into $MANROFFOPT.  An approach that
>     works for any (groff 1.23) system regardless of man librarian is to
>     edit the "man.local" file to set the register (`.nr CT 1`).  See
>     section "Files" of groff_man(7).  (mandoc(1) doesn't respect this
>     and likely never will; its maintainer scorns configurability.)
> 
> [2] Since `-` isn't a valid character in C identifiers, that aspect of
>     the discussion doesn't hold for that language.  But I am trying to
>     reason ecumenically, and command names in kebab-case are well known.

The pages I was considering writing are keyctl-...(1).  I want to
separate that huge page into one per subcommand, as git(1) does.

Have a lovely night!
Alex


-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Dash in name of a manual page
  2024-06-25 23:43   ` Alejandro Colomar
@ 2024-06-26  0:53     ` G. Branden Robinson
  0 siblings, 0 replies; 4+ messages in thread
From: G. Branden Robinson @ 2024-06-26  0:53 UTC (permalink / raw
  To: Alejandro Colomar; +Cc: linux-man

[-- Attachment #1: Type: text/plain, Size: 2714 bytes --]

Hi Alex,

At 2024-06-26T01:43:22+0200, Alejandro Colomar wrote:
> On Mon, Jun 24, 2024 at 11:21:59AM GMT, G. Branden Robinson wrote:

> Huh, I had missed your reply somehow.  :(

No worries, you're still pretty fast.  ;-)

> > 1.  The difference matters only on output devices that distinguish
> >     hyphens from minus signs.
> 
> This includes utf8, which is my main output device.

Yes, and not everyone uses a font that shows them the difference so
they're bound to be frustrated anyway.

> And in my head, which feels that there's something wrong, even if few
> people will be affected by it.

Our irritations are to some extent our own prerogatives.  ;-)

> > 3.  In general, Unix systems are case-sensitive.
> 
> And dash/hyphen sensitive.

Only where the character encoding distinguishes them!  The term
"hyphen-minus" was unattested in English until the ISO 8859 committee
found itself dealing with the mess that ANSI X3.4 made.

(That said, I prefer their mess to IBM's.)

> Of course, I don't expect this to be used (or useful) often, or maybe
> ever, but let's be correct.  :)
[...]
> I guess I'll use the escape there [in TH].

I won't stop you.  ;-)

> > > How about SH Name?
> > 
> > I reason differently about this case.
> > 
> > > 	.SH Name
> > man-db seems happy to me.
> 
> So, what's the rule?  The first white-space-delimited \- (that is, the
> \- forms a separate token) is the separator, right?

Strictly, the rule is up to lexgrog(1) or whatever parses man page
documents to build indexes from them.  This is not something over which
a *roff formatter has authority.

lexgrog(1):
     When using the traditional man macro set, a correct NAME section
     looks something like this:

            .SH NAME
            foo \- program to do something

     Some manual pagers require the ‘\-’ to be exactly as shown; mandb
     is more tolerant, but for compatibility with other systems it is
     nevertheless a good idea to retain the backslash.

In practice, man-db's lexgrog accepts a pretty motley stew of
separators.  It maps several patterns to a (whitespace-bounded) "-"
token.

https://gitlab.com/man-db/man-db/-/blob/main/src/lexgrog.l?ref_type=heads#L96
https://gitlab.com/man-db/man-db/-/blob/main/src/lexgrog.l?ref_type=heads#L217

> The pages I was considering writing are keyctl-...(1).  I want to
> separate that huge page into one per subcommand, as git(1) does.

Understood.  People are daunted by gigantic man pages.  groff(7) is
pretty big, but is edged out by ffmpeg(1).  bash(1) leaves both far in
the dust.

zshall(1) is truly staggering.  But that may be cheating.

Regards,
Branden

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-06-26  0:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-19 20:06 Dash in name of a manual page Alejandro Colomar
2024-06-24 16:21 ` G. Branden Robinson
2024-06-25 23:43   ` Alejandro Colomar
2024-06-26  0:53     ` G. Branden Robinson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).