Man, Info and Help
Introduction
It is a good habit read man pages, help pages, and info pages instead of immediately jumping to a web search.
Tip
If you are new to *nix, don’t expect to understand very much of the manuals when reading and trying stuff in the very first few attempts. It will really depend a lot on your background and previous experience, skills. Just keep on keeping on and hang tough! |
help
Tip
Read the document about shell built-in commands. |
Shells have built-in commands.
Some of Bash’s built-ins include commands like help
, pwd
, type
, and cd
.
$ type cd
cd is a shell builtin
help
is a builtin command that provides usage information on other builtin commands.
Tip
To see the list of Bash’s builtin commands, simply type |
To display the help for a builtin, you can either do man bash
and search for that command or simply use help
builtin.
Start with this:
$ help help
$ help
Yes, run help
without arguments once to see a list of Bash’s builtin commands.
Then try some others:
$ help <some builtin command>
$ help help
$ help builtin
$ help .
$ help source
$ help command
$ help [
$ help [[
$ help echo
$ help printf
# etc...
info
First:
$ info info
$ man info
$ info --help
Note
Of course, on Arch Linux, we are fine, but on Ubuntu, we need to |
Generally, info pages are more user-friendly and tutorial-like than man pages. To read info pages, do
$ info <program or command>
$ info ed
$ info sed
$ info bash
Note
Not all programs and commands have info pages, and when an info page does not exist for a given command, |
You can also open info directly into a section of an info document (if you know the name of the section), something like:
$ info sed 'execution cycle'
Programs in the coreutils group have an invocation section:
$ info coreutils
$ info '(coreutils) echo invocation'
$ info coreutils 'echo invocation'
$ info '(coreutils) printf invocation'
$ info coreutils 'printf invocation'
$ info '(coreutils) kill invocation'
$ info coreutils 'kill invocation'
From GNU Emacs, you can read the info pages with:
C-h i m <command>
# For example:
C-h i m sed
Info has a lot of nomenclature, concepts and commands.
info info
explains about commands to find stuff inside info, navigate documents, etc.
It is a somewhat complex system. Yet, a powerful one.
info summary
$ info emacs --node Files
$ info '(emacs)Files'
$ info /usr/local/share/info/bash.info
$ info ~/docs/doc.info
$ info sed 'sed scripts' 'the "s" command'
$ info emacs 'user input'
Run info info 'moving the cursor'
.
Note
|
For a quick glance at all info commands and key bindings, open any
info page, and press C-h
.
man
$ man man
$ man --help
$ man foo
When someone tells you something like “it is docummented in
some-command (3)”, they mean it is in section 3 of the man pages. Then
you would do man 3 some-command
or man some-command.3
:
A real example of that could be with the printf(1)
command or
printf(3)
from the C Standard Library:
$ man 1 printf
$ man printf.1
$ man 3 printf
$ man printf.3
If we don’t know what a man page name would be, we can search the man
page names and their sort descriptions by using -k
, which accepts a
regular expression. By the way, man -k pattern
is the same as
apropos pattern
.
Tip
If |
$ man -k bc
... produces to many results...
So, let’s match commands that start with “bc”:
man -k ^bc on Ubuntu 19.04
$ man -k ^bc
bc (1) - An arbitrary precision calculator language
bccmd (1) - Utility for the CSR BCCMD interface
bcmp (3) - compare byte sequences
bcopy (3) - copy byte sequence
man -k ^bc on Arch Linux as of September, 2019
$ man -k ^bc
BC (3x) - direct curses interface to the terminfo capability dat...
bc (1) - An arbitrary precision calculator language
bc (1p) - arbitrary-precision arithmetic language
bcmp (3) - compare byte sequences
bcomps (1) - biconnected components filter for graphs
bcopy (3) - copy byte sequence
Note
On Ubuntu, bc (1p) wasn’t available, but it was on Arch Linux. |
Note
A “p” right after a section number of a man page means the standard POSIX program/behavior. “bc (1p)” refers to the POSIX specs and behavior, while |
Section numbers are more or less standard across Unix-like OSes, but the letters may vary.
Finding Info Node Names
$ info sed --output - | grep '^\*\s.\+::'
* Introduction:: Introduction
* Invoking sed:: Invocation
* sed scripts:: 'sed' scripts
* sed addresses:: Addresses: selecting lines
* sed regular expressions:: Regular expressions: selecting text
* advanced sed:: Advanced 'sed': cycles and buffers
* Examples:: Some sample scripts
* Limitations:: Limitations and (non-)limitations of GNU 'sed'
* Other Resources:: Other resources for learning about 'sed'
* Reporting Bugs:: Reporting bugs
* GNU Free Documentation License:: Copying and sharing this manual
* Concept Index:: A menu with all the topics in this manual.
* Command and Option Index:: A menu with all 'sed' commands and
$ info sed 'sed scripts' --output - | grep '^\*\s.\+::'
* sed script overview:: 'sed' script overview
* sed commands list:: 'sed' commands summary
* The "s" Command:: 'sed''s Swiss Army Knife
* Common Commands:: Often used commands
* Other Commands:: Less frequently used commands
* Programming Commands:: Commands for 'sed' gurus
* Extended Commands:: Commands specific of GNU 'sed'
* Multiple commands syntax:: Extension for easier scripting
Then we use the names on the left column of the output above to read info for that command on that specific section.
$ info sed 'sed scripts' 'the "s" command' --output - | vim -
$ info sed 'sed scripts' 'the "s" command'
Or commands that end with “print” (but not “printf”, for example):
$ man -k print$
FcFontSetPrint (3) - Print a set of patterns to stdout
FcPatternPrint (3) - Print a pattern for debugging
FcValuePrint (3) - Print a value to stdout
isprint (3) - character classification functions
iswprint (3) - test for printing wide character
print (1) - execute programs via entries in the mailcap file
Bear in mind that all of these do the same thing:
man -k some_command
man --apropos some_command
apropos some_command
To search on the entire text of the man pages, use:
man --global-apropos some_command
man -K some_command
Note that it is an uppercase “K” this time.
`cp' Man Page Example
man cp
produces this:
Exerpt of `man cp' on Arch Linux as of 2019
CP(1) User Commands CP(1)
NAME
cp - copy files and directories
SYNOPSIS
cp [OPTION]... [-T] SOURCE DEST
cp [OPTION]... SOURCE... DIRECTORY
cp [OPTION]... -t DIRECTORY SOURCE...
DESCRIPTION
Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY.
Mandatory arguments to long options are mandatory for short options
too.
Let’s understand the man page syntax.
“cp” is the name of the command or program. No mistery.
Anything inside “[” and “]” means that thing is optional. In this case,
[OPTION]
means that command line options are optional, that is, you
can do something like cp -v foo.txt foo.txt.bpk
, where -v
is an
option, or simply cp foo.txt foo.txt.bpk
, and not use -v
or any
other option at all. You can think as options as flags the enable,
disable, or configure the way the program should behave.
The three dots, …
, like in [OPTION]…
or SOURCE…
, means that thing
may occur more than one time. If something is optional, it may occur
zero or more times. If that thing is required, then it has to occur one
or more times. So, in the case of:
cp [OPTION]... SOURCE... DIRECTORY
it means we must use cp
, followed by zero or more command line
options. Then, SOURCE…
is required, but it can occur more than once.
Finally, DIRECTORY
is required, and must occur only once.
Recap:
-
[THING]
optional and may occur at most once. -
[THING]…
optional and may occur zero or more times. -
THING
required and must occur exactly once. -
THING…
required and must occur one or more time.
Since cp
accepts multiple sources, we could copy more than one file at
a time to a given destination directory. As an example, let’s copy three
files to a backup directory.
$ cp main.c lib.h lib.c ~/bkpdir/
Suppose we want to use the options --verbose
and --interactive
(or
their short versions, -v
and -i
), we can do:
$ cp --verbose --interactive main.c lib.h lib.c ~/bpkdir/
And with the short option syntax, we can group options. All three commands below do the same thing:
$ cp --verbose --interactive foo.txt foo.txt.bpk
$ cp -v -i foo.txt foo.txt.bpk
$ cp -vi foo.txt foo.txt.bpk
Note the -vi
instead of -v -i
in the last one!
`csi' -help Example
One of the popular Scheme implementations is “Chicken”, and its command
line tools include csi
(Chicken Scheme Interpreter, for the command
line REPL) and csc
(Chicken Scheme Compiler).
Note
On some distros, the names are now |
Note
|
$ csi -help
usage: csi [FILENAME | OPTION ...]
Note that we have the square braces enclosing two things, and there is a “|” (the pipe character) between those two things. That character means 'OR', that is, either one thing, or the or the other. It doesn’t mean “invoke csi followed by a filename followed by an option.” Nope, that is incorrect. What that means is either one of these:
$ csi program.scm
# or
$ sci <some option>
# but this is INCORRECT:
$ sci program.scm <some option>
On the other hand, if you look at the csi
man page (or sci -help
),
you’ll see that some options require a file name, like the -s
(or
-script
) option.
The moral is that the man page shows something that can be easily misunderstood:
csi [FILENAME | OPTION ...]
Can lead one to think the syntax is:
$ sci program.scm -s
which is incorrect. The correct is either:
$ sci program.scm
or (because the option -s
takes a filename)
$ sci -s program.scm
That is, csi filename
or csi <option>
, just that some options
require a filename after the option itself.
Command Options
Most commands (or programs) accept both long versions and short versions
of options. For example, rsync
has -a
, short for --archive
, and
-r
, short for --recursive
, among many others.
Still, even for programs that support both short and long versions of
options, some options my be available only in long form (either because
there was no appropriate single letter left, or for some other,
sometimes odd, reason). For example, ls
has the long option
--group-directories-first
, and there is no short name for that option.
However, some programs allow the abbreviation of a long option as long
it does not clash with some other option. For instance ls
has only one
long option that starts with --g
(which is
--group-directories-first
), and it allows one to abbreviate it to
something like --group-directories
, or --group-d
, or even --group
or --g
.
To give another example, the program xclip
also allows unambiguous
abbreviations; one can either write xclip -selection clipboard
or
abbreviate to xclip -sel clip
. Many other commands allow this sort of
abbreviation.
Another thing to consider is the number of hyphens. For most commands,
short options use one hyphen, and long versions use two. You write
either -r
(one hyphen) or --recursive
(two hyphens). However, some
commands have long options (and sometimes only long options, and
behold, they take only one single hyphen. xclip
, chicken-csi
and
chicken-csi
are examples of programs in which the long version uses
only a single hyphen (and allow the unambiguous abbreviations).
Yet others, like tar
, do not require the hyphen for the short
versions. That is, you can either do tar -cf dir.tar dir/
or drop the
hyphen and do tar cf dir.tar dir/
.
java
and javac
, has long options, and some use one single hyphen,
like -classpath
, while others use two hyphens, like --class-path
.
POSIX and GNU
POSIX is a standard (specification) defined by the Open Group. There are four main sections in the spec:
-
Shell & Utilities (this is the one most useful for command line users and practictioners)
GNU programs and commands attempt to follow POSIX, but adds several additional features and “extensions” to standard POSIX. So, when you use a command line program, it is very likely that you are not using plain, standard POSIX, but extra features not defined in POSIX as well.
Bash itself can be started with environment variable POSIXLY_CORRECT
set (or with the --posix
option) so it will behave like a real, plain,
bare POSIX shell as much as possible.
In sed
, we can read its info page with info sed
. In the section “Sed
Scripts > The "s" Command”, we can read this:
Finally, as a GNU 'sed' extension, you can include a special sequence
made of a backslash and one of the letters 'L', 'l', 'U', 'u', or 'E'.
The meaning is as follows:
'\L'
Turn the replacement to lowercase until a '\U' or '\E' is found,
'\l'
Turn the next character to lowercase,
'\U'
Turn the replacement to uppercase until a '\L' or '\E' is found,
'\u'
Turn the next character to uppercase,
'\E'
Stop case conversion started by '\L' or '\U'.
Most (if not all) GNU command line programs docs explicitly state when something is not plain POSIX, but an additional GNU feature. We can assume that most man and info pages are explicit when an option or something else is not POSIX-compliant or POSIX-defined.
Documentation Relationships
Also worth noting is that some docs refer to some other docs. If a man, help or info page mentions some other docs, pay attention to it. It usually means it implements things mentioned in the other docs, and possibily extends and overrides things from the mentioned docs. Let’s discuss one such example.
If you read the help for the builtin printf
command, it says:
In addition to the standard format specifications described in printf(1),
printf interprets:
And then you do man 1 printf
, and see:
NOTE: your shell may have its own version of printf, which usually su‐
persedes the version described here. Please refer to your shell's doc‐
umentation for details about the options it supports.
So, Bash’s printf uses the format especifications defined in printf(1),
but nonetheless, printf(1) tells us that the Shell’s printf “usually
supersedes” this printf. Moreover, man 1 printf
talks about C
printf.
If we read POSIX printf specs, we see it mentions XBD File Format Notation, which says:
If the format is exhausted while arguments remain, the excess arguments shall
be ignored.
So, one would expect that printf '%s\n' foo bar
would print "foon" and
ignore "bar", still, take a look at what really happens:
$ printf '%s\n' foo bar
foo
bar
It is still printing “bar” even though the POSIX spec tells that it should be ignored. Except that XCU Command and Utilities extends and superseds XBD File Format Notation. Look:
The format operand shall be used as the format string described in XBD File
Format Notation with the following exceptions:
...
9. The format operand shall be reused as often as necessary to satisfy the
argument operands.
...
So, even though XBD tells that “excess arguments shall be ignored”, XCU printf overrides that and tells that it shall be reused to satisfy the operands.
End of Options echo Example
Unix shells and programs interpret --
to mean “end of options”.
Guideline 10 on
XBD
Utility Syntax Guidelines 10 says:
The first `--` argument that is not an option-argument should be accepted as a
delimiter indicating the end of options. Any following arguments should be
treated as operands, even if they begin with the '-' character.
Take a look:
$ printf -v
-bash: printf: -v: option requires an argument
printf: usage: printf [-v var] format [arguments]
But if we use --
, then printf simply prints “-v”:
$ printf -- -v
-v
Then we try it with echo:
$ echo -- -e
-- -e
Oops! echo printed -- -e
, not just -e
. It seems echo does not take
--
to mean “end of options”. If we run help echo
, it says nothing
about --
. Then we read
XCU
echo spec page, and come accross this:
The echo utility shall not recognize the "--" argument in the manner
specified by Guideline 10 of XBD Utility Syntax Guidelines; "--" shall be
recognized as a string operand.
So that is it. Since GNU Bash echo does not override the way --
should
work according to the specs, it is not even documented in help echo
.
And we should assume, at least when it comes to --
, that echo bash
builtin follows the specs!