Shell Argument Syntax :: Bash

There are a few, but extremely important concepts that we must keep in mind to make reasonable use of the command line and write shell scripts in general, and even to read the documentation.

Whitespace and Quoting

Whitespace is of the utmost importance in the shell. It is a metacharacter which is used to break the input into tokens.

In shell terminology, we say each thing separated by a field separator is a word. DO NOT think “word” literally, like in human spoken language words. Peruse the following documents to get a better idea:

For the shell, the first word (a.k.a token) is the command, or the name of the program to run, and the remaining words are parameters to be passed to the program.

$ printf '%d\n' {1..3}
1
2
3

Here, printf is the first token (the name of the program to be run), %d\n is the second token (quotes are removed — unless escaped — before the argument is passed to the program), and {1..3} is the third token, except the shell (Bash) performs brace expansion before passing the results as individual tokens to the printf program.

When the shell finds the newline, it then executes the command line.

A Sad ‘rm’ Incident

It is paramount that we prevent the shell from word splitting in certain (most) cases.

$ ls -1
message.txt
secret message.txt
secret.txt

Now we want to remove secret message.txt:

$ ls -1
message.txt
secret message.txt
secret.txt

$ rm -v secret message.txt
rm: cannot remove 'secret'
No such file or directory
removed 'message.txt'

$ ls -1
'secret message.txt'
secret.txt

Oh shoot! Because we did not prevent the shell from breaking secret message.txt into individual tokens, what was passed to rm was not a single parameter, but two: secret and message.txt.

rm was unable to remove a file named secret because no such file exists (we have secret.txt, but not secret) but was able to remove message.txt because that was a file that really existed (but alas, no longer 😭).

Unfortunately, we did not remove secret message.txt which is what we wanted, but accidentally removed secret.txt which we didn’t want to.

Yes, that is a sad story…​

We must prevent the shell from performing word splitting in cases like this (and many others). What we should have done is this:

$ rm -v 'secret message.txt'
removed 'secret message.txt'

$ ls -1
message.txt
secret.txt

Now we removed secret message.txt and no incidents took place.

End Of Options ‘--’

The end of options -- is used to indicate the end of options 🤣. It is documented under Utility Syntax Guidelines of the Open Group specification.

Guideline 10 says:

"The first -- argument that is not an option-argument should be accepted as a delimiter indicating the end of options. Any following arguments should be treated as operands, even if they begin with the '-' character."

It is useful when we want to tell a program something like “Look, from now on, these arguments are real file names, directories, data, whatever. They are not options (command line flags) to the program.”

The echo command treats -- as a normal string operand. See the echo spec.

Let’s see some use cases.

remove files starting with ‘-’

Sometimes, by accident or some other reason (like sabotage), we end up with files whose name start with one or more - (U+002D HYPHEN-MINUS character). If we try to remove or rename them without proper care, it doesn’t work.

shell
$ tree -CF .
.
├── --oops.txt
└── -w00t.txt

0 directories, 2 files

$ rm -v -w00t.txt
rm: invalid option -- 'w'
Try 'rm ./-w00t.txt' to remove the file '-w00t.txt'.
Try 'rm --help' for more information.

$ rm -v --oops.txt
rm: unrecognized option '--oops.txt'
Try 'rm ./--oops.txt' to remove the file '--oops.txt'.
Try 'rm --help' for more information.

How embarrassing!

— Master Yoda

But because we can use the end of options shell thing (--), we have a way out!

$ rm -vi -- --oops.txt -w00t.txt
rm: remove regular empty file '--oops.txt'? yes
removed '--oops.txt'
rm: remove regular empty file '-w00t.txt'? yes
removed '-w00t.txt'

Another option is to use ./<name of the file> to “force” the shell treat the word as a file and not as an option to the command.

$ tree -CF .
.
├── --oops.txt
└── -w00t.txt

0 directories, 2 files

$ rm -vi ./--oops.txt ./-w00t.txt
rm: remove regular empty file './--oops.txt'? y
removed './--oops.txt'
rm: remove regular empty file './-w00t.txt'? y
removed './-w00t.txt'

$ tree -CF .
.

0 directories, 0 files

printf and -- end of options

Example 1

We want to print "--> foo" verbatim:

$ printf -- --> foo
(no output 😲)

What the poop‽

It so happens that the first -- is treated as end of options, then the next -- is a parameter to printf. But > is treated as redirection. We ended up adding the text "--" to the foo file.

Example 2

$ printf -- '%s\n' a b c
a
b
c

Why isn’t printf printing '%s\n' literally/verbatim?

Because the format string IS NOT AN OPTION! End of options works with options.

People in bash IRC say that we would use end of options when the format string starts with --, which is not the case for this example. In this example, it doesn’t hurt but doesn’t change anything either.

Example 3

$ printf -- '-->' foo
-->

It prints "-->". What about "foo"?

It is ignored because no format string was provided. Without it, printf only handles the first argument.

Compare:

printf -- a b c
a

Note that "b" and "c" are simply ignored (again, no format string was provided). If it is provided, it is reused as necessary to consume all arguments.

printf -- '%s\n' a b c
a
b
c

echo does print all its arguments, and maybe the first impression is that printf would do the same by default, which is not the case.

$ echo a b c
a b c