Tuesday Tiny Techie Tip

Globbing

If you had to completely type every filename you ever wanted to do something to, you'd be very unhappy. In the shell's continuing efforts to make you happy, the globbing feature was added. The name derives from the glob program which was used for the purpose in pre-Bourne versions of the UNIX shell (ref. The New Hacker's Dictionary ed. by Eric Raymond)

Globbing is what lets you use a pattern to match one or more files:


% rm *.*

That command would delete every file with a "." in its name (well, except for those with a "." at the beginning of their names. Those are almost always a special case.) Here's the basic building blocks:
* ("splat", or "asterisk", or "nathan hale")
The splat is your basic steamroller of file matching. A splat all by itself matches any string of characters, and thus all files (except "dot"-files and this is the last time I'll point that out) combined with other literal characters, or other glob characters, it indicates parts of the filename which are allowed to vary wildly. Some examples:
[show all files]
% ls *
adb.1                   dd.1                    ld.so.1
addbib.1                ed.1                    ldd.1
adjacentscreens.1       edit.1                  od.1v
admin.1                 fdformat.1              pdp11.1
cd.1                    id.1v                   rdist.1
cdc.1                   ld.1                    sdiff.1v
[show all files that end in ".1v"]
% ls *.1v
id.1v           od.1v           sdiff.1v
[show all files that start with "a" and have a "b"]
% ls a*b*
adb.1           addbib.1

? (question mark)
The question mark matches exactly one character (any character). Examples:
[show all files with at least 5 characters in their names]
% ls ?????*
addbib.1                fdformat.1              pdp11.1
adjacentscreens.1       id.1v                   rdist.1
admin.1                 ld.so.1                 sdiff.1v
cdc.1                   ldd.1                   
edit.1                  od.1v
[show all files with a 2-character extension]
% ls *.??
id.1v           od.1v           sdiff.1v
[show all files whose third letter is "d"]
% ls ??d*
addbib.1        ldd.1

[] (square brackets)
Square brackets give a list of characters to match. Only one character is matched by each occurrence of square brackets. You can kind of think of them as a question mark match with a little more control. You can also do the negative match case by putting a "^" (caret, shift-6) as the first character in the class. (In the Bourne shell and friends, the negation character is "!" (bang))
[show all files that start with "a" or "e"]
% ls [ae]*
adb.1                   adjacentscreens.1       ed.1
addbib.1                admin.1                 edit.1
[show all files that don't start with "a" or "e"]
% ls [^ae]*
cd.1            fdformat.1      ld.so.1         pdp11.1
cdc.1           id.1v           ldd.1           rdist.1
dd.1            ld.1            od.1v           sdiff.1v
[show all files that have a "c" or "h" extension]
% ls *.[ch]

All of the methods covered so far are evaluated by the shell against the list of available files, and generate a list of existing files which is passed on the command line to the given program. If no files match the given pattern, the shell will return an error, and the command will not be called:
% ls z*
ls: No match.

Note that the shell tries to trick you into thinking that your program had the problem, but it is actually the shell itself which failed to find anything to match "z*" against. This behavior can be changed through the use of the "nonomatch" option to csh:
% set nonomatch
% ls z*
z* not found

This time the message really does come from the application. With nonomatch set, if the pattern doesn't match any existing files it is passed to the application unchanged, so ls was looking for a file whose name was really "z*". This is the default behavior for the Bourne shell (sh(1))

The final matching construct is only available in csh and its derivatives, and is not subject to the "No match" problems:

{} (curly brackets)
Curly brackets enclose a list of alternative strings separated by commas. Unlike with the other matching operators, the alternatives are expanded without any reference to the available files, and before the other wildcard operators are expanded. For example, if you used the pattern "{foo,bar}*", the pattern would first be expanded to "foo* bar*", and then the splats would be matched against files before the command would be called. Examples:
[show all files whose first character is "a" or "b" and second is "c" or "d"]
% ls {a,b}{c,d}*
adb.1                   adjacentscreens.1
addbib.1                admin.1
[show all files that start with "ed" or "ld"]
% ls {ed,ld}*
ed.1    edit.1  ld.1    ld.so.1 ldd.1
[show all files that have a "cc" or "hh" extension]
% ls *.{cc,hh}

Since the curly brackets don't pay any attention to the available files, you can also use them to generate strings (or count in base 3 ;-):


% echo {0,1,2}{0,1,2}{0,1,2} | fmt -36
000 001 002 010 011 012 020 021 022
100 101 102 110 111 112 120 121 122
200 201 202 210 211 212 220 221 222


Tuesday Tiny Techie Tip -- 25 February 1997
Forward to (03/04/97)
Back to (02/18/97)
Written by Jeff Youngstrom

Up to the TTTT index

Tuesday Tiny Techie Tips are all © Copyright 1996-1997 by Jeff Youngstrom. Please ask permission before reproducing any of this material.