In the section on looking for files, we talk about various methods for finding a particular file on your system. Let’s assume for a moment that we were looking for a particular file, so we used the find command to look for a specific file name, but none of the commands we issued came up with a matching file. There was not a single match of any kind. This might mean that we removed the file. On the other hand, we might have named it yacht.txt or something similar. What can we do to find it?
We could jump through the same hoops for using various spelling and letter combinations, such as we did for yacht and boat. However, what if the customer had a canoe or a junk? Are we stuck with every possible word for boat? Yes, unless we know something about the file, even if that something is in the file.
The nice thing is that grep doesn’t have to be the end of a pipe. One of the arguments can be the name of a file. If you want, you can use several files, because grep will take the first argument as the pattern it should look for. If we were to enter
we would search the contents of all the files in the directory ./letters/taxes looking for the word Boat or boat.
If the file we were looking for happened to be in the directory ./letters/taxes, then all we would need to do is run more on the file. If things are like the examples above, where we have dozens of directories to look through, this is impractical. So, we turn back to find.
One useful option to find is
Let’s find all the files in the /etc directory containing /bin/sh. This would be run as
The curly braces ({ }) are substituted for the file found by the search, so the actual grep command would be something like
The “\;” is a flag saying that this is the end of the command.
What the find command does is search for all the files that match the specified criteria then run grep on the criteria, searching for the pattern [Bb]oat. (in this case there were no criteria, so it found them all)
Do you know what this tells us? It says that there is a file somewhere under the directory ./letters/taxes that contains either “boat” or “Boat.” It doesn’t tell me what the file name is because of the way the -exec is handled. Each file name is handed off one at a time, replacing the {}. It is as though we had entered individual lines for
If we had entered
grep would have output the name of the file in front of each matching line it found. However, because each line is treated separately when using find, we don’t see the file names. We could use the option to grep, but that would only give us the file name. That might be okay if there was one or two files. However, if a line in a file mentioned a “boat trip” or a “boat trailer,” these might not be what we were looking for. If we used the option to grep, we wouldn’t see the actual line. It’s a catch-22.
To get what we need, we must introduce a new command: xargs. By using it as one end of a pipe, you can repeat the same command on different files without actually having to input the command multiple times.
In this case, we would get what we wanted by typing
The first part is the same as we talked about earlier. The find command simply prints all the names it finds (all of them, in this case, because there were no search criteria) and passes them to xargs. Next, xargs processes them one at a time and creates commands using grep. However, unlike the -exec option to find, xargs will output the name of the file before each matching line.
Obviously, this example does not find those instances where the file we were looking for contained words like “yacht” or “canoe” instead of “boat.” Unfortunately, the only way to catch all possibilities is to actually specify each one. So, that’s what we might do. Rather than listing the different possible synonyms for boat, lets just take the three: boat, yacht, and canoe.
To do this, we need to run the shell. In some instances, the shell knows when you want to continue with a command and gives you a secondary prompt. If you are running sh or ksh, then this is probably denoted as “>.”
command three times. However, rather than typing in the command each time, we are going to take advantage of a useful aspect of theFor example, if we typed
the shell knows that the pipe (|) cannot be at the end of the line. It then gives us a > or ? prompt where we can continue typing
The shell interprets these two lines as if we had typed them all on the same line. We can use this with a shell construct that lets us do loops. This is the for/in construct for sh and ksh, and the foreach construct in csh. It would look like this:
In this case, we are using the variable j, although we could have called it anything we wanted. When we put together quick little commands, we save ourselves a little typing by using single letter variables.
In the bash/sh/ksh example, we need to enclose the body of the loop inside the do-done pair. In the csh example, we need to include the end. In both cases, this little command we have written will loop through three times. Each time, the variable $j is replaced with one of the three words that we used. If we had thought up another dozen or so synonyms for boat, then we could have included them all. Remember also that the shell knows that the pipe (|) is not the end of the command, so this would work as well.
Doing this from the command line has a drawback. If we want to use the same command again, we need to retype everything. However, using another trick, we can save the command. Remember that both the ksh and csh have history mechanisms to allow you to repeat and edit commands that you recently edited. However, what happens tomorrow when you want to run the command again? Granted, ksh has the .sh_history file, but what about sh and csh?
Why not save commands that we use often in a file that we have all the time?
To do this, you would create a basic shell script, and we have a
When looking through files, I am often confronted with the situation where I am
not just looking for a single text, but possible multiple matches. Imagine a
data file that contains a list of machines and their various characteristics,
each on a separate line, which starts with that characteristic. For example:
The egrep command is an extension of the basic grep command. (The ‘e’ stands for extended) In older versions of grep, you did not have the ability to use things like
One extension is the ability to have multiple search patterns that are checked simultaneously. That is, if any of the patterns are found, the line is displayed. So in the problem above we might have a command like this:
This would then list all of the respective lines in order, making association between name and the other values a piece of cake.
Another variant of grep is fgrep, which interprets the search pattern as a list of fixed strings, separated by newlines, any of which is to be matched. On some systems, grep, egrep and fgrep will all be a hard link to the same file.
I am often confronted with files where I want to filter out the “noise”. That is, there is a lot of stuff in the files that I don’t want to see. A common example, is looking through large shell scripts or configuration files when I am not sure exactly what I am looking for. I know when I see it, but to simply grep for that term is impossible, as I am not sure what it is. Therefore, it would be nice to ingore things like comments and empty lines.
Once again we could use egrep as there are two expressions we want to match. However, this type we also use the
There is an important thing to note here. In the section on interpreting the command, we learn that the shell sets up file redirection before it tries to execute the command. If we don’t include the less-than symbol in the single quotes, the shell will try to redirect the input from a file name “TABLE”. See the section on quotes for details on this.
The -l option (long version: