Welcome to Linux Knowledge Base and Tutorial
"The place where you learn linux"
Bread for the World

 Create an AccountHome | Submit News | Your Account  

Tutorial Menu
Linux Tutorial Home
Table of Contents
Up to --> Linux Tutorial

· Shells and Utilities
· The Shell
· The Search Path
· Directory Paths
· Shell Variables
· Permissions
· Regular Expressions and Metacharacters
· Quotes
· Pipes and Redirection
· Interpreting the Command
· Different Kinds of Shells
· Command Line Editing
· Functions
· Job Control
· Aliases
· A Few More Constructs
· The C-Shell
· Commonly Used Utilities
· Looking for Files
· Looking Through Files
· Basic Shell Scripting
· Managing Scripts
· Shell Odds and Ends

Glossary
MoreInfo
Man Pages
Linux Topics
Test Your Knowledge

Site Menu
Site Map
FAQ
Copyright Info
Terms of Use
Privacy Info
Disclaimer
WorkBoard
Thanks
Donations
Advertising
Masthead / Impressum
Your Account

Communication
Feedback
Forums
Private Messages
Recommend Us
Surveys

Features
HOWTOs
News
News Archive
Submit News
Topics
User Articles
Web Links

Google
Google


The Web
linux-tutorial.info

Who's Online
There are currently, 179 guest(s) and 6 member(s) that are online.

You are an Anonymous user. You can register for free by clicking here

  
Linux Tutorial - Shells and Utilities - Looking Through Files
  Looking for Files ---- Basic Shell Scripting  


Looking Through Files

In the section on looking for files, we talk about various methods for finding a particular file on your system. Let's assume for a moment that we were looking for a particular file, so we used the find command to look for a specific file name, but none of the commands we issued came up with a matching file. There was not a single match of any kind. This might mean that we removed the file. On the other hand, we might have named it yacht.txt or something similar. What can we do to find it?

We could jump through the same hoops for using various spelling and letter combinations, such as we did for yacht and boat. However, what if the customer had a canoe or a junk? Are we stuck with every possible word for boat? Yes, unless we know something about the file, even if that something is in the file.

The nice thing is that grep doesn't have to be the end of a pipe. One of the arguments can be the name of a file. If you want, you can use several files, because grep will take the first argument as the pattern it should look for. If we were to enter

we would search the contents of all the files in the directory ./letters/taxes looking for the word Boat or boat.

If the file we were looking for happened to be in the directory ./letters/taxes, then all we would need to do is run more on the file. If things are like the examples above, where we have dozens of directories to look through, this is impractical. So, we turn back to find.

One useful option to find is -exec. When a file is found, you use -exec to execute a command. We can therefore use find to find the files, then use -exec to run grep on them. Still, you might be asking yourself what good this is to you. Because you probably don't have dozens of files on your system related to taxes, let's use an example from files that you most probably have.

Let's find all the files in the /etc directory containing /bin/sh. This would be run as

The curly braces ({ }) are substituted for the file found by the search, so the actual grep command would be something like

The "\;" is a flag saying that this is the end of the command.

What the find command does is search for all the files that match the specified criteria then run grep on the criteria, searching for the pattern [Bb]oat. (in this case there were no criteria, so it found them all)

Do you know what this tells us? It says that there is a file somewhere under the directory ./letters/taxes that contains either "boat" or "Boat." It doesn't tell me what the file name is because of the way the -exec is handled. Each file name is handed off one at a time, replacing the {}. It is as though we had entered individual lines for




If we had entered

grep would have output the name of the file in front of each matching line it found. However, because each line is treated separately when using find, we don't see the file names. We could use the -l option to grep, but that would only give us the file name. That might be okay if there was one or two files. However, if a line in a file mentioned a "boat trip" or a "boat trailer," these might not be what we were looking for. If we used the -l option to grep, we wouldn't see the actual line. It's a catch-22.

To get what we need, we must introduce a new command: xargs. By using it as one end of a pipe, you can repeat the same command on different files without actually having to input the command multiple times.

In this case, we would get what we wanted by typing

The first part is the same as we talked about earlier. The find command simply prints all the names it finds (all of them, in this case, because there were no search criteria) and passes them to xargs. Next, xargs processes them one at a time and creates commands using grep. However, unlike the -exec option to find, xargs will output the name of the file before each matching line.

Obviously, this example does not find those instances where the file we were looking for contained words like "yacht" or "canoe" instead of "boat." Unfortunately, the only way to catch all possibilities is to actually specify each one. So, that's what we might do. Rather than listing the different possible synonyms for boat, lets just take the three: boat, yacht, and canoe.

To do this, we need to run the find | xargs command three times. However, rather than typing in the command each time, we are going to take advantage of a useful aspect of the shell. In some instances, the shell knows when you want to continue with a command and gives you a secondary prompt. If you are running sh or ksh, then this is probably denoted as ">."

For example, if we typed

find ./letters/taxes -print |

the shell knows that the pipe (|) cannot be at the end of the line. It then gives us a > or ? prompt where we can continue typing

> xargs grep -i boat

The shell interprets these two lines as if we had typed them all on the same line. We can use this with a shell construct that lets us do loops. This is the for/in construct for sh and ksh, and the foreach construct in csh. It would look like this:

In this case, we are using the variable j, although we could have called it anything we wanted. When we put together quick little commands, we save ourselves a little typing by using single letter variables.

In the bash/sh/ksh example, we need to enclose the body of the loop inside the do-done pair. In the csh example, we need to include the end. In both cases, this little command we have written will loop through three times. Each time, the variable $j is replaced with one of the three words that we used. If we had thought up another dozen or so synonyms for boat, then we could have included them all. Remember also that the shell knows that the pipe (|) is not the end of the command, so this would work as well.

Doing this from the command line has a drawback. If we want to use the same command again, we need to retype everything. However, using another trick, we can save the command. Remember that both the ksh and csh have history mechanisms to allow you to repeat and edit commands that you recently edited. However, what happens tomorrow when you want to run the command again? Granted, ksh has the .sh_history file, but what about sh and csh?

Why not save commands that we use often in a file that we have all the time? To do this, you would create a basic shell script, and we have a whole section just on that topic.

When looking through files, I am often confronted with the situation where I am not just looking for a single text, but possible multiple matches. Imagine a data file that contains a list of machines and their various characteristics, each on a separate line, which starts with that characteristic. For example:

Name: lin-db-01 IP: 192.168.22.10 Make: HP CPU: 700 RAM: 512 Location: Room 3
All I want is the computer name, the IP address and the location, but not the others. I could do three individual greps, each with a different pattern. However, it would be difficult to make the association between the separate entries. That is, the first time I would have a list of machine's names, the second time a list of IP addresses and the third time a list of locations. I have written scripts before that handle this kind of situation, but in this case it would be easier to use a standard Linux command: egrep.

The egrep command is an extension of the basic grep command. (The 'e' stands for extended) In older versions of grep, you did not have the ability to use things like [:alpha:] to represent alphabetic characters, so extended grep was born. For details on representing characters like this check out the section in regular expressions.

One extension is the ability to have multiple search patterns that are checked simultaneously. That is, if any of the patterns are found, the line is displayed. So in the problem above we might have a command like this:

This would then list all of the respective lines in order, making association between name and the other values a piece of cake.

Another variant of grep is fgrep, which interprets the search pattern as a list of fixed strings, separated by newlines, any of which is to be matched. On some systems, grep, egrep and fgrep will all be a hard link to the same file.

I am often confronted with files where I want to filter out the "noise". That is, there is a lot of stuff in the files that I don't want to see. A common example, is looking through large shell scripts or configuration files when I am not sure exactly what I am looking for. I know when I see it, but to simply grep for that term is impossible, as I am not sure what it is. Therefore, it would be nice to ingore things like comments and empty lines.

Once again we could use egrep as there are two expressions we want to match. However, this type we also use the -v option, which simply flips or inverts the meaning of the match. Let's say there was a start-up script that contained a variable you were looking for, You might have something like this:

The first part of the expressions says to match on the beginning of the line (^) followed immediately by the end of the line ($), which turn out to be all empty lines. The second part of the expression says to match on all lines that start with the pound-sign (a comment). This ends up giving me all of the "interesting" lines in the file. The long option is easier to remember: --invert-match.

You may also run into a case where all you are interested in is which files contain a particular expression. This is where the -l option comes in (long version: --files-with-matches). For example, when I made some style changes to my web site I wanted to find all of the files that contained a table. This means the file had to contain the <TABLE> tag. Since this tag could contain some options, I was interested in all of the file which contained "<TABLE". This could be done like this:

There is an important thing to note here. In the section on interpreting the command, we learn that the shell sets up file redirection before it tries to execute the command. If we don't include the less-than symbol in the single quotes, the shell will try to redirect the input from a file name "TABLE". See the section on quotes for details on this.

The -l option (long version: --files-with-matches) says to simply list the file names. Using the -L option (long version: --files-without-match) we have the same effect as using both the -v and the -l options. Note that in both cases, the lines containing the matches are not displayed, just the file name.

Another common option is -q (long: --quiet or --silent). This does not display anything. So, what's the use in that? Well, often, you simply want to know if a particular value exists in a file. Regardless of the options you use, grep will return 0 if any matches were found, and 1 if no matches were found. If you check the $? variable after running grep -q. If it is 0, you found a match. Check out the section on basic shell scripting for details on the $? and other variables.

Keep in mind that you do not need to use grep to read through files. Instead, it can be one end of a pipe. For example, I have a number of scripts that look through the process list to see if a particular process is running. If so, then I know all is well. However, if the process is not running, a message is sent to the administrators.

 Previous Page
Looking for Files
  Back to Top
Table of Contents
Next Page 
Basic Shell Scripting


MoreInfo

Test Your Knowledge

User Comments:


Posted by Biju on November 18, 2004 09:03am:

Hi, This is in context to the drawback of 'find' and 'grep' combined. Author mentions that if we try to display the filename (with -l option to grep), then the contents are not displayed. I guess we still can achieve the desired results using -H option with grep. That was a small suggestion. I must add that this is the best Linux tutorial site (and among the best tutorials on any topic) that I have come across. And its free! Great job. Biju George


You can only add comments if you are logged in.

Copyright 2002-2009 by James Mohr. Licensed under modified GNU Free Documentation License (Portions of this material originally published by Prentice Hall, Pearson Education, Inc). See here for details. All rights reserved.
  
Show your Support for the Linux Tutorial

Purchase one of the products from our new online shop. For each product you purchase, the Linux Tutorial gets a portion of the proceeds to help keep us going.


Login
Nickname

Password

Security Code
Security Code
Type Security Code


Don't have an account yet? You can create one. As a registered user you have some advantages like theme manager, comments configuration and post comments with your name.

Help if you can!


Amazon Wish List

Did You Know?
The Linux Tutorial can use your help.


Friends



Tell a Friend About Us

Bookmark and Share



Web site powered by PHP-Nuke

Is this information useful? At the very least you can help by spreading the word to your favorite newsgroups, mailing lists and forums.
All logos and trademarks in this site are property of their respective owner. The comments are property of their posters. Articles are the property of their respective owners. Unless otherwise stated in the body of the article, article content (C) 1994-2013 by James Mohr. All rights reserved. The stylized page/paper, as well as the terms "The Linux Tutorial", "The Linux Server Tutorial", "The Linux Knowledge Base and Tutorial" and "The place where you learn Linux" are service marks of James Mohr. All rights reserved.
The Linux Knowledge Base and Tutorial may contain links to sites on the Internet, which are owned and operated by third parties. The Linux Tutorial is not responsible for the content of any such third-party site. By viewing/utilizing this web site, you have agreed to our disclaimer, terms of use and privacy policy. Use of automated download software ("harvesters") such as wget, httrack, etc. causes the site to quickly exceed its bandwidth limitation and are therefore expressly prohibited. For more details on this, take a look here

PHP-Nuke Copyright © 2004 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.
Page Generation: 0.11 Seconds