Get the answer to your homework problem. Try Numerade free for 7 days
Indiana Institute of Technology
_____ is used to find a particular word from a file.
You can use grep tool to search recursively the current folder, like: grep -r "class foo" .Note: -r - Recursively search subdirectories. You can also use globbing syntax to search within specific files such as: grep "class foo" **/*.cNote: By using globbing option (**), it scans all the files recursively with specific extension or pattern. To enable this syntax, run: shopt -s globstar. You may also use **/*.* for all files (excluding hidden and without extension) or any other pattern. If you've the error that your argument is too long, consider narrowing down your search, or use find syntax instead such as: find . -name "*.php" -execdir grep -nH --color=auto foo {} ';'Alternatively, use ripgrep. ripgrepIf you're working on larger projects or big files, you should use ripgrep instead, like: rg "class foo" .Checkout the docs, installation steps or source code on the GitHub project page. It's much quicker than any other tool like GNU/BSD grep, ucg, ag, sift, ack, pt or similar, since it is built on top of Rust's regex engine which uses finite automata, SIMD and aggressive literal optimizations to make searching very fast. It supports ignore patterns specified in .gitignore files, so a single file path can be matched against multiple glob patterns simultaneously. You can use common parameters such as:
Page 2
While you should never replace (or alias) a system command with a different program, due to risk of mysterious breakage of scripts or other utilities, if you are running a text search manually or from your own scripts or programs you should consider the fastest suitable program when searching a large number of files a number of times. Ten minutes to half an hour time spent installing and familiarizing yourself with a better utility can be recovered after a few uses for the use-case you described. A webpage offering a "Feature comparison of ack, ag, git-grep, GNU grep and ripgrep" can assist you to decide which program offers the features you need.
For a combination of speed and power indexed query languages such as ElasticSearch or Solr can be a long term investment that pays off, but not if you want a quick and simple replacement for grep. OTOH both have an API which can be called from any program you write, adding powerful searches to your program. While it's possible to spawn an external program, execute a search, intercept its output and process it, calling an API is the way to go for power and performance.
While some answers suggest alternative ways to accomplish a search they don't explain why other than it's "free", "faster", "more sophisticated", "tons of features", etc. Don't try to sell it, just tell us "why your answer is right". I've attempted to teach how to choose what's best for the user, and why. This is why I offer yet another answer, when there are already so many. Otherwise I'd agree that there are already quite a few answers; I hope I've brought a lot new to the table.
In the same way that many of us now use ‘Google’ as a verb meaning ‘to find’, Unix programmers often use the word ‘grep’. ‘grep’ is a contraction of ‘global/regular expression/print’, a common sequence of operations in early Unix text editors. It is also the name of a very useful command-line program. grep finds and prints lines in files that match a pattern. For our examples, we will use a file that contains three haiku taken from a 1998 competition in Salon magazine. For this set of examples, we’re going to be working in the writing subdirectory: $ cd $ cd Desktop/shell-lesson-data/exercise-data/writing $ cat haiku.txt The Tao that is seen Is not the true Tao, until You bring fresh toner. With searching comes loss and the presence of absence: "My Thesis" not found. Yesterday it worked Today it is not working Software is like that.
Let’s find lines that contain the word ‘not’: Is not the true Tao, until "My Thesis" not found Today it is not working Here, not is the pattern we’re searching for. The grep command searches through the file, looking for matches to the pattern specified. To use it type grep, then the pattern we’re searching for and finally the name of the file (or files) we’re searching in. The output is the three lines in the file that contain the letters ‘not’. By default, grep searches for a pattern in a case-sensitive way. In addition, the search pattern we have selected does not have to form a complete word, as we will see in the next example. Let’s search for the pattern: ‘The’. The Tao that is seen "My Thesis" not found. This time, two lines that include the letters ‘The’ are outputted, one of which contained our search pattern within a larger word, ‘Thesis’. To restrict matches to lines containing the word ‘The’ on its own, we can give grep with the -w option. This will limit matches to word boundaries. Later in this lesson, we will also see how we can change the search behavior of grep with respect to its case sensitivity. Note that a ‘word boundary’ includes the start and end of a line, so not just letters surrounded by spaces. Sometimes we don’t want to search for a single word, but a phrase. This is also easy to do with grep by putting the phrase in quotes. $ grep -w "is not" haiku.txt We’ve now seen that you don’t have to have quotes around single words, but it is useful to use quotes when searching for multiple words. It also helps to make it easier to distinguish between the search term or phrase and the file being searched. We will use quotes in the remaining examples. Another useful option is -n, which numbers the lines that match: 5:With searching comes loss 9:Yesterday it worked 10:Today it is not working Here, we can see that lines 5, 9, and 10 contain the letters ‘it’. We can combine options (i.e. flags) as we do with other Unix commands. For example, let’s find the lines that contain the word ‘the’. We can combine the option -w to find the lines that contain the word ‘the’ and -n to number the lines that match: $ grep -n -w "the" haiku.txt 2:Is not the true Tao, until 6:and the presence of absence: Now we want to use the option -i to make our search case-insensitive: $ grep -n -w -i "the" haiku.txt 1:The Tao that is seen 2:Is not the true Tao, until 6:and the presence of absence: Now, we want to use the option -v to invert our search, i.e., we want to output the lines that do not contain the word ‘the’. $ grep -n -w -v "the" haiku.txt 1:The Tao that is seen 3:You bring fresh toner. 4: 5:With searching comes loss 7:"My Thesis" not found. 8: 9:Yesterday it worked 10:Today it is not working 11:Software is like that. If we use the -r (recursive) option, grep can search for a pattern recursively through a set of files in subdirectories. Let’s search recursively for Yesterday in the shell-lesson-data/exercise-data/writing directory: ./LittleWomen.txt:"Yesterday, when Aunt was asleep and I was trying to be as still as a ./LittleWomen.txt:Yesterday at dinner, when an Austrian officer stared at us and then ./LittleWomen.txt:Yesterday was a quiet day spent in teaching, sewing, and writing in my ./haiku.txt:Yesterday it worked grep has lots of other options. To find out what they are, we can type: Usage: grep [OPTION]... PATTERN [FILE]... Search for PATTERN in each FILE or standard input. PATTERN is, by default, a basic regular expression (BRE). Example: grep -i 'hello world' menu.h main.c Regexp selection and interpretation: -E, --extended-regexp PATTERN is an extended regular expression (ERE) -F, --fixed-strings PATTERN is a set of newline-separated fixed strings -G, --basic-regexp PATTERN is a basic regular expression (BRE) -P, --perl-regexp PATTERN is a Perl regular expression -e, --regexp=PATTERN use PATTERN for matching -f, --file=FILE obtain PATTERN from FILE -i, --ignore-case ignore case distinctions -w, --word-regexp force PATTERN to match only whole words -x, --line-regexp force PATTERN to match only whole lines -z, --null-data a data line ends in 0 byte, not newline Miscellaneous: ... ... ...
While grep finds lines in files, the find command finds files themselves. Again, it has a lot of options; to show how the simplest ones work, we’ll use the shell-lesson-data/exercise-data directory tree shown below. . ├── animal-counts/ │ └── animals.csv ├── creatures/ │ ├── basilisk.dat │ ├── minotaur.dat │ └── unicorn.dat ├── numbers.txt ├── proteins/ │ ├── cubane.pdb │ ├── ethane.pdb │ ├── methane.pdb │ ├── octane.pdb │ ├── pentane.pdb │ └── propane.pdb └── writing/ ├── haiku.txt └── LittleWomen.txt The exercise-data directory contains one file, numbers.txt and four directories: animal-counts, creatures, proteins and writing containing various files. For our first command, let’s run find . (remember to run this command from the shell-lesson-data/exercise-data folder). . ./writing ./writing/LittleWomen.txt ./writing/haiku.txt ./creatures ./creatures/basilisk.dat ./creatures/unicorn.dat ./creatures/minotaur.dat ./animal-counts ./animal-counts/animals.csv ./numbers.txt ./proteins ./proteins/ethane.pdb ./proteins/propane.pdb ./proteins/octane.pdb ./proteins/pentane.pdb ./proteins/methane.pdb ./proteins/cubane.pdb As always, the . on its own means the current working directory, which is where we want our search to start. find’s output is the names of every file and directory under the current working directory. This can seem useless at first but find has many options to filter the output and in this lesson we will discover some of them. The first option in our list is -type d that means ‘things that are directories’. Sure enough, find’s output is the names of the five directories (including .): . ./writing ./creatures ./animal-counts ./proteins Notice that the objects find finds are not listed in any particular order. If we change -type d to -type f, we get a listing of all the files instead: ./writing/LittleWomen.txt ./writing/haiku.txt ./creatures/basilisk.dat ./creatures/unicorn.dat ./creatures/minotaur.dat ./animal-counts/animals.csv ./numbers.txt ./proteins/ethane.pdb ./proteins/propane.pdb ./proteins/octane.pdb ./proteins/pentane.pdb ./proteins/methane.pdb ./proteins/cubane.pdb Now let’s try matching by name: We expected it to find all the text files, but it only prints out ./numbers.txt. The problem is that the shell expands wildcard characters like * before commands run. Since *.txt in the current directory expands to ./numbers.txt, the command we actually ran was: $ find . -name numbers.txt find did what we asked; we just asked for the wrong thing. To get what we want, let’s do what we did with grep: put *.txt in quotes to prevent the shell from expanding the * wildcard. This way, find actually gets the pattern *.txt, not the expanded filename numbers.txt: ./writing/LittleWomen.txt ./writing/haiku.txt ./numbers.txt
As we said earlier, the command line’s power lies in combining tools. We’ve seen how to do that with pipes; let’s look at another technique. As we just saw, find . -name "*.txt" gives us a list of all text files in or below the current directory. How can we combine that with wc -l to count the lines in all those files? The simplest way is to put the find command inside $(): $ wc -l $(find . -name "*.txt") 21022 ./writing/LittleWomen.txt 11 ./writing/haiku.txt 5 ./numbers.txt 21038 total When the shell executes this command, the first thing it does is run whatever is inside the $(). It then replaces the $() expression with that command’s output. Since the output of find is the three filenames ./writing/LittleWomen.txt, ./writing/haiku.txt, and ./numbers.txt, the shell constructs the command: $ wc -l ./writing/LittleWomen.txt ./writing/haiku.txt ./numbers.txt which is what we wanted. This expansion is exactly what the shell does when it expands wildcards like * and ?, but lets us use any command we want as our own ‘wildcard’. It’s very common to use find and grep together. The first finds files that match a pattern; the second looks for lines inside those files that match another pattern. Here, for example, we can find txt files that contain the word “searching” by looking for the string ‘searching’ in all the .txt files in the current directory: $ grep "searching" $(find . -name "*.txt") ./writing/LittleWomen.txt:sitting on the top step, affected to be searching for her book, but was ./writing/haiku.txt:With searching comes loss
The Unix shell is older than most of the people who use it. It has survived so long because it is one of the most productive programming environments ever created — maybe even the most productive. Its syntax may be cryptic, but people who have mastered it can experiment with different commands interactively, then use what they have learned to automate their work. Graphical user interfaces may be easier to use at first, but once learned, the productivity in the shell is unbeatable. And as Alfred North Whitehead wrote in 1911, ‘Civilization advances by extending the number of important operations which we can perform without thinking about them.’
|