Grep is a Unix utility that searches through either information piped to it or files in the current directory. An example should help clarify things.

Let's say that we wanted to search through a directory, and wanted to find all the files that had the string "hello" in their name. You might issue the 'ls' command in a shell to list the directory's content and:

$ ls
DumpSite.sh  crontab.txt  nagios-3.0.6  xmpppy  xymon-4.3.0-beta2

and look through everything manually, or you could use the 'ls' command and pipe the output of ls to grep:

$ ls |grep crontab
crontab.txt

On the contrary, if you want to filter a list unless some entries, put it in the parameter -v:

$ ls |grep -v crontab
DumpSite.sh
nagios-3.0.6
xmpppy
xymon-4.3.0-beta2

the '|' character is the representation of the pipe basically directs the output of the 'ls' command as input for grep. You should get a nice (perhaps empty) list with all the files that have "hello" in their names.

For search term, grep can take regular expressions rather than plain strings. A simple example for that might be looking for all .txt OR .jpg files in a directory :

$ ls | grep '.*\(txt\|jpg\)'

The regex here is made up from .* which can stand for anything in a file's name and \(txt\|jpg\) which yields either txt or jpg as file endings.

OptionsEdit

Command-line options aka switches of grep:

  • -e pattern
  • -i: Ignore uppercase vs. lowercase.
  • -v: Invert match.
  • -c: Output count of matching lines only.
  • -l: Output matching files only.
  • -n: Precede each matching line with a line number.
  • -b: A historical curiosity: precede each matching line with a block number.
  • -h: Output matching lines without preceding them by file names.
  • -s: Output status only.
  • -x
  • -f file: Take regexes from a file.

Command-line options aka switches of GNU grep, beyond the bare-bones grep:

  • --help
  • -V, --version
  • --regexp=pattern, in addition to -e pattern
  • --invert-match, in addition to -v
  • --word-regexp, in addition to -w
  • --line-regexp, in addition to -x
  • -A num, --after-context=num
  • -B num, --before-context=num
  • -C num, -num, --context=num
  • and more ...

Links:

Regular expressionsEdit

Grep uses a particular version of regular expressions different from sed and Perl. Grep covers POSIX basic regular expressions (see also Regular Expressions/Posix Basic Regular Expressions).

Regular expression features available in grep include *, ., ^, $, [ ], [^ ], \( \), \n, \{i\}, \{i,j\}, \{i,\}.

Regular expression features available in GNU grep as a GNU extension include \?, \+, \b, \B, \<, \>, \w, \W, \s, \S.

Regular expression features available in grep with -E switch include ?, +, |, ( ), {i}, {i,j}, {i,}.

Predefined character classes supported by grep include [:alpha:], [:blank:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and [:xdigit:].

Regular expression features unavailable in grep include Perl's \d, \D, \A and \Z.

Links:

ExamplesEdit

Examples of grep use:

  • echo file.txt | grep ".*\(txt\|doc\)"
    • Matches. "\(" and "\)" create a group, while "\|" separates items in the group. The group matches if at least one of its items matches.
  • echo a456 | grep "[a-zA-Z][0-9][0-9]*"
    • Matches. "[" and "]" are delimiters for character groups. "*" stands for zero, one, or any other number of the previous.
  • echo a456 | grep -i "[A-Z][0-9]\+"
    • Matches. "\+" stands for one or more occurrences of the previous. Unlike "*", "+" has to be preceded by "\". "-i" makes the search case-insensitive.
  • echo file.txt | grep -E ".*(txt|doc)"
    • Matches. "-E" stands for extended regular expressions. In extended regex, "(" and "|" do not need "\" to act as special characters; they need "\" to act as literals, that is, stand for themselves.
  • echo abbc | grep -E "abb?c"
    • In extended regular expressions enabled by -E switch, the question mark matches zero or one occurrences of the previous.
  • echo abbc | grep "abb\?c"
    • In GNU Grep, \? (question mark preceded by a backslash) matches zero or one occurrences of the previous.
  • echo a4c | grep -P "a\dc"
    • In GNU Grep of some versions, matches. "-P" stands for Perl regular expresions; "\d" in the regex stands for a digit.
  • grep -P "\x22hello\x22" file.txt
    • In GNU Grep of some versions, searches for the string starting with a quotation mark, followed with "hello", followed with another quotation mark. Makes use of "-P", which turns on Perl regex. In Perl regex, "\x22" stands for a quotation mark, via standing for the character with the hexadecimal ASCII value of 22.
  • grep -P "a\t+b" file.txt
    • In GNU Grep of some versions, refers to the tab character (tabulator) by "\t". Enabled by -P.
  • grep -Fxv -f file2.txt file1.txt
    • Outputs set difference: file1.txt - file2.txt. Uses -F to interpret search term literally aka non-regex, -x to match whole lines only, -v to invert match, and -f to take the search terms from a file.
  • grep -Fx -f file1.txt file2.txt
    • Outputs set intersection: those lines of file1.txt that are also in file2.txt.
  • perl -ne "print if /\x22hello\x22/" file.txt
    • Not really a grep example but a Perl oneliner that you can use if Perl is available and grep is not.

VersionsEdit

Old versions of GNU grep can be obtained from GNU ftp server.

Release announcements of GNU grep are at a savannah group.

A changelog of GNU grep is available from git.savannah.gnu.org.

A version of GNU grep for MS Windows is available from GnuWin32 project, as well as from Cygwin.

See alsoEdit

External linksEdit

Last modified on 26 December 2013, at 11:38