Sed

sed ("stream editor") is Unix utility for parsing and transforming text files, with ports available on a variety of operating systems. For many purposes, it has been superseded by perl (or the earlier AWK), but for simple transforms in shell scripts, sed retains some use.

Sed is line-oriented – it operates one line at a time – and allows regular expression matching and substitution.

Simple use edit

The most commonly used feature of sed is the s command (“substitution”, or “the s/// construction”), which replaces one pattern with another; this originates in the earlier ed, and retains use in perl.

Simply:

sed s/cat/dog/g in > out

will replace “cat” by “dog” in file in and output it to file out; the “g” means “global”: replace all matches, not just the first on a given line.

One will often wish to use single quotes (' ') to surround the pattern to avoid the shell misinterpreting it:

sed 's/cat/dog/g' in > out

Some implementations require the expression to be preceded by -e and one will wish to use this regardless if there are several patterns:

sed -e 's/cat/dog/g' -e 's/meow/woof/g' in > out

Sed can also operate as a pipe, taking in standard input and sending to standard output.

Complex patterns edit

For complex patterns, one will likely wish to use the -r (GNU sed) or -E (BSD sed) switch to enable “extended regular expressions”, as sed’s default escaping and regular expressions can be awkward to use, particularly in escaping of “(”.

Especially useful is grouping, using (…) to indicate a group in the pattern to match, and using Page Template:Mono/styles.css has no content.\1, \2, …, \9 to refer to that numbered group in the substitution pattern. For example,

 sed -r 's/<(.*)>/<\1><\/\1>/g'

replaces Page Template:Mono/styles.css has no content.<a> with Page Template:Mono/styles.css has no content.<a></a>. This allows simple field parsing and processing, such as reordering of multiple groups of patterns.

Programming edit

Beyond use of the s command, one can develop complex programs in sed.

Line-oriented edit

Sed is line-oriented – it operates one line at a time, stripping the trailing newline. To operating on multiple lines, one must use more complicated constructions, namely the N command (add next line to buffer), or H followed by g. See Sed FAQ, Section 5.10

Concatenating lines edit

For the simple task of concatenating all lines in a file, easiest is to use the tr utility:

tr '\n' ' '

meaning “replace newlines by spaces”. Note that sed (other than GNU sed) has space limits, so any method to concatenate an entire file into one line in sed yields the entire file in memory; tr instead just processes the input from start to finish, and hence has no such memory problems.

In an expression, this can be written as:

tr '\n' ' ' < in > out

Related programs edit

grep and tr are useful complements – the first selects lines, the second applies single-character translation. For instance, you might use grep to select certain lines, then pipe through sed to parse said lines.

AWK and, especially, Perl, are higher-power languages in the same vein, and can be used as alternatives if the desired task is awkward to do in sed.

Options edit

Command-line options for the POSIX standard sed:

Page Template:Mono/styles.css has no content.-n: Only produce output via the Page Template:Mono/styles.css has no content.p command.
Page Template:Mono/styles.css has no content.-e script
Page Template:Mono/styles.css has no content.-f script-file

Command-line options of GNU sed, beyond the POSIX standard sed:

Page Template:Mono/styles.css has no content.--version
Page Template:Mono/styles.css has no content.--help
Page Template:Mono/styles.css has no content.--quiet, and Page Template:Mono/styles.css has no content.--silent, in addition to Page Template:Mono/styles.css has no content.-n
Page Template:Mono/styles.css has no content.--expression=script
Page Template:Mono/styles.css has no content.--file=script-file
Page Template:Mono/styles.css has no content.-i[SUFFIX], --in-place[=SUFFIX]
Page Template:Mono/styles.css has no content.-l N, --line-length=N
Page Template:Mono/styles.css has no content.--posix
Page Template:Mono/styles.css has no content.-b, --binary
Page Template:Mono/styles.css has no content.--follow-symlinks
Page Template:Mono/styles.css has no content.-r, --regexp-extended
Page Template:Mono/styles.css has no content.-s, --separate
Page Template:Mono/styles.css has no content.-u, --unbuffered

Command-line options of BSD sed (installed by default on macOS), beyond the POSIX standard sed:

Page Template:Mono/styles.css has no content.-E: Use extended regular expressions
Page Template:Mono/styles.css has no content.-a
Page Template:Mono/styles.css has no content.-i extension
Page Template:Mono/styles.css has no content.-l

Links:

2 Invocation in sed manual, gnu.org
Unix sed(1) manual page at man.cat-v.org

Regular expressions edit

Sed uses a particular version of regular expressions different from grep and Perl. Sed covers POSIX basic regular expressions (see also Regular Expressions/Posix Basic Regular Expressions).

Regular expression features available in sed include Page Template:Mono/styles.css has no content.*, ., ^, $, [ ], [^ ], Page Template:Mono/styles.css has no content.$ Page Template:Mono/styles.css has no content.$, Page Template:Mono/styles.css has no content.\n, Page Template:Mono/styles.css has no content.\{i\}, \{i,j\} , and Page Template:Mono/styles.css has no content.\{i,\} .

Regular expression features available in GNU sed as GNU extensions include Page Template:Mono/styles.css has no content.\+, Page Template:Mono/styles.css has no content.\?, Page Template:Mono/styles.css has no content., Page Template:Mono/styles.css has no content.\b, Page Template:Mono/styles.css has no content.\B, Page Template:Mono/styles.css has no content.\w, Page Template:Mono/styles.css has no content.\W, Page Template:Mono/styles.css has no content.\s, Page Template:Mono/styles.css has no content.\S, Page Template:Mono/styles.css has no content.\`, Page Template:Mono/styles.css has no content.\', Page Template:Mono/styles.css has no content.\<, Page Template:Mono/styles.css has no content.\>. Further included are Page Template:Mono/styles.css has no content.\a (bel character), Page Template:Mono/styles.css has no content.\f (form feed), Page Template:Mono/styles.css has no content.\n (newline), Page Template:Mono/styles.css has no content.\r (carriage return), Page Template:Mono/styles.css has no content.\t (horizontal tab), Page Template:Mono/styles.css has no content.\v (vertical tab), Page Template:Mono/styles.css has no content.\cx (Control-x), Page Template:Mono/styles.css has no content.\dxxx (character by decimal ascii value), Page Template:Mono/styles.css has no content.\oxxx (character by octal ascii value), Page Template:Mono/styles.css has no content.\xhh (character by hexadecimal ascii value).

Predefined character classes supported by sed include Page Template:Mono/styles.css has no content.[:alpha:], [:blank:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and Page Template:Mono/styles.css has no content.[:xdigit:].

Regular expression features unavailable in sed include Perl metacharacters Page Template:Mono/styles.css has no content.\d, Page Template:Mono/styles.css has no content.\D, Page Template:Mono/styles.css has no content.\A and Page Template:Mono/styles.css has no content.\Z.

Links:

Regular Expressions in sed manual, gnu.org
3.9 GNU Extensions for Escapes in Regular Expressions, in sed manual, gnu.org

Oneliner examples edit

Oneliner examples of substitution:

sed "s/concieve/conceive/" myfile.txt

Replaces the first occurrence of "concieve" on each line.

sed "s/concieve/conceive/g" myfile.txt

Replaces all occurrences, because of "g" at the end.

sed "s/concieve/conceive/g;s/recieve/receive/g" myfile.txt

Does two replacements.

$ echo "abccbd" | sed "s/a\([bc]*\)d/\1/g"
bccb

Uses $ and $ to mark a group and \1 to refer to the group in the replacement part.
Possibly works only with GNU sed; to be verified.

$ echo "abccbd" | sed -r "s/a([bc]*)d/\1/g"
bccb

In GNU sed, it does the same thing as the previous example, just that the use of -r to switch on extended regular expressions has obviated the need to place backslash before "(" to indicate grouping.
The -r switch is available in GNU sed, and unavailable in the original Unix sed.

$ echo "a  b" | sed -r "s/a\s*b/ab/g"
ab

In GNU sed, uses "\s" to denote whitespace, and "*" to let the previous character group be iterated any number of times. Needs -r to enable extended regex in GNU sed.

sed "s/\x22/'/g" myfile.txt

In GNU sed, replaces each quotation mark with a single quote. \x22 refers to the character whose hexadecimal ASCII value is 22, which is the quotation mark.

$ echo Hallo | sed "s/hallo/hello/gi"
hello

Ignores character case, because of "i" at the end. Does not preserve capitalization, outputting "hello" rather than "Hello".

$ echo a2 | sed "s/[[:alpha:]]/z/g"
z2

Use Page Template:Mono/styles.css has no content.[[:alpha:]], which stands for any letter. Notice that the character class is listed as "[:alpha:]" in manuals, with single "[".

Concrete examples edit

To convert a date format to another:

$ echo "03/11/2015 23:54:03" | sed -r "s/([0-9]+)\/([0-9]+)\/([0-9]+)/\3-\2-\1/g"
2015-11-03 23:54:03

Limitations edit

Sed does not support non-greedy matches as seen in ".*?" expression. In Unix shell scripting, you can use a Perl one-liner to emulate sed with non-greedy matches:

$ echo "abcbbc" | perl -pe "s/a.*?c/ac/"
acbbc
$ perl -pe "s/a.*c/ac/" # without the non-greedy "?"
ac

Related Wikibooks edit

External links edit

GNU sed user's manual as one page at gnu.org
sed(1) OS X Manual Page at developer.apple.com
Unix sed(1) manual page at man.cat-v.org
Sed FAQ at sed.sourceforge.net
Sed by example, Part 1 at ibm.com
Wikipedia:sed