Emacs

The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
https://en.wikibooks.org/wiki/Emacs

Permission is granted to copy, distribute, and/or modify this document under the terms of the Creative Commons Attribution-ShareAlike 3.0 License.

Emacs philosophy

Designers of UNIX tools often talk about the UNIX philosophy: "A tool should do one thing and one thing well". UNIX developers love their succinct, powerful, command-line tools that they can string together in myriad ways.

And then there's Emacs, a tool which despite its laudable UNIX roots has been mocked for its sprawling natural: the sarcastic vi user sitting next to you may well come up with hackneyed lines about your use of an Eight-meg Memory use Constantly Swapping operating system with a mediocre editor implemented on top of it.

And yet, I would argue that Emacs does indeed represent the UNIX philosophy quite well.

The missing maxim: extensibility

I would argue that the early UNIX philosophers forgot to add something to their guide. Something that is so implicitly true for command line tools that it's easy to miss, but becomes a lot more visible when you use things that are interactive.

What they should have said is: "Do one thing well... and also be Turing complete"

What do I mean you say!? How can a tool both do only one thing and also everything? Well, the thing is... it's all well and good doing one thing well, but your actions don't exist in isolation, your tool has to interact with other tools to achieve your overall task.

At the command line this is easy, if you want your grep command to send mail, you just pipe into mail like so

grep root /etc/shadow | mail dave@haxxor.net

But once you start doing things interactively things become difficult. Input and output cease to be simple streams of characters, but a complicated interleaving pattern of events.

In this landscape the analogue of the pipe becomes the application scripting language. And to truly satisfy the UNIX philosophy you have to be highly extensible. And so we meet:

EMACS: the turing complete text editor

And this is how we are to understand emacs. A text editor that tries to be as simple as possible in the UNIX philosophy, but also highly extensible.

Now extensibility is hardly something that only emacs offers. Many tools have plugin frameworks and scripting. But emacs is slightly different:

Its scripting language, emacs lisp, is simple to use, with many forms of interaction with the editor implemented as language-level features
As many features as possible are implemented using this extensibility language
Numerous features are provided to help you interact with emacs lisp, and moreover normal day-to-day activities encourage you to interact with emacs lisp.

Emacs is an editor with extensible scripting at its core, not merely provided as an afterthought that only plugin writers understand.

Why to use Emacs

The editor that will never die

IDEs that you use may have a limited life span due to the death of the company that makes the tool, or the language it is designed for. Even upgrades to the IDE you use can make not be to your liking. Emacs is far less likely to suffer this problem:

Since emacs is Free software, it is harder to kill
It has already been around for a long time
It has many committed users, many of whom are ”locked in” by their customisation
Many of these users are programmers who are used to making Free software contributions
Emacs is extensible, meaning, any hot new feature will be implemented for it as an extension.

How to Use Emacs

The learning debt model

Many emacs tutorials start with a long list of useful commands and keybindings for your to remember. Useful as this is, it can be slightly overwhelming, and distracting. Moreover, other tutorials probably already do a reasonable job.

An alternative way of learning things can be going into a form of "learning debt", learning the minimal amount required to easily find out the things you need to know when you want to do them. And then *playing* with the tools you want to learn. At the very least, this way of doing this can be very compelling, and a complete tutorial is often rendered far more interesting once you have suffered a little.

A mental model of emacs

We shall start of with a mental model of how emacs works, befitting its extensible nature. This model together with some emacs's commands for documentation can help you start using without having to learn that much.

Most of the work you do in emacs is via a *buffer*, an interactively editable representation of a file.
Everything you do in emacs is function call, even things like pressing individual keys.
There is a keymap that maps (or binds) keys, or sequences of keys you press, to function calls.
Some of these function calls might interactively ask for input. This is done by popping up a prompt (the mini-prompt) at the bottom of the screen.

Finding out how to do things in emacs

Emacs is designed to be self-documenting. Emacs's fairly consistent model, combined with functions for finding out about the internal state of emacs makes it quite easy to find out what emacs can do and how it does it.

Looking up functions

Everything you do in emacs is via a function call. These functions tend to have descriptive names, and good documentation. So by searching for functions by name you can find out how to do something.

You can search functions with the `describe-function` command. By default this function is bound to *C-h f*. That is you press down the control key, and before releasing it press h, then you release the control key and press *f*.

This pops up a prompt at the bottom of the screen asking you for a function name. At that time you can press *<TAB>* to get a list of functions starting with what you've typed so far.

Exercise

Example

Let's suppose you want to work out how to move the cursor left. If you press "C-h f l e f t <TAB> <TAB>" you see that there two functions that start with *left*, *left-char* and *left-word*. If you finish typing "left-char" and press "<ENTER>" Emacs will show you documentation for the *left-char* command.

This will tell you that

The left-char command moves the cursor to the left.
Unsurprisingly, it is bound to the left key.

Look up some functions

Look up some function definitions, and call some functions. Use the "C-h f" key-binding to look up as many different functions as you think of until you get bored. See if you can recognise patterns in function names, or the keybindings they have. In particular you are probably interested in how to open files, open buffers, change buffers, close buffers and exit emacs.

Looking up keys

Looking up functions can be a good way of finding out how to do this, but it can be problematic if you can't seem to find the name of the function you are looking for. Often what you are after are useful functions together with their keybindings. One suspects that many readers have already found their own list from among the various tutorials, cheat sheets and manuals for emacs that exist on the internet.

Emacs itself can be made to give you such a list. One such useful list is all the functions that are currently bound to some key. You can get at this using the describe-bindings function bound to "C-h f".

Another plausible approach to learning how to use a programming tool is to press keys at random until something interesting happens, and then look up what exactly the key press did. You can use the describe-key function bound to "C-h k" to do this.

Alternative sources of documentation

Once you have started playing emacs, and built up a sufficient store of frustration. You might want to interact with some reference or tutorial documentation, the function help-with-tutorial "C-h t" is one place to start.

Introduction to Emacs Lisp

Emacs Lisp is a programming language that belongs to the Lisp family of languages (which includes Scheme and Common Lisp). Lisp is the second-oldest programming language still in modern use (after Fortran), but although the Lisp community remains active it is now very small. As a result, most developers never have cause to learn Lisp, and a great many developers who use Emacs consider Emacs Lisp alien territory.

Although Emacs Lisp is not a mainstream programming language, it is a powerful language that is used to implement most of Emacs itself. This means that as an Emacs extension author you are using the same language, with access to all the same libraries, as was originally used to write the editor. This makes Emacs uniquely powerful; although other editors such as Eclipse accept extensions that modify their behaviour, no other editor can allow its behaviour to be tweaked at run-time with native code.

Simple examples of Emacs Lisp

Emacs Lisp is such a fundamental part of the Emacs way of thinking that it doesn't make you go somewhere special to execute Lisp expressions; you can do it from any buffer. Right in the middle of writing a C function you can just drop some Emacs Lisp in there, execute it, and see a result right away.

Go into any Emacs buffer and type the following:

(+ 1 2)

Position the cursor after the close parenthesis and type C-x C-e (control and x, followed by control and e). The expression will be evaluated and the result will be displayed in the minibuffer. This expression simply adds the values 1 and 2, so the resultant value 3 should appear in the minibuffer.

Though the calculation is simple, the way it's expressed might catch some people out. Expressions in Emacs Lisp always take the same form: Open parenthesis, a function identifier, a list of arguments to that function, and finally a close parenthesis. All the expression above means is that we're calling the + function (addition) and giving it the arguments of 1 and 2.

This way of writing operations is known as Polish notation (also called Polish prefix notation or simply prefix notation). Seeing addition written this way seems unnatural to most programmers at first, since they are used to seeing mathematical expressions written with the operator in the middle, so-called infix notation. However, most programmers are also familiar with calling functions, where the function name appears before the arguments. Most programming languages make a distinction between operators, which use infix notation, and functions, which use prefix notation. In languages of the Lisp family there is no such distinction, and all calls use prefix notation. Though this takes some getting used to, it provides the benefit that all code follows the same structure, which makes it much easier for Lisp code to read and write Lisp code, thereby making the language much better at introspection.

In case this bothers you, it's worth reminding yourself just how arbitrary the dividing line between functions and operators can be. Particularly in object-oriented languages that allow operators to be re-defined or specialised for classes, they are really just syntactic sugar to function calls anyway.

There can be any number of arguments in a function call (assuming the function supports it, which most functions will do if it is meaningful to do so):

(+ 1 2 3 4)

The arguments to a function can themselves be function calls. For example, the following expression adds the first three square numbers:

(+ (* 1 1) (* 2 2) (* 3 3))

The first element after the open parenthesis must always be a function identifier.

The parentheses are an integral part of the expression; if you remove the parenthesis from (+ 1 2) there is no Lisp expression there (to be precise, there are three separate and unconnected Lisp expressions, and no function evaluation). Extra parentheses aren't harmless, like they are in some languages. Consider the expression:

((+ 1 2 3))

The expression is evaluated recursively, so the inner expression is evaluated first, and this reduces to:

(6)

The meaning of this expression is applying the function 6 with no arguments, but since there is no such function, an error will be thrown. You can regard the parentheses as being like the parentheses on a function call in C, rather than like simple arithmetic parentheses for modifying operator precedence.

In fact, one of the benefits of the Lisp operator notation is that there is never any ambiguity in how the operators will be evaluated, so never any need for additional parentheses to disambiguate. For example, using infix notation, an expression like 5 * 4 + 3 is possible, which requires the reader to know precedence rules in order to know how it will be evaluated ((5 * 4) + 3 or 5 * (4 + 3)). In Lisp languages, the equivalent expression will be written (+(* 5 4) 3), so there is never any ambiguity.

Interacting with Emacs Lisp

Although you can execute Emacs Lisp statements from any buffer in Emacs, it's most convenient to set aside a buffer for the purpose, to avoid messing up buffers that have important work in them. When you start Emacs it creates a special buffer for you, *scratch*, which isn't associated with a file (unless you decide to save its content later). This makes a good choice for writing and executing Emacs Lisp statements.

The *scratch* buffer has another advantage for Lisp execution, which is that, by default, it starts in Lisp interaction mode. In this mode you can execute any Emacs Lisp expression with C-j and the result will be inserted permanently into the buffer, rather than appearing temporarily in the minibuffer. You can put any buffer into Lisp interaction mode with M-x lisp-interaction-mode.

Defining functions

As you would expect, you can define your own functions in Emacs Lisp, and they will work the same way as built-in functions. You define a function with a defun expression:

(defun my-add (x y)
   (+ x y))

Try typing this into emacs and evaluating the expression. Emacs should reply with:

my-add

The return value of defining a function is just the function itself, my-add in this case. This illustrates an important point: Every Lisp expression has a value. This is like the return value from a function in C, except that you don't have to explicitly return anything. Emacs will simply take the last expression in the function body and treat it as the return value. There are no void functions in Emacs Lisp, although the caller is free to ignore any uninteresting return values.

Having defined a new function, you can now use the function in the same way as the built-in functions:

(my-add 1 2)

This gives the expected result 3. Note that you can't pass an arbitrary number of parameters to this addition function, like you can with the built-in function (+). Don't worry, it's perfectly possible to create a function that can work in this way, and we'll learn how later.

Code and Data

The simplest data structure in any Lisp is the list; indeed, lists give the Lisp programming language its name, which is a shortening of List Processing. The name is perhaps a misnomer, since Lisp can be used for far richer data structures than just lists, but the ubiquitous list provides a natural starting point for exploring Lisp programming.

Type the following into a Lisp evaluation buffer and execute it (C-j if you're in lisp interaction mode, C-x C-e otherwise):

(list 1 2 3)

Emacs will reply with:

(1 2 3)

Lists can contain any number of items (or zero items), and the items need not all have the same type. In fact, lists can contain other lists as entries, giving a nested data structure:

(list 1 2 "buckle my shoe" (list 3 4))

If you evaluate this, Emacs will reply with:

(1 2 "buckle my shoe" (3 4))

When you enter an expression, Emacs evaluates it and returns the result to you. The expression (list 1 2 3) evaluates the list function with three arguments. The result of calling the list function is a list, which is returned to you: (1 2 3). So if (1 2 3) is correct syntax for a list, why must you call the list function, rather than typing the list in directly? Try executing the following Lisp expression in Emacs:

(1 2 3)

Emacs will respond with an error message^[1]:

Lisp error: (invalid-function 1)

The problem here is that by entering the list and asking Emacs to evaluate it, you're asking for it to be treated as Lisp code, not as data. Lisp code consists of one or more lists of items, where the first item in each list is required to be a function identifier, and the remainder of the items in the list are arguments to the function (which can themselves be expressions to be evaluated). Since you've attempted to evaluate the expression (1 2 3), Emacs assumes that 1 must be a function identifier, and when it fails to find such a function it throws an error. The same problem doesn't happen with (list 1 2 3), since list is a function; this is code, not data.

There's an important point in the above paragraph that bears being repeated for emphasis: Lisp code is simply Lisp data, and all Lisp data can be treated as code. This is perhaps the key thing that differentiates Lisp from all other mainstream programming languages. It makes it relatively easy for Lisp code to generate further Lisp code at run time. This one simple design decision exponentially increases the richness of the language at a stroke, since higher-order functionality that would require special language support in other languages can simply be written using ordinary Lisp code.

If the only difference between Lisp code and data is that code is evaluated and data is not, then we must have some way of communicating to Lisp whether or not some particular data is intended to be evaluated. As we saw above, one way to do this is with a trivial function like list, which just returns its arguments. This is one way round the problem, but it isn't an elegant one. By using list, we haven't succeeded in preventing evaluation from taking place, we've just written an expression where the evaluation is trivial. More seriously, with nested lists you have to modify each level of nested list (such as (list 1 2 (list 3 4)) rather than just the outer one.

Since the default behaviour of Lisp is to evaluate lists, all we need in order to control evaluation is some way of preventing evaluation. This is done with the quote operator, which is written like any other Lisp function call but causes its argument not to be evaluated. quote takes only one argument, but the argument can be a list, which of course includes nested sub-lists. Try evaluating the following:

(quote (1 2))
(quote (1 (2 3) 4))
(quote 1 2)

The third form will throw an error since quote only allows one argument. The second form shows that the sub-expression (1 2) isn't evaluated, even though normal evaluation works by evaluating each of the arguments in turn.

The behaviour of the nested expression shows something important: quote isn't an ordinary Lisp function. If you tried to implement quote yourself you'd find it impossible, since the arguments to quote would be evaluated before your custom function was even called. Since the error (evaluating (1 2) in the above example) would be thrown before your function would be called, there's no way for your code to recover from it.

Instead of being a Lisp function, quote is one of a small number of special forms. A special form has the same syntax as a Lisp function, but has special behaviour provided by the Lisp interpreter that an ordinary function couldn't provide.

You might wonder why it's OK to evaluate certain expressions but not others. For example, all the following expressions can be evaluated, even though none of them contain functions:

12
"twelve"
nil
:thing

Even though none of these expressions are quoted they all evaluate to themselves. The reason for this is that Lisp regards certain values as self-evaluating, which means that they will return their own value if evaluated as a Lisp expression. This works for integers and strings, and in general where there is no possibility of ambiguity. It can't work with lists, since ordinary code is expressed with the syntax of a list, so if lists evaluated to themselves then no code could be evaluated.

Since quote is expected to be used often there is a convenient syntax sugar - an apostrophe character ':

(quote (1 2))
'(1 2)

Both expressions evaluates to the same (1 2) list.

Variables and scope

Unlike some programming languages based on the functional paradigm, Emacs Lisp has mutable variables that will be immediately familiar to most developers. Variables in Emacs Lisp are untyped, which means that a variable can hold any value you care to give it: numbers, strings, even functions can be assigned to variables. The same variable can hold an integer at one point and a function definition a few lines later (though most developers will recognise that this is not good practice).

The simplest way to work with variables in Emacs Lisp is using global variables. As the name implies, these are available to read and write anywhere in the program and retain their values permanently.

One thing to be careful of when using global variables in Emacs Lisp is that there is no concept of namespacing that can be used to separate references to global variables in your library from global variables in someone else's library, or in the Emacs core. It's therefore good practice to prefix your global variables with a string that is specific to your library.

Global variables are widely used within Emacs (and within third-party Emacs Lisp packages) to hold simple configuration settings for a module. Although unrestrained use of global variables makes for code that is hard to follow, when used judiciously it provides a way of controlling configuration with minimal overhead. Global variables that are used for configuration will often have documentation associated with them. You can access this documentation by moving the cursor over the variable in question and typing C-h v.

You can assign a value to an existing variable, or create a new one, by using the setq special form:

To do:
If we haven't already explained what a special form is, some further explanation should be given here

(setq some-variable 12)

The setq form assigns the value of its second argument to the variable given in its first argument.

Scoped variables
let and let*
Buffer-local variables

Functions as first-class objects

Passing functions as arguments
Lambda functions
Comparison between lambda and defun

Functions that write functions

define-skeleton as an example

Notes

↑ In fact, Emacs will give you a full backtrace to show where the problem occurred; details have been left out here for clarity, but don't be surprised if your error message looks more complex than described here

Extending Emacs

How do I do X in emacs

People use emacs for a range of activities. So whatever you want to do, there's a chance that someone has already done it. Your first point of call for extending emacs should probably be emacswiki, which lists emacs extensions that you can use for different activities.

It may also be worth checking MELPA and ELPA, the emacs package archives directly, since useful packages can go undocumented on emacswiki. If you find nothing there you might like to search github directly.

Reading existing libraries

Check load-path variable to find where files included from.

Writing functions that can be executed from Emacs

Use of (interactive) to make the function callable

Hook functions

The Emacs Lisp Package Archive

The Emacs Lisp Package Archive, ELPA, makes it easy to download and install packages for emacs. The official site of ELPA is at http://tromey.com/elpa/.

Getting started with ELPA

Installing the package manager

In Emacs 22 and above, the necessary modules are already present to install the package manager from a small bootstrap script. Simply copy and paste the following code in to a *scratch* buffer and execute it with C-j if you're in lisp interaction mode, C-x C-e otherwise.

(let ((buffer (url-retrieve-synchronously
	       "http://tromey.com/elpa/package-install.el")))
  (save-excursion
    (set-buffer buffer)
    (goto-char (point-min))
    (re-search-forward "^$" nil 'move)
    (eval-region (point) (point-max))
    (kill-buffer (current-buffer))))

The package manager should install itself.

If you are using Emacs 21, you will need to have the wget utility available. Execute the following code:

(let ((buffer (get-buffer-create (generate-new-buffer-name " *Download*"))))
    (save-excursion
      (set-buffer buffer)
      (shell-command "wget -q -O- http://tromey.com/elpa/package-install.el"
		     (current-buffer))
      (eval-region (point-min) (point-max))
      (kill-buffer (current-buffer))))

Acknowledgements

In writing this book, I am drawing on the following sources (all under open licenses):

EmacsWiki
An introduction to programming in Emacs Lisp by Robert J. Chassell
ttn's Emacs Lisp tutorial
The GNU Emacs Lisp Reference Manual

[1] In fact, Emacs will give you a full backtrace to show where the problem occurred; details have been left out here for clarity, but don't be surprised if your error message looks more complex than described here

[1]

Emacs/Print version

Contents