Haskell/Modules



Modules are the primary means of organizing Haskell code. We have already met them in passing, when using import statements to put library functions into scope. In this chapter, we will have a closer look at how modules work. Beyond allowing us to make better use of libraries, such knowledge will help us to shape our own programs and, in particular, to create standalone programs which can be executed independently of GHCi (incidentally, that is the topic of the very next chapter, Standalone programs).

ModulesEdit

Haskell modules[1] are a useful way to group a set of related functionalities into a single package and manage a set of different functions that have the same name. The module definition is the first thing that goes in your Haskell file.

Here is what a basic module definition looks like:

module YourModule where

Note that

  1. the name of the module begins with a capital letter;
  2. each file contains only one module.

The name of the file must be that of the module but with a .hs file extension. Any dots '.' in the module name are changed for directories.[2] So the module YourModule would be in the file YourModule.hs while a module Foo.Bar would be in the file Foo/Bar.hs or Foo\Bar.hs. Since the module name must begin with a capital letter, the file name must also start with a capital letter.

ImportingEdit

One thing your module can do is import functions from other modules. That is, in between the module declaration and the rest of your code, you may include some import declarations such as

-- import only the functions toLower and toUpper from Data.Char
import Data.Char (toLower, toUpper)
 
-- import everything exported from Data.List
import Data.List
 
-- import everything exported from MyModule
import MyModule

Imported datatypes are specified by their name, followed by a list of imported constructors in parenthesis. For example:

-- import only the Tree data type, and its Node constructor from Data.Tree
import Data.Tree (Tree(Node))

Now what to do if you import some modules, but some of them have overlapping definitions? Or if you import a module, but want to overwrite a function yourself? There are three ways to handle these cases: Qualified imports, hiding definitions and renaming imports.

Qualified importsEdit

Say MyModule and MyOtherModule both have a definition for remove_e, which removes all instances of e from a string. However, MyModule only removes lower-case e's, and MyOtherModule removes both upper and lower case. In this case the following code is ambiguous:

-- import everything exported from MyModule
import MyModule
 
-- import everything exported from MyOtherModule
import MyOtherModule
 
-- someFunction puts a c in front of the text, and removes all e's from the rest
someFunction :: String -> String
someFunction text = 'c' : remove_e text

It isn't clear which remove_e is meant! To avoid this, use the qualified keyword:

import qualified MyModule
import qualified MyOtherModule
 
someFunction text = 'c' : MyModule.remove_e text -- Will work, removes lower case e's
someOtherFunction text = 'c' : MyOtherModule.remove_e text -- Will work, removes all e's
someIllegalFunction text = 'c' : remove_e text -- Won't work, remove_e isn't defined.

See the difference? In the latter code snippet, the function remove_e isn't even defined. Instead, we call the functions from the imported modules by prefixing them with the module's name. Note that MyModule.remove_e also works if the qualified keyword isn't included. The difference lies in the fact that remove_e is ambiguously defined in the first case, and undefined in the second case. If we have a remove_e defined in the current module, then using remove_e without any prefix will call this function.

Note

There is an ambiguity between a qualified name like MyModule.remove_e and the function composition operator (.). Writing reverse.MyModule.remove_e is bound to confuse your Haskell compiler. One solution is stylistic: to always use spaces for function composition, for example, reverse . remove_e or Just . remove_e or even Just . MyModule.remove_e


Hiding definitionsEdit

Now suppose we want to import both MyModule and MyOtherModule, but we know for sure we want to remove all e's, not just the lower cased ones. It will become really tedious to add MyOtherModule before every call to remove_e. Can't we just not import remove_e from MyModule? The answer is: yes we can.

-- Note that I didn't use qualified this time.
import MyModule hiding (remove_e)
import MyOtherModule
 
someFunction text = 'c' : remove_e text

This works. Why? Because of the word hiding on the import line. Followed by it is a list of functions that shouldn't be imported. Hiding more than one function works like this:

import MyModule hiding (remove_e, remove_f)

Note that algebraic datatypes and type synonyms cannot be hidden. These are always imported. If you have a datatype defined in multiple imported modules, you must use qualified names.

Renaming importsEdit

This is not really a technique to allow for overwriting, but it is often used along with the qualified flag. Imagine:

import qualified MyModuleWithAVeryLongModuleName
 
someFunction text = 'c' : MyModuleWithAVeryLongModuleName.remove_e $ text

Especially when using qualified, this gets irritating. We can improve things by using the as keyword:

import qualified MyModuleWithAVeryLongModuleName as Shorty
 
someFunction text = 'c' : Shorty.remove_e $ text

This allows us to use Shorty instead of MyModuleWithAVeryLongModuleName as prefix for the imported functions. As long as there are no ambiguous definitions, the following is also possible:

import MyModule as My
import MyCompletelyDifferentModule as My

In this case, both the functions in MyModule and the functions in MyCompletelyDifferentModule can be prefixed with My.

Combining renaming with limited importEdit

Sometimes it is convenient to use the import directive twice for the same module. A typical scenario is as follows:

import qualified Data.Set as Set
import Data.Set (Set, empty, insert)

This give access to all of the Data.Set module via the alias "Set", and also lets you access a few selected functions (empty, insert, and the constructor) without using the "Set" prefix.

ExportingEdit

In the examples at the start of this article, the words "import everything exported from MyModule" were used.[3] This raises a question. How can we decide which functions are exported and which stay "internal"? Here's how:

module MyModule (remove_e, add_two) where
 
add_one blah = blah + 1
 
remove_e text = filter (/= 'e') text
 
add_two blah = add_one . add_one $ blah

In this case, only remove_e and add_two are exported. While add_two is allowed to make use of add_one, functions in modules that import MyModule cannot use add_one, as it isn't exported.

Datatype export specifications are written quite similarly to import. You name the type, and follow with the list of constructors in parenthesis:

module MyModule2 (Tree(Branch, Leaf)) where
 
data Tree a = Branch {left, right :: Tree a} 
            | Leaf a

In this case, the module declaration could be rewritten "MyModule2 (Tree(..))", declaring that all constructors are exported.

Maintaining an export list is good practice not only because it reduces namespace pollution, but also because it enables certain compile-time optimizations which are unavailable otherwise.


NotesEdit

  1. See the Haskell report for more details on the module system.
  2. In Haskell98, the last standardised version of Haskell before Haskell 2010, the module system was fairly conservative, but recent common practice consists of employing an hierarchical module system, using periods to section off namespaces.
  3. A module may export functions that it imports. Mutually recursive modules are possible but need some special treatment.


Last modified on 29 September 2013, at 04:54