F Sharp Programming/Sequences

Previous: Lists Index Next: Sets and Maps
F# : Sequences

Sequences, commonly called sequence expressions, are similar to lists: both data structures are used to represent an ordered collection of values. However, unlike lists, elements in a sequence are computed as they are needed (or "lazily"), rather than computed all at once. This gives sequences some interesting properties, such as the capacity to represent infinite data structures.

Defining Sequences

edit

Sequences are defined using the syntax:

seq { expr }

Similar to lists, sequences can be constructed using ranges and comprehensions:

> seq { 1 .. 10 };; (* seq ranges *)
val it : seq<int> = seq [1; 2; 3; 4; ...]

> seq { 1 .. 2 .. 10 };; (* seq ranges *)
val it : seq<int> = seq [1; 3; 5; 7; ...]

> seq {10 .. -1 .. 0};; (* descending *)
val it : seq<int> = seq [10; 9; 8; 7; ...]

> seq { for a in 1 .. 10 do yield a, a*a, a*a*a };; (* seq comprehensions *)
val it : seq<int * int * int>
= seq [(1, 1, 1); (2, 4, 8); (3, 9, 27); (4, 16, 64); ...]

Sequences have an interesting property which sets them apart from lists: elements in the sequence are lazily evaluated, meaning that F# does not compute values in a sequence until the values are actually needed. This is in contrast to lists, where F# computes the value of all elements in a list on declaration. As a demonstration, compare the following:

> let intList =
    [ for a in 1 .. 10 do
        printfn "intList: %i" a
        yield a ]
        
let intSeq =
    seq { for a in 1 .. 10 do
            printfn "intSeq: %i" a
            yield a };;

val intList : int list
val intSeq : seq<int>

intList: 1
intList: 2
intList: 3
intList: 4
intList: 5
intList: 6
intList: 7
intList: 8
intList: 9
intList: 10

> Seq.item 3 intSeq;;
intSeq: 1
intSeq: 2
intSeq: 3
intSeq: 4
val it : int = 4

> Seq.item 7 intSeq;;
intSeq: 1
intSeq: 2
intSeq: 3
intSeq: 4
intSeq: 5
intSeq: 6
intSeq: 7
intSeq: 8
val it : int = 8

The list is created on declaration, but elements in the sequence are created as they are needed.

As a result, sequences are able to represent a data structure with an arbitrary number of elements:

> seq { 1I .. 1000000000000I };;
val it : seq<bigint> = seq [1I; 2I; 3I; 4I; ...]

The sequence above represents a list with one trillion elements in it. That does not mean the sequence actually contains one trillion elements, but it can potentially hold one trillion elements. By comparison, it would not be possible to create a list [ 1I .. 1000000000000I ] since the .NET runtime would attempt to create all one trillion elements up front, which would certainly consume all of the available memory on a system before the operation completed.

Additionally, sequences can represent an infinite number of elements:

> let allEvens = 
    let rec loop x = seq { yield x; yield! loop (x + 2) }
    loop 0;;

> for a in (Seq.take 5 allEvens) do
    printfn "%i" a;;
0
2
4
6
8
val it : unit = ()

Notice the definition of allEvens does not terminate. The function Seq.take returns the first n elements of elements of the sequence. If we attempted to loop through all of the elements, fsi would print indefinitely.

Note: sequences are implemented as state machines by the F# compiler. In reality, they manage state internally and hold only the last generated item in memory at a time. Memory usage is constant for creating and traversing sequences of any length.

Iterating Through Sequences Manually

edit

The .NET Base Class Library (BCL) contains two interfaces in the System.Collections.Generic namespace:

type IEnumerable<'a> =
  interface
   (* Returns an enumerator that iterates through a collection *)
    member GetEnumerator<'a> : unit -> IEnumerator<'a>
  end

type IEnumerator<'a> =
  interface
   (* Advances to the next element in the sequences. Returns true if 
      the enumerator was successfully advanced to the next element; false
      if the enumerator has passed the end of the collection. *)
    member MoveNext : unit -> bool

    (* Gets the current element in the collection. *)
    member Current : 'a

    (* Sets the enumerator to its initial position, which is 
       before the first element in the collection. *)
    member Reset : unit -> unit
  end

The seq type is defined as follows:

type seq<'a> = System.Collections.Generic.IEnumerable<'a>

As you can see, seq is not a unique F# type, but rather another name for the built-in System.Collections.Generic.IEnumerable interface. Since seq/IEnumerable is a native .NET type, it was designed to be used in a more imperative style, which can be demonstrated as follows:

open System
open System.Collections

let evens = seq { 0 .. 2 .. 10 } (* returns IEnumerable<int> *)

let main() =
    let evensEnumerator = evens.GetEnumerator() (* returns IEnumerator<int> *)
    while evensEnumerator.MoveNext() do
        printfn "evensEnumerator.Current: %i" evensEnumerator.Current
    
    Console.ReadKey(true) |> ignore
            
main()

This program outputs:

evensEnumerator.Current: 0
evensEnumerator.Current: 2
evensEnumerator.Current: 4
evensEnumerator.Current: 6
evensEnumerator.Current: 8
evensEnumerator.Current: 10

Behind the scenes, .NET converts every for loop over a collection into an explicit while loop. In other words, the following two pieces of code compile down to the same bytecode:

let x = [1 .. 10]
for num in x do
    printfn "%i" num
let x = [1 .. 10]
let enumerator = x.GetEnumerator()
while enumerator.MoveNext() do
    let num = enumerator.Current
    printfn "%i" num

All collections which can be used with the for keyword implement the IEnumerable<'a> interface, a concept which will be discussed later in this book.

The Seq Module

edit

Similar to the List modules, the Seq module contains a number of useful functions for operating on sequences:

val append : seq<'T> -> seq<'T> -> seq<'T>

Appends one sequence onto another sequence.
> let test = Seq.append (seq{1..3}) (seq{4..7});;
val it : seq<int> = seq [1; 2; 3; 4; ...]

val choose : ('T -> 'U option) -> seq<'T> -> seq<'U>

Filters and maps a sequence to another sequence.
> let thisworks = seq { for nm in [ Some("James"); None; Some("John") ] |> Seq.choose id -> nm.Length }
val it : seq<int> = seq [5; 4]

val distinct : seq<'T> -> seq<'T>

Returns a sequence that filters out duplicate entries.
> let dist = Seq.distinct (seq[1;2;2;6;3;2])
val it : seq<int> = seq [1; 2; 6; 3]

val exists : ('T -> bool) -> seq<'T> -> bool

Determines if an element exists in a sequence.
> let equalsTwo x = x=2
> let exist = Seq.exists equalsTwo (seq{3..9})
val equalsTwo : int -> bool
val it : bool = false

val filter : ('T -> bool) -> seq<'T> -> seq<'T>

Builds a new sequence consisting of elements filtered from the input sequence.
> Seq.filter (fun x-> x%2 = 0) (seq{0..9})
val it : seq<int> = seq [0; 2; 4; 6; ...]

val fold : ('State -> 'T -> 'State) -> 'State -> seq<'T> -> 'State

Repeatedly applies a function to each element in the sequence from left to right.
> let sumSeq sequence1 = Seq.fold (fun acc elem -> acc + elem) 0 sequence1
Seq.init 10 (fun index -> index * index)
|> sumSeq
|> printfn "The sum of the elements is %d."
> 
The sum of the elements is 285.

val sumSeq : seq<int> -> int
Note: sequences can only be read in a forward-only manner, so there is no corresponding foldBack function as found in the List and Array modules.

val initInfinite : (int -> 'T) -> seq<'T>

Generates a sequence consisting of an infinite number of elements.
> Seq.initInfinite (fun x -> x*x)
val it : seq<int> = seq [0; 1; 4; 9; ...]

val map : ('T -> 'U) -> seq<'T> -> seq<'U>

Maps a sequence of type 'a to type 'b.
> Seq.map (fun x->x*x+2) (seq[3;5;4;3])
val it : seq<int> = seq [11; 27; 18; 11]

val item : int -> seq<'T> -> 'T

Returns the nth value of a sequence.
> Seq.item 3 (seq {for n in 2..9 do yield n})
val it : int = 5

val take : int -> seq<'T> -> seq<'T>

Returns a new sequence consisting of the first n elements of the input sequence.
> Seq.take 3 (seq{1..6})
val it : seq<int> = seq [1; 2; 3]

val takeWhile : ('T -> bool) -> seq<'T> -> seq<'T>

Return a sequence that, when iterated, yields elements of the underlying sequence while the given predicate returns true, and returns no further elements.
> let sequenciaMenorqDez = Seq.takeWhile (fun elem -> elem < 10) (seq {for i in 0..20 do yield i+1})
val sequenciaMenorqDez : seq<int>
> sequenciaMenorqDez;;
val it : seq<int> = seq [1; 2; 3; 4; ...]

val unfold : ('State -> ('T * 'State) option) -> 'State seed -> seq<'T>

The opposite of fold: this function generates a sequence as long as the generator function returns Some.
> let fibs = (0I, 1I) |> Seq.unfold (fun (a, b) -> Some(a, (b, a + b) ) );;

val fibs : seq<bigint>

> Seq.iter (fun x -> printf "%O " x) (Seq.take 20 fibs);;
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181

The generator function in unfold expects a return type of ('T * 'State) option. The first value of the tuple is inserted as an element into the sequence, the second value of the tuple is passed as the accumulator. The fibs function is clever for its brevity, but it's hard to understand if you've never seen an unfold function. The following demonstrates unfold in a more straightforward way:

> let test = 1 |> Seq.unfold (fun x ->
    if x <= 5 then Some(sprintf "x: %i" x, x + 1)
    else None );;

val test : seq<string>

> Seq.iter (fun x -> printfn "%s" x) test;;
x: 1
x: 2
x: 3
x: 4
x: 5

Often, it's preferable to generate sequences using seq comprehensions rather than the unfold.

Previous: Lists Index Next: Sets and Maps