Ruby Programming/Reference/Objects/Enumerable

Enumerable

edit

Enumerator appears in Ruby as Enumerable::Enumerator in 1.8.x and (just) Enumerator in 1.9.x.

Forms of Enumerator

edit

There are several different ways in which an Enumerator can be used:

  • As a proxy for “each”
  • As a source of values from a block
  • As an external iterator

1. As a proxy for “each”

edit

This is the first way of using Enumerator, introduced in ruby 1.8. It solves the following problem: Enumerable methods like #map and #select call #each on your object, but what if you want to iterate using some other method such as #each_byte or #each_with_index?

An Enumerator is a simple proxy object which takes a call to #each and redirects it to a different method on the underlying object.

require 'enumerator'   # needed in ruby <= 1.8.6 only

src = "hello"
puts src.enum_for(:each_byte).map { |b| "%02x" % b }.join(" ")

The call to ‘enum_for’ (or equivalently ‘to_enum’) creates the Enumerator proxy. It is a shorthand for the following:

newsrc = Enumerable::Enumerator.new(src, :each_byte)
puts newsrc.map { |b| "%02x" % b }.join(" ")

In ruby 1.9, Enumerable::Enumerator has changed to just Enumerator

2. As a source of values from a block

edit

In ruby 1.9, Enumerator.new can instead take a block which is executed when #each is called, and directly yields the values.

block =  Enumerator.new {|g| g.yield 1; g.yield 2; g.yield 3}

block.each do |item|
  puts item
end

“g << 1” is an alternative syntax for “g.yield 1”

No fancy language features such as Fiber or Continuation are used, and this form of Enumerator is easily retro-fitted to ruby 1.8

It is quite similar to creating your own object which yields values:

block = Object.new
def block.each
  yield 1; yield 2; yield 3
end

block.each do |item|
  puts item
end

However it also lays the groundwork for “lazy” evaluation of enumerables, described later.

3. As an external iterator

edit

ruby 1.9 also allows you turn an Enumerator around so that it becomes a “pull” source of values, sometimes known as “external iteration”. Look carefully at the difference between this and the previous example:

block =  Enumerator.new {|g| g.yield 1; g.yield 2; g.yield 3}

while item = block.next
  puts item
end

The flow of control switches back and forth, and the first time you call #next a Fiber is created which holds the state between calls. Therefore it is less efficient that iterating directly using #each.

When you call #next and there are no more values, a StopIteration exception is thrown. This is silently caught by the while loop. StopIteration is a subclass of IndexError which is a subclass of StandardError.

The nearest equivalent feature in ruby 1.8 is Generator, which was implemented using Continuations.

require 'generator'
block = Generator.new {|g| g.yield 1; g.yield 2; g.yield 3}

while block.next?
  puts block.next
end

Lazy evaluation

edit

In an Enumerator with a block, the target being yielded to is passed as an explicit parameter. This makes it possible to set up a chain of method calls so that each value is passed left-to-right along the whole chain, rather than building up intermediate arrays of values at each step.

The basic pattern is an Enumerator with a block which processes input values and yields (zero or more) output values for each one.

  Enumerator.new do |y|
    source.each do |input|     # filter INPUT
      ...
      y.yield output           # filter OUTPUT
    end
  end

So let’s wrap this in a convenience method:

class Enumerator
  def defer(&blk)
    self.class.new do |y|
      each do |*input|
        blk.call(y, *input)
      end
    end
  end
end

This new method ‘defer’ can be used as a ‘lazy’ form of both select and map. Rather than building an array of values and returning that array at the end, it immediately yields each value. This means you start getting the answers sooner, and it will work with huge or even infinite lists. Example:

res = (1..1_000_000_000).to_enum.
  defer { |out,inp| out.yield inp if inp % 2 == 0 }.   # like select
  defer { |out,inp| out.yield inp+100 }.               # like map
  take(10)
p res

Although we start with a list of a billion items, at the end we only use the first 10 values generated, so we stop iterating once this has been done.

You can get the same capability in ruby 1.8 using the facets library. For convenience it also provides a Denumberable module with lazy versions of familiar Enumerable methods such as map, select and reject.

Methods which return Enumerators

edit

From 1.8.7 on, many Enumerable methods will return an Enumerator if not given a block.

>> a = ["foo","bar","baz"]
=> ["foo", "bar", "baz"]
>> b = a.each_with_index
=> #<Enumerable::Enumerator:0xb7d7cadc>
>> b.each { |args| p args }
["foo", 0]
["bar", 1]
["baz", 2]
=> ["foo", "bar", "baz"]
>> 

This means that usually you don’t need to call enum_for explicitly. The very first example on this page reduces to just:

src = "hello"
puts src.each_byte.map { |b| "%02x" % b }.join(" ")

This can lead to somewhat odd behaviour for non-map like methods - when you call #each on the object later, you have to provide it with the “right sort” of block.

=> ["foo", "bar", "baz"]
>> b = a.select
=> #<Enumerable::Enumerator:0xb7d6cfb0>
>> b.each { |arg| arg < "c" }
=> ["bar", "baz"]
>> 

More Enumerator readings

edit

each_with_index

edit

each_with_index calls its block with the item and its index.

array = ['Superman','Batman','The Hulk']

array.each_with_index do |item,index|
  puts "#{index} -> #{item}"
 end

# will print
# 0 -> Superman
# 1 -> Batman
# 2 -> The Hulk

find_all

edit

find_all returns only those items for which the called block is not false

range = 1 .. 10

# find the even numbers

array = range.find_all { |item| item % 2 == 0 }

# returns [2,4,6,8,10]
array = ['Superman','Batman','Catwoman','Wonder Woman']

array = array.find_all { |item| item =~ /woman/ }

# returns ['Catwoman','Wonder Woman']