Raku Programming/Printable version
This is the print version of Raku Programming You won't see this message or any elements not part of the book's content when you print or preview this page. |
The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
https://en.wikibooks.org/wiki/Raku_Programming
Introduction
The Raku programming language is the sixth major revision of Perl.
It was designed in order to tackle with the caveats that Perl had accumulated during its now long history. Those caveats were mainly due to a requirement of backward compatibility of successive versions of Perl. That's why Raku is the first version of Perl that is not backward-compatible.
Raku does not replace Perl. It is rather a sister language or, as some like to put it, the R&D branch of Perl. To some degree, Raku is to Perl what C++ is to C. Although C++ is very successful programming language, it has not in any way replaced C.
Perl History
Perl 1 - 5
editThe Perl programming language was created by Larry Wall, a linguist and a computer systems administrator for Unisys in 1987. Perl is a dynamic programming language which for much of its history was considered to be a "scripting language" or a command-line administration tool. However, as of version 5 of the language, Perl became a powerful and useful general-purpose programming language that is consistently popular among web developers, system administrators and hobbyist programmers.
The Perl programming language was developed as an open-source free-software project, and gradually expanded until the release of Version 5, the current state-of-the-art in Perl. Through all these developments, Perl remained backwards compatible with previous versions. The Perl 5 interpreter can read, understand and execute (for the most part) programs that were written in Perl 1, Perl 2, Perl 3 and Perl 4. Unfortunately, this made a mess of the internals of the Perl 5 interpreter and made many programming tasks harder than they need to be.
Another tripping point is in the Perl 5 language specification; it isn't a specification at all. The Perl interpreter itself is the standard: the behavior of the interpreter is the "standard" behavior of Perl. The only way to duplicate all the strange and idiosyncratic behavior of the Perl language is to use that standard software only.
Intermission: The Jon Orwant Mug-Throwing Incident in 2000
editBy 2000, it was evident that Perl needed an infusion of life:
"The [P5P / Perl Conference] meeting was originally a gathering of Chip Salzenberg, Jarkko Hietaniemi, Elaine Ashton, Tim Bunce, Sarathy, Nick Ing-Simmons, Larry Wall, Nat Torkington, brian d foy and Adam Turoff, brougt together to draft a constitution of sorts since the community seemed to be fragmenting. Jon showed up to the meeting late and found us talking about the community and started throwing things to express his discontent with how perl itself was stagnating, possibly even dying, and that we should be talking about reviving Perl. The cup incident was planned theatre from what I was told later. So, it was already a fait accompli but the tantrum was it's outing." [1]
Perl 6
editThe time was ripe for Perl 6: a rewrite of the Perl language from the ground up. Compatibility with Perl 5 is forfeited, in order to resolve fundamental problems with the language, and add necessary new features. It is therefore completely different from Perl 5 but at the same time unmistakably in the same 'language family'. Unlike Perl 5, which developed organically over time and was completely idiosyncratic, Perl 6 started out from a set of specifications, and is instantiated in multiple separate and equal implementations.
Perl 6 started with a long period of community involvement and an RFC process. Community members were asked to contribute ideas and suggestions for the new language. Larry went through the suggestions, saved the good ones, removed the bad ones, and tried to bring everything together in a unified way. Perl 5 had been criticized for being "hacky" and inconsistent, so Perl 6 should avoid that from the very beginning. Once all the suggestions were tabulated and discussed, Larry released a series of design documents known as the Apocalypses. Each Apocalypse was numbered to roughly correspond with a chapter in the book "Programming Perl" and were meant to be a revelation of the concepts and trade-offs that were being considered in the design of Perl 6. From these documents (which were short on specifics), Damian Conway produced a corresponding series of explanatory documents called Exegeses. While an Apocalypse revealed some of the design, an Exegesis explained what that meant to the everyday programmer in terms of the code that would be written. Later, as the design matured, design specifications called the Synopses were created to synthesize and document the design of Perl 6. The Synopses currently stand as the official design documents for the Perl 6 language.
At October 2019, the Perl 6 community voted the rename process to Raku name.
Raku Philosophy
editPerl has always been a flexible and capable programming language. One of the most important mantras of the Perl team is There's More than One Way To Do It (TIMTOWTDI, pronounced "Tim Toady"). Raku is a very flexible language that combines a number of different programming paradigms to support various programmers and varying programming tasks. Because of this TIMTOWTDI philosophy, Raku is a very big programming language with lots of different features and capabilities.
Another way to say this is that Perl gives you plenty of rope, but you have to be careful not to trip yourself with it. There are lots of big ideas floating around in Raku, but not all of them are going to be useful for all programming tasks. Also, there are plenty of things that are possible in Raku that might not be considered "best practices" by the programming community at large. It's important to learn not just how to write certain things in Raku, but when to write things in certain ways. Without that knowledge, programs could easily descend into meaningless unreadable gibberish.
Throughout this book we'll try to show you some best practices and try to talk about where each feature is and is not useful.
Implementations
- Pugs
- was the first more-or-less functioning implementation of Raku. It was written in Haskell by Audrey Tang. It is now relevant mostly for historical interest.
- Niecza
- An implementation of Raku using the .net framework.
- Rakudo
- The leading, high-level implementation of Raku. It is self-hosting, which means that it is written mostly in Raku and a sub-language of Raku: nqp. It targets several process virtual machines: Parrot, JVM, MoarVM, and probably a few others in a near future (JavaScript, Lua, ...)
As of April 2014, Rakudo on MoarVM is the most promising implementation. It is entirely free and open-source software and uses a VM specifically designed for Raku.
Pugs and Parrot
editAfter a long period of language design, it was time to start creating an implementation of the new language. To avoid the problems of Perl, it was decided by the initial organizers to create better separation between the back-end execution engine and the front-end language parser. After a number of discussions, the Parrot Virtual Machine project was started to create a virtual machine for dynamic languages like Raku. Parrot quickly grew to become independent of Raku, opting instead to become a virtual machine for all dynamic languages. Since Raku was so large and ambitious, any virtual machine that could support it would also be very capable of supporting many other dynamic languages as well.
Audrey Tang, a Perl hacker, put together a reference implementation of Raku using the Haskell programming language. This implementation was called Pugs, and served as a testbench for many of the ideas that the language designers were developing. Feedback from the Pugs team helped to shape the language design, and changes to the language design caused modifications in Pugs. It was a useful and helpful relationship, especially since no other implementations were at such a high state of development at that time.
STD.pm, STD_blue and ELF
editThe "official" grammar for Raku was going to be written in Raku itself. This is because Raku was being designed to have one of the most advanced grammar engines of any existing language at the time. There simply was no better choice for a grammar implementation for such an advanced language than that language itself. STD.pm was created as the standard Raku grammar, and is still deferred to when conflicts arise in the various implementations.
STD_red is an implementation of the Raku grammar using the Ruby programming language. STD_blue is a more up-to-date compiler for STD.pm that's written in Perl.
ELF is a boot-strapped implementation of Raku that uses STD_blue to compile Raku code into Perl code for execution.
Rakudo
editWhen Audrey Tang left the Pugs project, however, development on it dropped to a minimum. It was still useful for testing and reference, but Pugs was no longer the active development platform that it once was. However, Parrot had grown by leaps and bounds since that time and was finally ready to start supporting compilers for high-level languages. A Raku project, known as "Rakudo" was started and began to grow rapidly. Part of the Rakudo project was Patrick Michaud's creation of high-level parser tools called PCT ("Parrot Compiler Tools"). PCT is a parser generator tool similar to the low-level Flex and Bison tools. However, PCT used a subset of the Raku language to write parsers, instead of using C or C++. This meant that Rakudo was on the road to becoming self-hosting: The Rakudo compiler itself was partially written in Raku.
More information about Rakudo can be found on the web at http://www.rakudo.org
Variables and Data
Static and Dynamic Programming Languages
editRaku is one of a class of programming languages called dynamic languages. Dynamic languages use variables whose data type can change at runtime and don't need to be predeclared. The alternative to dynamic languages are static languages, such as C or Java, where variables must typically be declared as a specific data type before they are used.
In a language like C or one of its derivatives (C++, C# or Java for instance) variables need to be predeclared with a type before they can be used:
unsigned short int x;
x = 10;
The code above isn't entirely accurate since in C you can initialize a variable when you declare it:
unsigned short int x = 10;
However, the variable x
cannot be used before it's declared. Once it's declared as the type unsigned short int
, you can't use x
to store other types of data like floating point numbers or data pointers, at least not without explicit coercion:
unsigned short int x;
x = 1.02; /* Wrong! */
unsigned short int y;
x = &y; /* Wrong! */
In dynamic programming languages like Raku, variables can be automatically allocated when they are first used without having to be explicitly declared. Also, variables in Raku are polymorphic: They can be integers, strings, floating point numbers, or complex data structures like arrays or hashes without any coercion. Here are some examples:
my $x;
$x = 5; # Integer
$x = "hello"; # String
$x = 3.1415; # Floating Point Number
The example above demonstrates a number of different ideas that we will discuss throughout the rest of this book. One important idea is Comments. Comments are notes in the source code that are intended to be read by the programmers and are ignored by the Raku interpreter. In Raku, most comments are marked with a #
symbol, and continue until the end of the line. Raku also has embedded comments and multi-line documentation, which we will talk about later.
We haven't been entirely honest about Raku data. Raku does allow data to be given an explicit type, if you want it to. The default is to use data that is polymorphic like we've seen above, but you can also declare that a scalar may only hold an integer, or a string, or a number, or a different data item all together. We'll talk more about Raku's explicit type system later. For now, it's easier to think that data in Raku doesn't have explicit types (or, more specifically, that it doesn't need them).
The examples we see above also show an important keyword: my
. my
is used to declare a new variable for use. We'll talk more about my
and its uses later.
Variables and Sigils
editAs we saw in the brief example above, Raku variables have symbols in front of them called sigils. Sigils serve a number of important uses, but one of the most important is to establish context. Raku has four types of sigils that are used to establish different types of data. The $ sign that we saw above is for scalars: single data values like numbers, strings, or object references. Other sigils to be used are @ which denotes arrays of data, % for hashes of data, and & for subroutines or executable code blocks.
- Scalars
- Scalars as we have already seen contain a single data item like a number or a string.
- Arrays
- Arrays are lists of data of the same type that are indexed by number.
- Hashes
- Hashes are sets of data of potentially different types of data indexed by a string.
- Code References
- Code references are pointers to executable code structures that can be passed around like data and called at different places in your code.
Scalars
editWe've seen some uses of scalars above, here we're going to show a slightly more comprehensive list.
my $x;
$x = 42; # Decimal Integer
$x = 0xF6; # Hexadecimal Integer
$x = 0b1010010001; # Binary Integer
$x = 3.1415; # Floating Point Number
$x = 2.34E-5; # Scientific Notation
$x = "Hello "; # Double-Quoted String
$x = 'World!'; # Single-Quoted String
$x = q:to/EOS/; # Heredoc string
This is a heredoc string. It starts at the "q:to"
term and continues until we reach the terminator
specified in quotes above. This is useful for
large multi-line string literals. We will talk about
heredocs in more detail later
EOS
$x = MyObject.new(); # Object Reference
Scalars are the most fundamental and basic type of data in Raku and are the ones that are probably going to be used most often in your programs. This is because they are so versatile.
Arrays
editArrays, as we mentioned above, are lists of data objects that are considered to be the same type. Since arrays are lists of scalars, it's possible for some elements in an array to be numbers and some to be strings and some to be other data entirely. However, this is generally not considered to be the best use of arrays.
Arrays are prefixed with the @ sigil, and can be indexed using integers in [ square brackets ]. Here are some examples of using arrays:
my @a;
@a = 1, 2, 3;
@a = "first", "second", "third";
@a = 1.2, 3.14, 2.717;
Once we have an array we can extract the scalar data items out of them using index notation:
my @a, $x;
@a = "first", "second", "third";
$x = @a[0]; # first
$x = @a[1]; # second
$x = @a[2]; # third
Arrays can also be multi-dimensional:
my @a;
@a[0, 0] = 1;
@a[0, 1] = 2;
@a[1, 0] = 3;
@a[1, 1] = 4;
# @a is now:
# |1, 2|
# |3, 4|
Arrays also don't have to just store scalars, they can store any other data items as well:
my @a, @b, @c, @d, %e, %f, %g, &h, &i, &j;
@a = @b, @c, @d;
@a = %e, %f, %g;
@a = &h, &i, &j;
This can be the basis of some complex data structures, and we'll talk more about composing structures like this later.
Hashes
editHashes are similar in many ways to arrays: They can contain a group of objects. However, unlike arrays hashes use names for their items instead of numbers. Here are some examples:
my %a = "first" => 1, "second" => 2, "third" => 3;
my $x = %a{"first"}; # 1
my $y = %a{"second"}; # 2
my $z = %a{"third"}; # 3
The special =>
symbol is similar to a comma except it creates a pair. A pair is a combination of a string name and an associated data object. Hashes can sometimes be thought of as being arrays of pairs. Notice also that hashes use curly brackets to index their data instead of square brackets like arrays use.
Hashes can also use a special syntax called autoquoting to help make looking up hash values easier. Instead of using the curly brackets and quotes {" "}
, you can use the angle brackets < >
by themselves to do the same job:
my %a = "foo" => "first", "bar" => "second";
my $x = %a{"foo"}; # "first"
my $y = %a<bar> # "second"
Adverb Syntax
editPairs in a hash can be defined in another way without using the =>
operator. Pairs can also be defined using adverb syntax. Adverb syntax is used throughout Raku to provide named data values, so it's not just useful for hashes. Pairs of the form "name" => data
can be written in adverb syntax as :name(data)
instead.
my %foo = "first" => 1, "second" => 2, "third" => 3;
my %bar = :first(1), :second(2), :third(3); # Same!
We're going to see many uses for adverbs throughout Raku, so it's important to learn about them now.
$_
The Default Variable
edit
Raku uses the special variable $_
as a special default variable. $_
receives values when no other variables are provided, and is used by methods if they start with a dot. $_
can be used explicitly by name or implicitly.
$_ = "Hello ";
.print; # Call method 'print' on $_
$_.print; # Same
print $_; # Same, but written as a sub;
given "world!" { # Different way of saying $_ = "world"
.print; # Same as print $_;
}
Default variables can be useful in a number of places, such as loops, where they can be used to clean up the code and make actions more explicit. We'll talk more about default variables as we go.
Types and Context
Context
editWe've already talked about the various fundamental data types scalars, arrays and hashes. Each variable's sigil puts it into a specific context. Different types of variables act differently when they are used in different contexts. There are two basic types of context, at least two that we are going to talk about right now: Scalar context and List context. Scalars are anything with the $ sigil, while Lists are things like Arrays and Hashes.
Diversion: say
edit
We're going to take this time to talk about one of Raku's built-in functions: say
. say
prints a line of text to the console while the program is running. This is part of a larger system of input and output (I/O) that we will talk about in more detail later. say
takes a string or a list of strings and prints them to the console.
say "hello world!";
say "This is ", "Raku speaking!";
Diversion: Ranges
editA range is a list of values with some kind of numerical relationship between them. Ranges are created with the ..
operator:
my @digits = 0..9; # (0, 1, 2, 3, 4, 5, 6, 7, 8, 9);
Ranges can also use variables as the delimiters:
my $max = 15;
my $min = 12;
my @list = $min .. $max; # (12, 13, 14, 15);
A range is a separate type of object from an array entirely, even though a range will create an array-like list of values. Ranges implement a behavior called lazy evaluation: Instead of calculating a list of all values in the range first, the range is stored compactly as a starting and ending point only. Values can be calculated from a range later when a value is actually being read from it. This means that we can easily have infinite ranges without eating up all our computer's memory:
my @range = 1..Inf; # Infinite range, finite memory use
my $x = @range[1000000]; # Calculated on demand
Lazy evaluation isn't just a behavior of ranges, it's actually built into Raku directly and used in many places automatically. What this means is that large arrays won't necessarily take up a lot of memory, but instead the values can be calculated only when they are needed.
Context Specifiers
editWe can specify the context of a data item using various casting techniques. $( )
forces whatever is between the brackets to be treated like a scalar, even if it isn't one normally. @( )
and %( )
do the same for arrays and hashes too. Since ~
is always associated with strings and +
is always associated with numbers, we can use these to cast values to strings and numbers respectively:
my Num $x = +"3"; # Becomes a number
my Str $y = ~3; # The string "3"
my $z = $(2..4); # Three elements, but still a single Range object
my @w = @("key" => "value"); # Convert hash to array
The examples above are not the only cases where things can be cast from one type to another. Casting allows us to force variables into a specific type. In some cases, complicated variable types or classes can demonstrate very different behavior depending on how it is cast. We'll talk about these kinds of cases in later chapters.
Here is a quick list of context specifiers:
+( )
- Convert to a number
~( )
- Convert to a string
$( )
- Convert to a scalar
@( )
- Convert to an array
%( )
- Convert to a hash
Types
editWe talked earlier about how Raku is a dynamic language and therefore the variables and data items in it don't have pre-defined types. We also mentioned, almost in a footnote, that Raku also had a type system that could be used optionally if you want it. Raku is a very flexible language and was designed to give the programmer a lot of latitude to program in different ways. One of those ways that Raku makes available to program in is structured, statically-typed programming.
If you specify a type for a variable, Raku will follow that and only allow data of that type to be used in that variable. This is very helpful in some cases because certain classes of operations will raise compile-time errors instead of runtime errors. Also, the compiler can be free to perform certain types of optimizations if it knows the type of a variable never changes. These behaviors are implementation-dependent, of course - and will only be helpful if you try to make use of them. The general rule is that the more information that you supply to the compiler, the more helpful analysis the compiler can make for you.
Lexical Variables
editWe also mentioned earlier that variables do not need to be explicitly declared before they are used. This was also a little bit of a stretch of the truth. Variables don't need to be declared beforehand, but Raku gives you the flexibility to do it if you want. To predeclare a variable, you use the my
keyword:
my $x = 5;
my
defines a variable to be a local, lexical variable. The alternative is to define it as our
to be a global shared variable. Another name for a global variable in Raku is a "package variable", because it's available to the entire software package, or file.
our $x = 5;
Global variables are nice because you can use them all over the place without having to pass them around and keep track of them. However, unlike in kindergarten, sharing is not always the best idea in large programs.
Built-In Types
editRaku provides some built-in types that the Raku compiler knows about beforehand. You can always define your own types, but Raku makes a number of them available to you from the beginning. Here is a partial list of some of the basic types built in to Raku. This isn't a comprehensive list because some of the types that Raku has won't make any sense at this point.
- Bool
- A boolean value, true or false. Booleans are an enumerated type, which we will talk about in more depth a little bit later. Bool values can be
True
orFalse
only. - Int
- A basic integer value
- Array
- An array of values, indexed by an integer subscript
- Hash
- A hash of values indexed by a string
- Num
- A floating-point number
- Complex
- like a floating point number but also allows imaginary and complex data types too.
- Pair
- We briefly mentioned pairs when talking about hashes. A pair is a combination of a data object and a string.
- Str
- A string data object
With these values, you can write statically-typed code just like you can in a normal statically-typed language:
my Int $x;
$x = 5; # Good, it's an integer
$x = "five"; # Bad, just the name of an integer
We can also use the type system to catch errors as we move data between variables:
my Int $foo = 5;
my Num $bar = 3.14;
$foo = $bar; # ERROR! $bar is not an Int!
In this way, the compiler can tell you if you are trying to use a variable in a way you hadn't intended.
Basic Operations
Operators
editIn the past few chapters we've been looking at Raku data, and the variables that hold that data. Now we'll explore what kinds of things we can do with that data once we have it. Perl has a series of normal arithmetic operators that can be applied to integer and floating-point number values, and even a few operators for other data types too. Once we learn about all the normal operators, we can start to look at meta operators which take those same concepts and apply them to different contexts.
An operator works on its operands -- that quantity on which the operation is being performed. To understand any operator, you have to know its arity (how many operands it takes). Its called a unary operator if it takes one operand, binary if two, and ternary if three.
Arithmetic Unary Operators
editThe simplest arithmetic operators are the unary sign operators +
and -
. These get applied to the front of a numerical value to affect the sign of that number:
my Int $x = 5;
$y = -$x # Now, $y is -5
There is also a prefix +
operator which doesn't invert the sign.
Arithmetic Binary Operators
editRaku has a number of arithmetic operators, like other programming languages.
my $x = 12;
my $y = 3;
my $z;
$z = $x + $y; # $z is 15
$z = $x - $y; # $z is 9
$z = $y - $x; # $z is -9
$z = $x * $y; # $z is 36
$z = $x / $y; # $z is 4
$z = $x % $y; # $z is 0 (remainder)
Arithmetic operators expect numerical arguments, so arguments will be converted to numbers automatically as if we used the contextualizer +( )
.
my Str $x = "123";
my Str $y = "456";
my Int $z = $x + $y; # 579
my Int $w = +($x) + +($y); # Same, but more verbose
Strings and ~
edit
In Raku, the symbol ~
is always associated with strings. When used as a prefix, it turns whatever variable it was used on into a string if it wasn't before. When used as an operator between two strings, it joins the strings together end-to-end in a process called concatenation.
Stringification
editWe already talked about using ~( )
as the string context specifier. This is known as stringification. Stringifying converts a variable from other types into a string representation.
Concatenation
editTwo strings can be joined together to produce a single new string:
my Str $x = "hello " ~ "world!";
The ~
operator automatically stringifies any arguments that aren't strings. So we can write this:
my Int $foo = 5;
my Str $bar = "I have " ~ $foo ~ " chapters to write";
print $bar;
Which will print out the string: I have 5 chapters to write
.
In most cases it's probably easier to use interpolation, which we will talk about next.
Interpolation
editWe showed briefly before that there are three basic types of strings: double-quoted strings, single-quoted strings, and heredocs. Single-quoted and double-quoted strings may look similar, but they behave differently from each other. The difference is interpolation.
Double-quoted strings are interpolated. Variable names that appear inside the string are converted to their string value and included in the string. single-quoted strings do not have this behavior:
my Int $x = 5;
my Str $foo = "The value is $x"; # The value is 5
my Str $bar = 'The value is $x'; # The value is $x
Increment and Decrement
editThe operations by which a variable is increased or decreased by one is common enough to warrant specific operators. The ++
and --
operators can be used as prefixes or suffixes to a scalar variable. These two different locations have subtle differences.
my Int $x = 5;
$x++; # 6 (increment is done after)
++$x; # 7 (increment is done before)
$x--; # 6 (as above)
--$x; # 5 (as above)
The two forms, prefix and suffix forms, appear to generally do the same thing in the example code above. The code ++$x
and $x++
both perform the same action, but it's the time that the action happens that's different. Let's demonstrate this with some examples:
my Int $x = 5;
my Int $y;
$y = $x++; # $y is 5, $x is 6
$y = ++$x; # $y is 7, $x is 7
$y = $x--; # $y is 7, $x is 6
$y = --$x; # $y is 5, $x is 5
The prefix version performs the increment or decrement before the variable is used in the statement. The postfix version performs the increment or decrement after the variable is used.
Control Structures
Flow Control
editWe've seen in earlier chapters how to create variables and use them to perform basic arithmetic and other operations. Now we're going to introduce the idea of flow control, using special constructs for branching and looping.
Perl shared many common flow control construct names with other languages such as C, C++, and Java. Raku has chosen to rename some of these common constructs. We'll post notes near the particularly confusing ones |
Blocks
editBlocks are chunks of code inside { }
curly brackets. These are set apart from the rest of the nearby code in a variety of ways. They also define a new scope for variables: Variables defined with my
and used inside a block are not visible or usable outside the block. This enables code to be compartmentalized, to ensure that variables only intended for temporary use are only used temporarily.
Branching
editBranching can occur using one of two statements: an if and an unless. if
may optionally have an else clause as well. An if
statement evaluates a given condition and if it's a true statement, the block following the if
is executed. When the statement is false, the else
block, if any, is executed instead. The unless
statement does the opposite. It evaluates a condition and only executes its block when the condition is false. You cannot use an else
clause when using unless
.
Relational Operators
editThere are a variety of relational operators that can be used to determine a truth value. Here are some:
$x == $y; # $x and $y are equal
$x > $y; # $x is greater than $y
$x >= $y; # $x is greater than or equal to $y
$x < $y; # $x is less than $y
$x <= $y; # $x is less than or equal to $y
$x != $y; # $x is not equal to $y
All of these operators return a boolean value, and can be assigned to a variable:
$x = (5 > 3); # $x is True
$y = (5 == 3); # $y is False
The parentheses above are used only for clarity; they are not actually necessary.
if
/unless
edit
Let's start off with an example:
my Int $x = 5;
if ($x > 3) {
say '$x is greater than 3'; # This prints
}
else {
say '$x is not greater than 3'; # This doesn't
}
Notice in this example above that there is a space between the if
and the ($x > 3)
. This is important and is not optional. The parsing rules for Raku are clear on this point: Any word followed by a (
opening parenthesis is treated as a subroutine call. The space differentiates this statement from a subroutine call and lets the parser know that this is a conditional:
if($x > 5) { # Calls subroutine "if"
}
if ($x > 5) { # An if conditional
}
To avoid all confusion, the parenthesis can be safely omitted:
if $x > 5 { # Always a condition
}
unless
edit
unless
has the opposite behavior of if
:
my Int $x = 5;
unless $x > 3 {
say '$x is not greater than 3'; # This doesn't print
}
No else
clause is allowed after unless
.
Postfix Syntax
editif
and unless
aren't just useful for marking blocks to be conditionally executed. They can also be applied in a natural way to the end of a statement to only affect that one statement:
$x = 5 if $y == 3;
$z++ unless $x + $y > 8;
These two lines of code above only execute if their conditions are satisfied properly. The first sets $x
to 5 if $y
is equal to 3. The second increments $z
unless the sum of $x + $y
is greater than 8.
Smart Matching
editSometimes you want to check if two things match. The relational operator ==
checks if two values are equal, but that's very limited. What if we wanted to check other equality relationships? What we want is an operator that just does what we mean, no matter what that might be. This magical operator is the smart match operator ~~
.
Now, when you see the ~~
operator, you probably immediately think about strings. The smart match operator does a lot with strings, but isn't restricted to them.
Here are some examples of the smart match operator in action:
5 ~~ "5"; # true, same numerical value
["a", "b"] ~~ *, "a", *; # true, "a" contained in the array
("a" => 1, "b" => 2) ~~ *, "b", *; # true, hash contains a "b" key
"c" ~~ /c/; # true, "c" matches the regex /c/
3 ~~ Int # true, 3 is an Int
As you can see, the smart match operator can be used in a variety of ways to test two things to see if they match in some way. Above we saw an example of a regular expression, which we will discuss in more detail in later chapters. This also isn't a comprehensive list of things that can be matched, we will see more things throughout the book.
Given / When
editRaku has a facility for matching a quantity against a number of different alternatives. This structure is the given
and when
blocks.
given $x {
when Bool { say '$x is the boolean quantity ' ~ $x; }
when Int { when 5 { say '$x is the number 5'; } }
when "abc" { say '$x is the string "abc"'; }
}
Each when
is a smart match. The code above is equivalent to this:
if $x ~~ 5 {
say '$x is the number 5';
}
elsif $x ~~ "abc" {
say '$x is the string "abc"';
}
elsif $x ~~ Bool {
say '$x is the boolean quantity ' ~$x;
}
The given
/when
structure is more concise than the if
/else
, and internally it might be implemented in a more optimized way.
Loops
editLoops are ways to repeat certain groups of statements more than once. Raku has a number of available types of loops that can be used, each of which has different purposes.
for
loops
edit
for blocks take an array or range argument, and iterate over every element. In the most basic case, for
assigns each successive value to the default variable $_
. Alternatively, a specific variable can be listed to receive the value. Here are several examples of for
blocks:
# Prints the numbers "12345"
for 1..5 { # Assign each value to $_
.print; # print $_;
}
# Same thing, but using an array
my @nums = 1..5;
for @nums {
.print;
}
# Same, but uses an array that's not a range
my @nums = (1, 2, 3, 4, 5);
for @nums {
.print;
}
# Using a different variable than $_
for 1..5 -> $var {
print $var;
}
In all the examples above, the array argument to for
can optionally be enclosed in parenthesis too. The special "pointy" syntax ->
will be explained in more detail later, although it's worth noting here that we can extend it to read multiple values from the array at each loop iteration:
my @nums = 0..5;
for @nums -> $even, $odd {
say "Even: $even Odd: $odd";
}
This prints the following lines:
Even: 0 Odd: 1 Even: 2 Odd: 3 Even: 4 Odd: 5
for
can also be used as a statement postfix, like we saw with if
and unless
, although with some caveats:
print $_ for (1..5); # Prints "12345"
print for (1..5); # Parse Error! Print requires an argument
.print for 1..5; # Prints "12345"
loop
edit
C programmers will recognize the behavior of the loop
construct, which is the same format and behavior as the for
loop in C. Raku has reused the name for
for the array looping construct that we saw in the previous section, and uses the name loop
to describe the incremental behavior of C's loops. Here is the loop
structure:
loop (my $i = 0; $i <= 5; $i++) {
print $i; # "12345"
}
In general, loop
takes these three components:
loop ( INITIALIZER ; CONDITION ; INCREMENTER )
The INITIALIZER
in a loop
is a line of code that executes before the loop begins, but has the same lexical scope as the loop body. The CONDITION
is a boolean test that's checked before every iteration. If the test is false, the loop exits, if it is true, the loop repeats. The INCREMENTER
is a statement that happens at the end of the loop, before the next iteration begins. All of these parts may be optionally omitted. Here are five ways to write the same loop:
loop (my $i = 0; $i <= 5; $i++) {
print $i; # "12345"
}
my $i = 0; # Small Difference: $i is scoped differently
loop ( ; $i <= 5; $i++) {
print $i;
}
loop (my $i = 0; $i <= 5; ) {
print $i; # "12345"
$i++;
}
loop (my $i = 0; ; $i++) {
last unless ($i <= 5);
print $i; # "12345"
}
my $i = 0;
loop ( ; ; ) {
last unless ($i <= 5);
print $i; # "12345"
$i++;
}
If you want an infinite loop, you can also omit the parentheses instead of using (;;):
my $i = 0;
loop { # Possibly infinite loop
last unless ($i <= 5);
print $i; # "12345"
$i++;
}
repeat
blocks
edit
A repeat block will execute its body at least once as the condition follows after the block.
In the example below you can see that even though $i
is larger than two, the block
will still run.
my $i = 3;
repeat {
say $i;
} while $i < 2;
Subroutines
Subroutines
editWhen it comes to code reuse, the most basic building block is the subroutine. They are not the only building blocks in the toolkit however: Raku also supports methods and submethods, that we'll discuss when we talk about classes and objects.
Subroutines are created with the sub
keyword, followed by name, an optional list of parameters, and then a block of code.
Blocks
editBlocks are groups of code contained in { }
curly brackets. Blocks serve a number of purposes, including setting code apart, grouping several statements together, and creating a scope for variables.
Defining Subroutines
editSubroutines are defined using the sub
keyword. Here is an example:
sub mySubroutine () {
}
The parenthesis are used to define the list of formal parameters to the subroutine. Parameters are like regular my
local variables, except they are initialized with values when the subroutine is called. Subroutines can pass a result back to their caller using the return
keyword:
sub double ($x) {
my $y = $x * 2;
return $y;
}
Optional Parameters
editOptional parameters have a ?
after them. Also, optional parameters may be given a default value with =
. Required parameters may have a !
after them, although this is the default for positional parameters. All required parameters must be listed before all optional ones.
sub foo (
$first, # First parameter, required
$second!, # Second parameter, required
$third?, # Third parameter, optional (defaults to undef)
$fourth = 4 # Fourth parameter, optional (defaults to 4)
)
Named Parameters
editNormal parameters are passed by their position: The first passed parameter goes into the first positional argument, the second goes into the second, and so on. However, there is also a way to pass parameters by name, and to do so in any order. Named parameters are basically pairs, where a string name is associated with a data value. Named data values can be passed using either pair or adverb syntax.
sub mySub(:name($value), :othername($othervalue))
Of course, subroutine signatures allow a special shorthand, that you can use if your variable has the same name as the pair has:
sub mySub(:name($name), :othername($othername))
sub mySub(:$name, :$othername) # Same!
In a subroutine declaration, named parameters must come after all required and optional positional parameters. Named parameters are treated as optional by default unless they are followed by a !
. Actually, you can put a !
after required positional parameters as well, but that's the default.
sub mySub(
:$name!, # Required
:$type, # Optional
:$method? # Still optional
)
Slurpy Parameters
editRaku also allows so called "slurpy" parameters using the *@ syntax.
sub mySub($scalar, @array, *@theRest) {
say "the first argument was: $scalar";
say "the second argument was: " ~ @array;
say "the rest were: " ~ @theRest;
}
The *@ tells Raku to flatten out the rest of the arguments into a list and store in the array @theRest. This is necessary to allow perl to accept positional or named arrays without requiring references.
my $first = "scalar";
my @array = 1, 2, 3;
mySub($first, @array, "foo", "bar");
The above code will output three lines:
- the first argument was: scalar
- the second argument was: 1, 2, 3
- the rest were: "foo", "bar"
return
and want
edit
Calling Subroutines
editOnce we have a subroutine defined, we can call into it later to retrieve results or actions from it. We've already seen the built-in say
function, where you can pass strings to it, and have those strings printed to the console. We can use our double
function from above to calculate various values:
my $x = double(2); # 4
my $y = double(3); # 6
my $z = double(3.5); # 7
We can use the &
sigil to store a reference to the subroutine into a normal scalar variable:
my $sub = &double;
my $x = $sub(7) # 14
Multi Subroutines
editIn this example, you see that we are passing both integer values and floating point values to our double
subroutine. However, we can use our type specifiers to restrict what kinds of values
sub double (Int $x) { # $x can only be an int!
return $x * 2;
}
my $foo = double(4); # 8
my $bar = double(1.5); # Error!
Raku allows you to write multiple functions with the same name, so long as they have different parameter signatures and are marked with the key word multi
. This is called multi method dispatch, and is an important aspect of Raku programming.
multi sub double(Int $x) {
my $y = $x * 2;
say "Doubling an Integer $x: $y";
return $x * 2;
}
multi sub double(Num $x) {
my $y = $x * 2;
say "Doubling a Number $x: $y";
return $x * 2;
}
my $foo = double(5); # Doubling an Integer 5: 10
my $bar = double(3.5); # Doubling a Number 3.5: 7
Anonymous Subroutines
editInstead of naming a subroutine like normal, we can define an anonymous subroutine and store a reference to it in a variable.
my $double = sub ($x) { return $x * 2; };
my $triple = sub ($x) { return $x * 3; };
my $foo = $double(5); # 10
my $bar = $triple(12); # 36
Notice that we could also store these code references in an array:
my @times;
@times[2] = sub ($x) { return $x * 2; };
@times[3] = sub ($x) { return $x * 3; };
my $foo = @times[2](7); # 14
my $bar = @times[3](5); # 15
Blocks and Closures
About Blocks
editWhen we talked about subroutines we saw that a subroutine declaration consisted of three parts: The subroutine name, the subroutine parameter list, and the code block of subroutine internals. Blocks are very fundamental in Raku, and we're now going to use them to do all sorts of cool things.
We've seen a few blocks used in various constructs already:
# if/else statements
if $x == 1 {
}
else {
}
# subroutines
sub thisIsMySub () {
}
# loops
for @ary {
}
loop (my $i = 0; $i <= 5; $i++) {
}
repeat {
} while $x == 1;
All these blocks serve the purpose of grouping lines of code together for a particular purpose. In an if
block, the statements inside the block are all executed when the if
condition is true. The entire block is not executed if the condition is false. In a loop, all the statements in the loop block are executed together in repetition.
Scope
editIn addition to keeping like code together, blocks also introduce the notion of scope. my
variables defined inside a block are not visible outside it. Scope ensures that variables are only used when they are needed, and they are not being modified when they are not supposed to be. Blocks don't need to be associated with any particular construct, like an if
or a loop
. Blocks can exist all by themselves:
my $x = 5;
my $y = 5;
{
my $y = 3;
say $x; # 5
say $y; # 3
}
say $x; # 5
say $y; # 5
The example shows the idea of scope very nicely: The variable $y
inside the block is not the same as the variable $y
outside the block. Even though they have the same name, they have a different scope. Here's a slightly different example:
my $x = 5;
{
my $y = 7;
{
my $z = 9;
say $x; # 5
say $y; # 7
say $z; # 9
}
say $x; # 5
say $y; # 7
say $z; # ERROR: Undeclared variable!
}
say $x; # 5
say $y; # ERROR! Undeclared variable!
say $z; # ERROR! Undeclared variable!
The variable $x
is visible from the point where it was defined and inside all scopes inside the scope where it was defined too. $y
however is only visible inside the block it was defined in, and the block inside that. $z
is only visible in the innermost block.
Scope Variables
editScopes can be specified exactly in cases where there is ambiguity. We can use keywords like OUTER
to specify a variable from the scope directly above the current scope:
my $x = 5;
{
my $x = 6;
say $x; # 6
say $OUTER::x # 5
}
Subroutines have access to the scope from which they are called using the CALLER
scope, assuming that the variable in the outer scope was declared as is context
:
my $x is context = 5;
mySubroutine(7);
sub mySubroutine($x) {
say $x; # 7
say $CALLER::x; # 5
}
Coderefs
editBlocks can be stored in a single scalar variable as a coderef. Once stored in a coderef variable, the block can be executed like a regular subroutine reference:
my $dostuff = {
print "Hello ";
say "world!";
}
$dostuff();
Closures
editWe see in the example above that a block can be stored in a variable. This action creates a closure. A closure is a stored block of code that saves its current state and current scope, which can be accessed later. Let's see a closure in action:
my $block;
{
my $x = 2;
$block = { say $x; };
}
$block(); # Prints "2", even though $x is not in scope anymore
The closure saves a reference to the $x
variable when the closure is created. Even if that variable is not in scope anymore when the code block is executed.
When we change $x later on, the closure will see the changed value, so if you want to create multiple closures with different enclosed variables, you have to create a new variable each time:
my @times = ();
for 1..10 {
my $t = $_; # each subroutine gets a different $t
@times[$_] = sub ($a) { return $a * $t; };
}
say @times[3](4); # 12
say @times[5](20); # 100
say @times[7](3); # 21
Captures
editPointy Blocks
editWe can use the sub
keyword to create a subroutine or a subroutine reference. This isn't the only syntax to do this, and in fact is a little bit more verbose then it needs to be for the common case of an unnamed ("anonymous") subroutine or subroutine reference. For these, we can use a construct called a pointy block. Pointy blocks, which are called lambda blocks in other languages, are very useful. They can create a code reference like an anonymous subroutine, and they can also create blocks of code with parameters. A pointy block is a lot like an unnamed subroutine. More generally, it's like a block with parameters. We've seen pointy blocks briefly when we talked about loops. We used pointy blocks in association with a looping construct to give names to the loop variable instead of relying on the default variable $_
. This is why we used pointy blocks in these situations: They enable us to specify variable names to use as parameters to an arbitrary block of code.
We'll show a few examples:
my @myArray = (1, 2, 3, 4, 5, 6);
# In a loop:
for @myArray -> $item {
say $item;
# Output is:
# 1
# 2
# 3
# 4
# 5
# 6
}
# In a loop, multiples
for @myArray -> $a, $b, $c {
say "$a, $b, $c";
# Output is:
# 1, 2, 3
# 4, 5, 6
}
# As a condition:
my $x = 5;
if ($x) -> $a { say $a; } # 5
# As a coderef
my $x = -> $a, $b { say "First: $a. Second: $b"; }
$x(1, 2); # First: 1, Second: 2
$x("x", "y"); # First: x, Second: y
# As an inline coderef
-> $a, $b { say "First: $a, Second: $b"; }(1, 2)
#In a while loop
while ($x == 5) -> $a {
say "Boolean Value: $a";
}
Placeholder Arguments
editIn a block, if we don't want to go through the hassle of writing out an argument list, we can use placeholder arguments. Placeholders use the special ^
twigil. Passed values are assigned to placeholder variables in alphabetical order:
for 1..3 {
say $^a; # 1
say $^c; # 3
say $^b; # 2
}
Classes And Attributes
Classes and Objects
editWhat we've seen so far are the building blocks for procedural programming: Lists of expressions, branches, loops, and subroutines that tell the computer what job to do and exactly how to do it. Raku supports procedural programming very well, but this isn't the only style of programming that Raku supports. Another common paradigm that Raku fits nicely is the object-oriented approach.
Objects are combinations of data and the operations that act on that data. The data of an object are called its attributes, and the operations of an object are called its methods. In this sense, the attributes define the state and the methods define the behavior of the objects.
Classes are templates for creating objects. When referring to an object of a specific class, it's customary to call the object an instance of the class.
Classes
editClasses are defined using the class
keyword, and are given a name:
class MyClass {
}
Inside that class declaration you can define attributes, methods, or submethods.
Attributes
edit
Attributes are defined with the has
keyword, and are specified with a special syntax. For example, let's consider the following class:
class Point3D {
has $!x-axis;
has $!y-axis;
has $!z-axis;
}
The class Point2D defines a point in 3D coordinates with three attributes named x-axis, y-axis and z-axis.
In Raku, all attributes are private and one way to express this explicitly is by using the ! twigil. An attribute declared with the ! twigil can only be accessed directly within the class by using !attribute-name. Another important consequence of declaring attributes this way is that objects cannot be populated by using the default new constructor.
If you declare an attribute with the . twigil instead, a read-only accessor[check spelling] method will be automatically generated. You can think of the . twigil as "attribute + accesor". This accesor, which is a method named after its attribute, can be called from outside the class and return the value of its attribute. In order to allow changes to the attributes through the provided accesors, the trait is rw
must be added to the attributes.
The previous class Point3D could be declared as follows:
class Point3D {
has $.x-axis;
has $.y-axis;
has $.z-axis is rw;
}
Given that the . twigil declares a ! twigil and an accesor[check spelling] method, atrributes can always be used with the ! twigil even if they're declared using the . twigil.
Methods
editMethods are defined just like normal subroutines except for a few key differences:
- Methods use the
method
keyword instead ofsub
. - Methods have the special variable
self
, which refers to the object that the method is being called on. This is known as the invocant. - Methods can access the internal traits of the object directly.
When defining the method, you can specify a different name for the invocant, instead of having to use self
. To do this, you put it at the beginning of the method's signature and separate it from the rest of the signature with a colon:
method myMethod($invocant: $x, $y)
In this context, the colon is treated like a special type of comma, so you can write it with additional whitespace if that is easier:
method myMethod($invocant : $x, $y)
Here's an example:
class Point3D {
has $.x-axis;
has $.y-axis;
has $.z-axis;
method set-coord($new-x, $new-y, $new-z) {
$!x-axis = $new-x;
$!y-axis = $new-y;
$!z-axis = $new-z;
}
method print-point {
say "("~$!x-axis~","~$!y-axis~","~$!z-axis~")";
}
# method using the self invocant
method distance-to-center {
return sqrt(self.x-axis ** 2 + self.y-axis ** 2);
}
# method using a custom invocant named $rect
method polar-coordinates($rect:) {
my $r = $rect.distance-to-center;
my $theta = atan2($rect.y-axis, $rect.x-axis);
return "("~$r~","~$theta~","~$rect.z-axis~")";
}
}
Objects
editObjects are data items whose type is a given class. Objects contain any attributes that the class defines, and also has access to any methods in the class. Objects are created with the new
keyword.
Using the class Point3D:
my $point01 = Point3D.new();
The class constructor, new()
can take named methods used to initialize any of the class attributes:
# Either syntax would work for object initialization
my $point01 = Point3D.new(:x-axis(3), :y-axis(4), :z-axis(6));
my $point02 = Point3D.new(x-axis => 3, y-axis => 4, z-axis => 6);
Methods from that class are called using the dot notation. This is the object, a period, and the name of the method after that.
$point01.print-point();
say $point01.polar-coordinates();
When an object isn't provided for dot notation method calls, the default variable $_
is used instead:
$_ = Point3D.new(:x-axis(6), :y-axis(8), :z-axis(6));;
.print-point();
say .polar-coordinates();
Inheritance
editBasic class systems enable data and the code routines that operate on that data to be bundled together in a logical way. However, there are more advanced features of most class systems that enable inheritance too, which allows classes to build upon one another. Inheritance is the ability for classes to form logical hierarchies. Raku supports normal inheritance of classes and subclasses, but also supports special advanced features called mixins, and roles. We are going to have to reserve some of the more difficult of these features for later chapters, but we will introduce some of the basics here.
Basic Types
editWe've talked about a few of Perl's basic types in earlier chapters. It may surprise you to know that all Raku data types are classes, and that all these values have built-in methods that can be used. Here we're going to talk about some of the methods that can be called on some of the various objects that we've seen so far.
.print
and .say
edit
We've already seen the print
and say
builtin functions. All built-in classes have methods of the same name that print a stringified form of the object.
.perl
and eval
edit
We're going to take a quick digression and talk about the eval
function. eval
lets us compile and execute a string of Raku code at runtime.
eval("say 'hello world!';");
All Raku objects have a method called .perl
that returns a string of Raku code representing that object.
my Int $x = 5;
$x.perl.say; # "5"
my @y = (1, 2, 3);
@y.perl.say; # "[1, 2, 3]"
my %z = :first(1), :second(2), :third(3);
%z.perl.say; # "{:first(1), :second(2), :third(3)}"
Context and Coercion methods
editThere are a number of methods that can be called to explicitly change the given data item into a different form. This is like an explicit way to force the given data item to be taken in a different context. Here is a partial list:
.item
- Returns the item in scalar context.
.iterator
- Returns an iterator for the object. We'll talk about iterators in a later chapter.
.hash
- Returns the object in hash context
.list
- Returns the object in array or "list" context
.Bool
- Returns the boolean value of the object
.Array
- Returns an array containing the object data
.Hash
- Returns a hash containing the object data
.Iterator
- Returns an iterator for the object. We'll talk about iterators in a later chapter.
.Scalar
- Returns a scalar reference to the object
.Str
- Returns a string representation for the object
Introspection Methods
edit.WHENCE
- Returns a code reference for the object types autovivification closure. We'll talk about autovivification and closures later.
.WHERE
- Returns the memory location address of the data object
.WHICH
- Returns the objects identity value, which for most objects is just it's memory location (it's .WHERE)
.HOW
- (HOW = Higher Order Workings) Returns the meta class which handles this object
.WHAT
- Returns the type object of the current object
Comments and POD
This page or section is an undeveloped draft or outline. You can help to develop the work, or you can ask for assistance in the project room. |
Comments
editNow we've covered most of the basics of Raku programming. By no means have we covered the language in its entirety. However, we have seen the basic kinds of tools that we would need for ordinary programming tasks. There is much more to learn, many advanced tools and features that can be used to make common tasks easier, and hard tasks possible. We'll get on to some of those more advanced features in a bit, but in this chapter we want to wrap up the "Basics" section by talking a little about comments and documentation.
We mentioned previously that comments are notes in the source code that are intended to be read by the programmers and are ignored by the Raku interpreter. The most common form of comments in Raku is the single-line comment which starts with a single hash character # and extends until the end of the line:
# Calculate factorial of a number using recursion
sub factorial (Int $n) {
return 1 if $n == 0; # This is the base case
return $n * factorial($n - 1); # This is the recursive call
}
When the above is executed, all the text prefixed with a single hash character # will be ignored by the Raku interpreter.
Multi-Line Comments
editWhile Perl doesn't provide multi-line comments, Raku does. In order to create multi-line comments in Raku, the comment must start with a single hash character, followed by a backtick, then some opening bracketing character, and end with the matching closing bracketing character:
sub factorial(Int $n) {
#`( This function returns the factorial of a given parameter
which must be an integer. This is an example of a recursive
function where there is a base case to be reached through
recursive calls.
)
return 1 if $n == 0; # This is the base case
return $n * factorial($n - 1); # This is the recursive call
}
Furthermore, the content of a comment can also be embedded inline:
sub add(Int $a, Int $b) #`( two (integer) arguments must be passed! ) {
return $a + $b;
}
POD Documentation
edit
Regular Expressions
Regular Expressions
editRegular expressions are a tool for specifying and searching for patterns in strings, among other things. Regular expressions were a popular and powerful part of Perl, although they gradually grew and expanded in successive versions of that language in a way that was difficult to follow and implement.
Perl's regular expressions became increasingly difficult to use and understand as more operators and metacharacters were added to the engine. It was decided that Raku would break from this syntax and rewrite regular expressions from the ground-up to be more flexible and more integrated into the language. In Raku, they are known simply as regexes, and have become significantly more powerful.
Raku supports regexes in two ways: It has a legacy mode that supports Perl-style regular expressions, and it has a normal mode that supports the new style of regexes.
Basic Quantifiers
editRegexes describe patterns in string data that can be searched for and acted upon. One of the most basic patterns to search for is a repetition pattern. To describe repetition, there are a number of quantifiers that can be used:
Op | What It Means | Example | Explanation |
---|---|---|---|
* |
"zero or more of" | B A* |
Accepts a string with a 'B' followed by any number of 'A' characters, even zero of them. B , BAAAAA , etc.
|
+ |
"one or more of" | B A+ |
Accepts a string with a 'B', followed by at least one 'A'. Example: BAAA , or BA but not B
|
? |
"one or zero" | B A? |
Matches a 'B', optionally followed by one 'A'. B or BA
|
** |
"this many" | B A**5 |
Matches a 'B' followed by exactly 5 'A' characters. BAAAAA
|
B A ** 2..5 |
Matches a 'B' followed by at least two 'A' and no more than 5 'A'. BAA , BAAA , BAAAA , BAAAAA
|
Grammars
Grammars
editRegular expressions by themselves are useful but limited. It can be difficult to reuse regexes, difficult to group them into logical bunches, and very difficult to inherit regexes from one bunch to another. This is where grammars come in. Grammars are to regexes what classes are to data and code routines. Grammars allow regexes to act like normal first-class components of the programming language and make use of the cool features of the class system. Grammars can be inherited and overloaded like classes. In fact, the Raku grammar itself can be modified to add new features to the language on the fly. We will see examples of that later.
Rules, Tokens and Protos
editGrammars are broken into components called rules, tokens and protos. Tokens are like the regexes we've already seen. Rules are like subroutines because they can call other rules or tokens. Protos are like default multisubs, they define a rule prototype that can be overridden.
Tokens
editTokens are regex that don't backtrack meaning that if a portion of the expression has been matched, this portion will not be altered even if it prevents a larger portion of the expression from matching. While this sacrifices some of the flexibility of regexes, it allows more complex parsers to be created efficiently.
token number {
\d+ ['.' \d+]?
}
Rules
editRules are ways to combine tokens and other rules together. Rules are all given names, and can refer to other rules or tokens in the same grammar using < >
angle brackets. Like tokens they do not backtrack but spaces within them are interpreted literally instead of being ignored:
rule URL {
<protocol>'://'<address>
}
This rule matches a URL string where a protocol name such as "ftp" or "https" is followed by the literal symbol "://" and then a string representing an address. This rule depends on two sub-rules, <protocol>
and <address>
. These could be defined as either tokens or rules, so long as they are in the same grammar:
grammar URL {
rule TOP {
<protocol>'://'<address>
}
token protocol {
'http'|'https'|'ftp'|'file'
}
rule address {
<subdomain>'.'<domain>'.'<tld>
}
...
}
Protos
editProtos define a type of rules or tokens. For example, we could define a proto-token <protocol>
and then define several tokens representing different protocols. Within one of these tokens, we can refer to its name as <sym>
:
grammar URL {
rule TOP {
<protocol>'://'<address>
}
proto token protocol {*}
token protocol:sym<http> {
<sym>
}
token protocol:sym<https> {
<sym>
}
token protocol:sym<ftp> {
<sym>
}
token protocol:sym<ftps> {
<sym>
}
...
}
This would be equivalent to saying:
token protocol {
<http> | <https> | <ftp> | <ftps>
}
token http {
http
}
...
but is more extensible, allowing types of protocol to be specified later. For example if we wanted to define a new type of URL
which also supported the "spdy" protocol, we could use:
grammar URL::WithSPDY is URL {
token protocol:sym<spdy> {
<sym>
}
}
Matching Grammars
editOnce we have a grammar like the one defined above, we can match it with the .parse
method:
my Str $mystring = "http://www.wikibooks.org";
if URL.parse($mystring) {
#if it matches a URL, do something
}
Match Objects
editA match object is a special data type that represents the parse state of a grammar. The current match object is stored in the special variable $/
.
Parser Actions
editA grammar can be turned into an interactive parser by combining it with a class of parser actions. As the grammar matches certain rules, corresponding action methods can be called with the current match object.
Operator Overloading
There are only 5 types of operators: infix , prefix , postfix , circumfix and postcircumfix .
You can declare a new operator like this:
sub postfix:<!>(Int $n!) { [*] 1..$n }
say 5!; # prints 120
The above, as you can see, declares an operator '!' for calculating factorial of a Integer.
Language Extensions
Junctions
Junctions
editJunctions were originally implemented as part of a fancy Perl module to simplify some common operations. Let's say we have a complex condition where we need to test variable $x
against one of several discrete values:
if ($x == 2 || $x == 4 || $x == 5 || $x == "hello"
|| $x == 42 || $x == 3.14)
This is a huge mess. What we want to do is basically create a list of values and ask "if $x
is one of these values". Junctions allow this behavior, but also do so much more. Here's the same statement written as a junction:
if ($x == (2|4|5|"hello"|42|3.14))
Types of Junctions
editThere are 4 basic types of junctions: any (logical OR of the components), all (logical AND of all components), one (logical XOR of all components), and none (logical NOR of the components).
List Operators
editList operators construct a junction as a list:
my $options = any(1, 2, 3, 4); # Any of these is good
my $requirements = all(5, 6, 7, 8); # All or nothing
my $forbidden = none(9, 10, 11); # None of these
my $onlyone = one(12, 13, 4); # One and only one
Infix Operators
editAnother way to specify a junction is to use infix operators like we have already seen:
my $options = 1 | 2 | 3 | 4; # Any of these is good
my $requirements = 5 & 6 & 7 & 8; # All or nothing
my $onlyone = 12 ^ 13 ^ 4; # One and only one
Notice that there isn't an infix operator to create none()
junctions.
Operations on Junctions
editMatching Junctions
editJunctions, like any other data type in Raku, can be matched against using the smart match operator ~~
. The operator will automatically perform the correct matching algorithm depending on which type of junction is being matched.
my $junction = any(1, 2, 3, 4);
if $x ~~ $junction {
# execute block if $x is 1, 2, 3, or 4
}
all()
Junctions
edit
if 1 ~~ all(1, "1", 1.0) # Success, all of them are equivalent
if 2 ~~ all(2.0, "2", "foo") # Failure, the last one doesn't match
An all()
junction will only match if all the elements in it match the object $x
. If any of the elements do not match, the entire match fails.
one()
Junctions
edit
A one()
junction will only match if exactly one of its elements match. Any more or any less, and the entire match fails.
if 1 ~~ one(1.0, 5.7, "garbanzo!") # Success, only one match
if 1 ~~ one(1.0, 5.7, Int) # Failure, two elements match
any()
Junctions
edit
An any()
junction matches so long as at least one element matches. It could be one or any other number but zero. The only way for an any
junction to fail is if none of the elements match.
if "foo" ~~ any(String, 5, 2.18) # Success, "foo" is a String
if "foo" ~~ any(2, Number, "bar") # Failure, none of these match
none()
Junctions
edit
none()
junctions only succeed in a match if none of the elements in the junction match. In this way, it's equivalent to the inverse of the any()
junction. If any()
succeeds, none()
fails. If any()
fails, none()
succeeds.
if $x ~~ none(1, "foo", 2.18)
if $x !~ any(1, "foo", 2.18) # Same thing!
Lazy Lists and Feeds
Laziness
editIn most traditional computing systems, data objects are allocated to a set size and their values filled in to the spaces in memory. In C for instance, if we declare an array int a[10]
, the array a
will be a fixed size with enough space to store exactly 10 integers. If we want to store 100 integers, we need to allocate a space for 100. If we want to store a million, we need to allocate an array that size.
Let's consider the problem where we want to compute a multiplication table, a two-dimensional array where the value of a given cell in the array is the product of the two indices of it. Here's a simple loop that could generate this table with factors up to N:
int products[N][N];
int i, j;
for(i = 0; i < N; i++) {
for(j = 0; j < N; j++) {
products[i][j] = i * j;
}
}
Creating this table can take a while to perform all N2 operations. Of course, once we've initialized the table it's very fast to look a value up in it. Another thing to consider here is that we end up calculating more values then we are ever going to use, so that's wasted effort.
Now, let's look at a function to do the same thing:
int product(int i, int j) {
return i * j;
}
This function doesn't require any startup time to initialize its values, however it does require additional time with every call to compute the result. It's faster to start up than the array, but takes more time for each access than the array does.
Combining these two ideas gives us the lazy list.
Lazy Lists
editLazy lists are like arrays with a few major differences:
- They aren't necessarily declared with a predefined size. They can be any size, even infinitely long.
- They don't calculate their values until required, and only calculate what's needed when it's needed.
- Once their values have been calculated, they can be stored for fast lookup.
The opposite of lazy lists are eager lists. Eager lists calculate and store all their values immediately, like the arrays in C. Eager lists cannot be infinitely long because they need to store their values in memory, and computers don't have infinite memory.
Raku has both types of lists, and they are handled internally without intervention by the programmer. Lists which can be lazy are treated lazily. Lists which cannot be lazy are eagerly computed and stored. Lazy lists give a benefit in terms of storage space and calculation overhead, so Raku tries to use them by default. Raku also provides a number of constructs that can be used to support laziness and improve performance of list calculations.
Ranges
editWe've already seen ranges. Ranges are lazy by default, which means all the values in the range aren't necessarily calculated when you assign them to an array:
my @lazylist = 1..20000; # Doesn't calculate all 20,000 values
Because of their laziness, ranges can even be infinite:
my @lazylist = 1..Inf; # Infinite values!
Iterators
editIterators are special data items that move through a complex data object one element at a time. Think about the cursor in a text editor program; the cursor reads one keypress, inserts the character at its current position, and then moves to the next position to await the next key press. In this way, a long array of characters can be inserted one at a time without you, the editor, having to move the cursor manually.
In the same way, iterators in Raku traverse through arrays and hashes automatically, keeping track of your current location in the array automatically so you don't have to. We've already seen a use of iterators in our earlier discussion on loops, although we didn't call them "iterators" by name. Here are two loops that perform an identical function:
my @x = 1, 2, 3, 4, 5;
loop(my int $i = 0; $i < @x.elems; $i++) {
@x[$i].say;
}
for @x { # Same, but much shorter!
$_.say;
}
The first loop iterates through the @x
array manually using the $i
variable to keep track of the current location, and using the $i < @x.length
test to make sure we haven't reached the end. In the second loop, the for
keyword creates an iterator for us. The iterator automatically keeps track of our current position in the array, automatically detects when we've reached the end of the array, and automatically loads each subsequent value into the $_
default variable. It's worth mentioning that we can make this shorter still by using a few Raku idioms:
.say for @x;
What are Iterators?
editIterators are any object that implements the Iterator
role. We'll talk about roles a little bit later, but it will suffice for now to say that a role is a standard interface that other classes can participate in. Because they can be any class, so long as it has a standard interface, iterators can do anything we define them to do. Iterators can traverse arrays and hashes easily, but specially-defined types could also iterate over trees, graphs, heaps, files, and all sorts of other data structures and concepts.
If a data item has an associated iterator type, it can be accessed through the .Iterator()
method. This method is called internally most of the time by structures like the for
loop, but you can get access to it if you really need to.
Feeds
editFeeds give a nice graphical way to show where data is moving in complex assignment statements. Feeds have two ends, a "blunt" end and a "sharp" end. The blunt end connects to a data source which is a list of values. The sharp end connects to a receiver take can take at least one element at a time. Feeds can be used to send data from right-to-left or left-to-right, depending on the direction that the feed is pointing.
my @x <== 1..5;
say @x # 1, 2, 3, 4, 5
@x ==> @y ==> print # 1, 2, 3, 4, 5
say @y # 1, 2, 3, 4, 5
Layered feeds move data from one to the other. However feeds with two points append onto the last item in the feed chain:
my @x = 1..5;
@x ==> map {$_ * 2} ==> @y;
say @x; # 1, 2, 3, 4, 5
say @y; # 2, 4, 6, 8, 10
@x ==>>
@y ==> @z;
say @z # 1, 2, 3, 4, 5, 2, 4, 6, 8, 10
Gather and Take
editWe can write our own kinds of iterators using the gather
and take
keywords. These two keywords act a lot like the pointy blocks that we've seen previously. However, unlike pointy blocks, gather/take can return values. Like pointy blocks, gather/take can be combined with loops to form custom iterators.
gather
is used to define a special block. The code of that block can perform an arbitrary calculation and return a value with take
. Here's an example:
my $x = gather {
take 5;
}
say $x; # 5
This isn't so useful by itself. However, we can now combine it with loops to return a long list of values:
my @x = gather for 1..5 {
take $_ * 2;
}
say @x # 2, 4, 6, 8, 10
The take
operator performs two actions: It takes a capture of the value it's passed and returns that as one of the results of the gather
block, and it returns the value that it's been passed for storage. We can easily combine this behavior with a state
variable to use values recursively.
my @x = gather for 1..5 {
state $a = $_;
$a = take $_ + $a;
}
say @x; # 2, 4, 7, 11, 16
Meta Operators
Meta Operators
editOperators do things to data. Meta operators do things to operators.
List Operators
editReduction Operators
editReduction operators act on a list and return a scalar value. They do this by applying the reduction operator between every pair of elements in the array:
my @nums = 1..5;
my $sum = [+] @nums # 1 + 2 + 3 + 4 + 5
The [ ]
square brackets turn any operator that normally acts on scalars into a reduction operator to perform that same operation on a list. Reductions can also be used with relational operators:
my $x = [<] @y; # true if all elements of @y are in ascending order
my $z = [>] @y; # true if all elements of @y are in descending order
Hyper Operators
editReduction operators apply an operator to all the elements of an array and reduces it to a single scalar value. A hyper operator distributes the operation over all the elements in the list and returns a list of all results. Hyperoperators are constructed using the special "french quotes" symbols: « and ». If your keyboard doesn't support these, you can use the ASCII symbols >>
and <<
instead.
my @a = 1..5;
my @b = 6..10;
my @c = @a »*« @b;
# @c = 1*6, 2*7, 3*8, 4*9, 5*10
You can also use unary operators with hypers:
my @a = (2, 4, 6);
my @b = -« @a; # (-2, -4, -6)
Unary hyperoperators always return an array that is exactly the same size as the list it is given. Infix hyperoperators have different behavior depending on the sizes of its operands.
@a »+« @b; # @a and @b MUST be the same size
@a «+« @b; # @a can be smaller, will upgrade
@a »+» @b; # @b can be smaller, will upgrade
@a «+» @b; # Either can be smaller, Perl will Do What You Mean
Pointing the hyper symbols in different directions affects how Raku treats the elements. On the sharp side, it extends the array to be as long as the one on the dull side. If both sides are sharp, it will extend whichever is smaller.
Hypers can also be used with assignment operators:
@x »+=« @y # Same as @x = @x »+« @y
Cross Operators
editThe cross is a capital X
symbol. As an operator, the cross returns a list of all possible lists made by combining the elements of its operands:
my @a = 1, 2;
my @b = 3, 4;
my @c = @a X @b; # (1,3), (1,4), (2,3), (2,4)
The cross can also be used as a meta operator, applying the operator it's modifying against every possible combination of elements from each operand:
my @a = 1, 2;
my @b = 3, 4;
my @c = @a X+ @b; # 1+3, 1+4, 2+3, 2+4
Roles and Inheritance
Inheritance
editBasic class systems enable data and the code routines that operate on that data to be bundled together in a logical way. However, there are more advanced features of most class systems that enable inheritance too, which allows classes to build upon one another. Inheritance is the ability for classes to form logical hierarchies. Raku supports normal inheritance of classes and subclasses, but also supports special advanced features called mixins, and roles. We are going to have to reserve some of the more difficult of these features for later chapters, but we will introduce them here.
Class Inheritance
editRoles and does
edit
Mixins
editParametric Roles
editAdvanced Subroutines
Advanced Subroutines
editWe talked about subroutines and code references earlier, but there is a lot more material to cover with these issues then we had room for in that chapter. Now we're going to cover some of the more advanced features of subroutines and code references.
Code
objects
edit
Blocks as Parameters
editCurrying
editSignature Objects
editExceptions and Handlers
Exceptions
editIn the most basic sense, exceptions represent errors that are caused by your program. However, instead of crashing your program, exceptions have the opportunity to be caught and handled gracefully. Exceptions are said to be raised or thrown, and special code blocks called handlers can catch them.
Exception Objects
editHandlers and CATCH
blocks
edit
Property Blocks
Property Blocks
editWe've seen in the previous chapter the special CATCH
block that is used to handle exceptions thrown from the block that the CATCH
lives in. In addition to CATCH
, there are a number of other special property blocks that can be used to modify the behavior of the block they live in.
Property blocks are lexical in nature: They modify the behavior of the block they are defined in, and they do not effect outer scopes.
NEXT and LAST Blocks
editPRE and POST Blocks
editKEEP and UNDO Blocks
editOrder of Execution
editFiles
Before we begin
editFilehandle
editAny interaction with files in Raku happens through a filehandle. [note 1] A filehandle is an internal name for an external file. The open
function makes the association between the internal name and the external name, while the close
function breaks that association. Some IO handles are available for your use without the need to create them: $*OUT
and $*IN
are connected to STDOUT and STDIN, the standard output and standard input streams, respectively. You will need to open every other filehandle on your own.
Paths
editAlways remember, that any path to a file within the program is with respect to the current working directory.
File operations: Text
editOpen a file for reading
editTo open a file, we need to create a filehandle to it. This simply means that we create a (scalar) variable which will refer to the file from now on. The two argument syntax is the most common way to call the open
function: open PATHNAME, MODE
—where PATHNAME
is the external name of the file you want opened and MODE
is the type of access. If successful, this returns an IO handle object which we can put into a scalar container:
my $filename = "path/to/data.txt";
my $fh = open $filename, :r;
The :r
opens the file in read-only mode. For brevity, you can omit the :r
—since it is the default mode; and, of course, the PATHNAME
string can be passed directly, instead of passing it via a $filename
variable.
Once we have a filehandle, we can read and perform other actions on the file.
New Way
editUse slurp and spurt instead, as:
"file.txt".IO.spurt: "file contents here";
"file.txt".IO.slurp.say; # «file contents here»
Read an opened file
editThe most general approach to file reading avails itself with the establishment of a connection to the resource via the open
function, followed by the data consumption step, and terminating with an invocation of close
on the file handle received during the opening procedure.
my $fileName;
my $fileHandle;
$fileName = "path/to/data.txt";
$fileHandle = open $fileName, :r;
# Read the file contents by the desiderated means.
$fileHandle.close;
To transfer the file data immediately and completely into the program, the slurp
function can be used. Commonly, this involves the obtained string's storage into a variable for further manipulations.
my $fileName;
my $fileHandle;
my $fileContents;
$fileName = "path/to/data.txt";
$fileHandle = open $fileName, :r;
# Read the complete file contents into a string.
$fileContents = $fileHandle.slurp;
$fileHandle.close;
If a complete data consumption is, either because of a line-oriented programming task or memory considerations, undesirable, a line by line reading can be accomplished through the IO.lines
function.
my $fileName;
my $fileHandle;
$fileName = "path/to/data.txt";
$fileHandle = open $fileName, :r;
# Iterate the file line by line, each line stored in "$currentLine".
for $fileHandle.IO.lines -> $currentLine {
# Utilize the "$currentLine" variable which holds a string.
}
$fileHandle.close;
A file handle's employment enables the resource's reuse, but on the other hand obliges the programmer to attend its management. If the offered advantages do not merit these expenses, the functions mentioned above can work directly upon a file name represented by its string.
The complete file contents can be read by specifying the file name as a string and invoking the IO.slurp
upon it.
my $fileName;
my $fileContents;
$fileName = "path/to/data.txt";
$fileContents = $fileName.IO.slurp;
If this object-oriented approach does not befit one's style, the equivalent procedural variant is:
my $fileName;
my $fileContents;
$fileName = "path/to/data.txt";
$fileContents = slurp $fileName;
In the same mode, a file handle free processing on a line-by-line basis comprises:
my $fileName;
my $fileContents;
$fileName = "path/to/data.txt";
for $fileName.IO.lines -> $line {
# Utilize the "$currentLine" variable, which holds a string.
}
Remember the option to insert the file name without storage in a variable, which curtails the above code passages even more. Transferring the complete file contents, in corollary, might be reduced to:
my $fileContents = "path/to/data.txt".IO.slurp;
or
my $fileContents = slurp "path/to/data.txt";
In order to access a file on a finer level of granularity, Raku of course provides facilities for a specified amount of characters' retrieval through the readchars
function, which accepts the tally of characters to consume and returns a string representing the obtained data.
my $fileName;
my $fileHandle;
my $charactersFromFile;
$fileName = "path/to/data.txt";
$fileHandle = open $fileName, :r;
# Read eight characters from the file into a string variable.
$charactersFromFile = $fileHandle.readchars(8);
# Perform some action with the "$charactersFromFile" variable.
$fileHandle.close;
Write to a file
editClose a file
editNotes
edit- ↑ generalized to IO handles for interaction with other IO objects like streams, sockets, etc.
External resources
edit
Migrating from Perl 5
Inline::Perl5
editInline::Perl5 is a Module for executing Perl 5 code and accessing Perl 5 modules from Perl 6.
See https://github.com/niner/Inline-Perl5/