Perl Syntax Notes
Snippets of Perl Syntax to Insert Where Needed
Contents
Quote Constructs
Top BottomPerl provides the following quotation constructs:
- q//
- Single quote, no interpolation.
- qq//
- Double quote
- qw//
- Quote words, no interpolation
- qx//
- Command Execution
- m//
- Pattern Match
- s///
- pattern substitution
- tr///
- Transliterate, no interpolation
- qr//
- Quote regular expression
The '/' in all these constructs can be replaced with any non-alphanumeric, non-whitespace character. If you use a single-quote as the delimiter, this will suppress interpolation. The backtick operator returns the output as one string in scalar context, and as a list of lines in list context. The numeric status of the command is saved in $?.
The pattern match binding operator '=~' has a higher precedence than the assignment operator, allowing a matches to be captured and assigned on one line:
use strict;
use warnings;
my @processes = qx/ps aux/;
my $user = "root";
for my $string (@processes) {
if (my ($pid, $command) = $string =~ m/^$user\s+(\d+).*\d+.\d+\s(.+)/) {
print "$command ($pid) running as $user\n";
}
}
Because the assignment occurs in list context, the variables are assigned the matches in order of occurence.
Here Documents
Top BottomHere Documents allow you to process a block of text as an array of strings. Each line is an element of the array until the defined terminator is reached. The terminator is specified by a string immeadiately following two less than signs ('<<') and ending with a semicolon. The block of text defined will interpolate according to whether the terminator is single, double-quoted or back-tick quoted. If unquoted, the block will behave as though the terminator was double-quoted.
use strict;
use warnings;
use Getopt::Long;
my ($commands, $help);
GetOptions(help => \$help, commands => \$commands);
&usage() unless $commands;
print <<`DONE`;
df -h
uptime
uname -a
DONE
sub usage() {
print <<"EOF";
Description
\tThe System Administration Time-Saver
Synopsis
\t$0 -c
Options
\t$0 -h, --help\tPrints this message
\t$0 -c, --commands\tRuns the commands
EOF
exit;
}
The array of strings can be fed to print to provide multiline documentation, or used anywhere that an array of strings might be used:
use strict; use warnings; my @fruit = <<ENDFRUIT =~ m/(\S.*\S)/g; passion fruit avocado pear bannana kiwi fruit paw paw ENDFRUIT print "Found ", scalar(@fruit), " fruits\n";
Readline Operator
Top BottomThe readline operator '<>' only assigns to $_ if it is the only condition in a while loop. At the end of the file '<>' returns undef, so check for defined rather than truth if assigning the line read. In scalar context, returns the next line in the file: in array context returns each line as a list element. Use '$.' to check the number of lines read so far by the script.
use strict;
use warnings;
use FindBin qw($Bin); # assigns path of script to $Bin
for my $file ( glob("$Bin/*.pl") ) {
next unless open(INPUT, $file);
print "$.] $_" while <INPUT>;
$. = 0; # reset line number to zero for next file
}
Loop Control
Top BottomA while loop executes its block while its condition returns TRUE. Both while and until loops can also have an optional continue block, which is evaluated after each iteration. last, next and redo allow you to alter this behaviour, and can be used in conjunction with labels:
use strict;
use warnings;
my $input = '/etc/httpd/conf/httpd.conf';
open(INPUT, $input) or die "Unable to read from $input:$!";
my ($line_count, $config_count) = (0,0);
LINE: while (<INPUT>) {
next LINE if /^#/;
next LINE if /^$/;
$config_count++;
} continue {
$line_count++;
}
print "Found $config_count/$line_count of configuration in $input\n";
The 'last' operator immeadiately exits the current loop block and the continue block is not executed. The 'next' operator skips the rest of the current iteration and starts the next one after executing the continue block. The 'redo' operator restarts the loop without evaluating the conditional again and without executing the continue block.
The 'last' operator and a labelled bare block can be used to emulate a case/switch construct:
use strict;
use warnings;
my $Monday = q/Can't trust that day/;
my $Friday = q/TGI Friday!/;
my $Saturday = q/Hello World/;
my $Sunday = q/sleep 60/;
my $Weekday = q/Put the kettle on/;
print ${ &thought_for_the_day() }, "\n";
sub thought_for_the_day() {
my $day = (localtime(time))[6];
SWITCH: {
if ($day == 7) {return \$Sunday; last SWITCH;}
if ($day == 6) {return \$Saturday; last SWITCH;}
if ($day == 5) {return \$Friday; last SWITCH;}
if ($day == 1) {return \$Monday; last SWITCH;}
return \$Weekday;
}
}
Arrays
Top BottomThe last index of an array is held in the special variable $#arrayname. The last element of an array can therefore be accessed as $arrayname[$#arrayname]. However it is also possible to write $arrayname[-1]. An array can be truncated, or extended by specifying a value for $#arrayname.
An array slice is specified as @arrayname['list literal']. An array slice will behave as a list literal in scalar context: returning its last element rather than a count of elements.
use strict; use warnings; use Test::More qw/no_plan/; my @arr = qw(one two three four five six seven eight nine); ok(scalar(@arr) == 9, "Number of elements in array"); ok($#arr == 8, "Index of last element"); ok($arr[$#arr] eq 'nine', "Last element in array via \$\#"); ok($arr[-1] eq 'nine', "Last element in array via -1"); my @evens = @arr[1,3,5,7]; my $count = @evens; ok($count == 4, 'Array returns number of elements in scalar context'); my $last = @arr[1,3,5,7]; ok($last eq $evens[-1], 'Slice acts as a list literal in scalar context, returning last element'); my ($first, undef, undef, $second) = @arr[1..4]; ok($first eq $arr[1], 'Elements assigned in left-right order in list context'); ok($second eq $arr[4], 'Unwanted elements can be assigned to undef'); $#arr = 3; ok(scalar(@arr) == 4, "Truncate an array by assigning to \$\#");
Hashes
Top BottomHashes are prefixed with a '%' sign and initialised with a list:
%sounds = (cat => 'meow', dog => 'bark', pig => 'oink', goldfish => 'undef')
Values are accessed via keys:
print "The $animal says " . $sounds{$animal};
Use keys, values or each to iterate a hash:
foreach $key (keys %hash) foreach $value (values %hash) while (($nextkey, $nextval) = each %hash)
Subroutines
Top BottomCalled by specifying their name, followed by a list of arguments. The list of arguments gets flattened into one long list, stored in a localised @_ array. Copying the arguments to a list of my variables, means the routine works as a call-by-value: changes to the lexically scoped variables in the subroutine, do not affect the values of the variables used in the original call. Using @_ directly in the subroutine, makes the subroutine act as a call-by-reference, and changes to the variables will affect the original variables outside the subroutine:
use strict;
use warnings;
my $key = 'cf8baa5112';
print "Before uppercase: $key\n";
&uppercase($key);
print "After uppercase: $key\n";
my $return_key = &lowercase($key);
print "Key after lowercase: $key\n";
print "Return value from lowercase: $return_key\n";
sub uppercase() {
for (@_) { tr/a-z/A-Z/ }
}
sub lowercase() {
my $key = shift;
$key =~ tr/A-Z/a-z/;
return $key;
}
The list of arguments can be empty, or omitted altogether, if the subroutine is defined before the call. The parentheses around the argument list can also be omitted if the the subroutine has been defined before the call:
@values = myfunc $first, $last;
Calling a subroutine with the '&' prefix but no argument list means is gets the current @_ argument list instead. This also happens when calling the subroutine with goto:
goto &mysub;
Named arguments can be provided to a subroutine using the '=>' notation when listing the arguments:
checkServer(name => 'server01', date => '-1', cols => 3);
sub checkServer {
%defaults = (name => 'pdc01, date => '-5');
#defaults get overwritten by argument list
%args = (%defaults, @_);
#another way to specify a default
$args{cols} = 5 unless exists $args{cols};
..
}
The calling context of a subroutine can be determined using the wantarray function:
use Carp;
sub variousReturns {
# Do some processing here
# true and defined
return @someArray if wantarray;
# false but defined
return $someScalar if defined(wantarray);
#false and not defined
carp "subroutine &variousReturns was called in void context" unless(defined(wantarray));
}
Carp::carp reports the location of the call. Warn would report the location within the subroutine where the error occurred
The caller function returns a list of values indicating:
- the package from which the current subroutine was called
- the name of the file containing the code the called the current subroutine
- the line in that file from which the current subroutine was called
- the name of the subroutine
- whether the subroutine was passed arguments
- the context in which the subroutine was called
- the actual source code that called the subroutine, but only if the call was part of an eval TEXT statement
- whether the subroutine was called as part of a require or use statement
Symbolic references
Top BottomA symbolic reference is a character string containing the name of a variable or subroutine in a packages symbol table
package Utils;
sub formatter { print "Subroutine received '$_[0]' as first param\n"}
@formatter = qw! Jamon Queso Pan !;
$formatter = [qw# Ham Cheese Bread#];
package main;
$symref = "Utils::formatter";
print "Dereference symref as array: [@{$symref}] \n";
@{$symref}[0] = ${$symref}->[0];
print "Place first element of scalar ref in array ref: [", join(' ', @{$symref}), "]\n";
&{$symref}(${$symref}->[1]);
Hard References
Top BottomHard references point directly to the referent, rather than simply acting as a name for another variable
To create a reference use the unary '\' operator. References can be made to scalars, arrays, hashes, subroutines filehandles, patterns, typeglobs and other references. Use 'ref($my_ref)' to determine type of a reference. Use the '->' operator to dereference a reference:
my $s = "A scalar value";
my @a = qw(a list of vaules);
my %h = (first => 1, second => 2, third => 3);
my ($slr_ref, $arr_ref, $hsh_ref, $sub_ref) =
\($s, @a, %h, &print_results);
# same as (\$s, \@a, \%h, \&print_results);
$sub_ref->($slr_ref, $arr_ref, $hsh_ref, $sub_ref,);
print "Slice an array reference[", join(":", @$arr_ref[1,2]), "]\n";
print "Using arrow operator to access \$arr_ref->[1]: [$arr_ref->[1]]\n";
print "Using arrow notation to access \$hsh_ref->{fourth}: [$hsh_ref->{fourth}]\n";
sub print_results() {
my @args = @_;
foreach my $arg (@args) {
if (ref($arg) eq 'SCALAR') { print "${$arg}\n" }
elsif (ref($arg) eq 'ARRAY') { print "@{$arg}\n" }
elsif (ref($arg) eq 'HASH') {
foreach my $key (keys %{$arg}) { print "$key => ${$arg}{$key}\n" }
}
elsif (ref($arg) eq 'CODE') { print "Got a subroutine reference\n" }
else { print $arg, "\n" }
}
}
References can also be created using anonymous array, hash and subroutine literals:
my ($slr_ref, $arr_ref, $hsh_ref) = (
\"Second scalar value",
[qw(another list of values)],
{fourth => 4, fifth => 5, sixth => 6});
my $sub_ref = sub {
my @args = @_;
foreach my $arg (@args) {
if (ref($arg) eq 'SCALAR') { print "${$arg}\n" }
elsif (ref($arg) eq 'ARRAY') { print "@{$arg}\n" }
elsif (ref($arg) eq 'HASH') {
foreach my $key (keys %{$arg}) { print "$key => ${$arg}{$key}\n" }
}
elsif (ref($arg) eq 'CODE') { print "Got a subroutine reference\n" }
else { print $arg, "\n" }
}
};
$sub_ref->($slr_ref, $arr_ref, $hsh_ref, $sub_ref);
References to filehandles or directory-hamdles can be created by referencing the typeglob of the same name:
my ($slr_ref, $arr_ref, $hsh_ref) = (
\"Second scalar value",
[qw(another list of values)],
{fourth => 4, fifth => 5, sixth => 6});
my $sub_ref = sub {
my @args = @_;
foreach my $arg (@args) {
if (ref($arg) eq 'SCALAR') { print "${$arg}\n" }
elsif (ref($arg) eq 'ARRAY') { print "@{$arg}\n" }
elsif (ref($arg) eq 'HASH') {
foreach my $key (keys %{$arg}) { print "$key => ${$arg}{$key}\n" }
}
elsif (ref($arg) eq 'CODE') { print "Got a subroutine reference\n" }
else { print $arg, "\n" }
}
};
$sub_ref->($slr_ref, $arr_ref, $hsh_ref, $sub_ref);
Dereferencing can be achieved by prepending the scalar value with the appropriate 'funny character' or using the arrow ('->') operator.
use strict;
use warnings;
my ($slr_ref, $arr_ref, $hsh_ref, $sub_ref) = (
\"Second scalar value",
[qw(another list of values)],
{fourth => 4, fifth => 5, sixth => 6},
sub { "Subroutine called" },
);
print "Deref slr_ref: $$slr_ref\n";
print "Deref arr_ref: @$arr_ref\n";
print "Deref hsh_ref: ", (keys %$hsh_ref), "\n";
print "Deref sub_ref: ", &$sub_ref, "\n";
print 'Using deref to access element 2 for $arr_ref: [',
"$$arr_ref[1]", "]\n";
print 'Using arrow operator to access element 2 of $arr_ref: [',
"$arr_ref->[1]", "]\n";
print q/Using deref to access value or 'fourth key' in $hsh_ref: [/,
"$$hsh_ref{fourth}", "]\n";
print q/Using arrow notation to access value of 'fourth' key in \$hsh_ref: [/,
"$hsh_ref->{fourth}", "]\n";
print "Slice an array reference[", join(":", @$arr_ref[1,2]), "]\n";
my $matrix = [ [1, 4, 7, 9], [2, 5, 8, 11], [3, 6, 9, 12], ];
print "Element {2,3} of matrix is [${ $$matrix[1] }[2]]\n";
print "Element {2,3} of matrix is [$matrix->[1]->[2]]\n";
print "Element {2,3} of matrix is [$matrix->[1][2]]\n";
my $people = {
John => { age => 60, height => 200, birthday => '1999-04-01'},
Jill => { height => 180, sex => 'female', age => ''},
Fred => { sex => 'male', telephone => '00234283434', height => 60},
};
print "John is ${ $$people{John} }{age} years old\n";
print "John is $people->{John}->{age} years old\n";
print "John is $people->{John}{age} years old\n";
Namespaces
Top BottomAll variables exist in a namespace which is either global (defined in a package) or lexical (declared with 'my').
A symbol table or package is a global hash containing entries for global variables. Inside the symbol table, each key/value pair matches a variable name to its value. To use a package variable outside the package namespace, prefix the variable with the name of the package and a double colon, or use a 'package' declaration to switch back to the package's namespace. A package declaration changes the namespace until another package declaration is encountered, or until the end of the current enclosing block.
The special token '__PACKAGE__' contains the name of the current package.
Symbol tables can be used to store constants:
use strict;
use warnings;
my $r = 4;
package Scalar;
*PI = \(4 * atan2(1, 1));
package Constant;
use constant PI => (4 * atan2(1, 1));
package Sub;
*PI = sub () { 4 * atan2(1, 1) };
package Borrowed;
*PI = \$Scalar::PI;
print "Circumference is ", 2 * $Scalar::PI * $r, "\n";
print "Circumference is ", 2 * Constant::PI * $r, "\n";
print "Circumference is ", 2 * &Sub::PI * $r, "\n";
print "Circumference is ", 2 * $Borrowed::PI * $r, "\n";
A typeglob allows you to refer to all the symbol table entries for a particular package identifier. A typeglob is prefix with a '*' to indicate that it refers to all other types. Assigning *bar to *foo, makes foo an alias for bar, and typeglob assignment forms the basis for module import/export operations. Individual entries in a symbol table can be assigned to another another typeglob using a reference as the rvalue.
Symbol table references use the '*foo{THING}' syntax to access symbol table references, where THING is one of 'SCALAR', 'ARRAY', 'HASH', 'CODE', 'GLOB', 'IO' or 'FILEHANDLE'
package main;
$var = "The rain in Spain";
@var = qw( The rain in Spain);
%var = qw(The rain in Spain);
*newvar = *var;
print "The scalar in var is: [", $var, "]\n";
print "The scalar in newvar is: [", $newvar, "]\n";
print(join ":", @newvar, "\n");
*var = \"The sun in England";
print "The scalar in var is: [", $var, "]\n";
print "The scalar in newvar is: [", $newvar, "]\n";
$typeglob_ref = \*var;
print "The scalar in typeglob_ref is: [", ${*$typeglob_ref}, "]\n";
$slr_ref = *$typeglob_ref{SCALAR};
print "The scalar in slr_ref is: [", $$slr_ref, "]\n";
Typeglobs do not apply to lexical variables, since they do not have a named symbol table. Lexical variables must be defined with 'my', do not belong to any package and are only directly accessible within the defining code block. Lexical variables cease to exist outside the defining block unless some other part of the program has a reference to it. Lexical variables are not destroyed until their reference count is 0.
Local Variables
Top BottomThe local function takes package variables and replaces their value until the end of the enclosing block. Subroutines called from within the block will see the temporary value of the variable.
Modules
Top BottomModules are stored in '.pm' files whose name is the name of the module. Inside the module, the first line should be a package declaration that specifies the name of the module, which will be the name of the file without the '.pm' extension. The final line should end with '1;', which means the module will return a TRUE value on execution. A traditional module will use Exporter to export symbols to programs using the module. The @EXPORT array is used to define all the symbols that are exported by default. The @EXPORT_OK specifies symbols that can be exported on request.
package Underground;
require Exporter;
our @ISA = qw(Exporter);
our @EXPORT = qw(get_to_work get_home);
our @EXPORT_OK = qw(hurry_to_work $home $work);
our %EXPORT_TAGS = (
get_there => [qw(get_to_work hurry_to_work $work)],
get_home => [qw(get_home hurry_home $home)],
};
our $home = 'Upton Park';
our $work = 'Euston Square';
sub get_to_work() {
print <<Got_to_work;
Get to $home on foot
wait for the Hammersmith & City train
there's always seats, so relax for 35 minutes
Kings Cross, time to wake up
Mind the gap - you've arrived at $work
Got_to_work
}
sub get_home() {
print <<SOTS;
have a smoke, get the Hammersmith & City line
wait for Kings Cross and take a seat
nearly there
Home, home again
I like to be here when I can
SOTS
}
sub hurry_to_work() {
print <<PHEW;
grab a bus, get the first train
change at Mile End for Liverpool Street,
take anything that gets you to $work,
relax, no one noticed you arrive late
PHEW
}
sub hurry_home() {
print <<hi_honey_im_home;
getout, geton, getoff, getin
hi_honey_im_home
}
1;
To use the module use 'use'.
The 'use' statement preloads the MODULE at compile time so errors will be identified before runtime. A 'require' statement loads the module at runtime. 'use' will load the symbols from the @EXPORT array, if you haven't specified a symbol list in the use statement. Otherwise it will load the symbol list you've requested. If a requested symbol is not in either @EXPORT or @EXPORT_OK, Perl raises a compile-time exception for each failed symbol. A module can provide pre-defined lists of symbols for export using the %EXPORT_TAGS hash whose keys acts as aliases for the symbols listed indexed by the key.
use strict; use warnings; use FindBin qw($Bin); use lib $Bin; use Underground; print "Ba Da, Ba Da Da Da\n"; Underground::hurry_to_work; print "\n\nAnother day\nanother \$\n\n"; get_home; print "\nTuesday\n"; get_to_work; Underground::hurry_home;
Export tag aliases are used by preceding their keys with a colon in the use statements symbol list. To exclude a symbol, precede its name or tag with an exclamation mark:
use Underground; #import @EXPORT use Underground(); #import nothing use Underground qw($work $home); #import two scalars use Underground qw(:get_there); #import tag use Underground qw(!get_to_work); #don't import symbol use Underground qw(:DEFAULT); #import @EXPORT use Underground qw(/work/); #import anything matching use Underground qw(/^\$/); #import all scalars use Underground qw(:get_there !get_to_work); #import tag but not function use Underground qw(:get_there :get_home); #import two tags
Importing a modules symbols makes those symbols available in your modules namespace. This is sometimes described as 'polluting' your namespace. Built-in functions can be overridden by names imported from a module. To override a built-in function, a code reference should be assigned to a typeglob with the same name as the function. The assignment must occur in some other package, which can be achieved with 'use subs'.
use Naughty;
require Exporter;
our @ISA = qw(Exporter);
our @EXPORT = qw(chdir );
use subs qw(chdir ); # assigns to *chdir
sub AUTOLOAD() {
my $command = $AUTOLOAD;
*$AUTOLOAD = sub {print "$command called\n"};
goto &$AUTOLOAD;
}
1;
Overridden functions can still be accessed via the CORE package:
use strict; use warnings; use FindBin qw($Bin); use lib $Bin; use Naughty; my $command = "ls -l"; chdir; print qx/$command/; CORE::chdir; print qx/$command/;
Modules that override built-in functions should place the re-defined functions in @EXPORT_OK not @EXPORT. This way, the overridden functions have to be explicitly requested
The @EXPORT_FAIL array specifies symbols that should not be export. If a program attempts to export a symbol listed in @EXPORT_FAIL, then the export_fail method is called. The default export_fail returns a list of symbols and raises an exception for each one. Alternatively, you can write your own export_fail:
sub export_fail {
my $class = shift;
for my $symbol (@_) {
print "$symbol not available\n";
}
return @_;
}
If your custom export_fail routine returns an empty list, no errors are recorded and all symbols are exported
Autoload
Top BottomIf a non-existent package subroutine is called, perl will try to call the packages subroutine AUTOLOAD before issuing an error. AUTOLOAD is called with the argument list passed to the missing subroutine and the $AUTOLOAD variable is assigned the fully-qualified name of the missing subroutine.
sub AUTOLOAD {
my $function_name = our $AUTOLOAD;
*$AUTOLOAD = sub {print "$function_name called with: (@_)\n"};
goto &$AUTOLOAD;
}
blurch("One", "Two", "Three");
blarch(qw/eenie meenie minee/);
blorch(qx/ls -l/);
blerch(q/$bill @at %cornbeef &ecap/);
An interesting but useless application of AUTOLOAD is to run shell commands from a perl script:
# Do shell programming from perl
sub AUTOLOAD {
$AUTOLOAD =~ s/.*:://;
return `$AUTOLOAD @_`;
}
$network = ls('-al');
print "$network\n";
my $result = rm('README.TXT');
print "$result\n";
$cdplayer = play('~/CDtemp/*.wav');
Closures
Top BottomA closure is a subroutine that refers to one or more lexical variables declared outside the subroutines block. Lexical variables are scoped within their defining block: subroutines are not. Therefore lexical variables can be made accessible outside their defining block, via a subroutine defined within the block. Closures can thus be used to implement encapsulation.
{
my $locked;
sub lock { return 0 if $locked; $locked = 1};
sub unlock { $locked = 0}
}
lock or die "Resource already in use";
# do some stuff
unlock();
