Perl on Java? An Introduction to the Sleep Language Blog



    Why Bother with Sleep?
    Built-In Regular Expressions
       Regex Differences
    The Sleep Philosophy
    The Unified I/O API
       Working with Binary Data
    Built-In Data Structures
    Instantiating and
    Talking to Objects
    What Sleep Brings to
    the Java Platform
    Sleep Resources
       Real Perl on Java Resources

    The most popular scripting languages are available in some form for the Java platform. We have Jacl for TCL, Jython for Python, and JRuby for Ruby. One offering is missing from this bunch: what Java offering exists for the Perl hackers of the world?

    In this article, I would like to introduce Sleep. Sleep is a Java-based scripting language heavily inspired by Perl. Sleep is what I wrote when nothing like Perl was available to build a scriptable Java IRC client. I would like to introduce Sleep in terms of its similarities to Perl and what it brings to the Java platform. Sleep can be downloaded at the Sleep Scripting Project home page. A simple "Hello World" script in Sleep looks like this:

     println("Hello World");

    To run the above script, copy and paste it into a file and type:

     java -jar sleep.jar

    Why Bother with Sleep?

    There are a zillion scripting languages on the Java platform, each with its own strengths and weaknesses. With all of these scripting language choices, why does Java need a Perl offering? Perl is an incredibly powerful language for text and data processing. Perl excels at taking input, extracting stuff from it, chewing it up several hundred times, and finally, outputting the mess however the programmer would like. Perl is often referred to as the duct tape of the internet. This is due to its many uses as a "glue"-type language.

    Sleep is primarily a glue language and was designed from the ground up to be embedded in Java applications. This is accomplished via a two-pronged approach. Several Sleep APIs allow extension as well as embedding of the language. By extending Sleep, developers can practically design a domain-specific language for their applications. The second prong and the primary focus of this article is Sleep's similarity to Perl. Sleep steals, borrows, and begs features from Perl. One goal of Sleep is to bring Perl's incredible text/data processing and ease of use to the Java platform.

    Some of Perl's best features include built-in regular expression support, powerful data structures, and an easy-to-use I/O API. Ironically enough, Sleep provides built-in regular expressions, handy built-in data structures, and an easy to use unified I/O API. Sleep also has a few extra tricks up its sleeve. For example, Sleep can instantiate and talk to Java objects.

    Built-In Regular Expressions

    Regular expressions are a mini-language for describing patterns. Strings can be compared against regex pattern strings to check for a match. If there is a match, certain parts of the matching string can be extracted as described in the pattern string. Perl provides a bunch of operators for dealing with regular expressions. While Sleep does not support everything that Perl does, you'll find that the basics are there, and the operators are a little less esoteric.

    I will use a phone number pattern as an example. This example will be simple, since a full-on regex tutorial is beyond the scope of this article. A phone number in the U.S. might consist of a three-digit area code wrapped in parentheses, followed by a space, followed by three digits, followed by a dash, followed by four digits. Or in short:

    (ddd) ddd-dddd

    Assume the ds above mean "digit." A few changes are needed to build a regular expression string that represents the above pattern. The string \d represents a digit in regular expression speak. Parentheses are special characters, so to specify them literally they have to be escaped with a\ character, as in \( and\). A phone number regex pattern as described above is:

    $pattern = '\(\d\d\d\) \d\d\d-\d\d\d\d';

    The above pattern is great for matching "legal" phone numbers, as described. However it does no good for extracting information from any matching text.

    Remember that I mentioned parentheses are special? They are used in the pattern to identify which matching text to extract. To designate substrings of the pattern for extraction later, simply surround these substrings with parentheses. The extracted substrings will then be available later trough thematched() function.

    The following snippet compares a string to the phone number pattern above. Upon finding a match, it extracts the area code and local phone number pieces.

    if ("(654) 555-1212" ismatch '\((\d\d\d)\) (\d\d\d-\d\d\d\d)') { ($areaCode, $phoneNumber) = matched(); }

    In the example above, the scalar $areaCode will have the value 654 and the scalar $phoneNumberwill be equal to 555-1212. Pretty cool, eh?

    The function matched() is tied to the last use of the ismatch predicate. The matched()function returns an array of substrings extracted from the matching text. Just like Perl, Sleep allows individual elements of an array to be assigned to scalars using the syntax above.

    Regex Differences

    For the sake of comparison, I present the phone number extraction example written in Perl.

    if ('(654) 555-1212' =~ /\((\d\d\d)\) (\d\d\d-\d\d\d\d)/) { ($areaCode, $phoneNumber) = ($1, $2); }

    When I was a Perl beginner, I tripped over the =~operator. To me, it looked like an assignment being used as a predicate. Really the above is saying "bind the pattern on the right to the text on the left, and make the extracted substrings available as $1, $2, etc." I used my artistic license to make the Sleep syntax a little simpler. Hopefully Perl hackers can forgive me.

    In Perl, regular expression patterns are enclosed in forward slashes. This is more of a convention than a requirement; however, it is a convention nearly everyone follows. In Sleep, regular expression patterns are specified as double or single quoted strings.

    Many of the Perl regular-expression-related goodies are available in Sleep. The Perl functions split andjoin are both available. The ever populars/pattern/string/g regex operation is available in Sleep as the &replace function.

    Regexes are a good place to talk about the differences between Java philosophy, Perl philosophy, and the Sleep philosophy.

    The Sleep Philosophy

    The designers of Java designed and built a fairly simple core language. They decided that most of the features would be included in the Java class libraries. Hence, most of the complexity of Java lies within the API and not in the language itself. Perl is kind of the opposite: Perl hackers believe that many of the most commonly used features should be built directly into the language. This way, a lot of built-in syntax (AKA "sugar") can follow, to make common stuff easier to do. While this does result in a lot of power, it has also yielded a language that can be complicated.

    Sleep tries to find a middle ground between these two philosophies. Many things are built into the Sleep language. However, oftentimes an API is relied upon to provide functionality. The decision of where to place functionality is based on the "least complexity" rule. If a function is easier to understand and use as a built-in construct, then Sleep will include it as a built-in construct. Regex matching functionality is built into the Sleep language, as seen in the ismatch operator illustrated above. Other regex functionality is provided with built-in functions. Sleep aims to provide built-in power and flexibility while maintaining a core language that is accessible to novice scripters.

    Unified I/O API

    Sleep has an API for providing uniform access to sockets, processes, and files. These things are all data sources, as far as Sleep is concerned. Sleep also provides functionality similar to Perl's for manipulating byte data.

    The following Perl example opens a file and reads the file contents into an array:

     open(HANDLE, "myfile.txt"); @data = <HANDLE>; close(HANDLE);

    Perl provides a special syntax for dealing with file handles. This special syntax is called the diamond operator. The diamond operator reads either a single line or the entire contents of a HANDLE, depending on to what context the data is assigned. If the data is assigned to a $scalar, then a single line is read. The entire HANDLE is read when the data is assigned to, or used as, an @array.

    Sleep does not use assignment context to define how functions behave. Come to think of it, Sleep does not have special syntax for dealing with I/O handles, either. All Sleep I/O handles are object scalars that reference an I/O stream.

    The following Sleep example opens a file, reads all of the text from the file into an array, and closes the file. This example is the Sleep equivalent of the Perl example above.

     $handle = openf("myfile.txt"); @data = readAll($handle); closef($handle);

    Sleep uses &openf to open a file stream. Other functions exist for opening sockets, creating a listening socket, and executing processes. These functions all return a scalar variable that references an I/O stream. Any of these I/O sources can be read from, and written to, using the same built-in functions.

    A very cool I/O concept in Sleep is callback reading. Sleep can invoke a specified function or closure whenever data is read from a source.

    The following is an example of a simple echo server written in Sleep:

    sub handleData { println($1, "Right back at ya: $2"); } $server = listen(3000); read($server, &handleData);

    To connect to the echo server, do this:

     [raffi@beardsley ~]$ 
    telnet 3000 Trying Connected to localhost. Escape character is '^]'. 
    Hello World Right back at ya: Hello World

    A quick explanation is in order. The echo server is created on port 3000 using the &listen function. The&read function is used to tie the function&handleData to the socket stream$server. Internally, Sleep creates a thread that references &handleData and $server. When data is read from $server, the function&handleData is invoked with $server and the read data as arguments. This process continues until$server closes.

    Working with Binary Data

    Sleep provides functionality for dealing with binary data. This functionality is similar in many ways to what Perl offers. In Sleep, an array of byte data is stored as a string. Each character in the string maps to one byte. Sleep provides readb($handle, size) and writeb($handle, "data") to read and write byte strings from an I/O source. For example, to copy a file in Sleep, you'd write:

    # [original file] [new file] $in = openf(@ARGV[0]); $data = readb($in, lof(@ARGV[0])); $out = openf(">" . @ARGV[1]); writeb($out, $data); closef($in); closef($out);

    Sleep also provides Perl-like pack('format', ...)and unpack('format', "data") for storing and retrieving sleep data to or from a byte data string.

    The following example illustrates Sleep's binary data extraction abilities:

    The wtmp file is used to record information when a user logs in or out on a UNIX system. The information in wtmpis stored as binary data. The Mac OS X wtmp manpage specifies the following C structure for a wtmp record:

     #define _PATH_WTMP "/var/log/wtmp" #define UT_NAMESIZE 8 #define UT_LINESIZE 8 #define UT_HOSTSIZE 16 struct utmp { char ut_line[UT_LINESIZE]; char ut_name[UT_NAMESIZE]; char ut_host[UT_HOSTSIZE]; time_t ut_time; };

    Each wtmp entry consists of 36 bytes of data. These entries contain three strings and one integer packed together. The following example extracts the contents of the wtmp file on Mac OS X using Sleep's &unpack function:

    $handle = openf("/var/log/wtmp"); while (1) { ($tty, $uid, $host, $ctime) = bread($handle, 'Z8 Z8 Z16 I'); if (-eof $handle) { break; } $date = formatDate($ctime * 1000, "EEE, d MMM yyyy HH:mm:ss Z"); println("$[10]tty $[10]uid $[20]host $date"); }

    A shortened snapshot of my wtmp file is below.

    ttyp3 raffi Sun, 5 Jun 2005 09:58:57 +0200 ttype raffi Sun, 5 Jun 2005 09:59:13 +0200

    Editor's note: the output has been reformatted to better suit this article's web page layout.

    Built-In Data Structures

    Sleep provides two data structures built into the language: the hashtable (usually referred to as just a "hash") and the ever-versatile array. Normal Sleep arrays are versatile in the fact that they can act as lists, stacks, or arrays.

    To use an array and start playing with it:

    @array = array("Raphael", "Serge", "Andreas", "Fuzzy Puppy"); println("The last element is: " . pop(@array)); push(@array, "Mr. Anderson"); foreach $element (@array) { println("An element: $element"); }

    In Sleep multi-dimensional arrays are easy to create. Just start indexing new dimensions:

    for ($x = 1; $x <= 10; $x++) { for ($y = 1; $y <= 10; $y++) { @multiplication[$x - 1][$y - 1] = $x * $y; } } # print out our multiplication table foreach $row (@multiplication) { foreach $column ($row) { print(" $[3]column |"); } println(); }

    As a side note, the string " $[3]column |" is called a parsed literal in Sleep. A parsed literal is a double-quoted string. Scalar variable names are evaluated inside of parsed literals. Within parsed literals, some formatting is available. For example, $[n]var means "append spaces to $var until the string length is n characters". A negative value indicates that spaces should be prepended instead. Single-quoted strings in Sleep are simple no-frills string literals.

    Hashes are another Sleep data type. The Hashinterface in Sleep is backed by nothing more than ajava.util.HashMap. All scalar keys are converted to strings prior to storage in a hash:

    %dictionary[1] = "The number one"; %dictionary["1"] = "The string one";

    In terms of multi-dimensional data structures, hashes and arrays can be mixed and matched. This is because [] is a special operator in Sleep. It attempts to index data from whatever expression to which it is applied. If it is applied to an expression returning an array, it will index array data. If it is applied to a hash, it will index hash data. This means that technically, any expression that returns array or hash data can be indexed. For example:

    $temp = array("a", "b", "c"); println("Second element is: " . $temp[1]);


    println("Second element is: " . array("a", "b", "c")[1]);

    Arrays in Sleep are always prefixed with an @; hashes, a %, and scalars are always prefixed with a$. Sleep uses the symbol at the beginning of the variable name to determine which type of data structure to create when referencing a variable that does not exist:

    # do we want a hash or an array in this case? $data[0] = "Hello World";

    In the example above, Sleep will silently ignore the attempt to assign a value to a $scalar that doesn't reference a hash or an array.

    The symbols also apply in multi-dimensional data structures. If the symbol at the beginning of the variable name is a%, then any time an index is applied to a nonexistent dimension, a new hash will be created.

    The nice thing about this system is that hashes and arrays are just like any other variables, with no need for special treatment. For example, to pass an array to a subroutine:

    sub multiplyAll { foreach $temp ($1) { # assigning to $temp is the same as # assigning to the individual element $temp = $temp * $2; } return $1; } @data = array(1, 2, 3, 4, 5, 6, 7, 8, 9, 10); printAll(multiplyAll(@data, 3));

    Above, @data was passed to&multiplyAll with no special handling. Perl would normally "flatten" @data and pass each element as a separate parameter unless \ were used to turn@data into a reference. In Sleep @data is passed as a reference automatically.

    Instantiating and Talking to Objects

    Another fun thing in Sleep is the ability to instantiate and talk to Java objects. This ability was added to the language to allow access to APIs that I was too lazy to create.

    The following is a simple web browser created in Sleep:

    # # Simple Sleep Based Graphical Web Browser # Java's HTML renderer isn't very good, therefore # this browser isn't either # import java.awt.*; import javax.swing.*; import*; $window = [new JFrame:"Sleep Based Web Browser"]; [$window setDefaultCloseOperation: [JFrame EXIT_ON_CLOSE]]; [$window setSize:480, 320]; sub go_to_site { [$display setPage: [$address getText] ]; if (checkError($check)) { println("Error: $check"); } } sub link_clicked { if ([$1 getEventType] eq "ACTIVATED") { [$display setPage: [$1 getURL]]; [$address setText: [$1 getURL]]; } } $address = [new JTextField:20]; [$address addActionListener:&go_to_site]; $button = [new JButton:"Go!"]; [$button addActionListener:&go_to_site]; $panel = [new JPanel]; [$panel add: $address, [FlowLayout CENTER]]; [$panel add: $button, [FlowLayout RIGHT]]; [[$window getContentPane] setLayout: [new BorderLayout]]; [[$window getContentPane] add: $panel, [BorderLayout NORTH]]; $display = [new JEditorPane: "text/html", ""]; [$display addHyperlinkListener:&link_clicked]; [$display setEditable:0]; [[$window getContentPane] add: [new JScrollPane: $display], [BorderLayout CENTER]]; [$window show];

    One will easily notice the calls to the Java API pretty quickly. They are all surrounded in square brackets. Sleep's syntax for using Java objects is similar to that of Objective-C:

     [reference message: argument, argument, ...]

    Each call has a reference, a message, and then a colon, followed by a comma-separated list of arguments.

    The web browser example demonstrates that the Sleep object syntax allows one to get stuff done with the Java API. However, working with swing this way is a little cumbersome. A Sleep/Swing module is currently in the works to help make UI scripting in Sleep more practical.

    In Sleep, you can't create new Java classes. However, interfaces can be faked by passing a subroutine or a closure to an argument expecting a specific interface. Closures and subroutines are actually one and the same. The topic of closures is covered next.


    In Sleep, functions are considered "first class" types. This means that a scripter can define a new function, assign it to a variable, pass it as a value, invoke a function referenced by a variable, and so on.

    To define a named closure:

     sub foo { println("bar"); }

    The named closure can be invoked as follows:


    OK, that wasn't too exciting. To get technical, a named closure can also be invoked with:


    That was confusing. It will make sense in just a minute. To assign a named closure to a variable and invoke the closure from the variable:

     $var = &foo; [$var];

    Consequently, this could have been written as:

     $var = { println("bar"); }; [$var];


     [{ println("bar"); }];

    Sleep closures are called with the same syntax used with objects. Arguments in closures are available starting at $1on up to $n for the nth argument. The message parameter (defined before the semicolon) is passed to closures as$0. This allows you to create some cool interfaces in Sleep. For example:

    sub BuildStack { return { this('@stack'); if ($0 eq "push") { push(@stack, $1); } if ($0 eq "pop") { return pop(@stack); } }; } # construct a new stack closure... $mystack = BuildStack(); # push the string "test" onto the stack [$mystack push: "test"]; # pop the top value off of the stack and print it println("Top value is: " . [$mystack pop]);

    The example above defines a new subroutine called&BuildStack. The subroutine returns a new closure. Inside of the closure, the variable @stack is put into the this scope. Inside of the this scope,@stack is visible only inside of the owning closure instance. A second call to &BuildStack() would return a new closure instance with its own @stackvariable.

    Closures can also be passed to Java objects expecting an interface. Any Java method call against the closure interface will result in the entire closure being executed. The message parameter ($0) will contain the name of the method Java is trying to invoke. Closures are the closest thing to objects Sleep has.

    What Sleep Brings to the Java Platform

    Sleep is a language for Perl hackers who also live in the world of Java. Sleep brings the power of Perl to the Java platform. Not only can Sleep extract data, parse it, rework it, and spit it back out, but Sleep can extract data from, and send it back to, Java objects. Sleep is also highly extensible, allowing new functions, operators, and constructs to be added to the language. Sleep's extensibility allows it to fit into new problem domains or be embedded into Java applications. Combine the extensibility to fit into new problem areas with powerful language features, and the possibilities are endless.

    Sleep Resources

    Real Perl on Java Resources