object_regex

Regex searching on Arbitrary Objects
Documentation
Repository

object_regex

Every Rubyist loves Regular Expressions. Even their most basic features (concatenation, choice, repetition) are quite expressive and are conveniently implementable in linear time. The more advanced features, such as backreferences and syntax-sugar like [ab]{3,5} make many common patterns simple to match.

Regular expressions in Ruby, however, work only on sequences of characters. Since regular expressions have their basis (ignoring features like backreferences) in Finite Automata, we should be able to change the alphabet they operate on to match sequences of arbitrary objects. object_regex does just that.

Usage

The first use intended for object_regex was matching sequences of Ruby tokens to extract comments. Multiple lines of comments end up in the token stream as a comment token, followed by pairs of an optional space token and a comment token. In regular expressions, if 'c' is a comment and 's' is a space, we'd match this pattern as such:

    /c(s?c)*/

However, we want to match on token objects, not 'c' and 's': we need to retain token information such as the text contained and the location of the token. What we'd like to match is more like this:

    /comment (space? comment)*/

and be able to run that regex against an array of tokens.

object_regex guarantees that if every object in an array (really any random-access sequence with the #[] method) has a reg_desc method returning a string, then you can match against it using the different values returned by reg_desc.

This exact implementation is used in Wool for this very purpose.

Implementation

object_regex piggybacks off of the normal Regex implementation in Ruby: it actually maps the input pattern you provide to a string-based search pattern. It can handle arbitrarily large patterns, including those with thousands of possible types. Its implementation is derived from a similar, though more restricted, implementation in Ruby 1.9's Ripper library.

Requirements

Installation

gem install object_regex