Shooting errors with Raygun.io

Written on August 28 2013 at 23:05 ∷ permalink

I’ve been playing with Raygun.io over the last day or so. It’s a tool, like Honeybadger, Airbrake or Errbit, for managing exceptions from other web or mobile applications. It will email you when exceptions occur, collapse duplicate errors together, and allows a team to comment and resolve exceptions from their nicely designed web interface.

I’ve come to the conclusion that integrating with something like this is basically a minimum requirement for software these days. Previously we might’ve suggested an ‘iterative’ approach of emailing exceptions directly from the application before later using one of these services, but I no longer see the benefit in postponing a better solution when it’s far simpler to work with one of these services than it is to set up email reliably.

It seems pretty trivial to integrate with a Rails application – just run a generator to create the initializer complete with API key. However, I had to do a bit more work to hook it into a Rack application (which is what Vanilla is). In my config.ru:

# Setup Raygun and add it to the middleware
require 'raygun'
Raygun.setup do |config|
  config.api_key = ENV["RAYGUN_API_KEY"]
end
use Raygun::RackExceptionInterceptor

# Raygun will re-raise the exception, so catch it with something else
use Rack::ShowExceptions

The documentation for this is available on the Raygun.io site, but at the moment the actual documentation link on their site points to a gem, which more-confusingly isn’t actually the gem that you will have installed. Reading the documentation in the gem README also reveals how to integrate with Resque, to catch exceptions in background jobs.

One thing that’s always worth checking when integrating with exception reporting services is whether or not they support SSL, and thankfully it looks like that’s the default (and indeed only option) here.

The Raygun server also sports a few plugins (slightly hidden under ‘Application Settings’) for logging exception data to HipChat, Campfire and the like. I’d like to see a generic webhook plugin supported, so that I could integrate exception notification into other tools that I write; thankfully that’s the number one feature request at the moment.

My other request would be that the gem should try not to depend on activesupport if possible. I realise for usage within Rails, this is a non-issue, but for non-Rails applications, loading ActiveSupport can introduce a number of other gems that bloat the running Ruby process. As far as I can tell, the only methods from ActiveSupport that are used are Hash#blank? (which is effectively the same as Hash#empty?) and String#starts_with? (which is just an alias for the Ruby-default String#start_with?). Pull request submitted.


Monitoring our cat with Twine

Written on August 19 2013 at 16:34 ∷ permalink

I was lucky enough to be gifted a Twine by my colleagues at Go Free Range last weekend, and I took the opportunity to put together a very simple service that demonstrates how it can be used.

The Twine

If you haven’t heard of Twine, it’s a hardware and software platform for connecting simple sensors to the internet, and it makes it really very easy to do some fun things bridging the physical and online worlds.

Hardware

On the hardware side, there’s a simple 2.7” square that acts as a bridge between your home WiFi network and a set of sensors.

The twine bridge

Some of the sensors are built in to the square itself: temperature, orientation and vibration can be detected without plugging anything else in. You can also get external sensors, which connect to the square via a simple 3.5mm jack cable. If you buy the full sensor package, you’ll get a magnetic switch sensor, a water sensor and a ‘breakout board’ that lets you connect any other circuit (like a doorbell, photoresistor, button and so on) to the Twine.

Setup

Connecting the Twine to a WiFi network is elegant and features a lovely twist: you flip the Twine on its “back”, like a turtle, and it makes its own WiFi network available.

Twine setup

Connect to this from your computer, and you can then give the Twine the necessary credentials to log on to your home network, and once you’re done, flip it back onto its “belly” again and it will be ready to use. I really loved this simple, physical interaction.

Software

On the software side, Twine runs an online service that lets you define and store ‘rules’ to your connected Twine units. These rules take the form of when <X> then <Y>, in a similar style to If This Then That. So, with a rule like when <vibration stops> then <send an email to my phone>, you could pop the Twine on top of your washing machine and be alerted when it had finished the final spin cycle.

Twine rules

Connectivity

As well as emailing, the Twine can flash it’s LED, tweet, send you an SMS, call you, or ping a URL via GET or POST requests including some of the sensor information.

Supermechanical, the company that launched Twine about a year and a half ago via Kickstarter, maintains a great blog with lots of example ideas of things that can be done.

All technology tends towards cat

Just as the internet has found its singular purpose as the most efficient conduit for the sharing of cat pictures, so will the Internet of Things realise its destiny by becoming entirely focussed on physical cats, in all their unpredictable, scampish glory.

It’s neat having something in your house tweet or send you an email, but I like making software so I decided to explore building a simple server than the Twine could interact with, and thus, “Pinky Status” was born:

Pinky Status

What follows is a quick explanation of how easy it was.

The sensor

I hooked up the magnetic switch sensor to the Twine, and then used masking tape to secure the sensor to the side of the catflap, and then the magnet to the flap itself.

Catflap sensor

That way, when “Pinky” (that’s our cat) opened the flap, the magnet moves away from the switch sensor and it enters the ‘open’ state. It’s not pretty, but it works.

The Rule

Next, we need a simple rule so that the Twine knows what to do when the sensor changes:

Pinky Status Twine Rule

When the sensor changes to open, two things happen. Firstly, I get an email, which I really only use for debugging and I should probably turn it off, except that it’s pretty fun to have that subject line appear on my phone when I’m out of the house.

Secondly and far more usefully, the Twine pings the URL of a very, very simple server that I wrote.

A simple service

Here’s the code, but it’s probably clearest to view an earlier Sinatra version than the current Rails implementation:

require "rubygems"
require "bundler/setup"
require "sinatra"
require "data_mapper"

DataMapper::setup(:default, ENV['DATABASE_URL'] || 'postgres://localhost/pinky-status')

class Event
  include DataMapper::Resource
  property :id, Serial
  property :source, Enum[:manual, :twine], default: :twine
  property :status, Enum[:in, :out]
  property :created_at, DateTime

  def self.most_recent
    all(order: [:created_at.desc]).first
  end

  def self.most_recent_status
    most_recent ? most_recent.status : nil
  end

  def self.next_status
    if most_recent_status
      most_recent_status == :in ? :out : :in
    end
  end
end

DataMapper.finalize

Event.auto_upgrade!

get "/" do
  @events = Event.all
  @most_recent_status = Event.most_recent_status
  erb :index
end

post "/event" do
  Event.create!({created_at: Time.now, status: Event.next_status}.merge(params[:event] || {}))
  redirect "/"
end

The key part is at the very bottom – as Twine makes a POST request, the server simply creates another Event record with an alternating status (‘in’ or ‘out’), and then some logic in the view (not shown) can tell us whether or not the cat is in or out of the house.

In more recent versions of the code I’ve moved to Rails because it’s more familiar, but also slightly easier to do things like defend against duplicate events (normally when the cat changes her mind about going outside when her head is already through the flap) and other peripheral things.

But don’t be dissuaded by Rails - it really was as trivial as the short script above , showing some novel information derived from the simple sensor attached to the Twine. Deploying a server is also very easy thanks to tools like Heroku.

Conclusions

A few hours idle work and the secret life of our cat is now a little bit less mysterious than it was. I really enjoyed how quick and painless the Twine was to setup, and I can highly recommend it if you’re perhaps not comfortable enough to dive into deep sea of Arduinos, soldering and programming in C, but would still like to paddle in the shallower waters of the “internet of things”.


What happens when RSpec runs, or, what I think about testing with blocks

Written on February 18 2013 at 16:41 ∷ permalink

Welcome to part two of the the post series which will hopefully cauterize the bleeding stump that is my Ruby Testing Quest.

This time, we will take a not-too-deep dive into how RSpec works. Last time we looked at MiniTest; if you haven’t already read that, it might be a better place to start than this.

Let’s get going.

A simple RSpec example

Here’s a simple RSpec example.

require "rspec/autorun"

describe "an object" do
  before :all do
    @shared_thing = Object.new
  end

  before :each do
    @something = Object.new
  end

  it "should be an Object" do
    @something.should be_an(Object)
  end

  describe "compared to another object" do
    before :each do
      @other = Object.new
    end

    it "should not be equal" do
      @something.should_not == @other
    end
  end

  after do
    @something = nil
  end
end

This is obviously extremely dull and pointless – just like the minitest one – but it contains just enough to exercise the major parts of RSpec that I care about. It’s actually slightly more sophisticated than the example that I used for MiniTest, because RSpec provides a couple of notable features that MiniTest doesn’t provide. Specifically, these are before :all setup blocks, and nested groups of tests12.

I’m not particularly interested in looking at the other obvious distinguishing features of RSpec, like matchers and the BDD-style “should” language, as these aren’t actually a part of the core RSpec implementation3.

The two hallmark attributes here that I am interested in are:

  • grouping test definitions within blocks (as opposed to classes)
  • defining test behaviour using blocks (as opposed to methods)

Running the test spec

The simplest way of running this spec would be to save as something_spec.rb and run it from the command-line.

$ ruby something_spec.rb
..

Finished in 0.00198 seconds
2 examples, 0 failures
[Finished in 0.5s]

So – what’s actually happening here?

Autorun

As with the minitest example, the first line loads a special file within the test library that not only loads the library, but also installs an at_exit hook for Ruby to run when the interpreter exists.

In RSpec’s case, this is defined in RSpec::Core::Runner.autorun. This calls RSpec::Core::Runner.run with ARGV and the stderr and stdout streams.

In contrast with MiniTest, RSpec parses the options at this point, and will try to determine whether or not to launch using DRb. In most cases it will create an instance of RSpec::Core::CommandLine with the parsed options, and then calls run on that instance.

Within the run method, some setup happens (mostly preamble to be output by the reporter, which is set via the configuration). Then we iterate through all of the “example groups”, returned by RSpec::world.example_groups4.

Let’s take a diversion to see how things actually get into RSpec::world.example_groups.

Your example groups

Consider our example spec again. At the top we have a call to describe:

describe "an object" do

The describe method is actually defined within the module RSpec::Code::DSL, but this module is extended into self at the top level of the running Ruby interpreter (which is main, a singleton instance of Object), making the methods in that module available to call in your spec files. You can actually see all of the modules that have been extended into this instance:

require "rspec/core"

self.class.ancestors
# => [Object, Kernel, BasicObject]

class << self
  ancestors
  # => [RSpec::Core::SharedExampleGroup, RSpec::Core::DSL, Object, Kernel, BasicObject]
end

# also, self.singleton_class.ancestors in Ruby 1.9

From this we can tell that the ancestors of Object are still just Kernel and BasicObject, but the ancestors of the specific instance main includes a few extra modules from RSpec. Anyway, moving on…

describe and RSpec::Core::ExampleGroup

The describe method in RSpec::Core::DSL passes its arguments straight through to RSpec::Core::ExampleGroup.describe. This is where things get a little interesting. Within this inner describe method, a subclass of RSpec::Code::ExampleGroup is created, and given a generated name.

describe "a thing" do
  # your tests, um, I mean specs
end
RSpec::Core::ExampleGroup.constants
# => [:Nested_1, :Extensions, :Pretty, :BuiltIn, :DSL, :OperatorMatcher, :Configuration]

The class that was created is there: Nested_1. For each describe at the top level, you’ll have a new generated class:

describe "a thing" do
  # your specs
end
describe "another thing" do
  # more specs
end
RSpec::Core::ExampleGroup.constants
# => [:Nested_1, :Nested_2, :Extensions, :Pretty, :BuiltIn, :DSL, :OperatorMatcher, :Configuration]

After each subclass is created, it is “set up” via the set_it_up method, which roughly speaking adds a set of metadata about the group (such as which file and line it was defined upon, and perhaps some information about the class if it was called in the form describe SomeClass do ...), and stashes that within the created subclass.

module_eval

More importantly, however, the block which was passed to describe is evaluated against this new subclass using module_eval.

The effect of using module_eval against a class is that the contents of the passed block are evaluated essentially as if they were within the definition of that class itself:

class Lionel; end

Lionel.module_eval do
  def hello?
    "is it me you're looking for?"
  end
end

Lionel.new.hello?
# => "is it me you're looking for?"

You can see above that the behaviour is effectively the same as if we’d defined the hello? method within the Lionel class without any “metaprogramming magic”5.

It’s because of module_eval that you can define methods within example groups:

describe "a thing" do
  def invert_phase_polarity
    # waggle the flux capacitor or something
  end
end

RSpec::Core::ExampleGroup::Nested_1.instance_methods(false) # false means don't include methods from ancestors
# => [:invert_phase_polarity]

These methods are then effectively defined as part of the Nested_1 class that we are implicitly creating. This means that methods defined in this way can be called from within your specs:

describe "a method in an example group" do
  def the_method_in_question
    :result
  end

  it "can be called from within a spec" do
    the_method_in_question.should == :result
  end
end

We’ll see how this actually works a bit later. Knowing that the contents of the describe block are effectively evaluated within a class definition also explains what’s happening when the before methods are called:

  before :all do
    @shared_thing = Object.new
  end

  before :each do
    @something = Object.new
  end

Because this is evaluated as if it was written in a class definition, then before must be a method available on the ExampleGroup class. And indeed it is – RSpec::Code::ExampleGroup.before.

Well, almost.

Hooks

The before method actually comes from the module RSpec::Core::Hooks, which is extended into ExampleGroup. RSpec has a very complicated behind-the-scenes hook registry, which for the purposes of brevity I’m not going to inspect here..

The before method registers its block within that registry, to be retrieved later when the specs actually run.

Because I’m not going to really look too deeply at hooks, the call to the after method works in pretty much the same way. Here it is though, just because:

  after do
    @something = nil
  end

The spec itself

The next method that’s module_eval‘d within our ExampleGroup subclass is the it:

  it "should be an Object" do
    @something.should be_an(Object)
  end

Users of RSpec will know that you can call a number of methods to define a single spec: it, specify example, and others with additional meaning like pending or focus. These methods are actually all generated while RSpec is being loaded, by calls to define_example_method within the class definition of ExampleGroup. For simplicity’s sake (pending and focussed specs are somewhat outwith the remit of this exploration), we’ll only look at the simplest case.

When it is called, more metadata is assembled about the spec (again, including the line and file), and then both this metadata and the block are passed to RSpec::Core::Example.new, which stashes them for later.

Nesting

Within our outer example group, we’ve nested another group:

  describe "compared to another object" do
    before :each do
      @other = Object.new
    end

    it "should not be equal" do
      @something.should_not == @other
    end
  end

Just as the top-level call to describe invokes a class method on RSpec::Core::ExampleGroup, this call will be invoked against the subclass of ExampleGroup (i.e. Nested_1) that our outer group defined. Accordingly, each call to describe defines a new subclass6, stored as a constant within the top-level class: Nested_1::Nested_1. This subclass is stored within an array of children in the outer Nested_1 class.

Within the definition, our before and it calls evaluate as before.

Your spec, as objects

So, for every describe, a new subclass of ExampleGroup is created, with calls to before and after registering hooks within that subclass, and then each it call defines a new instance of RSpec::Core::Example, and these are stored in an array called examples within that subclass.

We can even take a look at these now, for a simplified example:

group = describe "a thing" do
  it "should work" do
    (1 + 1).should_not equal(2)
  end
end

group
# => RSpec::Core::ExampleGroup::Nested_1

group.examples
# => [#<RSpec::Core::Example:0x007ff2523db048
#      @example_block=#<Proc:0x007ff2523db110@example_spec.rb:7>,
#      @options={},
#      @example_group_class=RSpec::Core::ExampleGroup::Nested_1,
#      @metadata={
#        :example_group=>{
#          :description_args=>["a thing"],
#          :caller=>["/Users/james/Code/experiments/rspec-investigation/.bundle/gems/ruby/1.9.1/gems/rspec-core-212.#   2/lib/rspec/core/example_group.rb:291:in `set_it_up'", "/Users/james/Code/experiments/rspec-ivestigation/.#   bundle/gems/ruby/1.9.1/gems/rspec-core-2.12.2/lib/rspec/core/example_group.rb:243:in `ubclass'", #   "/Users/james/Code/experiments/rspec-investigation/.bundle/gems/ruby/1.9.1/gems/rspec-core-2.12.#  2/lib/rspec/core/example_group.rb:230:in `describe'", "/Users/james/Code/experiments/rspec-investigation/.#  bundle/gems/ruby/1.9.1/gems/rspec-core-2.12.2/lib/rspec/core/dsl.rb:18:in `describe'", "example_spec.#   r:6:in `<main>'"]
#        },
#        :example_group_block=>#<Proc:0x007ff255c11430@example_spec.rb:6>,
#        :description_args=>["should work"],
#        :caller=>["/Users/james/Code/experiments/rspec-investigation/.bundle/gems/ruby/1.9.1/gems/rspec-core-2.1.#   2/lib/rspec/core/metadata.rb:181:in `for_example'", "/Users/james/Code/experiments/rspec-investigation/.#  bundle/gems/ruby/1.9.1/gems/rspec-core-2.12.2/lib/rspec/core/example.rb:81:in `initialize'", #   "Users/james/Code/experiments/rspec-investigation/.bundle/gems/ruby/1.9.1/gems/rspec-core-2.12.#   2lib/rspec/core/example_group.rb:67:in `new'", "/Users/james/Code/experiments/rspec-investigation/.#   bndle/gems/ruby/1.9.1/gems/rspec-core-2.12.2/lib/rspec/core/example_group.rb:67:in `it'", "example_spec.#   r:7:in `block in <main>'", "/Users/james/Code/experiments/rspec-investigation/.bundle/gems/ruby/1.9.#   1gems/rspec-core-2.12.2/lib/rspec/core/example_group.rb:244:in `module_eval'", #   "Users/james/Code/experiments/rspec-investigation/.bundle/gems/ruby/1.9.1/gems/rspec-core-2.12.#   2lib/rspec/core/example_group.rb:244:in `subclass'", "/Users/james/Code/experiments/rspec-investigation/.#   bndle/gems/ruby/1.9.1/gems/rspec-core-2.12.2/lib/rspec/core/example_group.rb:230:in `describe'", #   "Users/james/Code/experiments/rspec-investigation/.bundle/gems/ruby/1.9.1/gems/rspec-core-2.12.#   2lib/rspec/core/dsl.rb:18:in `describe'", "example_spec.rb:6:in `<main>'"]
#      },
#      @exception=nil,
#      @pending_declared_in_example=false>
#    ]

Where example groups are nested, further subclasses are created, and stored in an array of children within their respective parent groups.

Almost there!

Phew. The detour we took when looking at this aspect of minitest was much shorter, but now that we understand what happened when our actual spec definition was evaluated, we can return to RSpec running and see how it’s actually exercised.

As we saw above, the describe method returns the created subclass of RSpec::Core::ExampleGroup, and when that is returned back in RSpec::Code::DSL#describe, the register method is called on it. This calls world.register with that class as an argument, where world is returned by RSpec.world and is an instance of RSpec::Core::World, which acts as a kind of global object to contain example groups, configuration and that sort of thing.

Calling register on the World instance stashes our Nested_1 class in an example_groups array within that world.

Our diversion is complete! You deserve a break. Go fetch a cup of your preferred delicious beverage, you have earned it!

Back in RSpec

OK, pop your brain-stack back until we’re in RSpec::Core::Commandline#run again. Our reporter did its preamble stuff, and we were iterating through @world.example_groups, whose origin we now understand.

For each example group, the run method is called on that class, with the reporter instance passed as an argument.

This gets a bit intricate, so I’m going to step through the method definition itself (for version 2.12.2) to help anchor things.

def self.run(reporter)
  if RSpec.wants_to_quit
    RSpec.clear_remaining_example_groups if top_level?
    return
  end

RSpec has a “fail fast” mode, where any single example failure will cause the execution of specs to finish as quickly as possible. Here, RSpec is checking whether anything has triggered this.

  reporter.example_group_started(self)

Next, the reporter is notified that an example group is about to start. The reporter can use this information to print out the name of the group, for example.

  begin
    run_before_all_hooks(new)

The run of the examples is wrapped in a block so it can catch any exceptions and handle them gracefully as you might expect.

The before :all hooks

The call to run_before_all_hooks is very interesting though, and worth exploring. A new instance of the example group is created. It is then passed into this method, where any “before all” blocks are evaluated against that instance, and then the values of any instance variables are stashed.

Consider our original example:

  before :all do
    @shared_thing = Object.new
  end

Given this, we’ll stash the value of @shared_thing (and the fact that it was called @shared_thing) for later use.

It’s actually quite easy to inspect the instance variables of an object in Ruby; try calling instance_variables, instance_variable_get and instance_variable_set on some objects in an IRB session:

class Thing
  def initialize
    @value = Object.new
  end
end

class OtherThing
end

thing = Thing.new
thing.instance_variables # => [:@value]
ivar = thing.instance_variable_get(:@value) # => #<Object:0x007fe43a050e30>

other_thing = OtherThing.new
other_thing.instance_variables # => []

other_thing.instance_variable_set(:@transplanted_value, ivar)
other_thing.instance_variables # => [:@transplanted_value]
other_thing.instance_variable_get(:@transplanted_value) # => #<Object:0x007fe43a050e30>

As you can see above, we can poke around with the innards of objects to our heart’s content. Who needs encapsulation, eh?

Why did RSpec have to create an instance of the example group class, only to throw it away after the before :all blocks have been evaluated? Because RSpec needs to evaluate the block against an instance of the example group so that it has access to the same scope (e.g. can call the same methods) as any of the specs themselves.

Running the example

Now we’re finally ready to run the examples:

    result_for_this_group = run_examples(reporter)

To understand this, we need to look at the definition of run_examples:

def self.run_examples(reporter)
  filtered_examples.ordered.map do |example|
    next if RSpec.wants_to_quit
    instance = new
    set_ivars(instance, before_all_ivars)
    succeeded = example.run(instance, reporter)
    RSpec.wants_to_quit = true if fail_fast? && !succeeded
    succeeded
  end.all?
end

This method iterates over each Example that was stored in the examples array earlier, filtering them according to any command-line parameters (though we are ignoring that here). The most relevant part for us lies in the middle:

    instance = new
    set_ivars(instance, before_all_ivars)
    succeeded = example.run(instance, reporter)

A striking parallel with MiniTest

Another new instance of the ExampleGroup subclass is created. Remember, RSpec created one instance of the class for the before :all blocks, but now it’s creating a fresh instance for this specific spec to be evaluated against.

Thinking back to how MiniTest works, there’s a striking parallel: where MiniTest would instantiate a new instance of the MiniTest::Unit::TestCase for each test method, RSpec is creating a new instance of the ExampleGroup subclass to evaluate each Example block against.

Instances of this class are used so that any methods defined as part of the spec definition are implicitly available as methods to be called in the “setup” and “test” bodies (see the module_eval section above). Not so different after all, eh?

Next, the instance variables that we stashed after evaluating the before :all blocks are injected (effectively using instance_variable_set as we saw above) into this new instance, which will allow the spec to interact with any objects those blocks created. It also means that these values are shared between every spec, and so interactions within one spec that changed the state of one of these instance variables will be present when the next spec runs.

Finally, the #run method on the Example subclass is called, passing the ExampleGroup instance and the reporter. Down one level we go, into Example#run

The spec finally runs

Here’s the full definition of RSpec::Core::Example#run:

def run(example_group_instance, reporter)
  @example_group_instance = example_group_instance
  @example_group_instance.example = self

  start(reporter)

  begin
    unless pending
      with_around_each_hooks do
        begin
          run_before_each
          @example_group_instance.instance_eval(&@example_block)
        rescue Pending::PendingDeclaredInExample => e
          @pending_declared_in_example = e.message
        rescue Exception => e
          set_exception(e)
        ensure
          run_after_each
        end
      end
    end
  rescue Exception => e
    set_exception(e)
  ensure
    @example_group_instance.instance_variables.each do |ivar|
      @example_group_instance.instance_variable_set(ivar, nil)
    end
    @example_group_instance = nil

    begin
      assign_generated_description
    rescue Exception => e
      set_exception(e, "while assigning the example description")
    end
  end

  finish(reporter)
end

For our purposes, we again only need to consider a small part. Once all the reporter and “around” block housekeeping has taken place, the essential core of the example is run:

          run_before_each
          @example_group_instance.instance_eval(&@example_block)
        rescue Pending::PendingDeclaredInExample => e
          @pending_declared_in_example = e.message
        rescue Exception => e
          set_exception(e)
        ensure
          run_after_each

The call to run_before_each introspects the hook registry and evaluates every relevant before hook against the ExampleGroup instance. In effect, this will find any before blocks registered in this example group, and then any blocks registered in any parent groups, and evaluate them all in order, so that each nested before block runs.

Then, the spec block (stored in @example_block) is evaluated against the ExampleGroup instance. This is where your assertions, or matchers, are finally – finally! – evaluated.

If there was a problem, such as a matcher failing or an exception being raised, then the exception is stored against this Example for later reporting. Just as MiniTest assertions raise an exception when they fail, RSpec matchers raise an RSpec::Expectations::ExpectationNotMetError exception. It seems this is the universal way of halting execution when a test fails7. Another hidden similarity between RSpec and MiniTest!

As in MiniTest, whether or not the spec failed or an exception occured, an ensure section is used to guarantee that run_after_hooks is called, and any teardown is performed.

After the specs have run

Once all the specs in this example group have run, all the examples in any subclasses are run (recall that the inner describe stashed the nested ExampleGroup subclass in an array called children). We map each ExampleGroup subclass to the result of calling run on it, which starts this whole process again, for every nested example group. Whether or not this group passed or failed overall is then determined using simple boolean logic:

    results_for_descendants = children.ordered.map {|child| child.run(reporter)}.all?
    result_for_this_group && results_for_descendants

As we leave the call to ExampleGroup#run, we run any corresponding after :all blocks, and also clear out our stash of before :all instance variables, because they are no longer necessary.

  ensure
    run_after_all_hooks(new)
    before_all_ivars.clear
    reporter.example_group_finished(self)
  end
end

Finishing up

You can once again pop your brain-stack back until we’re in RSpec::Core::Commandline#run.

Having run all of the example groups, RSpec will do a little bit of tidy up, and finally return back up through the stack. Along the way printing the results of the run to the console is performed, before interpreter finally, properly quits.

Phew. You deserve another rest.

Testing with blocks

In contrast to the class-based implementation with MiniTest, we’ve now seen how a block-based test framework can work. In a nutshell, it can be characterised in a couple of key ways:

  • the stashing of behaviour blocks, later evaluated using instance_eval against clean test-environment instances (see this section of the MiniTest article for what I mean by “test environment”);
  • using module_eval and subclassing to ensure method definition matches programmer expectation.

I would say these two aspects are the hallmark attributes of an RSpec-style test framework. The other notable aspect is the ability to nest example groups, and the subsequent necessity to be able to gather the implicit chain of setup blocks and evaluate them against the test environment instance, but this could be considered another example of using instance_eval.

Supporting method definition in example groups

One thing I’ve found particularly interesting is that RSpec ends up generating classes and subclasses behind the scenes. I believe this is almost entirely a consequence of wanting to support the kind of “natural” method definition within group bodies (see the module_eval section again).

If any test framework chose to not support this, there’s almost certainly no reason to create classes that map to example groups at all, and the setup and test blocks could be evaluated against a bare instance of Object.

Supporting nesting and dynamic composition

It’s clear that RSpec has more “features” (e.g. nesting, before :all and so on) than MiniTest (ignoring the many extensions available for MiniTest, the most sophisticated of which end up significantly modifying or replacing the MiniTest::Unit.run behaviour). I’m deliberately ignoring features like matchers, or a built-in mocking framework, because what I’m most interested in here are the features that affect the structure of the tests.

It’s certainly possible to implement features like nesting using subclasses and explicit calls to super, but this is the kind of plumbing work that Ruby programmers are not accustomed to accepting. By separating creation of tests from Ruby’s class implementation, the implicit relationships between groups of tests can take this burden instead, and behaviours like before :all, which have no natural analogue in class-based testing, are possible.

Now, you may believe that nesting is fundamentally undesirable, and it is not my present intention to disabuse you of that opinion. It’s useful (I think) to understand the constraints we accept by our choice of framework, and I’ve certainly found my explorations of MiniTest and RSpec have helped clarify my own opinions about which approach is ultimately more aligned with my own preferences. While I wouldn’t say that I’m ready to jump wholesale into the RSpec ecosystem, I think it’s fair to say that my advocacy of class-style testing frameworks is at an end.

RSpec and Kintama

I started this exploration because I wanted to understand the relationship between the software I have accidentally produced and what’s already available. I already had strong suspicions that any block-based testing implementation would converge on a few common implementation decisions, and while I have now identified a few interesting (to me) ways in which RSpec and Kintama diverge, the essential approach is the same.

In the final article in this triptych (coming soon, I hope), I’ll walk through Kintama and point those out.

  1. There’s no built-in way to ‘nest’ test groups with MiniTest, or test-unit; the closest simulation would be to create subclasses, and explicitly ensure that super is called within every setup method.

  2. There are other RSpec features like shared examples and global before/after hooks that are definitely interesting, but I need to keep the scope of this article down…

  3. They are actually within a separate gem (rspec-expectations), and it’s quite possible to use rspec-core with test-unit’s assertions (for the curious, hunt for config.expect_with :stdlib).

  4. I’m not sure why some people prefer the syntax Module::method rather than Module.method; as I understand it they are exactly the same, but the former seems more confusing to me, since if you don’t notice the lower-case w in world then you’d assume it was refering to a constant.

  5. It’s not really magic, and it’s not really “metaprogramming”, because it’s all just programming. It just so happens that it’s quite sophisticated programming.

  6. The nested class is a subclass of the outer subclass of ExampleGroup (sorry, I realise that’s confusing), precisely such that any methods defined in the outer class are also available in nested subclasses via the regular mechanisms of inheritance.

  7. Raising an exception might not be the only way to stop a test executing at the point it fails; it could be possible to use fibers/continuations to “pause” failing tests…


The Postal Inter.net Office is shutting down

Written on February 13 2013 at 17:00 ∷ permalink

Two years ago, as a joke and a nod to making things fast, I took a silly domain name and served a few silly “HTTP” requests using the UK postal service as the transport layer. It was called Postal Inter.net.

Postal Inter.net

It was good fun, and I really enjoyed some of the requests that we received, but the “server” has not been accessed for more than a year now, so I think it’s time to put it to rest.

You were fun, postalinter.net, but your time has passed. I release you into the quantum foam.

Here are a few of the requests that we got. Obviously you are only seeing one side of the communication; the responses are now lost in the ether (or in the post boxes of the UK).

A simple request to the root URL Some more requests

The most impressive request thoroughly embraced the nature of TCP/IP, and arrived in a number of packets, out of order and with some data corruption (see the missing data on the envelopes), which we had to reconstitute into the actual request within our ‘server’. Bravo, Tom Stuart!

Packets 1 and 2 of a very elaborate reimplementation of TCP/IP over the postal system Packets 3 and 4 of a very elaborate reimplementation of TCP/IP over the postal system Packets 5 and 6 of a very elaborate reimplementation of TCP/IP over the postal system

Tom was challenged for login details, and here was his response.

A further request after login was challenged

Alas, I cannot remember what was at http://experthuman.com/proof, and whatever was there is gone now. Perhaps that’s for the best.

PostalInter.net

Packet corruption

Packets on the wire

Next request

Bad request


Richard Paterson will kill you

Written on February 07 2013 at 09:51 ∷ permalink

Richard Paterson is Whyte & Mackay’s “master blender”, which means he doubtless knows a lot about whisky, and as a result he’s clearly asked to appear on TV to guide people around the world of whisky and how best to appreciate it.

Let him guide you now:

But what is a fun bit of banter once can quickly become sinister when you hear it again. And again. And again.

“I’ll kill you.”

“I’ll kill you.”

“I’ll kill you.”

… so don’t say you haven’t been warned.


What happens when MiniTest runs, or, what I think about testing using classes

Written on February 01 2013 at 13:44 and updated on February 06 2013 at 04:12 ∷ permalink

I think I can see the end of my Ruby Testing Quest in sight.

As one part of the final leg of this journey, I want to take a not-too-deep dive into how some principal testing frameworks actually work, so that I can better clarify in my own mind what distinguishes them, and perhaps, if we are lucky, draw out some attributes that may help me. Somehow.

We’re going to start with MiniTest. We’ll also look at RSpec and Kintama, but not right now. This is already crazy-long.

(Update: you can now read about how RSpec works if you wish…)

A simple MiniTest example

Let’s say you have the following test case:

require "minitest/autorun"

class SomethingTest < MiniTest::Unit::TestCase
  def setup
    @something = Object.new
  end

  def test_something
    refute_nil @something
  end

  def teardown
    @something = nil
  end
end

This test is obviously extremely dull and pointless, but it contains just enough to exercise the major parts of MiniTest that I care about.

The two hallmark attributes here are:

  • creating an explicit subclass of a framework class (SomethingTest < MiniTest::Unit::TestCase)
  • defining test behaviour within explicit methods (def setup, def test_something and def teardown).

Running the test

The simplest way of running this test would be to save it in some file (something_test.rb) and run it from the command-line.

$ ruby something_test.rb
Run options: --seed 24486

# Running tests:

.

Finished tests in 0.000866s, 1154.7344 tests/s, 1154.7344 assertions/s.

1 tests, 1 assertions, 0 failures, 0 errors, 0 skips

So – what’s actually happening here?

Autorun

The first line in the file (require "minitest/autorun"), when evaluated, loads the MiniTest library and then calls MiniTest::Unit.autorun, which installs an at_exit hook – a block of code that will be run when this Ruby interpreter process starts to exit.

Our command in the shell (ruby something_test.rb) tells Ruby to load the contents of something_test.rb, which after loading MiniTest simply defines a class with some methods, and nothing else, so after the definition of SomethingTest is finished Ruby starts to exit, and the at_exit code is invoked.

Within this block, a few things happen, but only a small part is particularly relevant to us at the moment: the method MiniTest::Unit.new.run is run, with the contents of ARGV from the command line (in this case an empty Array, so we’ll ignore them as we continue).

MiniTest::Unit, a.a. the “runner”

The call to MiniTest::Unit.new.run simply calls MiniTest::Unit.runner._run, passing the command-line arguments through. runner is a class method on MiniTest::Unit, which returns an instance of MiniTest::Unit by default, although it can be configured to return anything else by setting MiniTest::Unit.runner = <something else>.

So, an instance of MiniTest::Unit was created in the unit test, which then calls run on another newly-created instance of it. It’s mildly confusing, but I believe the purpose is to allow you to completely customise how the tests run by being able to use any object with a _run method. From here on, we’ll assume that the default runner (an instance of MiniTest::Unit) was used.

The default _run method parses the ARGV into arguments (which we’ll ignore right now since in our example they are empty) and then loops through the plugins (another modifiable property of MiniTest::Unit class), which is really just an array of strings which correspond to methods on the MiniTest::Unit “runner” instance. By default, this is all methods which match run_*, and unless you’ve loaded extensions to MiniTest, it is just run_tests:

$ MiniTest::Unit.plugins
# => ["run_tests"]

The run_tests method calls the _run_anything method with the argument :tests. Within _run_anything, the argument is used to select the set of “suites” by kind (“test” suites or “bench” suites, but basically the classes that contain your actual tests).

The actual set of “suites” is returned by calling TestCase.test_suites in this instance. So what does it return? Let’s take a diversion to see what’s going on there.

The test suites, a.k.a TestCase subclasses, a.k.a. your actual tests

Take another look at the content of our test file:

class SomethingTest < MiniTest::Unit::TestCase
  def setup
    @something = Object.new
  end

  def test_something
    refute_nil @something
  end

  def teardown
    @something = nil
  end
end

When we subclassed MiniTest::Unit::TestCase as SomethingTest, the inherited hook on the superclass is called by Ruby with SomethingTest as an argument.

This stashes a reference to the class SomethingTest in an class variable1. The TestCase.test_suites method that we were looking at above returns all those subclasses, sorted by name:

MiniTest::Unit::TestCase.test_suites
# => [SomethingTest]

Running a “suite”2

Back in the _run_anything method, those suites are passed to the _run_suites method, which maps them into their results by passing each to the _run_suite method.

The _run_suite method is responsible for selecting those tests within a suite (returned by the test_methods method on your TestCase subclass) which match any filters (i.e. -n /test_something/).

SomethingTest.test_methods
# => ["test_something"]

The default filter is /./, which will match everything that test_methods returns. For each matching method name, it instantiates a new instance of your suite class, with the method name as an argument to the intialiser, i.e. SomethingTest.new("test_something").

The run method is then called on that instance, with the runner (the instance of MiniTest::Unit that was returned by MiniTest::Unit.runner) as an argument. If you wanted to do the same in the console, it basically amounts to this:

runner = MiniTest::Unit.new
suite = SomethingTest.new("test_something")
suite.run(runner)
# => "."

An actual test running

We’re now at the point where the code from your test is significantly involved. Within the run method, the following methods are called3:

  • setup – this is the method defined in your TestCase subclass. In our example, this results in the instance variable @something being set:
      def setup
        @something = Object.new
      end
    
  • run_test, with the test name that passed to the initializer as an argument. This method is simply an alias for __send__, so the effect is that the method corresponding to your test name is invoked. In our case, the body of test_something runs:
      def test_something
        refute_nil @something
      end
    
  • runner.record – this passes information about the name of the test, how long it took and how many assertions were called back to the runner instance

If we reach this point in the method, it means that the test method returned without raising any exceptions, and so the test is recorded as a pass.

However, if an exception was raised – either by the test, or by a failing assertion – then the test is marked as a failure, and the exception is passed as an argument in a corresponding call to runner.record.

  • teardown – This method is run via an ensure block, so that it will be invoked whether or not an exception occured. In our example, the @something instance variable is set to nil:
      def teardown
        @something = nil
      end
    

Various other things happen, but this is the essential core of how MiniTest works: an instance of your TestCase subclass is created, and then the setup, test and teardown methods are invoked on it.

After the test has run

The run method returns a “result”, which is normally a character like . or F or E. This ultimately gets spat out to whatever is going to be doing the output (normally STDOUT). We saw this output above when we manually instantiated SomethingTest and then called run on it.

Actually, the puke method is called for anything other than a pass, which writes a more detailed string into a @report instance variable, and then returns the first character of that string (Skipped ...S, Failed ...F and so on).

Back up into MiniTest

Once the run method finishes, the result is printed out, and the number of assertions stored on the instance is collected. The test method names that we were iterating over – the result of SomethingTest.test_methods above – are sequentially mapped into this number of assertions, and the final returned value of the _run_suite method is a two element array, the first being the number of tests and the second being the total number of assertions, for each test that ran. In our example, this would be [1,1] – one test and one assertion in total:

runner = MiniTest::Unit.new
runner._run_suite(SomethingTest, :test)
# => [1, 1]

Back up in the _run_suites method, each TestCase is being mapped via into this pair of numbers:

runner._run_suites([SomethingTest], :test)
# => [[1, 1]]

Back up one level further in the _run_anything method, those numbers are summed to return the total number of tests and the total number of assertions, across the whole run of test suites. Finally, these numbers are printed out, and then any failures that were gathered by the calls to runner.record when each test was running.

When the _run method itself finally finishes, taking us back into the at_exit block we started in, it returns the number of errors plus failures that were counted. This value doesn’t seem to be used, and disappears into the quantum foam of energy and information to which we all, ultimately, return.


Running tests within the console

We’ve actually seen already how we could start to poke around with tests without running them all. We can run a single test relatively easily, and determine whether or not it passed:

runner = MiniTest::Unit.new
suite = SomethingTest.new("test_something")
suite.run(runner)
# => "."

Unfortunately, there’s no simple way to run a group of tests (a “suite” or a “testcase” or what have you) aside from using the runner to specify a filter based on names. In other words, there’s no behaviour inherent within the TestCase class that lets you examine the result of the tests it contains. The information about which test failed, and why, leaves the instance when runner.report is called, and it’s only the runner that “knows” (in a very, very weak sense) about the state of more than just the test that is running now.

Instead the TestCase subclass is really just a collection of test_ methods along with the underlying behaviour to execute them (the run method that we examined above, and all of the supporting methods it invokes).

A test’s environment

One of the aspects of test frameworks that interests me most is what provides the environment for each test. What I mean by environment here are things like

  • the implicit value of self
  • how instance variables declared outside of the test relate to the code within the test
  • how methods defined outside the test relate to the code within the test

When a TestCase subclass is instantiated, that instance provides the environment for the test to execute. MiniTest, like test-unit before it, is using the familiar conceptual relationship between classes, objects and methods in Ruby to implicitly communicate that instance variables created or modified in one method, like setup, will be available within our tests, just like normal Ruby code.

This is, I believe, the main reason behind some of the preference towards MiniTest or test-unit style frameworks – they use “less magic”, they are “closer to the metal” – because they use the same conceptual relationships between methods, variables and self as we use when doing all other programming.

This may be so familiar as to seem obvious; methods can of course call methods within the same class, and instance variables set in one method (e.g. setup) can of course be accessed by other methods (e.g. test_something) within the same class. Therefore implementing test suites like this is surely only natural!

Yes, indeed. But doing so is not without consequence.

For example, it’s not typical behaviour to create a new instance of a class just to invoke a single method on it, but that happens for every test_ method. I hope you’ll agree that that seems far less natural. But this has to happen so that each test runs within a clean environment, without any of the changes the previous test might have made to the instance variables they both use, and without any trace of the instance variables previous tests may have created.

If your test framework has those hallmark attributes I mentioned above – a class definition to contain tests, and tests defined as methods – then creating a new instance of that class to run each individual test is an inevitable consequence, unless you want to do some incredible gymnastics behind the scenes.

Examining test environments

Before I climb the ivory tower at the end of this post, let’s have one final code interlude, using these test objects we are creating in the console.

I’ve often imagined that it would be very useful if, when a test fails, you got a dump of all of the values of every instance value in that test. I don’t know about you, but I am very bored of peppering tests with puts statements, or trying to use logs to decipher what happened, when I know that if I could just see the instance variables then I could tell what was failing, and why.

How about this:

require "minitest/unit"

class AnotherTest < MiniTest::Unit::TestCase
  def setup
    @value = 123
  end

  def test_something
    assert_nil @value
  end
end

class MiniTest::Unit::TestCase
  def environment(hide_framework_variables = true)
    variables = instance_variables
    variables.reject! { |v| v.to_s =~ /^@(_|passed)/ } if hide_framework_variables
    variables.inject({}) do |h, v|
      h[v] = instance_variable_get(v)
      h
    end
  end
end

runner = MiniTest::Unit.new
test = AnotherTest.new("test_something")
test.run(runner)
# => "F"

We can see that the test failed, but now we can also look at the instance variables within that test:

test.environment
# => {:@value=>123}

In this test it’s pretty trivial, but maybe you can imagine that being useful when you have a ton of ActiveRecord objects flying around? Particularly if you also patch whatever is outputting your test results to print the contents of environment for all failing tests.

If you’re curious, you can also take a look at the other instance variables that MiniTest has created behind the scenes, mostly prefixed with _ to indicate an informal ‘privacy’:

test.environment(false)
# => {:@__name__=>"test_something", :@__io__=>nil, :@passed=>false, :@value=>123, :@_assertions=>1}

Perhaps this might be worth developing into something useful? Maybe. It’s very much related to the other ideas that I’ve had about Rerunning tests in Ruby.

Using classes for test cases?

So, here we are at the foot of my ivory tower.

There’s nothing wrong with implementing test suites like MiniTest does, but it’s interesting to understand the consequences, both in terms of the impact to the test implementer and the design choices that it forces on the framework implementer. This is particularly obvious if you’re trying to understand the different ways that one could compose test suites.

Using classes and methods is one way, but it’s not the only way to produce blocks of code (indeed, blocks are another) to be run in some specific way.

If we choose Ruby’s existing class system as the mechanism for collecting test behaviour together, we are bound by the rules and limitations of that class system when trying to do anything slightly more out of the ordinary, like dynamically composing abstract behaviour specifications.

Of all languages I’ve used, Ruby is by far the most forgiving regarding this; you can get an amazing amount of mileage out of subclassing, and including modules, and using “class methods” to modify the definition of classes at run-time.

Ruby Testing Diaspora

It’s really a credit to Ruby that, even within the niche of the ecosystem that testing libraries represent, and even within that, the libraries that build on MiniTest or test-unit, so much richness exists. Things like shoulda or coulda or contest could not possibly exist without this flexibility.

But that doesn’t mean that there aren’t occasions where you hit a problem using things like inclusion or inheritance. This has been on my mind for a while.

Hmm

It’s my intuition that these test suites that we’re writing… well, they shouldn’t be classes. They don’t describe things that you can instantiate sensibly and that then have behaviour. They certainly don’t send messages to one another, like “proper objects” do. Classes are just convenient containers for these loosely-related essentially-procedural test bodies.

I believe that this intuition is what lies behind my interest in other test frameworks. From it springs all the ideas about composing or describing the systems under test in more dynamic or more natural ways.

In the next article, we’ll look at how RSpec works under the hood, and finally how Kintama does. Without having done the comparison yet, my guess is that they are very similar, but even within the alternate approach of block-defined tests there are many different paths you can take…

  1. For some reason this collection of classes is stored in a Hash, but it seems like the keys of the hash are the only aspect used, so I don’t understand why it isn’t an Array…

  2. …a.k.a. TestCase subclass, a.k.a. your actual tests. I’m not sure why the MiniTest code is riddled with references to ‘suites’, when the classes that it’s actually running are called TestCases. Perhaps it’s a compromise involving historic names of classes in test-unit?

  3. There are actually quite a few more methods called, but I’m ignoring hooks principally used by plugin authors.


"A framework for making decisions"

Written on January 25 2013 at 09:27 ∷ permalink

A lot of my time, energy and worry with regards to Free Range over the past year or so has been about trying to establish a direction, a goal, a sense of momentum. A shared and explicit purpose, of what we are trying to achieve and how we think about achieving it.

As we entered the new year, this stuff was very much at the front of my mind, and a serendipity would have it I came across this article by Steven Sinofsky (let’s not hold his Microsoft background against him, at least for now), with some words that felt particularly relevant.

Some quotes:

If you think about a team as a set of folks each coming to work to make difficult choices each and every day (and night!) then the critical element for the team is a shared and detailed sense of the overall plan for a product. Historically software planning has not matured to the degree of planning for most other engineering endeavors (construction, transportation). For the most part this is viewed as a positive—it is the “soft” part of software that makes it fun, agile, and in tune with the moment

But if each perspective on a team is maximizing their creativity and agility, it doesn’t take long for chaos to take over. And worse, if things get chaotic and don’t come together well then fingers start pointing.

It is often amazing how quickly the most well-intentioned folks working together can start to have that so-called natural tension turn into a genuine dysfunction.

For a project of any size that goes beyond a handful of people or involves any complexity, detailing the how and why of a product, not just the what, is a critical first step. The reality is that every member of the team benefits from the context and motivation for the project.

The point of a plan is to build a bridge made up of the how and why, not the what.

A plan for what is being built can sound so heavy or burdensome. It can be. It doesn’t have to be. Another word for a plan is a “framework for making decisions”.

It’s this last sentence that really resonates with me.

I don’t want a plan for Free Range so that we can decide rigidly upfront exactly what we’re going to do, followed by dogmatic excecution.

I want a plan so that we have a framework for making decisions; so that we have a way of evaluating our choices that is based on some commonly agreed goals and principles. Without this, I can only see chaos taking the reigns. And not the good kind either.


Ruby Manor 4 tickets sold out in about 12 hours, and it's great and worrying at the same time

Written on January 25 2013 at 07:29 ∷ permalink

One one hand it’s great that the Ruby Manor tickets this year sold out so quickly - about 12 hours for 250 tickets at £15 each. Our little-conference-that-could must have a good reputation, and people must be excited about it.

On the other hand, I wonder what it else means. I worry when tickets for a conference are selling before there’s any hint of what the content will be, which is almost always the case with Ruby Manor. I know a lot of people will buy tickets for an event based on what they heard about the previous incarnation, but how good a reason is that?

Too cheap to pass up?

And when the tickets are only £15, maybe it’s the rational thing to get a ticket without thinking too hard; even if the conference turns out to be rubbish, or you realise you can’t attend, then you’ve only lost the price of quiet night out.

With Ruby Manor, though, it’s not enough to buy a ticket on a whim and then turn up on the day hoping for some fun presentations. We need everyone to get involved to build the schedule. Given that people tend to value things based on how much they paid for them, does that say anything about how people are going to invest in helping build our conference?

How many of you have read the Ruby Manor manifesto? I know some people really do get it, but I haven’t met most of the 250 people who will hopefully be joining us this year, so I really don’t know what most people think. I know that some of the people who helped us sell all our tickets won’t know anything about Ruby Manor except that it’s about Ruby, and it’s in London, and other people liked it. And that might be all they know at the end of the day on the 6th of April, too.

You can lead a horse to water…

In the Free Range weeknotes last week, Chris mentioned a parallel that I drew between things like indieweb and Vendor Relationship Management, but what I was thinking about when we had that conversation was that I am coming to terms with how hard (and maybe impossible) it is to change the way people think about conferences (or software), at all.

Chris wants people to realise that they can have much more control over what software services do with their data, and how they are built. I wanted conference attendees to realise that there was an alternative to expensive conferences with tired content and pointless swag.

What I’ve come to realise, however, is that even when you try and demonstrate that different is possible, and maybe even better, a lot of people will still approach what you’re doing in the same way as any other contemporary example (be that a conference or a software service). Even if something is better, and you can produce a coherent, concise and interesting argument for why it is better, and why someone should care, even with the most perfect meme to implant in their minds… it often still doesn’t make a difference.

It’s really hard to get people to recognise (let alone embrace) philosophical agendas; what I’ve come to accept over the 5+ years I’ve been thinking about Ruby Manor, is that some people just don’t care, and never will.

What I need to get better at is finding that minority who do care – even if they disagree – and build on that.

Anyway, there’s your slice of my brain for today. Do with it what you will.


Tests as documentation for Kintama

Written on January 15 2013 at 17:30 ∷ permalink

One of my goals when working on Kintama has been to drive the implementation using examples of actual behaviour as seen from someone using the framework.

In other words, I’ve been writing a lot of acceptance tests to ensure that as I wildly refactor the internals (from class-based to block-based and back), I can be confident that the outward behaviour of the testing DSL still works.

There are some unit tests, and I’d like to add more because it’s also one of my goals to make it clear how developers can use and manipulate Kintama building blocks (contexts, tests) to the best effect when testing their own applications. In my experience, customising a testing DSL to your specific application can really help improve the clarity and robustness of tests, and I’d like to make it as easy as possible to do that.

That said, the acceptance tests are still extremely important to me, and all new features are driven by expressing the behaviour as I’d like to use it.

Acceptance tests as usage documentation

Anyway, I’ve been doing some refactoring of the tests themselves recently, with the new aim of also using them as a sort-of living documentation. My hope is that any developer could quickly look at these “usage” tests and see how the assertions, or the nesting, or even the “action” behaviour that I mentioned in Setup vs. Action for Ruby tests behaves and how it should be used.

I had thought about using Cucumber to do this and the publish the features on Relish, but it seemed like doing so was going to introduce a large roundtrip, where the cucumber tests would end up writing test files, then running them and then slurping in and parsing the input.

One of the main reasons why I’ve been able to get away with such a reliance on acceptance tests is because I can run them very, very quickly (the whole suite is just over 2 seconds), so I am not very keen on slowing that down. I’m also not keen to lose things like syntax highlighting of the Kintama tests themselves, which would also happen if they were written as strings within Cucumber.

Writing Kintama acceptance tests in Test/Unit

Thankfully, we can make some big improvements without having to lose any of the speed or editor-friendliness of the tests. Here’s an example of one of the new Kintama “usage” tests, which are a suite of acceptance tests written to demonstrate the behaviour of various features:

def test_should_pass_when_all_tests_pass
  context "Given a test that passes" do
    should "pass this test" do
      assert true
    end

    should "pass this test too" do
      assert true
    end
  end.
  should_output(%{
    Given a test that passes
      should pass this test: .
      should pass this test too: .
  }).
  and_pass
end

This test covers the basic behaviour of a Kintama test (which you can see follows squarely in the footsteps of shoulda or RSpec style tests). It shows how to define a context, how to write tests in one, what the output you’ll get will be, and that the test should ultimately pass.

You can peruse all of the usage tests on Github.

They work up from basic usage through setup and teardown to exceptions, expectations and mocking all the way to the experimental action behaviour that inspired Kintama in the first place.

Running Kintama within existing Ruby processes

This is a Test::Unit test, but it’s actually generating and running the Kintama test it describes, but without resorting to writing a file and then shelling out to a new Ruby process. Being able to write tests like this also makes it easier for them to communicate how each feature should be used, because every test contains the actual Kintama expression of how to use that feature.

This is possible in no small part because Kintama is designed to produce independent, executable chunks of code. The context method in Kintama, which defines a collection of tests, also returns an object which has methods that can be used to run and introspect against those tests, and their failures1. Only a tiny piece of scaffolding is used, and that is only responsible for adding the syntactically nice methods for asserting output and how many tests passed.

Anyway, more examples. Here’s a test describing the behaviour of nested contexts with setups and tests at different levels:

def test_should_only_run_necessary_setups_where_tests_at_different_nestings_exist
  context "Given a setup in the outer context" do
    setup do
      @name = "jazz"
    end

    context "and another setup in the inner context" do
      setup do
        @name += " is amazing"
      end

      should "run both setups for tests in the inner context" do
        assert_equal "jazz is amazing", @name
      end
    end

    should "only run the outer setup for tests in the outer context" do
      assert_equal "jazz", @name
    end
  end.
  should_run_tests(2).
  and_pass
end

The test verifies that two tests ran (should_run_tests(2)), and that the context passed (and pass).

Because I already have some unit tests, and other basic usage tests, I can be confident that simple testing behaviour already works and so lots of the higher-level tests can verify their behaviour using the Kintama tests themselves. If any of the assertions within the Kintama test fail, such as the assert_equal "jazz", @name above, then the test will not pass.

Here’s another test that checks exceptions raised in the teardown (which always runs) don’t mask any exceptions within the test itself:

def test_should_not_mask_exceptions_in_tests_with_ones_in_teardown
  context "Given a test and teardown that fails" do
    should "report the error in the test" do
      raise "exception from test"
    end

    teardown do
      raise "exception from teardown"
    end
  end.
  should_fail.
  with_failure("exception from test")
end

Here we can actually make assertions about which test failed, and what the failure message was.

I am very pleased with how this is turning out.

  1. Lower-level unit tests, as mentioned above, will help demonstrate exactly how to poke these test objects for maximum effect, so that it’s clearer how to compose Kintama objects dynamically.


Weeknotes 1805

Written on January 07 2013 at 12:04 ∷ permalink

Ha! Remember weeknotes? Hilarious.

Anyway, I wanted to keep writing, and this seems like the best structure for doing so. Also, I couldn’t think of a snappy title for what’s been happening this week. By my reckoning, last week was Week 1805. Apparently I started the count at the week I was born; that seems a bit silly in retrospect, but it’s all arbitrary in the end.

Last week I spent a lot of time getting various things in order, including this site. I broke up a bunch of draft posts that had been cluttering up my git st for probably more than a year, and psuedo-published them. They’re still not exactly finished, but at least they are something.

Manor Fo[u]rth

We are achingly close to being able to announce the date and venue for Ruby Manor 4, at which point we can also start selling tickets and begin the Vestibule-powered CFP process. We also need to write some stuff about how the CFP is going to work, particularly in the light of all the recent minority involvement discussion.

Yee-haw

Meanwhile, I’m in Austin for a few weeks, planning my personal geographic strategy for the next year. However that works out, I predict a significant increase in frequent flier miles.

I have a pay-as-you-go SIM via AT&T for use when I’m in the USA, but recently they’ve changed their plans such that the only way to get data if you have a smartphone is to pay, like, $25 a month. There’s no other way to get PAYG data if you have a smartphone. That’s a big pain in the ass when you’re trying to get some information from a tweet while wandering around near 6th Street.

I haven’t done any real programming in 2013 yet. Instead, I’ve been writing a bit more about what my hopes and plans for Free Range are for 2013 and beyond, as have the others, and we’ll be discussing our output on Tuesday. The idea is that we’ll use this material as the raw material from which to distil and refine our company direction – our strategy – for the next year.

I’ve been thinking a lot of about our company dynamic and direction, and I think it’s fair to say that I’ve been the most vocal about these things internally, but it’s still a challenge to hone my thoughts and intuitions into useful fodder. Who knows, maybe I think too much about it; I don’t think so though. I’m just keen to actually get things done.

So, we’ll talk about that on Tuesday morning (my afternoon) and see what transpires. Video into the office hasn’t been working very well for anyone, which seems to point the finger of connectivity blame pretty squarely at the office DSL; something else to sort out, but the pain is only experienced by those not in the best position to actually do anything about it…