If you haven’t seen the video, then I’d still strongly encourage you to read the blog post. While I can now see the inspiration for wanting to discuss these ideas1, the post really does stand on it’s own.
I’m not going to re-explain it here, so yes, really, go read it now.
What I found really interesting here was the idea of building new enumerators by re-combining existing enumerators. I’ll use a different example, one that is perhaps a bit too simplistic (there are more concise ways of doing this in Ruby), but hopefully it will illustrate the point.
Let’s imagine you have an Enumerator which enumerates the numbers from 1 up to 10:
You can now use that to do all sorts of enumerable things like mapping, selecting, injecting and so on. But you can also build new enumerables using it. Say, for example, we now only want to iterate over the odd numbers between 1 and 10.
We can build a new Enumerator that re-uses our existing one:
> odd_numbers = Enumerator.new do |yielder|
numbers.each do |number|
yielder.yield number if number.odd?
=> #<Enumerator: #<Enumerator::Generator:0x007fc0b38de6b0>:each>
So, that’s quite neat (albeit somewhat convoluted compared to 1.upto(10).select(&:odd)). To extend this further, let’s imagine that I hate the lucky number 7, so I also don’t want that be included. In fact, somewhat perversely, I want to stick it right in the face of superstition by replacing 7 with the unluckiest number, 13.
Yes, I know this is weird, but bear with me. If you have read Tom’s post (go read it), you’ll already know that this can also be achieved with a new enumerator:
> odd_numbers_that_arent_lucky = Enumerator.new do |yielder|
odd_numbers.each do |odd_number|
if number == 7
=> #<Enumerator: #<Enumerator::Generator:0x007fc0b38de6b0>:each>
StopIteration: iteration reached an end
In Tom’s post he shows how this works, and how you can further compose enumerators to to produce new enumerations with specific elements inserted at specific points, or elements removed, or even transformed, and so on.
A hidden history of enumerable transformations
What I find really interesting here is that somewhere in our odd_numbers enumerator, all the numbers still exist. We haven’t actually thrown anything away permanently; the numbers we don’t want just don’t appear while we are enumerating.
The enumerator odd_numbers_that_arent_lucky still contains (in a sense) all of the numbers between 1 and 10, and so in the tree composition example in Tom’s post, all the trees he creates with new nodes, or with nodes removed, still contain (in a sense) all those nodes.
It’s almost as if the history of the tree’s structure is encoded within the nesting of Enumerator instances, or as if those blocks passed to Enumerator.new act as a runnable description of the transformations to get from the original tree to the tree we have now, invoked each time any new tree’s children are enumerated over.
I think that’s pretty interesting.
Notes on ‘Catamorphisms’
In the section on Catamorphisms (go read it now!), Tom goes on to show that recognising similarities in some methods points at a further abstraction that can be made – the fold – which opens up new possibilities when working with different kinds of structures.
What’s interesting to me here isn’t anything about the code, but about the ability to recognise patterns and then exploit them. I am very jealous of Tom, because he’s not only very good at doing this, but also very good at explaining the ideas to others.
Academic vs pragmatic programming
This touches on the tension between the ‘academic’ and ‘pragmatic’ nature of working with software. This is something that comes up time and time again in our little sphere:
Now I’m not going to argue that anyone working in software development should have a degree in Computer Science. I’m pretty sympathetic with the idea that many “Computer Science” degrees don’t actually bear much of direct resemblance to the kinds of work that most software developers do2.
Ways to think
What I think university study provides, more than anything else, is exposure and training in ways to think that aren’t obvious or immediately accessible via our direct experience of the world. Many areas of study provide this, including those outside of what you might consider “science”. Learning a language can be learning a new way to think. Learning to interpret art, or poems, or history is learning a new way to think too.
Learning and internalising those ways to think give perspectives on problems that can yield insights and new approaches, and I propose that that, more than any other thing, is the hallmark of a good software developer.
Going back to the blog post which, as far I know, sparked the tweet storm about “programming and maths”, I’d like to highlight this section:
At most academic CS schools, the explicit intent is that students learn programming as a byproduct of learning CS. Programming itself is seen as rather pedestrian, a sort of exercise left to the reader.
For actual developer jobs, by contrast, the two main skills you need these days are programming and communication. So while CS still does have strong ties to math, the ties between CS and programming are more tenuous. You might be able to say that math skills are required for computer science success, but you can’t necessarily say that they’re required for developer success.
What a good computer science (or maths or any other logic-focussed) education should teach you are ways to think that are specific to computation, algorithms and data manipulation, which then
provide the perspective to recognise patterns in problems and programs that are not obvious, or even easily intuited, and might otherwise be missed.
provide experience applying techniques to formulate solutions to those problems, and refactorings of those programs.
Plus, it’s fun to achieve that kind of insight into a problem. It’s the “a-ha!” moment that flips confusion and doubt into satisfaction and certainty. And these insights are also interesting in and of themselves, in the very same way that, say, study of art history or Shakespeare can be.
So, to be crystal clear, I’m not saying that you need this perspective to be a great programmer. I’m really not. You can build great software that both delights users and works elegantly underneath without any formal training. That is definitely true.
Back to that quote:
the ties between CS and programming are more tenuous … you can’t necessarily say that they’re required for developer success.
All I’m saying is this: the insights and perspectives gained by studying computer science are both useful and interesting. They can help you recognise existing, well-understood problems, and apply robust, well-understood and powerful solutions.
That’s the relevance of computer science to the work we do every day, and it would be a shame to forget that.
In the last 15 minutes or so of the video, the approach Tom uses to add a “child node” to a tree is interesting but there’s not a huge amount of time to explore some of the subtle benefits of that approach↩
Which is, and let’s be honest, a lot of “Get a record out of a database with an ORM, turn it into some strings, save it back into the database”.↩
I’ve been toying around with Docker for a while now, and I really like it.
The reason why I became interested, at first, was because I wanted to move the Printer server and processes to a different VPS which was already running some other software (including this blog), but I was a bit nervous about installing all of the dependencies.
What if something needed an upgraded version of LibXML, but upgrading for one application ended up breaking a different one? What if I got myself in a tangle with the different versions of Ruby that different applications expect?
If you’re anything like me, servers tend to be supremely magical things; I know enough of the right incantations to generally make the right things appear in the right places at the right time, but it’s so easy to utter the wrong spell and completely mess things up1.
Docker provides an elegant solution for these kinds of worries, because it allows you to isolate applications from each other, including as many of their dependencies as you like. Creating images with different versions of LibXML, or Ruby, or even PostgreSQL is almost trivially easy, and you can be confident that running any combination will not cause unexpected crashes and hard-to-trace software version issues.
However, while Docker is simple in principle, it’s not trivial to actually deploy with it, in the pragmatic sense.
What I mean is getting to a point where deploying a new application to your server is as simple (or close enough to it) as platforms like Heroku make it.
Now, to be clear, I don’t specifically mean using git push to deploy; what I mean is all the orchestration that needs to happen in order to move the right software image into the right place, stop existing containers, start updated containers and make sure that nothing explodes as you do that.
But, you might say, there are packages that already help with this! And you’re right. Here are a few:
Dokku, the original minimal Heroku clone for Docker
Orchard, a managed host for your Docker containers
…and many more. I don’t know if you’ve heard, but Docker is, like, everywhere right now.
So why not use one of those?
That’s a good question
Well, a couple of reasons. I think Orchard looks great, but I like using my own servers for software when I can. That’s just my personal preference, but there it stands, so tools like Orchard (or Octohost and so on) are not going to help me at the moment.
Dokku is good, but I’d like to have a little more control of how applications are deployed (as I said above, I don’t really care about git push to deploy, and even in Heroku it can lead to odd issues, particularly with migrations).
Flynn isn’t really for me, or at least I don’t think it is. It’s for dev-ops running apps on multiple machines, balancing loads and migrating datacenters; I’m running some fun little apps on my personal VPS. I’m not interested in using Docker to deploy across multiple, dynamically scaling nodes; I just want to take advantage of Docker’s isolation on my own, existing, all-by-its-lonesome VPS.
But, really more than anything, I wanted to understand what was happening when I deploy a new application, and be confident that it worked, so that I can more easily use (and debug if required) this new moving piece in my software stack.
So I’ve done some exploring
I took a bit of time out from building Harmonia to play around more with Docker, and I’d like to write about what I’ve found out, partly to help me make it concrete, and partly because I’m hoping that there are other people out there like me, who want some of the benefits that Docker can bring without necessarily having to learn so much about using it to it’s full potential in an ‘enterprise’ setting, and without having to abandon running their own VPS.
There ought to be a simple, easy path to getting up and running productively with Docker while still understanding everything that’s going on. I’d like to find and document it, for everyone’s benefit.
If that sounds like the kind of think you’re interested in, and would like reading about, please let me know. If I know there are enough people interested in the idea, then I’ll get to work.
It was really interesting to read the news about the potential cancellation of Wicked Good Ruby, a Ruby conference that was due to run for a second time in Boston this year:
Rather than half-ass a conference we’ve decided to cancel it. All tickets will be fully reimbursed. I’ve already reached out to those that have purchased with how that will be done.
There’s lots of interesting stuff in this thread, but one point that jumped out to me was the financial risk involved:
Zero sponsors. […] I had 2 companies inquire about sponsorship. One asked to sponsor but never paid the sponsorship invoice after 82 days.
Last year we lost $15k. In order to made it worth our effort this year we needed to make a profit. The conference we had in mind required a $50k sponsorship budget with an overall budget of close to $100k. (last year’s conference cost about $125k) Consider to date, after 6 months, we have received $0 in sponsorship the financial risk was too high.
Since announcing this a few hours ago I’ve been contacted by 3 other regional conference organizers. They are all having similar issues this year. Sponsorship is incredibly difficult to come by […] I didn’t get the sense they were going to bail but I think this is a larger issue than just Boston.
With costs in the tens or even hundreds of thousands of dollars, it’s really no wonder that organisers might have second thoughts.
But does running a conference really need to come with such an enormous financial risk? With Ruby Manor, our biggest budget was less than £2,000 (around $3,000). Here’s why:
We don’t use expensive ‘conference’ venues…
… so we aren’t locked into their expensive conference catering.
We run a single track for a single day, so we only need one main room.
We meticulously pare away other expenses like swag, t-shirts, catering and so on, where it’s clear that attendees are perfectly capable of handling those themselves.
Now, it could be that there are a few factors unique to Ruby Manor that make it difficult, or even impossible, for other conferences to follow the same model. For instance, holding the event in the center of a big city like London means there’s already a wide range of lunch options for attendees, right outside the venue.
Another problem could be the availability of university and community venues as an alternative to conference centres. I really don’t know if it’s possible or not to rent spaces like this in other cities or countries. A quick look at my local university indicates that it’s totally feasible to rent a 330-seat auditorium for less than $1,000, and even use their coffee/tea catering for a total of less than $3,0001, all-in.
I would be genuinely fascinated for other conferences to start publishing their costing breakdown. LessConf have already done this, and even though I might not have chosen to spend quite so much money on breakfasts and surprises, I genuinely do applaud their transparency for the benefit of the community.
I think I’ve come up with a plan: reduce the conference from 2 nights to 1 night. Cut out many of the thrills that I was planning. This effectively would reduce our operational costs from ~$100k to around ~$50k. This would also allow us to reduce the ticket prices (I would reimburse current tickets for the difference).
I genuinely wish the organisers the best of luck. It’s a tough gig, running a conference. That said, $50,000 is still an enormous amount of money, and I cannot help but feel that it’s still far higher than it needs to be.
Every hour you spend as a conference organiser worrying about sponsorships or ticket sales or other financial issues, is an hour that could be spent working on making the content and the community aspects as good as they can be.
Let’s not forget: the only really important part of a conference is getting a diverse group of people with shared interests into the same space for a day or two. Everything else is just decoration. A lot of the time, even the presentations are just decoration. It’s getting the community together that’s important.
I realise that many people expect, consciously or otherwise, some or all of the peripheral bells and whistles that surround the core conference experience. For some attendees, going to a conference might even be akin to a ‘vacation’ of sorts. Perhaps a conference without $15,000 parties feels like less of an event… less of an extravaganza.
But consider this: given the choice between a glitzy, dazzling extravaganza, and a solid conference that an organiser can run without paralysing fear of financial ruin, I know which I would choose, and I know which ultimately benefits the community more.
To be clear, they require that you cannot make a profit on events they host, but from what I can tell, most conference organisers don’t wish to run their events for profit anyway.↩
This irregularly-kept blog thing is getting so irregular it’s almost regularly irregular. Lest you fool yourself into presuming any ability to predict when I might write here (i.e. “never”), let me thwart your erroneous notion by serving up this rambling missive, unexpected and with neither warning nor warrant!
Take that, predictability! Chalk another one up for chaos.
I live in Austin now, or at least, for the moment
In summer last year, I moved from London to Austin. My other half had already been living in Austin for two years, and I’d been flying back and forth ever other month or so, but 24 months of that is quite enough, and so I bit the bullet and gave up my London life for Stetsons and spurs. It’s been quite an adventure so far, although I do miss a few people back in the UK.
We don’t know how long we’ll stay here, but for the moment I’m enjoying the weather immensely. You might think it’s a balmy summer in London when temperatures regularly reach 20°C, but during Austin’s summer the temperature never drops below 20°C, even at night. Can you even conceive of that?
Farewell Go Free Range, Howdy Exciting
Leaving the UK also marked the beginning of the end of the Free Range experiment for me. Ever since I started kicking the ideas for Free Range around in late 2008, I’ve always considered it an experiment. I wish I could say that it had been a complete success, and it was very successful in a number of important ways, but if you’re going to pour a lot of energy into trying to make something for yourself, it really needs to be moving in the direction you feel is valuable.
I wrote a bit more on the Exciting blog and the Free Range blog. I could say a lot more about this – indeed, I have actually written thousands of words in various notepads and emails – but I’ll save that for another time; my memoirs, maybe. It takes a lot of effort to keep a small boat moving in a consistent direction through uncharted waters.
So: I’ve spun out a lot of the projects I was driving into a new umbrella-thing called Exciting, and that’s the place to look at what I’m working on.
Right, but what are you doing exactly?
Well, I’m still doing the occasional bit of client work, and I have some tinkering projects that I need to write more about, but this year I have mostly been1 working on Harmonia. Although it was built for Free Range, it’s still both a product and a way of working that I believe lots of teams and companies could benefit from. I’ve added a fair bit of new functionality (more calendar and email functionality, webhooks, Trello integration) and fixed a whole bunch of usability issues, the lion’s share of the work has been in working on how to communicate what Harmonia does and how it could help you.
I really cannot overstate the amount of effort this takes. It’s exhausting. Some of that is doubtless because I am a developer by training, and so I’ve had to learn about design, user experience, copywriting, marketing, and everything in between, as I’ve gone along. It’s very much out of my comfort zone, but it’s simply not possible to succeed without them.
You can build the best product in the world, but if nobody knows it exists or can quickly determine whether or not it’s something they want… nobody will ever use it.
The good thing is that the web is the ideal medium for learning all this, because you can do it iteratively. I’ve completely redesigned the Harmonia landing page four times since it became my main focus, and each time I think it’s getting better at showcasing the benefits that it can bring. It still needs more work though; as of writing this I think it’s too wordy and the next revision will pare it down quite a bit.
It’s also a real challenge to keep up this effort while working alone. It’s hard to wrangle a group of people into a coherent effort, but once you succeed then it provides a great support structure to keep motivation up and to share and develop ideas. Working by yourself, you don’t have to deal with anyone else’s whims or quirks, but you also only have yourself to provide motivation and reassurance that what you’re working on is valuable, and that the decisions you’re making are sensible.
What’s more, it can be challenging to stay motivated in a world seemingly filled by super-confident, naturally-outgoing entrepreneur types. I don’t operate with the unshakable confidence that what I’m building is awesome, or that my ideas are worth sharing, or that they even have any particular merit.
Confronted with the boundless crowd of what-seems-typical Type A entrepreneurs that fill the internet, prolifically, with their reckons and newsletters and podcasts and blog posts and courses, I often wonder about the simpler life of just being an employee; of offloading the risk onto someone else in exchange for a steady incoming and significantly less crippling self-doubt. I’m sure I’m not the only person who feels this way, whose skin crawls as yet another service or blog post or person is cheaply ascribed the accolade “awesome” when really, I mean, really, is it? Is it?
But… there’s still a chance that I might be able to carve out a niche of my own, and maybe build another little company of friends working on small things that we care about, and investing in our future collectively. I still believe that’s a possibility.
Anyway, so that’s what I’m doing, mostly: crushing it2.
Couldn’t resist this turn of phrase; see here if you don’t immediately get it.↩
I’ve been trying to largely ignore the recent TDD discussion prompted by DHH’s RailsConf 2014 keynote. I think I can understand where he’s coming from, and my only real concern with not sharing his point of view is that it makes it less likely that Rails will be steered in a direction which makes TDD easier. But that’s OK, and if my concern grows then I have opportunities to propose improvements.
I don’t even really mind what he’s said in his latest post about unit testing the Basecamp codebase. There are a lot of Rails applications – including ones I’ve written – where a four-minute test suite would’ve been a huge triumph.
I could make some pithy argument like:
… but let’s be honest, four minutes for a substantial and mature codebase is pretty excellent in the Rails world.
So that is actually pretty cool.
A lot of that speed is no doubt because Basecamp is using fixtures: test data that is loaded once at the start of the test run, and then reset by wrapping each test in a transaction and rolling back before starting the next test.
This can be a benefit because the alternative – assuming that you want to get some data into the database before your test runs – is to insert all the data required for each test, which can potentially involve a large tree of related models. Doing this hundreds or thousands of times will definitely slow your test suite down.
(Note that for the purposes of my point below, I’m deliberately not considering the option of not hitting the database at all. In reality, that’s what I’d do, but let’s just imagine that it wasn’t an option for a second, yeah? OK, great.)
So, fixtures will probably make the whole test suite faster. Sounds good, right?
The problem with fixtures
I feel like this is glossing over the real problem with fixtures: unless you are using independent fixtures for each test, your shared fixtures have coupled your tests together. Since I’m pretty sure that nobody is actually using independent fixtures for every test, I am going to go out on a limb and just state:
Fixtures have coupled your tests together.
This isn’t a new insight. This is pain that I’ve felt acutely in the past, and was my primary motivation for leaving fixtures behind.
Say you use the same ‘user’ fixture between two tests in different parts of your test suite. Modifying that fixture to respond to a change in one test can now potentially cause the other test to fail, if the assumptions either test was making about its test data are no longer true (e.g. the user should not be admin, the user should only have a short bio, or so on).
If you use fixtures and share them between tests, you’re putting the burden of managing this coupling on yourself or the rest of your development team.
Going back to DHH’s post:
Why on earth would you run your entire test harness for every single line change in a particular model? If you have so little confidence in the locality of your changes, the tests are indeed telling you that the system has overly high coupling.
What fixtures do is introduce overly high coupling in the test suite itself. If you make any change to your fixtures, I do not think it’s possible to be confident that you haven’t broken a single test unless you run the whole suite again.
Fixtures separate the reason test data is like it is from the tests themselves, rather than keeping them close together.
I might be wrong
Now perhaps I have only been exposed to problematic fixtures, and there are techniques for reliably avoiding this coupling or at least managing it better. If that’s the case, then I’d really love to hear more about them.
Or, perhaps the pain of managing fixture coupling is objectively less than the pain of changing the way you write software to both avoid fixtures AND avoid slowing down the test suite by inserting data into the database thousands of times?
I’ve been playing with Raygun.io over the last day or so. It’s a tool, like Honeybadger, Airbrake or Errbit, for managing exceptions from other web or mobile applications. It will email you when exceptions occur, collapse duplicate errors together, and allows a team to comment and resolve exceptions from their nicely designed web interface.
I’ve come to the conclusion that integrating with something like this is basically a minimum requirement for software these days. Previously we might’ve suggested an ‘iterative’ approach of emailing exceptions directly from the application before later using one of these services, but I no longer see the benefit in postponing a better solution when it’s far simpler to work with one of these services than it is to set up email reliably.
It seems pretty trivial to integrate with a Rails application – just run a generator to create the initializer complete with API key. However, I had to do a bit more work to hook it into a Rack application (which is what Vanilla is). In my config.ru:
# Setup Raygun and add it to the middlewarerequire'raygun'Raygun.setupdo|config|config.api_key=ENV["RAYGUN_API_KEY"]enduseRaygun::RackExceptionInterceptor# Raygun will re-raise the exception, so catch it with something elseuseRack::ShowExceptions
The documentation for this is available on the Raygun.io site, but at the moment the actual documentation link on their site points to a gem, which more-confusingly isn’t actually the gem that you will have installed. Reading the documentation in the gem README also reveals how to integrate with Resque, to catch exceptions in background jobs.
One thing that’s always worth checking when integrating with exception reporting services is whether or not they support SSL, and thankfully it looks like that’s the default (and indeed only option) here.
The Raygun server also sports a few plugins (slightly hidden under ‘Application Settings’) for logging exception data to HipChat, Campfire and the like. I’d like to see a generic webhook plugin supported, so that I could integrate exception notification into other tools that I write; thankfully that’s the number one feature request at the moment.
My other request would be that the gem should try not to depend on activesupport if possible. I realise for usage within Rails, this is a non-issue, but for non-Rails applications, loading ActiveSupport can introduce a number of other gems that bloat the running Ruby process. As far as I can tell, the only methods from ActiveSupport that are used are Hash#blank? (which is effectively the same as Hash#empty?) and String#starts_with? (which is just an alias for the Ruby-default String#start_with?). Pull request submitted.
I was lucky enough to be gifted a Twine by my colleagues at Go Free Range last weekend, and I took the opportunity to put together a very simple service that demonstrates how it can be used.
If you haven’t heard of Twine, it’s a hardware and software platform for connecting simple sensors to the internet, and it makes it really very easy to do some fun things bridging the physical and online worlds.
On the hardware side, there’s a simple 2.7” square that acts as a bridge between your home WiFi network and a set of sensors.
Some of the sensors are built in to the square itself: temperature, orientation and vibration can be detected without plugging anything else in. You can also get external sensors, which connect to the square via a simple 3.5mm jack cable. If you buy the full sensor package, you’ll get a magnetic switch sensor, a water sensor and a ‘breakout board’ that lets you connect any other circuit (like a doorbell, photoresistor, button and so on) to the Twine.
Connecting the Twine to a WiFi network is elegant and features a lovely twist: you flip the Twine on its “back”, like a turtle, and it makes its own WiFi network available.
Connect to this from your computer, and you can then give the Twine the necessary credentials to log on to your home network, and once you’re done, flip it back onto its “belly” again and it will be ready to use. I really loved this simple, physical interaction.
On the software side, Twine runs an online service that lets you define and store ‘rules’ to your connected Twine units. These rules take the form of when <X> then <Y>, in a similar style to If This Then That. So, with a rule like when <vibration stops> then <send an email to my phone>, you could pop the Twine on top of your washing machine and be alerted when it had finished the final spin cycle.
As well as emailing, the Twine can flash it’s LED, tweet, send you an SMS, call you, or ping a URL via GET or POST requests including some of the sensor information.
Supermechanical, the company that launched Twine about a year and a half ago via Kickstarter, maintains a great blog with lots of example ideas of things that can be done.
All technology tends towards cat…
Just as the internet has found its singular purpose as the most efficient conduit for the sharing of cat pictures, so will the Internet of Things realise its destiny by becoming entirely focussed on physical cats, in all their unpredictable, scampish glory.
It’s neat having something in your house tweet or send you an email, but I like making software so I decided to explore building a simple server than the Twine could interact with, and thus, “Pinky Status” was born:
What follows is a quick explanation of how easy it was.
I hooked up the magnetic switch sensor to the Twine, and then used masking tape to secure the sensor to the side of the catflap, and then the magnet to the flap itself.
That way, when “Pinky” (that’s our cat) opened the flap, the magnet moves away from the switch sensor and it enters the ‘open’ state. It’s not pretty, but it works.
Next, we need a simple rule so that the Twine knows what to do when the sensor changes:
When the sensor changes to open, two things happen. Firstly, I get an email, which I really only use for debugging and I should probably turn it off, except that it’s pretty fun to have that subject line appear on my phone when I’m out of the house.
Secondly and far more usefully, the Twine pings the URL of a very, very simple server that I wrote.
The key part is at the very bottom – as Twine makes a POST request, the server simply creates another Event record with an alternating status (‘in’ or ‘out’), and then some logic in the view (not shown) can tell us whether or not the cat is in or out of the house.
In more recent versions of the code I’ve moved to Rails because it’s more familiar, but also slightly easier to do things like defend against duplicate events (normally when the cat changes her mind about going outside when her head is already through the flap) and other peripheral things.
But don’t be dissuaded by Rails - it really was as trivial as the short script above , showing some novel information derived from the simple sensor attached to the Twine. Deploying a server is also very easy thanks to tools like Heroku.
A few hours idle work and the secret life of our cat is now a little bit less mysterious than it was. I really enjoyed how quick and painless the Twine was to setup, and I can highly recommend it if you’re perhaps not comfortable enough to dive into deep sea of Arduinos, soldering and programming in C, but would still like to paddle in the shallower waters of the “internet of things”.
What happens when RSpec runs, or, what I think about testing with blocks
require"rspec/autorun"describe"an object"dobefore:alldo@shared_thing=Object.newendbefore:eachdo@something=Object.newendit"should be an Object"firstname.lastname@example.org_an(Object)enddescribe"compared to another object"dobefore:eachdo@other=Object.newendit"should not be equal"email@example.com_not==@otherendendafterdo@something=nilendend
This is obviously extremely dull and pointless – just like the minitest one – but it contains just enough to exercise the major parts of RSpec that I care about. It’s actually slightly more sophisticated than the example that I used for MiniTest, because RSpec provides a couple of notable features that MiniTest doesn’t provide. Specifically, these are before :all setup blocks, and nested groups of tests12.
I’m not particularly interested in looking at the other obvious distinguishing features of RSpec, like matchers and the BDD-style “should” language, as these aren’t actually a part of the core RSpec implementation3.
The two hallmark attributes here that I am interested in are:
grouping test definitions within blocks (as opposed to classes)
defining test behaviour using blocks (as opposed to methods)
Running the test spec
The simplest way of running this spec would be to save as something_spec.rb and run it from the command-line.
$ ruby something_spec.rb
Finished in 0.00198 seconds
2 examples, 0 failures
[Finished in 0.5s]
So – what’s actually happening here?
As with the minitest example, the first line loads a special file within the test library that not only loads the library, but also installs an at_exit hook for Ruby to run when the interpreter exists.
In RSpec’s case, this is defined in RSpec::Core::Runner.autorun. This calls RSpec::Core::Runner.run with ARGV and the stderr and stdout streams.
In contrast with MiniTest, RSpec parses the options at this point, and will try to determine whether or not to launch using DRb. In most cases it will create an instance of RSpec::Core::CommandLine with the parsed options, and then calls run on that instance.
Within the run method, some setup happens (mostly preamble to be output by the reporter, which is set via the configuration). Then we iterate through all of the “example groups”, returned by RSpec::world.example_groups4.
Let’s take a diversion to see how things actually get intoRSpec::world.example_groups.
Your example groups
Consider our example spec again. At the top we have a call to describe:
The describe method is actually defined within the module RSpec::Code::DSL, but this module is extended into self at the top level of the running Ruby interpreter (which is main, a singleton instance of Object), making the methods in that module available to call in your spec files. You can actually see all of the modules that have been extended into this instance:
From this we can tell that the ancestors of Object are still just Kernel and BasicObject, but the ancestors of the specific instancemain includes a few extra modules from RSpec. Anyway, moving on…
describe and RSpec::Core::ExampleGroup
The describe method in RSpec::Core::DSL passes its arguments straight through to RSpec::Core::ExampleGroup.describe. This is where things get a little interesting. Within this inner describe method, a subclass of RSpec::Code::ExampleGroup is created, and given a generated name.
describe"a thing"do# your tests, um, I mean specsendRSpec::Core::ExampleGroup.constants# => [:Nested_1, :Extensions, :Pretty, :BuiltIn, :DSL, :OperatorMatcher, :Configuration]
The class that was created is there: Nested_1. For each describe at the top level, you’ll have a new generated class:
describe"a thing"do# your specsenddescribe"another thing"do# more specsendRSpec::Core::ExampleGroup.constants# => [:Nested_1, :Nested_2, :Extensions, :Pretty, :BuiltIn, :DSL, :OperatorMatcher, :Configuration]
After each subclass is created, it is “set up” via the set_it_up method, which roughly speaking adds a set of metadata about the group (such as which file and line it was defined upon, and perhaps some information about the class if it was called in the form describe SomeClass do ...), and stashes that within the created subclass.
More importantly, however, the block which was passed to describe is evaluated against this new subclass using module_eval.
The effect of using module_eval against a class is that the contents of the passed block are evaluated essentially as if they were within the definition of that class itself:
classLionel;endLionel.module_evaldodefhello?"is it me you're looking for?"endendLionel.new.hello?# => "is it me you're looking for?"
You can see above that the behaviour is effectively the same as if we’d defined the hello? method within the Lionel class without any “metaprogramming magic”5.
It’s because of module_eval that you can define methods within example groups:
describe"a thing"dodefinvert_phase_polarity# waggle the flux capacitor or somethingendendRSpec::Core::ExampleGroup::Nested_1.instance_methods(false)# false means don't include methods from ancestors# => [:invert_phase_polarity]
These methods are then effectively defined as part of the Nested_1 class that we are implicitly creating. This means that methods defined in this way can be called from within your specs:
describe"a method in an example group"dodefthe_method_in_question:resultendit"can be called from within a spec"dothe_method_in_question.should==:resultendend
We’ll see how this actually works a bit later. Knowing that the contents of the describe block are effectively evaluated within a class definition also explains what’s happening when the before methods are called:
Because this is evaluated as if it was written in a class definition, then before must be a method available on the ExampleGroup class. And indeed it is – RSpec::Code::ExampleGroup.before.
The before method actually comes from the module RSpec::Core::Hooks, which is extended into ExampleGroup. RSpec has a very complicated behind-the-scenes hook registry, which for the purposes of brevity I’m not going to inspect here..
The before method registers its block within that registry, to be retrieved later when the specs actually run.
Because I’m not going to really look too deeply at hooks, the call to the after method works in pretty much the same way. Here it is though, just because:
The spec itself
The next method that’s module_eval‘d within our ExampleGroup subclass is the it:
it"should be an Object"firstname.lastname@example.org_an(Object)end
Users of RSpec will know that you can call a number of methods to define a single spec: it, specifyexample, and others with additional meaning like pending or focus. These methods are actually all generated while RSpec is being loaded, by calls to define_example_method within the class definition of ExampleGroup. For simplicity’s sake (pending and focussed specs are somewhat outwith the remit of this exploration), we’ll only look at the simplest case.
When it is called, more metadata is assembled about the spec (again, including the line and file), and then both this metadata and the block are passed to RSpec::Core::Example.new, which stashes them for later.
Within our outer example group, we’ve nested another group:
describe"compared to another object"dobefore:eachdo@other=Object.newendit"should not be equal"email@example.com_not==@otherendend
Just as the top-level call to describe invokes a class method on RSpec::Core::ExampleGroup, this call will be invoked against the subclass of ExampleGroup (i.e. Nested_1) that our outer group defined. Accordingly, each call to describe defines a new subclass6, stored as a constant within the top-level class: Nested_1::Nested_1. This subclass is stored within an array of children in the outer Nested_1 class.
Within the definition, our before and it calls evaluate as before.
Your spec, as objects
So, for every describe, a new subclass of ExampleGroup is created, with calls to before and after registering hooks within that subclass, and then each it call defines a new instance of RSpec::Core::Example, and these are stored in an array called examples within that subclass.
We can even take a look at these now, for a simplified example:
Where example groups are nested, further subclasses are created, and stored in an array of children within their respective parent groups.
Phew. The detour we took when looking at this aspect of minitest was much shorter, but now that we understand what happened when our actual spec definition was evaluated, we can return to RSpec running and see how it’s actually exercised.
As we saw above, the describe method returns the created subclass of RSpec::Core::ExampleGroup, and when that is returned back in RSpec::Code::DSL#describe, the register method is called on it. This calls world.register with that class as an argument, where world is returned by RSpec.world and is an instance of RSpec::Core::World, which acts as a kind of global object to contain example groups, configuration and that sort of thing.
Calling register on the World instance stashes our Nested_1 class in an example_groups array within that world.
Our diversion is complete! You deserve a break. Go fetch a cup of your preferred delicious beverage, you have earned it!
Back in RSpec
OK, pop your brain-stack back until we’re in RSpec::Core::Commandline#run again. Our reporter did its preamble stuff, and we were iterating through @world.example_groups, whose origin we now understand.
For each example group, the run method is called on that class, with the reporter instance passed as an argument.
This gets a bit intricate, so I’m going to step through the method definition itself (for version 2.12.2) to help anchor things.
RSpec has a “fail fast” mode, where any single example failure will cause the execution of specs to finish as quickly as possible. Here, RSpec is checking whether anything has triggered this.
Next, the reporter is notified that an example group is about to start. The reporter can use this information to print out the name of the group, for example.
The run of the examples is wrapped in a block so it can catch any exceptions and handle them gracefully as you might expect.
The before :all hooks
The call to run_before_all_hooks is very interesting though, and worth exploring. A new instance of the example group is created. It is then passed into this method, where any “before all” blocks are evaluated against that instance, and then the values of any instance variables are stashed.
Consider our original example:
Given this, we’ll stash the value of @shared_thing (and the fact that it was called @shared_thing) for later use.
As you can see above, we can poke around with the innards of objects to our heart’s content. Who needs encapsulation, eh?
Why did RSpec have to create an instance of the example group class, only to throw it away after the before :all blocks have been evaluated? Because RSpec needs to evaluate the block against an instance of the example group so that it has access to the same scope (e.g. can call the same methods) as any of the specs themselves.
Running the example
Now we’re finally ready to run the examples:
To understand this, we need to look at the definition of run_examples:
This method iterates over each Example that was stored in the examples array earlier, filtering them according to any command-line parameters (though we are ignoring that here). The most relevant part for us lies in the middle:
Another new instance of the ExampleGroup subclass is created. Remember, RSpec created one instance of the class for the before :all blocks, but now it’s creating a fresh instance for this specific spec to be evaluated against.
Thinking back to how MiniTest works, there’s a striking parallel: where MiniTest would instantiate a new instance of the MiniTest::Unit::TestCase for each test method, RSpec is creating a new instance of the ExampleGroup subclass to evaluate each Example block against.
Instances of this class are used so that any methods defined as part of the spec definition are implicitly available as methods to be called in the “setup” and “test” bodies (see the module_eval section above). Not so different after all, eh?
Next, the instance variables that we stashed after evaluating the before :all blocks are injected (effectively using instance_variable_set as we saw above) into this new instance, which will allow the spec to interact with any objects those blocks created. It also means that these values are shared between every spec, and so interactions within one spec that changed the state of one of these instance variables will be present when the next spec runs.
Finally, the #run method on the Example subclass is called, passing the ExampleGroup instance and the reporter. Down one level we go, into Example#run…
The spec finally runs
Here’s the full definition of RSpec::Core::Example#run:
defrun(example_group_instance,reporter)@example_group_instance=example_group_instance@example_group_instance.example=selfstart(reporter)beginunlesspendingwith_around_each_hooksdobeginrun_before_each@example_group_instance.instance_eval(&@example_block)rescuePending::PendingDeclaredInExample=>e@pending_declared_in_example=e.messagerescueException=>eset_exception(e)ensurerun_after_eachendendendrescueException=>eset_exception(e)ensure@example_group_instance.instance_variables.eachdo|ivar|@example_group_instance.instance_variable_set(ivar,nil)end@example_group_instance=nilbeginassign_generated_descriptionrescueException=>eset_exception(e,"while assigning the example description")endendfinish(reporter)end
For our purposes, we again only need to consider a small part. Once all the reporter and “around” block housekeeping has taken place, the essential core of the example is run:
The call to run_before_each introspects the hook registry and evaluates every relevant before hook against the ExampleGroup instance. In effect, this will find any before blocks registered in this example group, and then any blocks registered in any parent groups, and evaluate them all in order, so that each nested before block runs.
Then, the spec block (stored in @example_block) is evaluated against the ExampleGroup instance. This is where your assertions, or matchers, are finally – finally! – evaluated.
If there was a problem, such as a matcher failing or an exception being raised, then the exception is stored against this Example for later reporting. Just as MiniTest assertions raise an exception when they fail, RSpec matchers raise an RSpec::Expectations::ExpectationNotMetError exception. It seems this is the universal way of halting execution when a test fails7. Another hidden similarity between RSpec and MiniTest!
As in MiniTest, whether or not the spec failed or an exception occured, an ensure section is used to guarantee that run_after_hooks is called, and any teardown is performed.
After the specs have run
Once all the specs in this example group have run, all the examples in any subclasses are run (recall that the inner describe stashed the nested ExampleGroup subclass in an array called children). We map each ExampleGroup subclass to the result of calling run on it, which starts this whole process again, for every nested example group. Whether or not this group passed or failed overall is then determined using simple boolean logic:
You can once again pop your brain-stack back until we’re in RSpec::Core::Commandline#run.
Having run all of the example groups, RSpec will do a little bit of tidy up, and finally return back up through the stack. Along the way printing the results of the run to the console is performed, before interpreter finally, properly quits.
the stashing of behaviour blocks, later evaluated using instance_eval against clean test-environment instances (see this section of the MiniTest article for what I mean by “test environment”);
using module_eval and subclassing to ensure method definition matches programmer expectation.
I would say these two aspects are the hallmark attributes of an RSpec-style test framework. The other notable aspect is the ability to nest example groups, and the subsequent necessity to be able to gather the implicit chain of setup blocks and evaluate them against the test environment instance, but this could be considered another example of using instance_eval.
Supporting method definition in example groups
One thing I’ve found particularly interesting is that RSpec ends up generating classes and subclasses behind the scenes. I believe this is almost entirely a consequence of wanting to support the kind of “natural” method definition within group bodies (see the module_eval section again).
If any test framework chose to not support this, there’s almost certainly no reason to create classes that map to example groups at all, and the setup and test blocks could be evaluated against a bare instance of Object.
Supporting nesting and dynamic composition
It’s clear that RSpec has more “features” (e.g. nesting, before :all and so on) than MiniTest (ignoring the many extensions available for MiniTest, the most sophisticated of which end up significantly modifying or replacing the MiniTest::Unit.run behaviour). I’m deliberately ignoring features like matchers, or a built-in mocking framework, because what I’m most interested in here are the features that affect the structure of the tests.
It’s certainly possible to implement features like nesting using subclasses and explicit calls to super, but this is the kind of plumbing work that Ruby programmers are not accustomed to accepting. By separating creation of tests from Ruby’s class implementation, the implicit relationships between groups of tests can take this burden instead, and behaviours like before :all, which have no natural analogue in class-based testing, are possible.
Now, you may believe that nesting is fundamentally undesirable, and it is not my present intention to disabuse you of that opinion. It’s useful (I think) to understand the constraints we accept by our choice of framework, and I’ve certainly found my explorations of MiniTest and RSpec have helped clarify my own opinions about which approach is ultimately more aligned with my own preferences. While I wouldn’t say that I’m ready to jump wholesale into the RSpec ecosystem, I think it’s fair to say that my advocacy of class-style testing frameworks is at an end.
RSpec and Kintama
I started this exploration because I wanted to understand the relationship between the software I have accidentally produced and what’s already available. I already had strong suspicions that any block-based testing implementation would converge on a few common implementation decisions, and while I have now identified a few interesting (to me) ways in which RSpec and Kintama diverge, the essential approach is the same.
In the final article in this triptych (coming soon, I hope), I’ll walk through Kintama and point those out.
There’s no built-in way to ‘nest’ test groups with MiniTest, or test-unit; the closest simulation would be to create subclasses, and explicitly ensure that super is called within every setup method.↩
There are other RSpec features like shared examples and global before/after hooks that are definitely interesting, but I need to keep the scope of this article down…↩
I’m not sure why some people prefer the syntax Module::method rather than Module.method; as I understand it they are exactly the same, but the former seems more confusing to me, since if you don’t notice the lower-case w in world then you’d assume it was refering to a constant.↩
It’s not really magic, and it’s not really “metaprogramming”, because it’s all just programming. It just so happens that it’s quite sophisticated programming.↩
The nested class is a subclass of the outer subclass of ExampleGroup (sorry, I realise that’s confusing), precisely such that any methods defined in the outer class are also available in nested subclasses via the regular mechanisms of inheritance.↩
Raising an exception might not be the only way to stop a test executing at the point it fails; it could be possible to use fibers/continuations to “pause” failing tests…↩
Two years ago, as a joke and a nod to making things fast, I took a silly
domain name and served a few silly “HTTP” requests using the UK postal service
as the transport layer. It was called Postal Inter.net.
It was good fun, and I really enjoyed some of the requests that we received,
but the “server” has not been accessed for more than a year now, so I think it’s
time to put it to rest.
You were fun, postalinter.net, but your time has passed. I release you into
the quantum foam.
Here are a few of the requests that we got. Obviously you are only seeing one
side of the communication; the responses are now lost in the ether (or in
the post boxes of the UK).
The most impressive request thoroughly embraced the nature of TCP/IP, and
arrived in a number of packets, out of order and with some data corruption (see
the missing data on the envelopes), which we had to reconstitute into the actual
request within our ‘server’. Bravo, Tom Stuart!
Tom was challenged for login details, and here was his response.
Richard Paterson is Whyte & Mackay’s “master blender”, which means he doubtless knows a lot about whisky, and as a result he’s clearly asked to appear on TV to guide people around the world of whisky and how best to appreciate it.
Let him guide you now:
But what is a fun bit of banter once can quickly become sinister when you hear it again. And again. And again.