Notes on 'Notes on "Counting Tree Nodes"'

Written on July 17 2014 at 16:11 ∷ permalink

Having now finished watching Tom’s episode of Peer to Peer, I finally got around to reading his Notes on “Counting Tree Nodes” supplementary blog post. There are a couple of ideas he presents that are so interesting that I wanted to highlight them again here.

If you haven’t seen the video, then I’d still strongly encourage you to read the blog post. While I can now see the inspiration for wanting to discuss these ideas1, the post really does stand on it’s own.

Notes on ‘Enumerators’

Here’s the relevant section of the blog post. Go read it now!

I’m not going to re-explain it here, so yes, really, go read it now.

What I found really interesting here was the idea of building new enumerators by re-combining existing enumerators. I’ll use a different example, one that is perhaps a bit too simplistic (there are more concise ways of doing this in Ruby), but hopefully it will illustrate the point.

Let’s imagine you have an Enumerator which enumerates the numbers from 1 up to 10:

> numbers = 1.upto(10)
=> #<Enumerator: 1:upto(10)>
> numbers.next
=> 1
> numbers.next
=> 2
> numbers.next
=> 3
...
> numbers.next
=> 9
> numbers.next
=> 10
> numbers.next
StopIteration: iteration reached an end

You can now use that to do all sorts of enumerable things like mapping, selecting, injecting and so on. But you can also build new enumerables using it. Say, for example, we now only want to iterate over the odd numbers between 1 and 10.

We can build a new Enumerator that re-uses our existing one:

> odd_numbers = Enumerator.new do |yielder|
    numbers.each do |number|
      yielder.yield number if number.odd?
    end
  end
=> #<Enumerator: #<Enumerator::Generator:0x007fc0b38de6b0>:each>

Let’s see it in action:

> odd_numbers.next
=> 1
> odd_numbers.next
=> 3
> odd_numbers.next
=> 5
> odd_numbers.next
=> 7
> odd_numbers.next
=> 9
> odd_numbers.next
StopIteration: iteration reached an end

So, that’s quite neat (albeit somewhat convoluted compared to 1.upto(10).select(&:odd)). To extend this further, let’s imagine that I hate the lucky number 7, so I also don’t want that be included. In fact, somewhat perversely, I want to stick it right in the face of superstition by replacing 7 with the unluckiest number, 13.

Yes, I know this is weird, but bear with me. If you have read Tom’s post (go read it), you’ll already know that this can also be achieved with a new enumerator:

> odd_numbers_that_arent_lucky = Enumerator.new do |yielder|
    odd_numbers.each do |odd_number|
      if number == 7
        yielder.yield 13
      else
        yielder.yield number
      end
    end
  end
=> #<Enumerator: #<Enumerator::Generator:0x007fc0b38de6b0>:each>
> odd_numbers.next
=> 1
> odd_numbers.next
=> 3
> odd_numbers.next
=> 5
> odd_numbers.next
=> 13
> odd_numbers.next
=> 9
> odd_numbers.next
StopIteration: iteration reached an end

In Tom’s post he shows how this works, and how you can further compose enumerators to to produce new enumerations with specific elements inserted at specific points, or elements removed, or even transformed, and so on.

So.

A hidden history of enumerable transformations

What I find really interesting here is that somewhere in our odd_numbers enumerator, all the numbers still exist. We haven’t actually thrown anything away permanently; the numbers we don’t want just don’t appear while we are enumerating.

The enumerator odd_numbers_that_arent_lucky still contains (in a sense) all of the numbers between 1 and 10, and so in the tree composition example in Tom’s post, all the trees he creates with new nodes, or with nodes removed, still contain (in a sense) all those nodes.

It’s almost as if the history of the tree’s structure is encoded within the nesting of Enumerator instances, or as if those blocks passed to Enumerator.new act as a runnable description of the transformations to get from the original tree to the tree we have now, invoked each time any new tree’s children are enumerated over.

I think that’s pretty interesting.

Notes on ‘Catamorphisms’

In the section on Catamorphisms (go read it now!), Tom goes on to show that recognising similarities in some methods points at a further abstraction that can be made – the fold – which opens up new possibilities when working with different kinds of structures.

What’s interesting to me here isn’t anything about the code, but about the ability to recognise patterns and then exploit them. I am very jealous of Tom, because he’s not only very good at doing this, but also very good at explaining the ideas to others.

Academic vs pragmatic programming

This touches on the tension between the ‘academic’ and ‘pragmatic’ nature of working with software. This is something that comes up time and time again in our little sphere:

Now I’m not going to argue that anyone working in software development should have a degree in Computer Science. I’m pretty sympathetic with the idea that many “Computer Science” degrees don’t actually bear much of direct resemblance to the kinds of work that most software developers do2.

Ways to think

What I think university study provides, more than anything else, is exposure and training in ways to think that aren’t obvious or immediately accessible via our direct experience of the world. Many areas of study provide this, including those outside of what you might consider “science”. Learning a language can be learning a new way to think. Learning to interpret art, or poems, or history is learning a new way to think too.

Learning and internalising those ways to think give perspectives on problems that can yield insights and new approaches, and I propose that that, more than any other thing, is the hallmark of a good software developer.

Going back to the blog post which, as far I know, sparked the tweet storm about “programming and maths”, I’d like to highlight this section:

At most academic CS schools, the explicit intent is that students learn programming as a byproduct of learning CS. Programming itself is seen as rather pedestrian, a sort of exercise left to the reader.

For actual developer jobs, by contrast, the two main skills you need these days are programming and communication. So while CS still does have strong ties to math, the ties between CS and programming are more tenuous. You might be able to say that math skills are required for computer science success, but you can’t necessarily say that they’re required for developer success.

What a good computer science (or maths or any other logic-focussed) education should teach you are ways to think that are specific to computation, algorithms and data manipulation, which then

  • provide the perspective to recognise patterns in problems and programs that are not obvious, or even easily intuited, and might otherwise be missed.
  • provide experience applying techniques to formulate solutions to those problems, and refactorings of those programs.

Plus, it’s fun to achieve that kind of insight into a problem. It’s the “a-ha!” moment that flips confusion and doubt into satisfaction and certainty. And these insights are also interesting in and of themselves, in the very same way that, say, study of art history or Shakespeare can be.

So, to be crystal clear, I’m not saying that you need this perspective to be a great programmer. I’m really not. You can build great software that both delights users and works elegantly underneath without any formal training. That is definitely true.

Back to that quote:

the ties between CS and programming are more tenuous … you can’t necessarily say that they’re required for developer success.

All I’m saying is this: the insights and perspectives gained by studying computer science are both useful and interesting. They can help you recognise existing, well-understood problems, and apply robust, well-understood and powerful solutions.

That’s the relevance of computer science to the work we do every day, and it would be a shame to forget that.

  1. In the last 15 minutes or so of the video, the approach Tom uses to add a “child node” to a tree is interesting but there’s not a huge amount of time to explore some of the subtle benefits of that approach

  2. Which is, and let’s be honest, a lot of “Get a record out of a database with an ORM, turn it into some strings, save it back into the database”.


You are reading interblah.net, an irregularly-kept blog thing by James Adam.

If this isn’t what you wanted, you can try the index, or search. Or, read about this site.


Docker on your Server

Written on July 01 2014 at 21:22 ∷ permalink

TL;DR: Register at http://www.dockeronyourserver.com if you’re interested in an e-book about using Docker to deploy small apps on your existing VPS.

I’ve been toying around with Docker for a while now, and I really like it.

The reason why I became interested, at first, was because I wanted to move the Printer server and processes to a different VPS which was already running some other software (including this blog), but I was a bit nervous about installing all of the dependencies.

Dependency anxiety

What if something needed an upgraded version of LibXML, but upgrading for one application ended up breaking a different one? What if I got myself in a tangle with the different versions of Ruby that different applications expect?

If you’re anything like me, servers tend to be supremely magical things; I know enough of the right incantations to generally make the right things appear in the right places at the right time, but it’s so easy to utter the wrong spell and completely mess things up1.

Docker isolation

Docker provides an elegant solution for these kinds of worries, because it allows you to isolate applications from each other, including as many of their dependencies as you like. Creating images with different versions of LibXML, or Ruby, or even PostgreSQL is almost trivially easy, and you can be confident that running any combination will not cause unexpected crashes and hard-to-trace software version issues.

However, while Docker is simple in principle, it’s not trivial to actually deploy with it, in the pragmatic sense.

Orchestration woes

What I mean is getting to a point where deploying a new application to your server is as simple (or close enough to it) as platforms like Heroku make it.

Now, to be clear, I don’t specifically mean using git push to deploy; what I mean is all the orchestration that needs to happen in order to move the right software image into the right place, stop existing containers, start updated containers and make sure that nothing explodes as you do that.

But, you might say, there are packages that already help with this! And you’re right. Here are a few:

  • Dokku, the original minimal Heroku clone for Docker
  • Flynn, the successor to Dokku
  • Orchard, a managed host for your Docker containers
  • …and many more. I don’t know if you’ve heard, but Docker is, like, everywhere right now.

So why not use one of those?

That’s a good question

Well, a couple of reasons. I think Orchard looks great, but I like using my own servers for software when I can. That’s just my personal preference, but there it stands, so tools like Orchard (or Octohost and so on) are not going to help me at the moment.

Dokku is good, but I’d like to have a little more control of how applications are deployed (as I said above, I don’t really care about git push to deploy, and even in Heroku it can lead to odd issues, particularly with migrations).

Flynn isn’t really for me, or at least I don’t think it is. It’s for dev-ops running apps on multiple machines, balancing loads and migrating datacenters; I’m running some fun little apps on my personal VPS. I’m not interested in using Docker to deploy across multiple, dynamically scaling nodes; I just want to take advantage of Docker’s isolation on my own, existing, all-by-its-lonesome VPS.

But, really more than anything, I wanted to understand what was happening when I deploy a new application, and be confident that it worked, so that I can more easily use (and debug if required) this new moving piece in my software stack.

So I’ve done some exploring

I took a bit of time out from building Harmonia to play around more with Docker, and I’d like to write about what I’ve found out, partly to help me make it concrete, and partly because I’m hoping that there are other people out there like me, who want some of the benefits that Docker can bring without necessarily having to learn so much about using it to it’s full potential in an ‘enterprise’ setting, and without having to abandon running their own VPS.

There ought to be a simple, easy path to getting up and running productively with Docker while still understanding everything that’s going on. I’d like to find and document it, for everyone’s benefit.

If that sounds like the kind of think you’re interested in, and would like reading about, please let me know. If I know there are enough people interested in the idea, then I’ll get to work.

Sign up here: http://www.dockeronyourserver.com

  1. You might consider this to be an early example of what I mean.


Reducing risk when running a conference

Written on June 25 2014 at 14:37 ∷ permalink

It was really interesting to read the news about the potential cancellation of Wicked Good Ruby, a Ruby conference that was due to run for a second time in Boston this year:

Rather than half-ass a conference we’ve decided to cancel it. All tickets will be fully reimbursed. I’ve already reached out to those that have purchased with how that will be done.

There’s lots of interesting stuff in this thread, but one point that jumped out to me was the financial risk involved:

Zero sponsors. […] I had 2 companies inquire about sponsorship. One asked to sponsor but never paid the sponsorship invoice after 82 days.

[…]

Last year we lost $15k. In order to made it worth our effort this year we needed to make a profit. The conference we had in mind required a $50k sponsorship budget with an overall budget of close to $100k. (last year’s conference cost about $125k) Consider to date, after 6 months, we have received $0 in sponsorship the financial risk was too high.

[…]

Since announcing this a few hours ago I’ve been contacted by 3 other regional conference organizers. They are all having similar issues this year. Sponsorship is incredibly difficult to come by […] I didn’t get the sense they were going to bail but I think this is a larger issue than just Boston.

With costs in the tens or even hundreds of thousands of dollars, it’s really no wonder that organisers might have second thoughts.

But does running a conference really need to come with such an enormous financial risk? With Ruby Manor, our biggest budget was less than £2,000 (around $3,000). Here’s why:

  • We don’t use expensive ‘conference’ venues…
  • … so we aren’t locked into their expensive conference catering.
  • We run a single track for a single day, so we only need one main room.
  • We meticulously pare away other expenses like swag, t-shirts, catering and so on, where it’s clear that attendees are perfectly capable of handling those themselves.

Now, it could be that there are a few factors unique to Ruby Manor that make it difficult, or even impossible, for other conferences to follow the same model. For instance, holding the event in the center of a big city like London means there’s already a wide range of lunch options for attendees, right outside the venue.

Another problem could be the availability of university and community venues as an alternative to conference centres. I really don’t know if it’s possible or not to rent spaces like this in other cities or countries. A quick look at my local university indicates that it’s totally feasible to rent a 330-seat auditorium for less than $1,000, and even use their coffee/tea catering for a total of less than $3,0001, all-in.

I would be genuinely fascinated for other conferences to start publishing their costing breakdown. LessConf have already done this, and even though I might not have chosen to spend quite so much money on breakfasts and surprises, I genuinely do applaud their transparency for the benefit of the community.

In the end it seems there’s hope for Wicked Good:

I think I’ve come up with a plan: reduce the conference from 2 nights to 1 night. Cut out many of the thrills that I was planning. This effectively would reduce our operational costs from ~$100k to around ~$50k. This would also allow us to reduce the ticket prices (I would reimburse current tickets for the difference).

I genuinely wish the organisers the best of luck. It’s a tough gig, running a conference. That said, $50,000 is still an enormous amount of money, and I cannot help but feel that it’s still far higher than it needs to be.

Every hour you spend as a conference organiser worrying about sponsorships or ticket sales or other financial issues, is an hour that could be spent working on making the content and the community aspects as good as they can be.

Let’s not forget: the only really important part of a conference is getting a diverse group of people with shared interests into the same space for a day or two. Everything else is just decoration. A lot of the time, even the presentations are just decoration. It’s getting the community together that’s important.

I realise that many people expect, consciously or otherwise, some or all of the peripheral bells and whistles that surround the core conference experience. For some attendees, going to a conference might even be akin to a ‘vacation’ of sorts. Perhaps a conference without $15,000 parties feels like less of an event… less of an extravaganza.

But consider this: given the choice between a glitzy, dazzling extravaganza, and a solid conference that an organiser can run without paralysing fear of financial ruin, I know which I would choose, and I know which ultimately benefits the community more.

  1. To be clear, they require that you cannot make a profit on events they host, but from what I can tell, most conference organisers don’t wish to run their events for profit anyway.


Read more recent blogish post-things here