Loose Typing

Make your own tools

As developers, we're far too fond of the blunt, primitive tools we find lying around at the bottom of the software development stack - logs, CI GUIs, low-level bash commands, inflexible build scripts. Too rarely do we write bespoke, higher-order tools to make ourselves more productive and the job more enjoyable. We're missing out on real opportunities to go faster.

When we're writing production or test code which gets repetitive or unwieldy, it's not long before we naturally break it up into chunks and compose those chunks together at a higher, more useful level, then as things grow still more we refactor that and compose over it, and so on up the ladder of abstraction. We don't end up writing files of source code thousands of lines long, or at least I hope not. But when it comes to the machinery of software development, we're perfectly happy bolting together the same old steps by hand. We seem unaware that we're death-marching through numerous repetitive, un-factored-out, long-winded, ambiguously-defined, error-prone steps time and time again.

What would be so wrong in daring to create project-specific tools? Not just traditional automation, but an interface which is deliberately, intimately tailored to the software we're currently engaged in producing? It could be somethign we use at the command line or REPL, or even through its own GUI.

There are many reasons why we tend not to write our own tools, or tend not to give much love to the ones we do create.

  1. We can get buy using the facilities we already have, albeit inefficiently, so we allow ourselves to be drugged by comforting familiarity and forget that it's possible to improve.

  2. When we're not very experienced, we take it as read that everything's been done for us. "Surely Maven is all you need!"

  3. It's quite easy to write a wiki page to capture complex, often-repeated processes. You get to be a published author! Your effort is evident to others, and they'll thank you for it. And it's comforting to follow instructions by rote - at least for the first few times.

  4. We're easily demotivated by the effort required to make something new, even if its utility is blindingly obvious.

  5. Admittedly, automation sometimes hides things which it's useful to see, especially at the start of a project or a new kind of activity. That's why manual testing is still valuable. But this shouldn't stop us continually assessing how to avoid day-to-day drudgery.

  6. When we're working at scales where the effort definitely would be worth our while, there's often too much weight of opinion behind old practices, and we're engaged in so much fire-fighting - ironically often exacerbated by the very lack of tooling - that we can't think clearly anyway.

  7. Software development work is often ultimately dictated or influenced by someone who hasn't experienced the value of taking stock and investing in improving the delivery process itself. "Delivery-focussed" people like this see a team writing its own tools as a self-indulgence.

  8. Writing tools to solve messy software development problems often involves understanding many kinds of technologies - eg in infrastructure or cloud automation - and not everyone has the confidence nor the breadth of knowledge required to bring them all together.

  9. There might be a Product out there which Already Does This. Writing something specific to your situation might be seen as unnecessary effort, even if there's every chance the enterprise equivalent is expensive, bloated and doesn't in practice do what you need, or at least not without considerable customisation. And money. And formal procedures for requesting installation. And teams of experts. And time spent finding nothing useful about it on StackOverflow. And restrictions born of the fact that it's shared by many people. And downtime.

  10. Larger companies, especially, tend to mandate the use of specific toolsets, and explicitly making more tools is officially naughty. "We already have everything we need!" Which will either be nothing or one of those bloated enterprise things.

  11. Ground-up efforts to improve software delivery often see developers use tools, languages or patterns which are novel to the rest of the team, and as a result the initiative is frowned upon.

  12. Conversely, when it comes to writing tools we're often suckered by convention into believing we can't use the first-class programming language we're using in our core deliverables.

  13. When we're done writing our tools, there might not be anywhere acceptable to host or distribute whatever we've written, and so it won't get used and can't be promoted.

  14. Finally, some of us give the whole 'invent what you need' principle a bad name because in the past we've applied it for the wrong reasons, preferring to solve an easier, more interesting and mostly unrelated problem rather than the one our clients care about and are paying us for.

.....

Once you start thinking like this, the ideas pile on thick and fast. In a way, that's another reason why I've failed in the past to take the first step, because by the time I steel myself up to it, the first step looks decidedly drab compared to the grandiose vision in my head, and it all seems unachievable. "Time travel! Fantastic! Oh, wait - what shall I use for the UI?" Once I realise my first step can't get me straightaway to something like Brett Victor's round-tripping code editor, I give up then and there.

So let's come back down to reality. The first few steps to giving ourselves a tool like the one I've described ought to be really simple. If we give ourselves permission we'll see more and more opportunities for improvement using the skills we already have at our fingertips. And that is the most motivating thing about delivering software: making a useful difference more quickly than last time, getting to powerful and interesting levels of abstraction, and giving yourself more time to think.

================= Irrelevant stuff for a different post:

So here's a concrete example. It's about an idea that's been knocking around in my head for ages. If I'd done anything about it, it would have helped immensely in several widely varying contexts, but instead I've not even put finger to keyboard. I mention it now for a couple of reasons. Firstly, it's an example of a failure on my part to apply the principle I'm talking about here, for most of the reasons above. Secondly, the value of the idea is screaming at me in the current day job, and I don't want to make the same mistake by ignoring that.

The idea is about a little toolset for using test data in complex enterprise systems. It's an idea that starts life really simple. It's easy to begin. But it fertilises the garden of your mind for other ideas to grow.

Nowhere do we need better tools for testing software than in large-scale enterprises. Like it or not, delivering software in big organisations usually means dealing with real legacy systems in unreliable environments which contain (or serve up) test data of questionable provenance and certain unreliability. Test data is often won at significant cost, perhaps involving whole teams of people to mine for it or create if from scratch. It's jealously hoarded once acquired, but rarely safe from interference. It's often intermittently available as the environments which deliver it come and go.

Despite its problems, realistic test data is invaluable in proving the behaviour of complex systems that have parts not under your control - the parts that a friend recently labelled "here be dragons". You'd be a brave or foolish person to take new functionality live without making use of test data and the systems doling it out.

Large-scale enterprises commonly have dedicated test teams - QAs - and it's the QAs who understand, discover and acquire test data. If you've only ever worked in small, 'agile' teams you might scoff at the idea of a separate QA function, but once you get into a big enterprise I'll wager that you'll be happy for those QAs to take on all that messy work! Many is the situation in which my team would have been lost without QAs. Their knowledge and understanding is gold dust, and they're not afraid of panning through mud and silt to get to it.

But information about test data is usually poorly disseminated. It's emailed all over the place, for a start. If you're lucky it will appear in as a spreadsheet attachment. If you're luckier still it will be shared in a wiki page, though usually still as an attachment. Someone might even go so far as to make it more readily machine-readable.

And that's usually as far as it goes. It's rarely actually read by machine. The association of your tests with the test data they need is all manual. This is understandable when the testing itself is manual. But even in the nicest of automated tests which rely on that data, expressed in the most sensible way, there's still a huge gap between saying "Given I am a customer with a Volkswagen" and choosing account number 234292325 under the assumption that it really does illustrate a scenario depending on that fact. That gap's filled by the person writing the test, who reads the wiki page or whatever then types "234292325" into their Cucumber or plain-code test, where it's trusted forever afterwards as being useful and relevant in that proving that the product works as required in that scenario.

To keep ourselves honest, we'd need a sub-test embedded in the Given step or the test's preconditions: namely, an assertion with proof that 234292325 really is a Volkswagen owner's account. But it's rare in big systems to be able to verify this directly, so typically the best we could do is to assert that 234292325 is mentioned on the test data wiki page in the section marked "user accounts who have Volkswagens". Then we'd know that if the QA was right, we're good to go. But even that is usually too much work, so we give up completely and just type in "234292325" in the test, hoping that no-one screws with that account and that the QA was right and that the environment containing it stays up, and so perpetuate the hiding of our assumptions.

If someone screws with the account, or the QA was wrong, or something in the way it was acquired or created was wrong, then you may eventually discover the fact. Assuming, that is, that your test does actually fail and isn't passing coincidentally. If your test does fail, you'll grub around for another account number to use, if you can remember where the wiki page or the spreadsheet is, all the while hoping that the list of test data is up to date and the data is still truly available in the environment which allegedly used to contain it.

Worse still, you may well have brought into question the provenance of some carefully curated fossil of test data - but no-one else will know about your discovery, and will carry on wasting their time with it. That may even be you, because you'll have used the same precious artefact in several tests, and several kinds of tests, and might not remember to search-and-replace.

What if there were a place to describe test data and the things about it that are of interest to your team and people working in your domain, and what if your tests had first-class, no-air-gap access to that kind of material? What if you didn't have to grub around for documentation about what test data is available in which environment with which characteristics, and instead could just ask for it? What if you could actually ask for it in code at test run time? What if you could ask for all data with the desired characteristics and automate tests over all of it? What if we could maintain facts about test data in a single shared place?

People used to more traditional forms of testing don't like the idea of running tests over a set of data which isn't necessary fixed. This arises from the idea that tests should be repeatable. It turns out that tests can be repeatable without exercising the system with the same data each time: you need to know, when a test fails, what data was being used at failure time, and you need to be able to re-run the same tests against the same data until you've fixed the problem or determined that the test data is invalid. Provided you can do that, why on earth wouldn't you want to be able to test with as much data as possible, even as new test data becomes available, given that it's the product's stability in the face of data variance that you're seeking to prove?

With a shared accessible place for test data metadata, everyone benefits at once. As new test data is acquired, all your tests get to know about it. Once you've discovered something about a bit of data - eg that it doesn't after all correspond to a customer who owns a car - you can bank that bit of knowledge, and you and everyone and all your tests will benefit thereafter, and avoid using it when it's inappropriate to do so.

We can get higher-order pretty quickly. With a declarative description of what you're interested in, you can create test data, for a certain value of 'create'. Creation might mean automatically spitting out the specification of the test data you want, then emailing that to your manual test data creation team. It might mean going off to the underlying databases that are otherwise invisible to the system you're developing, and which are read-only for you and your team, to find the data and highlight what's missing: "in this test suite you say you want customers who've bought a Volkswagen, but we can find none in the system test environment - this needs fixing, please". Once you've found data meeting your expectations, you can assert its viability for everyone to take advantage of, thereby making it useful for any tests which rely on matching preconditions.

We're not dealing here only with the messiness of acquiring realistic test data and confirming its realism before using it in tests. Effective testing of this kind also means squarely facing up to test data instability, to the fact that bits of it are liable to change under your feet, or become unavailable, at any time. Time is fact the key. We need to have a first-class notion of the time at which facts are true, because that will help us understand test failures and attribute them to the right cause. Wouldn't it be great never again to have to undergo the dismaying, futile archaeology which so often results when things go wrong and is the more time-consuming and exhausting the more people and the more teams involved? "Well, it worked yesterday in system test but not today, and we've got to go live tomorrow, and they're saying they haven't changed anything..."

Put another way, we need our testing infrastructure to be able to time-travel and return to the present with answers. Sounds like a perfect job for Datomic!