If you write software for long enough you begin to develop instincts about the relative difficulty of different problems and some of those solidify into rules of thumb. The following are some of the guidelines that direct my work on a daily basis.

Choose simpler graph data structures

Many problems are graph problems of one sort or another. Resolving dependencies in a package manager, traversing a file system, exploring friend connections in a social network, all of those are examples of graph problems. There's a hierarchy of increasing complexity of graph types:

  • list
  • tree
  • acyclic graph
  • cyclic graph

All of those are technically graphs, but the amount of work you need to do to operate on them gets noticeably harder as you work your way down the list. If you can constrain your requirements to allow you to use structures nearer the top of that list you will have fewer implementation bugs.

Sometimes you can't avoid a cyclic graph, but by remembering that it's harder than the others you should stop and think "do I need more tests for this?"

Default to immutable data

Cache invalidation is hard. If your data is immutable you rarely have to solve that problem. Immutable data is also easier to archive and/or store in cheap blob stores like S3 or in static files on a filesystem.

It can be very hard to make your data truly immutable. For example, the GDPR "right to be forgotten" means that anything containing personal data must be deletable. So just like with graph data structures there's a kind of hierarchy of mutability of data storage:

  • Truly immutable. The data is created and is never destroyed.
  • Can be atomicly created or destroyed but never modified.
  • Can have portions appended or deleted, but previous data is never modified.
  • Some portions of the data structure are immutable, others can be deleted or mutated in place.
  • All parts of the data structure are mutable.

Stay as far up that list as you can and you will avoid many implementation problems. You can extend this practice into the data structures you use in code (e.g. "final everything" in Java or by using Clojure's core data types) but it's much more important to think about it for your data at rest.

Choose simpler programming models

There are lots of techniques to improve the performance of software. Use none of them until you are forced to. Concurrency is hard to get right. Event-driven systems can be difficult to reason about. Manual memory management is a notorious source of bugs. Functions without side-effects are easier to test than functions with them. Optimise for simplicity first, not performance.

Start with single-threaded, single process code in the highest-level language you can use and iterate from there. Use pure functions when you can, but don't be a zealot about it. The "Functional Core, Imperative Shell" pattern can be very useful.

Focus on your public interfaces

Hyrum's law teaches us that anything your customer can perceive about your software is part of its public interface. If you change it, someone will complain.

When you're designing new software, divide your problem space into features and behaviours that are visible and invisible to customers. Put the bulk of your effort into the visible parts, everything else is fixable later.

If you don't want your customer to depend on something, make it impossible to depend upon. For example, if you have a function that returns a list of items and for some reason you don't want the ordering to be defined then you may need to deliberately shuffle the items to prevent people from depending on the order.

Always remember that your colleagues are also your customers. If they can depend on something they will.

You probably can't avoid distributed systems

Even if you have only one web server running a monolithic application, if you put a rich JavaScript web app in front of it you have a distributed system. If you only work on static, single binary, CLI tools that don't touch the network then you can avoid distributed systems but you might also be severely limiting your career.

It's extremely hard to avoid distributed systems so you have to get good at working on them. Here are some things to remember:

  • In almost all systems there's no such thing as a zero-downtime atomic deploy where all of your customers instantaneously transfer from one version of your software to another. You can choose downtime if you want, but choosing a safe, non-atomic deployment pattern is much easier.
  • Your system's topology is a graph. The guidance for graph data structures above applies here too. If you have cycles in your graph of services you will regret it.
  • There's no such thing as a reliable network. It will fail, so don't use it unless you need it. If you make your services very "micro" then you are probably hitting the network more than you want.
  • Eventual consistency is often good enough. Use it enthusiastically.
  • Exactly-once delivery of messages is hard. At-least-once delivery of messages that are idempotent is much, much easier.
  • Don't share data stores between different services. Have one, canonical owning service for every piece of data. Other services may cache the data, but they need to defer to the owning service.
  • Use centralised logging and metrics systems from the beginning. Use a tool like Honeycomb if you can. You will learn things about your system every day.

Everything should have limits

This is somewhat tongue-in-cheek but the point still stands. It's relatively easy to relax a limitation, but if you don't have one to begin with you will have to negotiate to introduce it.

Types of limits worth thinking about are:

  • Request rate limits. An accidental denial of service by an overzealous customer is embarrassing for everyone.
  • Request size limits. These apply on both input and output, for example limiting the size of a text field being inserted and the number of records that can be fetched at once.
  • Pagination. If you return a sequence of anything it should be paginated and there should be a limit on the maximum size of the page.
  • Application-specific limits. For example, if you're making an online store limit the number of product categories and limit the number of products per category. Everywhere you see a list, a one-to-many or a many-to-many relationship you should attempt to limit it.
  • Time-based retention limits for data at rest. Don't promise to store things forever. If you don't want to implement the deletion immediately, hide it so that the customers are used to the idea that data goes away.
  • Security limits. For example browser session lengths, failed login attempts, etc.

Trust your gut

There are many other rules of thumb that I use in my day-to-day work, but it's sometimes difficult to even notice that I'm doing it. If you want to become more conscious about your own, personal guidelines trust your gut during code reviews. If something seems wrong then you need to stop and think. Maybe there's actually something wrong with the code or maybe your instincts are wrong. Either way you should use that feeling to refine your own patterns of working and teach them to those that are coming after you.

Thanks to Nathan Dintenfass for the prompt to write this piece and to Hannah Henderson for some great feedback.