# Focus

Focus is a design element in programming languages that I think deserves more attention than it gets.

A focused language puts emphasis on a set of coherent idioms. Multi-paradigm languages like C++ or C# are unfocused because they lack a certain principle.

Take C, for instance. You can do OOP in C, but it’s awkward. You need structures full of function pointers and the language wasn’t designed for it: it’s not a good idea to do it. The point is that you can but you shouldn’t.

Focus is not so much of what a language has but something that it embodies. A single-paradigm language can still be unfocused, because there can be several ways to wield the singular paradigm. A multi-paradigm language can be focused if the multi-paradigm languages connect at a higher level. Focus can be implemented as a coding standard or it can be something that everybody understands as the idiomatic way of doing things.

Focus is not always a positive trait but it rarely is a negative trait. On the other hand, a lack of focus is more often negative than positive.

Take Haskell, a pure functional language, effectively single-paradigm; it is a very special case. The language itself is absolutely focused to the point of extreme autism, but its flexible type system and vibrant community, there are many ways to program Haskell. Do you absolutely need state? Use state, but be careful. Do you want IO without monads? Well, sure, but be careful.

At a high level, Haskell code is pure. It permits some inconsistencies with its principal paradigm but it eschews them and this is the key difference.

A bigger problem with focus is that it often is intangible. It’s easier to point out languages that are unfocused than those that are. Focus is about philosophy. Some language are very philosophical. For instance, Clojure is just as much about its particular approach to concurrency, state and identity, that is a language implementing those ideas. The language caught on because Rich Hickey, the author, did not market it as the tool that would have solved everybody’s problems, but because he marketed the ideas that Clojure represented as a solution to common programming problems.

“If you want to build a ship, don’t drum up the men to gather wood, divide the work and give orders. Instead, teach them to yearn for the vast and endless sea.”

Antoine de Saint-Exupéry

In this context, Clojure can be seen as a focused language. These core philosophies are what constitute the language, the fact that Clojure happens to be a Lisp dialect implementing the philosophies is secondary in my mind. With that in mind, I acknowledge being a Lisp is also a core part of Clojure, but its principles about state and identity can be implemented in any language. Clojure does let you do OOP but it feels awkward. When you grok Clojure you understand what that means: the language can be bent for that purpose, but it doesn’t want to be. Its philosophy is like a memory-foam, if you tamper with it, it will coalesce back into its original form. When you see that, it’s the moment you understand what Clojure—or any other language—is about.

Some languages double down on philosophy by making it a part of a coding standard and enforcing it: Go. Go embodies simplicity and intuition, intentionally eschewing things that are not modern, but complex, opting to keep the core language simple. Some chalk this down as a negative trait, others love it; I find it to be both good and bad. Good, because I can jump into any Go codebase and get its purpose in minutes; bad, because sometimes I want to use abstractions for which Go is unsuitable. I respect its design philosophy, because it has one, and absolutely flaunts it. It’s not just a structural type system, it’s an idea.

Scala is another beast. It began as an experiment, trying to augment and fix the deficiencies of Java. It was designed by brilliant PLT theorists and the language is a beautiful and magnanimous behemoth. Scala has so many features that it eschews focus either intentionally or unintentionally. On the other hand, Scala is capable of many great things. But if you ask two Scala programmers what Scala represents to them, you may get different answers.

It can be a technical aspect. To some, it might be about Shapeless or all the cool things that go with it. Macros. DSLs. Code generation. Or it could be how Akka or Spark are amazing tools. It could also be a philosophical difference. Some people want an advanced type system and don’t want to be constrained by the laziness and purity of Haskell. Others want the JVM. Some just want a better Java. Some just happen to use it for Spark.

I would choose the simpler Scala, the better Java. Trait-based generics, sum types, implicits, and functional programming, to name a few. This is not just because it’s less complicated, from a business perspective, it makes it easier to hire new programmers.

As a professional Scala developer and a long-time functional programming enthusiast, I fear that I may never comfortably jump to another company, confident that since I’ve written Scala, I can understand their Scala. That, or years of experience, but who knows what’s enough? Their, whoever they may be, Scala might not be the simple Scala I and my colleagues prefer.

This is scary. For the future of the language, this is an untenable position. While I absolutely enjoy working with the language, I’m afraid that it is fated to be like Macbeth from Shakespeare: “thou shalt get kings, though thou be none”. Thus, Scala will inspire a great language and then die. Maybe it already did, and the clock is ticking. Some purport Kotlin as the successor, but I wouldn’t bet on it just yet.

“Ah, but a man’s reach should exceed his grasp, or what’s a heaven for?”

Robert Browning

The thing about Scala is that this is a conscious design decision. The language is meant to have everything and the kitchen sink. Programming languages don’t have to be simple. Powerful languages are powerful tools. Use them well, you can achieve greatness. You have to choose your tool set and hone it.

But for Haskell, Go, and Clojure, you’re using them, and you’re thinking, what is the natural way to do this? Once you find it, you find yourself implementing ideas using that philosophy, that natural way, and you’re no longer just using a tool. You’re using an idea.

# Imprecision and abstraction

What is the point of abstractions?

We want to hide things. We want to generalize things. We want to extend things.

Why are mathematical abstractions so intractable? Why is the Wikipedia page on functors incomprehensible to someone not used to mathematical formalisms? Why does it sound so vague?

When approaching abstractions, for educational purposes, it is sometimes easier to think of analogies or similes. We can conceptualize the idea of functors of “procedures” that operate on things inside “boxes”, or we can study relational algebra using Venn diagrams.

These analogies are dangerous, because they are vague. Formalisms leave no room for interpretation because they are exact not whimsically, but because of the pervasive imprecision of the human mind.

Let’s take an analogy that’s very approachable but quite dangerous. Explaining modular arithmetic can done with clocks. The analogy would go like this:

You see, when you have a number modulo 12, think of it as a clock. If x is over 12, think of like like the hand of a clock looping over, and you’re back where you started.

The problem with such an analogy is that not everybody uses 12-hour clocks. Most of Europe uses a 24-hour clock with no distinction between AM and PM. Of course, they are also taught to understand that “sixteen” means “four” since nobody builds 24-hour analog clocks (yet). That aside, it’s still very possible, that when explaining the above analogy to someone accustomed to 24-hour clocks, they’ll get confused since what comes after 12, is 13, not 0.

This is a basic but fundamental example: things like functors, semigroups, monads, and categories, are a bit intractable for a reason: there’s no room left for interpretation.

Mathematical formalisms pare fundamental ideas into pure forms. Your intuition can’t get in the way and corrupt them.

The obvious downside is that these formalisms are harder to understand. I wager that this is for the better because down the road there are concepts so high-level one can’t even begin to think in analogies, and it will only slow one down.

There was a turning point in my math studies when I stopped trying to grok things using analogies. My approach to topology was geometrical. I tried to visualize limit points in my mind and in vain, because the mind can’t bend itself around more than three spatial dimensions. Granted, visualizing hypercubes was possible (“like a cube is made of sides, a hypercube is made of cubes”)… kind of.

Stopping this perilous habit, I started to memorize laws instead. That changed the language of maths for me, forever. I wasn’t understanding relations via shapes or arrows, but by basic axioms and mathematical laws. It wasn’t too long before I started to visualize concepts using these laws.

I stopped staring concepts in the eye, looking for hidden meanings behind bounded sets. I simply read the definition, thought “huh”, and memorized it. Slowly, by building towards other related concepts and set theory I quickly understood what the law meant, without trying to grok the hidden meaning.

Once that became a habit, it became easy and changed my approach forever: let go of intuition. Abstract ideas are hard by definition and they need to be understood piece by piece, from the ground up.

This is why any explanation of a precise thought, like a mathematical formalism, using something imprecise like an analogy, is a fallacy doomed to fail.

When functional programmers try to explain basic ideas like semigroups or functors, they often find themselves in an apologetic mire of simplifications. This is doomed to fail. Give concrete examples of how a functor works. Don’t illustrate them as operations that can map stuff in boxes to other boxes. Invent a problem and solve it using a functor. After all, they’re such a basic concept, even those writing non-functional code end up doing them all the time.

Let the abstraction sink in, it’s the only thing that will survive.

# Are my services talking to each other?

I am faced with an interesting thought experiment, which asks:

If I can see two of my friends, and I know they should be communicating to each other, what is the simplest way of making sure they are doing so?

Your first instinct is to look at them and listen. What if the communication method is subtler than that? What if you are, metaphorically speaking, deaf, and cannot eavesdrop on their conversation?

A problem like arises when you have a non-trivial amount of distributed components talking to each other, forming a complex network. Let’s start from the basics and consider a simple one:

arrows indicate flows of information, i.e. x → y means x sends information to y

You could assume A is an event log, for example, of financial transactions; B is a message queue and C is a fast queryable cache for the transactions. We want to be able to query the cache quickly for log events and rely on the message queue of transporting them from A to C, while preferably not having a hard software dependency from A to C.

The illusion is that while there are neither code nor protocol dependencies between A and C, a semantic dependency exists: the one in our heads! A is content on dumping information towards B, but what we’re really interested in is messages getting through all the way to C. So in reality, if we superimpose our perceived dependencies on top of information flows, we end up with this:

## Tolerating faults

What if the chain breaks? What happens when A can’t push messages onward to B, and we get a blackout? Who gets notified? C doesn’t know what’s happening in A, it’s just not getting information! In line of the original question, if I can see both A and C are doing fine, but they’re not talking to each other, where is or who is the broken phone?

With such a simple case as above, pointing this out is easy, so let’s make our network a bit more complicated.

A - an event log; B - a message queue; C - a cache; E - app back-end; P - a user-facing application; I - a business intelligence system; S - a storage system

Let’s assume each one of these components is an independent service, each load balanced and with redundancies that aren’t visible beyond the node itself1, and that communication is done over a computer network using some protocol.

The depicted network consists of a set of applications that all in one way or the other build on top of an event log, A. In one branch, there’s a fast queryable cache for the transaction log, the app back-end is an interface for the cache (like a REST API), and the storage acts as a long-term backup system. The second branch consists of a business intelligence system that analyzes the event log data and does something with it.

Indirectly, there are dependency arrows emanating from the root of the network tree (A) to its leaves S, P and I. From an observer’s perspective, these are the relationships that matter. These are the implicit dependencies. Furthermore, we can see those dependencies, but we build the code in such a way that it does not! The event log simply dumps data to a message queue, and that’s it. What is worse, is that the implicit dependencies each propagate up the chain. Not only does the leaf node depend on the root node, it also depends on the intermediate nodes.

Implicit dependencies

The inherent hazard in all this, of course, is that there’s a communication error. Even though we (hopefully) built the system following the robustness principle, data isn’t flowing from the root node to the leaf nodes and we have to quickly identify where the disconnect happened.

## Seeing is not enough

Our first instinct is to peer at the logs. So we go through each edge in the network and see if there’s a fault. This means for n nodes looking at least at n-1 edges for each fault! Moreover, the problem isn’t fixed by using something that gives me visibility of the nodes, like ZooKeeper or other service discovery tools. This is because I am interested in the flow of information from one node to another. The thought experiment already assumes that the nodes are there, only the communication between them is broken.

In the Internet world, with the Transmission Control Protocol , communication is made reliable using error-checking and acknowledgments. That means, if A were a network element and wanted to send things over to C, in case of a successful delivery C will acknowledge this back to A.

For various reasons, it may be that in a distributed service network this approach is not feasible. This is the cost of abstractions: when you enforce loose coupling, you have to deal with the consequences of looseness. We could build the transaction log aware of the user-facing Application but that may be overkill.

For the particular problem of acknowledging from a message queue root to a consumer leaf, there are various solutions. You either implement this on your own, which while laborious, essentially follows the principle of error-checking. The caveat is this grows in complexity with every new node. Another option is to use a message queue (one of these things is not like the others) that supports this natively.

## The rescue signal

We could build a centralized logging system to which each node logs its events. This centralized system contains all events from all nodes. To make the data meaningful, you need to construct a way to determine the flow of information, that is, grouping events together semantically. Worse, the system will require manual or semi-automated inspection to determine when any event is missing its acknowledgment, that is, A logged an event of sending Foo to message queue but the user application back-end E never processed it.

A system like this could work using a FRP approach: since FRP signals map exactly to discrete events, one could build a rule engine. By integrating time flow and compositional events, a centralized system could use its rule engine to listen to signals. A signal can be any event, e.g., a financial transaction that was logged into the event log. You can combine this signal with another event in a system that consumes transactions and does something with them, like the business intelligence system. The sum of these two signals imply that “a financial transaction was consumed by the business intelligence system”. This is also a signal!

Building a FRP-based rule engine isn’t easy, you’d need to construct a rule engine that can map diverse data events into high-level signals and then create additional logic for summing the signals.

The sum of two signals is another signal. (Oh hey, this makes it a semigroup!)

Once such a system is built, it can be queried to determine the state of the network quite efficiently (and perhaps elegantly), but it does not introduce any fault tolerance and will only tell you where data is moving, but not where it isn’t.

I guess that most of this stuff underlines the difficulties of unraveling a monolith into a microservice. Keeping track of network traffic is really hard, even at the hardware level (!), so when we push this abstraction to the software level, it is not a surprise that this can cause problems.

Playing with some toy solutions I thought of something I call a shadow network. Let’s say our principal information source is an event monitor X and we have a leaf node in the information dependency tree that is interested in data originating from X.

Each leaf node sends its data to the shadow node. The shadow node understands the data and can tell where it originated from, thereby seeing the implicit dependencies. The shadow node is effectively a mirror of the root node(s).

In the shadow network, X does not receive any new dependencies nor do the intermediaries, but the leaf nodes each push their actions to the shadow node. The shadow node contains a rule engine that can parse leaf events. A rule is something that identifies a source. It could be anything, from a simple parser (“this looks like Apache logs” → “it came from Apache!”) to something more sophisticated. This introduces a dependency only to leaf nodes, but the problem is that the shadow node has to be kept up to date on how to correctly map events to sources. When you change the format of the data traveling across the network, you have to update the rule engine.

Unfortunately, this doesn’t really help us: you can query the shadow node to get the implied dependencies, but that’s it. So while it requires less effort to develop, disregarding cases where creating rules causes difficulties, it suffers from the same flaw than the centralized FRP engine: it can only tell when data is flowing but not when it isn’t.

Bolting temporal awareness in the shadow network works if the data is supposed to be regular. If the consuming leaf expects a tick from the origin(s) every n seconds, the shadow rule engine can be built to be aware of this. If ticks aren’t happening when they are supposed to, you can create a fault on the implicit dependency. Alas, only regularly occurring data works here, so we’re out of luck for irregular events.