Clojure Flow - First Impressions
Clojure’s core async library recently added a namespace called ‘flow’.
The point of flow is to provide abstractions for building easily composable systems where separate processes can be easily connected together. It is supposed to make these kind of stateful, ‘parallel’ type systems more ‘functional’, and therefore more easy to reason about and test.
The flow docs state that:
The fundamental objective of core.async.flow is to enable a strict separation of your application logic from its topology, execution, communication, lifecycle, monitoring and error handling, all of which are provided by and centralized in, c.a.flow, yielding more consistent, robust, testable, observable and operable systems.
I was curious to try this out, and I had also recently come across Zonestream, a project from OpenINTEL that has a kafka stream for real-time DNS zone changes.
I tried out flow for building a web page that would read newly registered domains from this stream and then group by things like tld, or the number of domains per hour. You can view the results at eoin.site/fl. Not sure how long I’ll leave it running for.
The point of this wasn’t necessarily to gather useful or analysable data, it was more to try to create something ‘real-time’ that might fit well as a use case with flow. The data here is intentionally quite ’ephemeral’; I’m not saving or collecting anything.
Overall, I thoroughly enjoyed working with the new flow namespace. Once you understand the basic points about a process, it is very easy and intuitive to use.
One of the main things I really loved was how interactive development was. As someone who uses clojure a lot in hobby projects (I am not a professional developer), I am of course well used to a high level of interactivity through the REPL. However, this was like expanding that way of thinking to the level of a ‘system’ of separate, concurrent processes. Flow handles all of the backend plumbing in terms of managing channels, so you can easily “wire” different components together, and start/stop/edit them in real-time, without interrupting the overall “flow” of the system itself.
It is also great for thinking of ways to reuse signals and data from one process in other processes. Processes can be connected and interconnected quite seamlessly.
I’m not sure about how it compares to other approaches in terms of performance. I’m not particularly experienced with websockets or even webservers and I am sure there are so many optimisations missed and better design patterns that could have been choosen in my particular experimentation with flow. I was a also bit confused with how best to convert it to a java application to run in production (and if there is any way to retain the interactivity it in jar/production format?). I am sure there are lots of mistakes I made in terms of designing the ‘flow’, but this was just a first look, and I hope to explore it more in the future.