Categories
elixir gitlab

Simple lesson from writing my first Elixir app

This may be a secret as I didn’t mention it here that one of my hobbies is economy. The other one is programming, which is kinda obvious. I enjoy teaching as well, and run Coding Dojo in my town, but this is a story for some other time. These three things, programming, economy, and teaching mixed together with an idea might create something interesting.

Few months ago I have decided to start calculating one economic indicator called [Misery Index](https://en.wikipedia.org/wiki/Misery_index_(economics) of Poland. It is simple, easy to understand, and data required to calculate it is available from Polish Bureau of Statistcs(GUS).

After having this idea I created https://jakjestw.pl. For first few months I have updated data by hand. New data is not released often and I could simply check and then apply changes myself. That was fine at first but started being cumbersome. I happen to fancy Elixir and to test it out on something real I decided to create poor man’s static site generator and data parser.

Requirements were ridiculously simple. The app had to fetch data from GUS website, which was in JSON format. Then transform it into something that can be injected into HTML file. Since I was done with manual labour it had to run on Gitlab pipelines.

Easier said than done. My major language is Python what influenced how I chose to model my data. This had been my demise. I picked really poorly, by following gut instinct of a Python programmer. Maps, tuples, and lists were my choice. In Python it does make sense, for such case of data transformation might not be the best but still Dict is a goto structure. My Elixir data looked like this, what a lovely year for Polish wallets by the way.

     %{
      2015 => [
        {12, %{cpi: -0.5}},
        {11, %{cpi: -0.6}},
        {10, %{cpi: -0.7}},
        {9, %{cpi: -0.8}},
        {8, %{cpi: -0.6}},
        {7, %{cpi: -0.7}},
        {6, %{cpi: -0.8}},
        {5, %{cpi: -0.9}},
        {4, %{cpi: -1.1}},
        {3, %{cpi: -1.5}},
        {2, %{cpi: -1.6}},
        {1, %{cpi: -1.4}}
      ]
     }

My website displays latest number that is calculated each day. It also provides information how the number is calculated by providing both components of the indicator, CPI and unemployment. One last thing is a comparison of last four years by giving data from last month of each year. Not a perfect situation but will do for comparison.

Extracting such information from data structure presented above requires a lot of effort. Lot more than I expected and I have told myself that this is because I’m not fluent in Elixir. After I have finished I realised that it’s not me it is my data structure. Which is my fault, but it’s not me.

That sparked an idea to change my data structure to something that map/reduce can handle easier. This time with some experience in processing data in pipelines I decided to skip the nested structures and have flat data like list and use proper date object.

    [
      [~D[2016-12-01], {:unemployment, 8.2}],
      [~D[2016-11-01], {:unemployment, 8.2}],
      [~D[2016-10-01], {:unemployment, 8.2}],
      [~D[2016-09-01], {:unemployment, 8.3}],
      [~D[2016-08-01], {:unemployment, 8.4}],
      [~D[2016-07-01], {:unemployment, 8.5}],
      [~D[2016-06-01], {:unemployment, 8.7}],
      [~D[2016-05-01], {:unemployment, 9.1}],
      [~D[2016-04-01], {:unemployment, 9.4}],
      [~D[2016-03-01], {:unemployment, 9.9}],
      [~D[2016-02-01], {:unemployment, 10.2}],
      [~D[2016-01-01], {:unemployment, 10.2}]
    ]

This is perfect for map/reduce/filter operations. Saying that code is simpler from my point of view does not makes sense as I spent a lot of time with it. The metric that can be helpful here is number of added and removed lines. In total I have removed 409 lines while adding 244, that is 165 lines less then before. After removing lines that changed in test we get 82 removed and 67 added, which is around 25% less code doing the same thing. Which is a good news but giving only LOCs could be misleading as lines are not equal. So now code before

    def second_page(all_stats) do
      Enum.to_list(all_stats)
      |> Enum.map(fn {x, data} -> for d <- data, do: Tuple.insert_at(d, 0, x) end)
      |> List.flatten()
      |> Enum.sort(fn x, y -> elem(x, 0) >= elem(y, 0) && elem(x, 1) >= elem(y, 1) end)
      |> Enum.find(fn x -> map_size(elem(x, 2)) == 2 end)
      |> elem(2)
      |> Map.to_list()
    end

And after.

    def second_page(all_stats) when is_list(all_stats) do
      Enum.drop_while(all_stats, fn e -> length(e) < 3 end)
      |> hd
      |> tl
    end

This is the most striking example from the codebase that illustrates what changes this can involve.

TIL:

My main take from this experience is that mistakes at the start of a project may lead to disastrous consequences later on. The time spent on designing, that includes writing throw away code when doing spikes, is best investment you can make. Think about it before you start.

P.S.

Code is up on Gitlab, feel free to look and comment.

Categories
elixir patterns

Actor model

Motivation

One of my resolutions this year is to review my notes from each conference I visit and read/learn/build something. Quite recently I have been for the second time to Lambda Days in Kraków, where I got interested in Elixir. It is functional, what is dear to me but more importantly it is built for concurrent computation. This is achieved by shared nothing architecure and actor model. It means that each actor, or process in Elixir land, is independent and share no memory or disk storage. Communication between processes is achieved by sending messages. This allows building systems that are concurrent with less effort. Since it is so helpful I’ll try to understand what actor is and hopefully explain it here.

This is not a full explanation by any means but rather a primer before I implement my latest project using this computation model. It is possible that my knowledge will be revised along with a new post.

What is the Actor Model

Actor is a model of concurrent computation. It has the following properties or axioms. (I have shuffled them a bit to emphasise messaging as IMHO important part of this model).

  • Can designate how to handle next received message
  • Can create actors
  • Can send messages to actors

Let’s unpack those properties to make it more clear. "Can designate how to handle next received message", so actors communicate with messages. Each actor has an address as well, where messages can be send. And it is up to an actor how will it respond if at all.

"Can create actors" is pretty simple, each actor can spawn other actors if required by performed task.

"Can send messages to actors" as mentioned while describing first axiom communication is done via messages. Actors send messages to each other.

One actor is not really an actor model, as mentioned in one of the articles, actor come in systems.

This is short and simple, it is the jist of it with focus on most important parts of actor model. What I find valuable is the fact that this model brings software failures to the front and forces solution designer to appreciate and expect them.

I find it similar to OOP created by Alan Key and described here

OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I’m not aware of them.

When to use it?

If you are having a task that can be split into stages, may be linked or independent. In such case I find actors more palatable than locks and threading. This is kind of thing that Broadway library for Elixir is trying to solve. Actor model may also be used when thinking about OOP, it might not be possible to implement actors in such way that they are independent at the level this model expects, but thinking in such terms may improve resilience of the project.

Resources

I know I have skimmed this topic and if you are interested please have a look at resources I used to grasp the idea of actor model.