Categories
patterns

Multi-level cache with DAPR

Quite recently at work, I was tasked with making improvements to the response time of our main service. The service was pretty slow for the work it was doing. There were some low-hanging fruits, and it got substantially faster with not a lot of effort. The next step was to reduce the number of external calls. Calls can be cached as the response from external service does not change that often. I chose a multi-level cache to reduce the latency as much as possible.

Categories
microservices patterns

Cloud agnostic apps with DAPR

DAPR is cool, as stated on the website “APIs for building portable and reliable microservices”, it works with many clouds, and external services. As a result you only need to configure the services and then use DAPR APIs. It is true and I’ll show you. You will find the code for this article here. A must have tool when working with micro services.

Applications can use DAPR as a sidecar container or as a separate process. I’ll show you a local version of the app, where DAPR is configured to use Redis running in a container. The repo will have Azure, AWS, and GCP configuration as well.

Before we can start the adventure you have to install DAPR.

Running app locally

Configuration

You have to start off on the right foot. Because of that we have to configure secrets store where you can keep passwords and such. I could skip this step but then there is a risk someone will never find out how to do it properly and will ship passwords in plain text. Here we go.

# save under ./components/secrets.yaml
apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
  name: local-secret-store
  namespace: default
spec:
  type: secretstores.local.file
  version: v1
  metadata:
  - name: secretsFile
    value: ../secrets.json

The file secrets.json should have all your secrets, like connection strings, user and pass pairs, etc. Don’t commit this file.

{
    "redisPass": "just a password"
}

Next file is publish subscribe configuration. Dead simple but I’d recommend going through the docs as there is much more to pub/sub. Here you can reference your secrets as shown.

# save under ./components/pubsub.yaml
apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
  name: order_pub_sub
spec:
  type: pubsub.redis
  version: v1
  metadata:
  - name: redisHost
    value: localhost:6379
  - name: redisPass
    secretKeyRef:
      name: redisPass
      key: redisPass
auth:
  secretStore: local-secret-store

Publisher and subscriber

With config out of the way only thing left are publisher part and subscriber part. As mentioned before the app you write talks to DAPR API. This means you may use http calls or Dapr client. Best part is that no matter what is on the other end, be it Redis or PostgreSQL, your code will not change even when you change your external services.

Here goes publisher that will send events to a topic. Topic can be hosted anywhere, here is a list of supported brokers. The list is long however only 3 are stable. I really like how DAPR is approaching components certification though. There are well defined requirements to pass to advance from Alpha, to Beta, and finally to Stable.

# save under ./publisher/app.py

import logging

from dapr.clients import DaprClient
from fastapi import FastAPI
from pydantic import BaseModel

logging.basicConfig(level=logging.INFO)


app = FastAPI()


class Order(BaseModel):
    product: str


@app.post("/orders")
def orders(order: Order):
    logging.info("Received order")
    with DaprClient() as dapr_client:
        dapr_client.publish_event(
            pubsub_name="order_pub_sub",
            topic_name="orders",
            data=order.json(),
            data_content_type="application/json",
        )
    return order

Here is a consumer.

# save under ./consumer/app.py

from dapr.ext.fastapi import DaprApp
from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()
dapr_app = DaprApp(app)


class CloudEvent(BaseModel):
    datacontenttype: str
    source: str
    topic: str
    pubsubname: str
    data: dict
    id: str
    specversion: str
    tracestate: str
    type: str
    traceid: str


@dapr_app.subscribe(pubsub="order_pub_sub", topic="orders")
def orders_subscriber(event: CloudEvent):
    print("Subscriber received : %s" % event.data["product"], flush=True)
    return {"success": True}

Running the apps

Now you can run both apps together in separate terminal windows and see how they talk to each other using configured broker. For this example we are using Redis as a broker. You will see how easy is to run them on different platforms.

In the first terminal run the consumer.

$ dapr run --app-id order-processor --components-path ../components/ --app-port 8000 -- uvicorn app:app

In the other terminal run the producer.

$ dapr run --app-id order-processor --components-path ../components/ --app-port 8001 -- uvicorn app:app --port 8001

After you make a HTTP call to a producer you should see both of them producing log messages as follows.

$ http :8001/orders product=falafel

# producer
== APP == INFO:root:Received order
== APP == INFO:     127.0.0.1:49698 - "POST /orders HTTP/1.1" 200 OK

# subscriber
== APP == Subscriber received : falafel
== APP == INFO:     127.0.0.1:49701 - "POST /events/order_pub_sub/orders HTTP/1.1" 200 OK

Running app in the cloud

It took us a bit to reach the clu of this post. We had to build something, and run it so then we can run it in the cloud. Above example will run on cloud with a simple change of configuration.

Simplest configuration is for Azure. Change your pubsub.yaml so it looks as follows, and update your secrets.json as well.

apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
  name: order_pub_sub
  spec:
    type: pubsub.azure.servicebus
    version: v1
    metadata:
    - name: connectionString
      secretKeyRef:
        name: connectionStrings:azure
        key: connectionStrings:azure

Your secrets.json should look like this now

{
  "connectionStrings": {
    "azure": "YOUR CONNECTION STRING"
  }
}

Rerun both commands in terminal and the output will look the same as with local env but the app will run on Azure Service Bus.

Bloody magic if you’d ask me. You can mix and match your dependencies without changing your application. In some cases you may even use features not available to a particular cloud, like message routing based on body in Azure Service Bus. This will be another post though.

Here is the repo for this post, it includes all the providers listed below:

  • Azure
  • Google Cloud
  • AWS

Please remember to update your secrets.json.

Have fun 🙂

Categories
microservices patterns

GitOps workflow

I have been using GitOps in my last project and I like the way it changed my workflow. It had to change as in the world of microservices old ways have to go. It is not a post if this is good or bad, I may write one someday. This post is about GitOps, if you do not know what GitOps is read here. TLDR version: in practice, each commit deploys a new version of a service to one’s cluster.

Going back to the subject of the workflow. I’ll focus on the microservices workflow as here in my opinion GitOps is extremely useful. One of the main pain points of microservices architecture is deployment. When deployments are mentioned you instinctively think about deploying the application. It may be difficult but it is not as difficult as creating developer environments and QA environments.

Here comes GitOps. When applied to your project you immediately get new service per each commit. This applies to each service you have. Having this at your disposal you can set up your application in numerous combinations of versions. You can also easily replace one of the services in your stack. Sounds good but nothing beats a demo, so here we go.

Demo

Let’s say I have a task of verifying if our changes to one of the microservices are correct. I’m using Rio to manage my Kubernetes cluster as it makes things smoother. Change in one service affects another service, and I have to verify it using UI. This adds up to 3 services deployed in one namespace and configured so they talk to each other. After I add commit in the service repository there is a namespace already created on a cluster. Each commit creates a new version of the service.

% rio -n bugfix ps -q
bugfix:appCode language: CSS (css)

Now I need to add missing services, and I can do it by branching off from master. The name of the branch must match in all services involved.

% cd other_services && git checkout -b bugfix 
% git push

After pushing the changes Rio adds them to the same namespace.

% rio -n bugfix ps -q
bugfix:app
bugifx:web
bugfix:other_appCode language: CSS (css)

One thing left is to wire them up so services talk to each other. As I’m using recommendations from https://12factor.net/config so it is dead easy, and I can use Rio to do it. Edit command allows me to modify the environment variables of each service.

% rio -n bugfix edit web

This opens up your favourite text editor where you can edit the variables and setup where the web app can find other services. You can do the same changes in other services if necessary.

I have wired up services, they are talking to each other and I can proceed with my work. This my workflow using GitOps, Rio, and microservices.

Categories
elixir patterns

Actor model

Motivation

One of my resolutions this year is to review my notes from each conference I visit and read/learn/build something. Quite recently I have been for the second time to Lambda Days in Kraków, where I got interested in Elixir. It is functional, what is dear to me but more importantly it is built for concurrent computation. This is achieved by shared nothing architecure and actor model. It means that each actor, or process in Elixir land, is independent and share no memory or disk storage. Communication between processes is achieved by sending messages. This allows building systems that are concurrent with less effort. Since it is so helpful I’ll try to understand what actor is and hopefully explain it here.

This is not a full explanation by any means but rather a primer before I implement my latest project using this computation model. It is possible that my knowledge will be revised along with a new post.

What is the Actor Model

Actor is a model of concurrent computation. It has the following properties or axioms. (I have shuffled them a bit to emphasise messaging as IMHO important part of this model).

  • Can designate how to handle next received message
  • Can create actors
  • Can send messages to actors

Let’s unpack those properties to make it more clear. "Can designate how to handle next received message", so actors communicate with messages. Each actor has an address as well, where messages can be send. And it is up to an actor how will it respond if at all.

"Can create actors" is pretty simple, each actor can spawn other actors if required by performed task.

"Can send messages to actors" as mentioned while describing first axiom communication is done via messages. Actors send messages to each other.

One actor is not really an actor model, as mentioned in one of the articles, actor come in systems.

This is short and simple, it is the jist of it with focus on most important parts of actor model. What I find valuable is the fact that this model brings software failures to the front and forces solution designer to appreciate and expect them.

I find it similar to OOP created by Alan Key and described here

OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I’m not aware of them.

When to use it?

If you are having a task that can be split into stages, may be linked or independent. In such case I find actors more palatable than locks and threading. This is kind of thing that Broadway library for Elixir is trying to solve. Actor model may also be used when thinking about OOP, it might not be possible to implement actors in such way that they are independent at the level this model expects, but thinking in such terms may improve resilience of the project.

Resources

I know I have skimmed this topic and if you are interested please have a look at resources I used to grasp the idea of actor model.

Categories
patterns python

Iterator fun

PyCon US 2017 finished more than a month ago. By the miracle of the technology everyone not able to attend can watch all the talks conveniently in ones own home. So I did, starting with a list of talks recommended in one of the recent episodes of Talk Python To Me podcast.

At this point it’s easy to guess that this post will be about PyCon and most probably about one of the talks I enjoyed. The talk I’d like to focus on is Instagram Keynote delivered by Lisa Guo and Hui Ding. Really good performance and really good slides, in my opinion top notch delivery.

Both speakers have been talking about migration of Instagram from Legacy Python (2.7.x) to Modern Python (>3.5). Massive user base and codebase, outdated third party packages without Python3 support, outdated Django (super old), and Legacy Python. Sounds like a typical software endeavour, maybe except user base being massive. When speaking of challenges Lisa mentioned something that really surprised me. I’m talking about the issue with iterators and builtin function any. Here is the slide she used to illustrate the problem and below is the code in question.

CYTHON_SOURCES = [a.pyx, b.pyx, c.pyx]
builds = map(BuildProcess, CYTHON_SOURCES)
while any(not build.done() for build in builds):
    pending = [build for build in builds if not build.started()]
    <do some work>

This piece of code works fine under Legacy Python but when run under Modern Python first element of builds list is lost in the while loop. builds list inside of while is missing an element. The cause of it is the change of return value of map function in Modern Python from list into iterator. This is something new to me and I really had to know the reason why first value is lost.

Why does it happen?

It would be much easier to reason about this with simpler code example.

Python 3.5.3

>>> m = map(str, [1, 2])
>>> any(m)
>>> print(list(m))

['2']

Puzzling, I was wondering if same thing would happen with Legacy Python. In order to make it work result of map has to be converted into iterator.

Python 2.7.13

>>> m = iter(map(str, [1, 2]))
>>> any(m)
>>> print(list(m))

['2']

This behaviour is shared among Python versions and most likely is intended behaviour, although not documented one I guess. But why does this happen? Answer to this question as usual requires us to look into how Python is implemented. Implementation of any function needs to be examined. So whip up Python/bltinmodule.c file out, it can be any version of Python. Below is mentioned function taken from source of Python 3.5.1 as it’s most recent source I have.

static PyObject *
builtin_any(PyModuleDef *module, PyObject *iterable)
{
    PyObject *it, *item;
    PyObject *(*iternext)(PyObject *);
    int cmp;

    it = PyObject_GetIter(iterable);
    if (it == NULL)
        return NULL;
    iternext = *Py_TYPE(it)->tp_iternext;

    for (;;) {
        item = iternext(it);
        if (item == NULL)
            break;
        cmp = PyObject_IsTrue(item);
        Py_DECREF(item);
        if (cmp < 0) {
            Py_DECREF(it);
            return NULL;
        }
        if (cmp == 1) {
            Py_DECREF(it);
            Py_RETURN_TRUE;
        }
    }
    Py_DECREF(it);
    if (PyErr_Occurred()) {
        if (PyErr_ExceptionMatches(PyExc_StopIteration))
            PyErr_Clear();
        else
            return NULL;
    }
    Py_RETURN_FALSE;
}

Looking at body of any we can tell that there is no difference if you call it with iterator or iterable. Both objects can be iterated on, difference is in what PyObject_GetIter returns when either list or iterator is used.

As we know iterators work tirelessly till exhausted, but list is not an iterator. List is an iterable. List when iterated returns fresh iterator each time, thus is never exhausted. Each for loop operating on a list gets fresh iterator with all the values. In our case important lines are these:

    it = PyObject_GetIter(iterable);  // This retrieves an iterator
    ...
    
    item = iternext(it);  // This consumes an element

any returns True if any element is true. When running on Modern Python any will consume each element up to one that is true. Cause of it is the loop and iternext call that subsequently removes elements from the iterator. This simple example illustrates the behaviour.

Python 3.5.3

>>> m = map(bool, [False, False, True, False])
>>> any(m)
True
>>> print(list(m))
[False]

One other function came to my mind. How would all behave? Mechanics of it seems to be the same as with any. Answer is pretty simple when we look at the source.

static PyObject *
builtin_all(PyModuleDef *module, PyObject *iterable)
{
    PyObject *it, *item;
    PyObject *(*iternext)(PyObject *);
    int cmp;

    it = PyObject_GetIter(iterable);
    if (it == NULL)
        return NULL;
    iternext = *Py_TYPE(it)->tp_iternext;

    for (;;) {
        item = iternext(it);
        if (item == NULL)
            break;
        cmp = PyObject_IsTrue(item);
        Py_DECREF(item);
        if (cmp < 0) {
            Py_DECREF(it);
            return NULL;
        }
        if (cmp == 0) {
            Py_DECREF(it);
            Py_RETURN_FALSE;
        }
    }
    Py_DECREF(it);
    if (PyErr_Occurred()) {
        if (PyErr_ExceptionMatches(PyExc_StopIteration))
            PyErr_Clear();
        else
            return NULL;
    }
    Py_RETURN_TRUE;
}

As we can see this function also loops over iterable using an iterator. Condition however is slightly different as it needs to verify if all elements are true. This will cause all to consume each element up to first false element. Again simple example will illustrate this best.

>>> m = map(bool, [True, True, True, False, True])
>>> all(m)
False
>>> print(list(m))
[True]

It has been bugging me since I watched the video, now I have my closure. I learned a bit while investigating this and I’m happy to share it.

P.S. Yes, it is Legacy Python.