Gontroller: a Go library to create reliable feedback loop controllers

Xabier Larrakoetxea
Spotahome Product
Published in
9 min readSep 6, 2019

--

Say hello to Gontroller, a Go library to create controllers (without the need of Kubernetes resources).

If you are reading this and you’ve been aware of the Kubernetes and CNCF ecosystem you may be wondering, another library to create Kubernetes controllers?!

Not at all… Don’t worry, you can keep reading independently if you are familiar with Kubernetes ecosystem or not.

Gontroller has nothing to do with Kubernetes, well that’s not 100% true… it implements the feedback loop/control controller pattern used by Kubernetes controllers/operators in a similar way.

Before explaining what does this mean, what Gontroller can solve and explaining how it can be used, let’s start explaining a bit of control-theory and feedback loops.

Control theory

https://pixabay.com/photos/feedback-review-gut-bad-3677258/

So what is this control theory thing, from Feedback Control for Computer Systems book, Feedback means:

Feedback works by constantly comparing the actual behavior of a sys‐
tem to its desired behavior. If the actual behavior differs from the de‐
sired one, a corrective action is applied to counteract the deviation and
drive the system back to its target. This process is repeated constantly,
as long as the system is running.

We could say that a controller is an application that tries to meet the desired state of a resource applying actions (forever).

For example:

  • A thermostat: sets the heater temperature of a room based on the room’s temperature.
  • An autoscaler: runs and deletes instances based on the traffic.
  • A cache system: increases and decreases the size based on the hits and the system is running on.
  • An application concurrency system: sets the available concurrent limits based on the traffic.
  • An application that maintains the TLS certificates up to date: renews them before they expire based on their expiration date.
  • A Loadbalancer: maintains a healthy set of endpoints for services based on health checks.
  • Distributed file storage: Replicates the data in different regions based on the state of data (e.g: S3).

These systems can be of different types and combinations (open and closed feedback loops, linear, proportional, integral, derivative…).

Closed-loop controller diagram (Wikipedia)

If you are using Kubernetes, is likely you are using this kind of systems every day, they are also called controllers or operators. For example:

  • Horizontal pod autoscaller: Scales up and down the number of replicas (pods) based on the CPU usage.
  • Replicaset controller: Maintains the desired number of replicas of a service using pods (if you want 3 pods, it will ensure 3 pods exist all the time).
  • Ingress controller: Ensures the desired services have access based on hosts and are being routed correctly to the desired services. Depending on the controller could be an ELB, ALB, GCLB, Nginx, HAProxy, Envoy…

As you see, this pattern is applied to lots of things and we use it every day.

Creating controllers

Depending on the context, a reliable controller needs to meet a number of basic features, sometimes all of them, others, only some:

Consistent

A controller needs to maintain the desired state all the time, that’s why a controller needs to be in a constant loop checking and trying to meet the desired state of a resource by taking actions.

Real-time

Some controllers need to maintain the desired state as fast as possible, in this case apart from the constant loop they need to take actions on real-time events or changes.

Level triggering

Explained here.

Scalable

A controller would scale overtime having to process more and more resources, that’s why it will need to act on multiple different resources at the same time, this means that concurrency is required, for example:

  • Multiple AWS autoscaling groups (hundreds of AWS autoscaling groups from different clusters).
  • Multiple TLS certificates (e.g. thousands of automatic hosts creation have their own certificates).
  • Multiple groups of instances (e.g. Kubernetes deployments).

Safe

Errors exist and will happen, delays exist, scale and concurrency increase the problems… that’s why a safe controller should:

  • Process the same resource only once at the same time (although there are different resources being processed at the same time).
  • Handle the resource based on the latest state of the resource (not a stale state, delays exist…).
  • Retry in case of error.

Gontroller should have all these features and at the same time be flexible to be able to implement all these different controller kinds.

Gontroller

At Spotahome we use and develop a lot of Kubernetes controllers for multiple things, they are awesome and maintain the system in a correct state, this gives us reliability and resilience.

Motivation

The main problem is that we wanted to create controllers for another kind of resources that weren’t based on Kubernetes (e.g. Github repositories), but we love how the Kubernetes controllers have been designed, so we developed Gontroller with the same idea except some small changes to make it more flexible and easier to use.

Features

  • Easy and fast usage.
  • Automatic retries on error.
  • Ensure only one worker is handling the same resource (based on ID) at the same time.
  • Concurrent handling of different resources.
  • A resource will be only processed once if it’s not been handled yet (although it’s been received multiple times before it has been handled).
  • Handle all resources at regular intervals (reconciliation loop) and updated objects in real-time.
  • Observability as a first-class citizen: Metrics with Prometheus implementation.
  • Extensible, all is based on behavior (go interfaces) and not concrete types.
  • Easy to test, business logic not coupled with infrastructure code (controller implementation/library).
  • If you are used to Kubernetes: No Kubernetes dependencies on the code.

How does it work

To implement a controller using Gontroller the user needs to implement 3 Go interfaces.

High-level architecture of Gontroller

ListerWatcher

This is the first component, is responsible for knowing what resources will handle the controller, this is done by retrieving the IDs of the resources the controller will handle in the future. e.g:

  • TLS certificates.
  • AWS Autoscaling groups.
  • Github repositories.
  • Users.
  • Services.

This component is composed of 2 methods that need to be implemented:

type ListerWatcher interface {
List(ctx context.Context) (ids []string, err error)
Watch(ctx context.Context) (<-chan Event, error)
}
  • List: This will list all the resource ids, this list will be used to process each of the resources at regular intervals, also called reconciliation loop (e.g. I want to check/handle all TLS certificate expiration every 30m).
  • Watch: This will watch events on the resources (delete, edit, create), unlike the List that will be used on a constant loop, these events will be processed in real-time (e.g. I don’t want to wait for the next reconciliation loop for this new created TLS certificate).

Storage

type Storage interface {
Get(ctx context.Context, id string) (interface{}, error)
}

The storage component is the component that transforms the ListerWatcher obtained ids into the resource state/data. e.g:

  • The Autoscaling group current instances and the cluster free size.
  • The number of replicas of a service.
  • The metadata and files of a Git repository.
  • The location of a package on the truck and road.
  • The expiration date of a TLS certificate.

Once the storage gets the resource state based on the id, the controller will call the handler with this data.

Handler

The handler is where most of the controller logic lives, is the one that takes the actions to meet the expected state of the resource. The Handler needs to implement 2 methods.

When handling a resource, the controller will call one of these two methods:

type Handler interface {
Add(ctx context.Context, obj interface{}) error
Delete(ctx context.Context, id string) error
}
  • Add: This method will be called when the object exists (Events of add or edit), here we will take actions based on the received data (previously obtained by the Storage)
  • Delete: This method will be called when the object does not exist (received a delete event by a Watch), this method only receives the id and is mostly used for cleanup actions.

For a full example of how you can implement a controller using Gontroller, check the examples.

If you want to know the internal implementation, you can check the repository.

Use case examples

Project description in a repository file on Github

What we want is to have a file (e.g. spotahome.yaml) that describes the project information in a repository (name, team owner, cluster…). This file should be downloaded and parsed so everything about the project can be initialized and registered.

  • ListerWatcher(List): Will list all the Github repositories from an organization and return the name as the unique id (e.g. spotahome/gontroller)
  • ListerWatcher(Watch): Will subscribe to Github webhooks and when there is a push event, it will check that is in themaster branch and there is a modification in spotahome.yaml file.
  • Storage: Will get the spotahome.yaml data file from the repo and get the current project initialization state in the cluster, teams… This will be returned in a state object.
  • Handler: Will use the received state object with the expected data and the current state, and create, modify or ignore depending on what we want and what is present.

We described one approach but it could be designed in another way, for examples, the storage only return the spotahome.yaml data and let the handler get the current state, compare and take actions.

With this controller, we would have in sync every project in our clusters, chat rooms like slack, monitoring systems, AWS roles...

Note: This needs a garbage collection job in the background in case we miss a deletion event or the controller wasn’t running at the moment of the deletion.

Directory replication

What we want is to have a directory replicated in the FS.

  • ListerWatcher(List): Will list all files in a directory, in case of being a directory, it will get the subdirectory files also.
  • ListerWatcher(Watch): Will subscribe to the Kernel notifications on the different files like edit, create and deletion (e.g. using fsnotify).
  • Storage: Will get the file hash for verification and check if the file exists in the replicated directory, if it is there it will get also the verification hash. Then will return a state struct with the expected state and the current state.
type State struct {
OriginalPath string
originalHash string
ReplicaPath string
ReplicaHash string
}
  • Handler: In case of being an add/edit event it will check if the file has not been replicated yet or the hashes of both files are different and will replicate the file. In case of deletion and the file exists it will delete the file.

Like the previous approach, the Storage could get only the information of the original file and let the handler get the hashes and take actions, that depends on how you would like to design your application but is flexible enough to give you the option to select whatever you want.

With this controller, we could have a directory replicated, like a Dropbox or Google Drive application.

Optimization example: This can be optimized to use a cache for the ListerWatcher and Storage, so the list of all the files is only done the first time (sync.Once), and then use the notifications (like the watcher) on files to update that cached state and return that on the List calls (this is the method used by Kubernetes).

Optimization example2: When we handle a file that is a directory, create a new replication controller for that directory and when deleted everything stops the controller (kind of recursion but with controller runs).

Note: This needs a garbage collection job in the background in case we miss a deletion event or the controller wasn’t running at the moment of the deletion.

Conclusion

Hope you like the library and if you have use cases using it that you want to share we would love to hear them!

You can reach us on Github to give us feedback about features, errors and pain points, or contributing by sending PRs.

Thanks for reading!

P.S. Now you don’t have an excuse to write reliable and clean controllers if you are not using Kubernetes resources :)

--

--