I talked about microservices at two conferences last week. In both opportunities, the attendees asked me about how to implement the Saga pattern. So, I decided to blog about it.
What is a Saga?
From the microservices.io:
A saga is a sequence of local transactions. Each local transaction updates the state (database) and publishes a message or event to trigger the next local transaction in the saga. If a local transaction fails because it violates a business rule then the saga executes a series of compensating transactions that undo the changes that were made by the preceding local transactions.
The original idea is pretty old and comes from this article.
The implementation I recommend is known as “Orchestration-based saga.”
Saga as a Flow
I like to think Sagas as a long-running flow. Using BPMN (Business Process Modeling Notation), we could represent the preceding Saga as follows:
The good thing about representing Sagas as workflows is that we can run it using a workflow engine. So the workflow engine can act as the Saga Execution Coordinator.
What workflow engine I should/could use?
Like everything, the correct answer is: “It depends.” There are several options you could use. Some options are potent (and expensive). Also, there are a lot of lightweight and cheap alternatives. The right choice depends on your project.
I have been working (a lot) in a lightweight workflow engine library that you could use in your projects to solve a lot of problems (including “how to implement Sagas,” of course). It is still under development, but it will be ready to use soon.
Here is a preview of how you would configure your saga using my library:
var model = ProcessModel.Create() .AddEventCatcher<CreateNewOrderCommand>("start") .AddActivity<CreateOrderActivity>("CreateOrder") .AddActivity<CancelOrderActitivity>("CancelOrder") .AttachAsCompensationActivity("CancelOrder", "CreateOrder") .AddActivity<CreateOrderActivity>("ReserveProducts") .AddActivity<CancelOrderAcitivity>("CancelReservation") .AttachAsCompensationActivity("ReserveProducts", "CancelReservation") .AddActivity<RequestPaymentActivity>("RequestPayment") .AddActivity<CancelPaymentActivity>("CancelPayment") .AttachAsCompensationActivity("RequestPayment", "CancelPayment") .AddActivity<RequestDeliveryActivity>("RequestDelivery") .AddEventThrower("end") .AddSequenceFlow("start", "CreateOrder", "ReserveProducts", "RequestPayment", "RequestDelivery", "end" ); var models = new InMemoryProcessModelsStore(model); var instances = new InMemoryProcessInstancesStore(); var manager = new ProcessManager(models, instances);
Let me know if you are implementing microservices and need help. It would be a pleasure to help your company. Also, I would love to help you to start using my library (again, it is under development and source code is on Github).
Olá Elemar, muito bom o post, obrigado por partilhar.
Qual seria a abordagem quando por exemplo uma das transações for um Web Service. vamos supor que existe um processo de negócio que tem duas componentes (escrever numa base de dados e executar um ou vários Web Services), se por algum motivo a chamada ao webservice der timeout ou algum outro erro (de comunição ou de negócio), como garantir (se possível) que todos os serviços atómicos serão executados como um único serviço?
O que tenho feito actualmente é definir uma estrutura para coordenar as transações onde faço o registo do inicio, término e estado da execução e depois vou para o passo seguinte (próxima transação), caso haja algum erro, tenho de fazer um rollback nas transações da base de dados e chamar a função de compensação quando for Web Service. Tento implementar os conceitos do WS-Atomic Transaction Standard (https://en.wikipedia.org/wiki/WS-Atomic_Transaction). Gostaria de saber se já teve alguma situação similar e qual foi a solução.
The Saga Pattern is an alternative to the 2PC algorithm. What is the main reason preventing you of adopting Sagas?
I did not know about Sagas, but surely I will research and adopt it.
My concern was about the proper structure to coordinate the processes (specially when it is a web service process), so it can be controlled, and we can know exactly in which step the process failed and was rolled back.
It’s more easy to control processes when they are database processes (calling a procedure, executing an INSERT/UPDATE) because when it fails we have guarantee that the execution did not occur and we can call the compensating transactions for the previous processes with “no fear”.
My fear comes when the process is a Web Service, because it can return an error (a communication error like timeout) and still execute the instruction. Then because it returned an error, I will call the compensating transactions for all previous processes, finishing with a corrupt result.
process 1: write in database
process 2: write in database
process 3: call web service
process 4: write database
Execution of the processes:
1. process 1: success
2. process 2: success
3. process 3: error (timeout) *but it executed the instruction in the third party (timeout is not a guarantee that the webservice was not executed since it is a comunication problem and not a business error)
4. compensate process 2
5. compensate process 1
In this case the process 3 was not compensated generating a corrupt result.
I don’t know if it’s a Saga Pattern concept, but maybe a workaround would be having two categories of errors, errors that assure us that the process did not occur (business errors) and errors that cannot guarantee the execution or not of the process (comunication errors) so, depending on the error we will do the proper compensating execution.
I will test your engine and return to you with feedback.
I am still working on the library. But it would be a pleasure to help you.