Controlling when a backup happens on a node
AnsweredHi Sesam
Is it possible to control either through API or a script when a node is backuped ?
Right now backup are affected by when the node last started up but we cannot be depend on when the node are started up. We need to be in control we then the backup happens so for example that the backup happens everyday at for example 15:00.
Thanks in advanced.
-
Official comment
Sesam subscriptions (not self-hosted) currently makes backups at a randomized time inside the 22:00-03:00 UTC time window. It is currently not possible to override this.
What is the use-case for needing to control the backup time?Comment actions -
We have another database that needs to be in sync with the Sesam database and if we need to restore our database we need to chose a backup that matches a Sesam backup that matches in time (down to minutes it must match) otherwise will we get some conflicts that are close to impossible to resolve.
-
It sounds to me like that is a next-to-impossible approach to take. Is Sesam writing data to that database or is Sesam reading data from the database? What is the reason for the conflicts? Can't you have Sesam all the data a second time? It would be helpful if you could explain the use case in more detail.
-
Thanks for your reply
The use case is the following:From our system a request for information from a third party solution is send through Sesam. Sesam checks if the information already exist. If the information is not found the request is procced and the response is saved both in Sesam and in our local database
If a crash happens and Sesam gets two hours ahead of our backup and the same request is send again then Sesam will say that the information exist but it does not in our system. This is these kind of conflicts we want to avoid.
We can take a meeting to explain in further details. -
Thank you for the explanation and sorry for the late reply.
One way to work around this problem is to have Sesam pull in the state from your local database. That pipe needs to perform a full sync of all the data it needs every so often. It could do it on every run, every nth run or on a schedule. The pipe that receives the data from the third party system can then hop[1] to the dataset that contains the data from your local database to check if the information is found. To prevent the second pipe to process the data to soon you can use the completeness feature[2] to make it read its input data only after the state has been synced from your local database.[1] https://docs.sesam.io/DTLReferenceGuide.html#hops-dtl-function
[2] https://docs.sesam.io/product-features.html#completeness -
The two pipes would look roughly something like this:
[
{
"_id": "local-database-state",
"type": "pipe",
"source": {
"type": "sql",
"system": "local-database",
"primary_key": ["the_id"],
"schema": "dbo",
"table": "some_table_or_view"
}
},
{
"_id": "processing-pipe",
"type": "pipe",
"source": {
"type": "dataset",
"dataset": "third-party-requests",
"completeness": true
},
"transform": {
"type": "dtl",
"rules": {
"default": [
[...filter_or_some_other_logic...,
["hops", {
"datasets": ["local-database-state s"],
"where": [
["eq", "_S.some_id", "s.the_same_id"]
]
}]
]
]
}
}
}
] -
Sesam's backup system is an implementation detail and it is implemented to support disaster recovery. There are technical reasons why we cannot control/expose the exact time when the backup runs. To make further more guarantees to prevent loss of data the durable data feature should be enabled.
The use-case you describe is when the local database is restored and it is thus in a state that is older than the state in Sesam. Given that there is a processing queue, then another workaround is to rewind the pipe that processes that queue. The pipe should be rewound to the point in time right before the timestamp of the backup that was used to restore the local database. Would that work?
Please sign in to leave a comment.
Comments
12 comments