Controlling when a backup happens on a node

Answered

May 09, 2022 14:09

Hi Sesam

Is it possible to control either through API or a script when a node is backuped ?

Right now backup are affected by when the node last started up but we cannot be depend on when the node are started up. We need to be in control we then the backup happens so for example that the backup happens everyday at for example 15:00.

Thanks in advanced.

Comments

12 comments

Official comment
Geir Ove Grønmo

May 20, 2022 08:09
Sesam subscriptions (not self-hosted) currently makes backups at a randomized time inside the 22:00-03:00 UTC time window. It is currently not possible to override this.

What is the use-case for needing to control the backup time?
Comment actions Permalink
Jon Bryndorf

May 20, 2022 12:02
We have another database that needs to be in sync with the Sesam database and if we need to restore our database we need to chose a backup that matches a Sesam backup that matches in time (down to minutes it must match) otherwise will we get some conflicts that are close to impossible to resolve.
0

Comment actions Permalink
Geir Ove Grønmo

May 31, 2022 10:41
It sounds to me like that is a next-to-impossible approach to take. Is Sesam writing data to that database or is Sesam reading data from the database? What is the reason for the conflicts? Can't you have Sesam all the data a second time? It would be helpful if you could explain the use case in more detail.
0

Comment actions Permalink
Jon Bryndorf

May 31, 2022 13:39
Thanks for your reply

The use case is the following:

From our system a request for information from a third party solution is send through Sesam. Sesam checks if the information already exist. If the information is not found the request is procced and the response is saved both in Sesam and in our local database

If a crash happens and Sesam gets two hours ahead of our backup and the same request is send again then Sesam will say that the information exist but it does not in our system. This is these kind of conflicts we want to avoid.

We can take a meeting to explain in further details.
0

Comment actions Permalink
Jon Bryndorf

July 11, 2022 06:34
Hi Sesam

Are there any updates ? We need clarification soon since we go into production in September.
0

Comment actions Permalink
Geir Ove Grønmo

August 01, 2022 06:59
Thank you for the explanation and sorry for the late reply.

One way to work around this problem is to have Sesam pull in the state from your local database. That pipe needs to perform a full sync of all the data it needs every so often. It could do it on every run, every nth run or on a schedule. The pipe that receives the data from the third party system can then hop[1] to the dataset that contains the data from your local database to check if the information is found. To prevent the second pipe to process the data to soon you can use the completeness feature[2] to make it read its input data only after the state has been synced from your local database.

[1] https://docs.sesam.io/DTLReferenceGuide.html#hops-dtl-function
[2] https://docs.sesam.io/product-features.html#completeness
0

Comment actions Permalink

Geir Ove Grønmo

August 01, 2022 07:20

The two pipes would look roughly something like this:

[
    {
        "_id": "local-database-state",
        "type": "pipe",
        "source": {
            "type": "sql",
            "system": "local-database",
            "primary_key": ["the_id"],
            "schema": "dbo",
            "table": "some_table_or_view"
        }
    },
    {
        "_id": "processing-pipe",
        "type": "pipe",
        "source": {
            "type": "dataset",
            "dataset": "third-party-requests",
            "completeness": true
        },
        "transform": {
            "type": "dtl",
            "rules": {
                "default": [
                    [...filter_or_some_other_logic...,
                     ["hops", {
                         "datasets": ["local-database-state s"],
                         "where": [
                             ["eq", "_S.some_id", "s.the_same_id"]
                         ]
                     }]
                    ]
                ]
            }
        }
    }
]

Jon Bryndorf

August 01, 2022 13:10

Edited
Hi Sesam

Thanks for the good solution but we have a specific system where a synching with the proposed solution will takes days and that is not acceptable.

So we cannot use the proposed solution.
0

Comment actions Permalink
Geir Ove Grønmo

August 01, 2022 14:51
I see. Could you write the response only to the local database and then sync it back to Sesam? Then you could use the same kind of technique as described above.
0

Comment actions Permalink
Jon Bryndorf

August 02, 2022 12:37
Hi Geir

That will defeat the purpose of using Sesam as a backup since when we will have our on database backup and then we have double backup. Our own and yours.

It is possible to do something custom for our nodes so you (Sesam) control the time when the backups happens ?
0

Comment actions Permalink
Geir Ove Grønmo

August 05, 2022 11:13
Sesam's backup system is an implementation detail and it is implemented to support disaster recovery. There are technical reasons why we cannot control/expose the exact time when the backup runs. To make further more guarantees to prevent loss of data the durable data feature should be enabled.

The use-case you describe is when the local database is restored and it is thus in a state that is older than the state in Sesam. Given that there is a processing queue, then another workaround is to rewind the pipe that processes that queue. The pipe should be rewound to the point in time right before the timestamp of the backup that was used to restore the local database. Would that work?
0

Comment actions Permalink
Jon Bryndorf

August 05, 2022 12:17

Edited
Thank you the reply.

It would work. We will take it from here.

Thanks for your good replies and help.
0

Comment actions Permalink

Please sign in to leave a comment.

Comments

Didn't find what you were looking for?