Spooling of static remote data sources

Add an option for remote data sources that a local copy of data should be transferred and consumed locally. This is useful for e.g CSV, REST, SQL and other remote systems where we would like to protect system resources like incoming handlers, connection pools etc and not bind to remote resources for prolonged unnecessary periods of time. Typically, modern storage and service endpoints will protect against "lingering connections" and kill slow clients. This means that Sesam connections to remote data sources could sometimes be closed down by the remote peer because of host platform performance or DTL pipes performance issues. The default behaviour is also such that a number of retries can be specified, but this will always start all over again and this can sometimes exacerbate the problems and lead to the Sesam node "hammering" remote peers with connections which always will fail. The default behaviour is again to fail immediately, but there is also no recourse or control to remedy this pattern of failures.

If a remote resource like e.g a CSV file was transferred locally to the node, this will greatly improve the Sesam node behaviour in larger data eco-systems and cloud environments. This should be considered a best practice and a default setting.

Another option which can be useful is to be able to supply a local spool buffer size for remote data sources, whereby only a portion of the remote static resource is transferred and processed, and then a new connection or HTTP2/HTTP3 resource continuation of resuming a transfer is applied. This requires the remote endpoint to be able to support continuation and also support replying with some "content-length" for resources. Again, this would improve Sesam platform behaviour and "playing nice" with other systems.

Spooling of static remote data sources

Comments

Didn't find what you were looking for?