Christopher Lagali
1 min readFeb 6, 2022

--

Hi Shiva,

I am thankful for your kind comments and glad that this was a guiding light for you.

With regards to your query; As I do not have a complete view of the intended pipeline I wanted to ask if the RAW table is the final destination of the file?

If not then a stream that sits on top of your RAW table would be a representation of the actual data in the RAW table after every load (as you are performing a truncate load).

In this scenario CDC is not applicable as you are performing a complete data refresh every day. You could however use streams to trigger the next few E.T.L steps (with a latency of maybe a few seconds) by using SYSTEM$STREAM_HAS_DATA in your task def. That way you could get your INT tables refreshed whenever your RAW table is loaded.

Does that answer your question?

--

--

Christopher Lagali
Christopher Lagali

Written by Christopher Lagali

An Enthusiastic Data Eng. who is on a mission to unravel the possibilities of pipeline building with AWS and who believes in knowledge sharing.

Responses (1)