Evgenii Karimov - Thoughts about the modern data engineering stack

Random thoughts about the modern data engineering stack - the modern stack should be the best from the beginning, or at least aiming to be so.

It means any required vendor solutions should be onboarded as early as possible, obviously given the platform vision and financial situation at hand.

There’s no point re-inventing the wheel unless you’re trying to solve a unique use-case, which might be the truth for huge data systems like Netflix or Google has. Migrations in data world are much more complex comparing it to the regular database migrations in software engineering, because of the eco-system around all components. It’s hard to isolate and replace a single component.

Today, when so many products ease the life of a data engineer (managed ETL, data cataloging, governance, etc), it’s already a challenge to pick right and integrate them with each other, making the job of data analysts and scientists intuitive and easy-going.

Thoughts about the modern data engineering stack

Share this post

Reverse ETL - definition and use-cases

Implementing slim CI for dbt with GitHub Actions