
At Barracuda, our Enterprise Data Platform team is focused on delivering high-quality and reliable data pipelines that enable analysts and business leaders across the business to make informed decisions. To drive this initiative, we’ve adopted Databricks Lakeflow Declarative Pipelines (formerly DLT) and Unity Catalog to handle our Extract, Transform, Load (ETL) workflows, enforce data quality and ensure robust governance.
Lakeflow Declarative Pipelines have empowered us to leverage our customer usage data for applications that help renewals and customer success teams deliver better customer experiences. We’ve also used Lakeflow Declarative Pipelines and Unity Catalog to create dashboards for our executive teams, enabling them to leverage data from various sources to make more informed financial decisions. These use cases rely on highly available and accurate data, for which Lakeflow Declarative Pipelines have provided significant support.
Why Lakeflow Declarative Pipelines?
Databricks’ core declarative transformation framework – embodied in Lakeflow Declarative Pipelines – allows us to define data transformations and quality constraints. This significantly reduces the operation overhead of managing complex ETL jobs and improves the observability of our data flows. We no longer have to write imperative code to orchestrate tasks; instead, we define what the pipeline should do, and Lakeflow Declarative Pipelines handles the rest. This has made our pipelines easier to build, understand and maintain.
From Batch to Streaming
Lakeflow Declarative Pipelines provides robust features to streamline incremental data processing and enhance efficiency in data management workflows. By utilizing tools like Auto Loader, which incrementally processes new data files as they arrive in cloud storage, our data team can easily handle incoming data. Schema inference and schema hints further simplify the process by managing schema evolution and ensuring compatibility with incoming datasets.
Here’s how we define a streaming ingestion table using Auto Loader. This example shows advanced configuration options for schema hints and backfill settings – but for many pipelines, the built-in schema inference and defaults are sufficient to get started quickly.
