Elementl, a startup that is building a data platform based on the popular Dagster orchestrator, today announced that it has raised a $33 million Series B round led by Georgian. This round also saw participation from new investors 8VC and Human Capital, as well as existing investors Sequoia, Index, Amplify, Hanover and Slow. The new round brings the company’s total funding to $48.8 million.
As is so often the case, Dagster founder Nick Schrock also founded Elementl after many years at Facebook, where he also co-created GraphQL. Schrock is currently the company’s CTO and chairman, with his former Facebook colleague Pete Hunt now the company’s CEO. As Hunt told me, he had invested in Elementl as part of its 2017 seed round — mostly as a bet on Schrock. Hunt admitted that at that point, he didn’t really understand the value proposition of Dagster but as he worked on more data problems at Facebook and then later at Smyte, the anti-abuse service he co-founded and later sold to Twitter, the need for better data orchestration quickly became clear to him.
“I realized that there are these big complex data pipelines that are making very serious decisions — not just taking down social media posts but also deciding who gets a mortgage, all that stuff. Once you get to a certain size, every company is a data company and every company has a data platform,” Hunt said. This also means that managing their data pipelines is one of the biggest challenges for many companies.
Apache Airflow remains one of the most popular tools for building these pipelines (and there are plenty of startups that bet on it), but Schrock was looking to build a more modern system that was optimized for the world of cloud, DevOps and containers. But the team also rethought data pipelines from a high-level perspective. “The way people have historically built data pipelines is that they think in terms of tasks. So step A to step B — and then do step C. Within those steps, they could do anything and you don’t really know — they could write to some database in a way that you don’t expect and you have no way of controlling that or having observability into that step,” Hunt explained.
Elementl rethought this with what it calls a data asset (which could be a table in a data warehouse or a file sitting in a data lake) at its core. So instead of thinking about tasks as the core abstraction, Elementl (and Dagster) focus on the data assets. “By centering this notion of an asset at the core of our system, we get a ledger of every data asset in the organization and every state transition that it’s gone through, along with all the metadata associated with it. That’s a mental model that developers love,” said Hunt.
Given that it competes with well-tested tools like Apache Airflow, Dagster also needs to work well for large organizations — and it needs to be a legitimate open source project, too. Like most open source startups, the company is layering enterprise features like single sign-on, role-based access and support for teams on top of the open source project as it builds out its commercial offering. And since Airflow is so popular, the team also recently launched a tool that allows current Airflow users to run data pipelines written for Airflow on Dagster.
Over the course of the last year, the number of active projects that use Dagster has tripled, the company says, and so has the overall open source community around it. Currently, companies like DoorDash, Flexport and Aritzia are using Dagster in production.
“Dagster was built from the ground up to provide a transformative developer experience while supporting the most demanding use cases in data engineering. Our unique abstractions and asset-first approach are really resonating with data practitioners, and we’re seeing this play out
in our key growth metrics,” said Schrock.
The company plans to use most of the new funding to build out its go-to-market organization.
“Our R&D team adopted Dagster for data orchestration over a year ago after an evaluation of the solutions in the space. We’ve been impressed with how Dagster has accelerated our engineering team’s productivity and ability to efficiently ship production-grade data pipelines.” said Emily Walsh, lead investor at Georgian.