Press "Enter" to skip to content

Snowflake Open Sources Apache Iceberg-related Polaris Data Catalog

Alongside the notification that Polaris Catalog has been made open source, Dremio announced that it is sharing Project Nessie’s capabilities with Polaris.

Penguins on an iceberg.
Source: Pixabay

Surprise! Surprise! More open source has landed for the enterprise.

On Tuesday, The Montana-based cloud company Snowflake announced that Polaris Catalog is now officially an open source project, licensed under the Apache 2.0 license. Released on June 3, Polaris is a catalog for Apache Iceberg, the open-source high-performance format for huge analytic tables that was initially developed by Netflix. It offers full interoperability with most public clouds, including Amazon Web Services, Confluent, Dremio, Google Cloud, Microsoft Azure, and Salesforce.

The announcement, which was primarily penned by Snowflake’s principle software engineers Tyler Akidau and Russell Spitzer, also promised that Snowflake’s cloud-hosted version of Polaris will essentially be the same as the open source version that can be self-hosted.

“While other vendor-hosted catalogs deviate from the open source specification, which leads to lock-in, Snowflake’s service for Polaris Catalog is designed to be fully compatible with Polaris Catalog’s open source implementation both now and in the future,” they wrote. “Snowflake handles the responsibilities of running the service like providing an endpoint, deploying bug fixes, and users get a completely portable catalog for their data, which can be used with Iceberg REST catalog-compatible tools.”

Tuesday’s announcement was actually only a formality to make the open sourcing of the project official. When the project was released in June the company said, “Polaris Catalog will be both open sourced in the next 90 days and available to run in public preview in Snowflake infrastructure soon.”

Now that’s all a done deal.

‘Merging’ With Project Nessie

Support for Snowflake’s action came quickly from inside the Apache Iceberg community, and included an action that will help unify the project with established Iceberg-focused technology, which should serve to make it easier for DevOps teams to incorporate Polaris into their workflows.

Included in Tuesday’s announcement was news that the unified lakehouse analytic platform Dremio has contributed the capabilities of its transactional catalog for data lakes, Project Nessie.

“We are delighted to support the launch of Polaris Catalog as open source under the Apache license and look forward to actively contributing to its success,” Tomer Shiran, Dremio’s co-founder and CPO said in a statement. “With over four years of experience building Project Nessie as an open source Apache Iceberg catalog, we’re excited to share its differentiated capabilities, such as catalog-level versioning, multi-engine support, multi-table transactions and Git for data, with Polaris Catalog and the broader community.”

While this isn’t quite the “merger” that some articles on other sites are claiming it is (nor should it be), it should help remove some pain points for users.

Breaking News: