Cloudera acquires Octopai's platform to enhance metadata management capabilities

Read the press release
Overview

Data lake flexibility & data warehouse performance in a single platform.

Open data lakehouse helps organizations run quick analytics on all data - structured and unstructured at massive scale. It eliminates data silos and allows data teams to collaborate on the same data with the tools of their choice on any public cloud and private cloud.

This modern data architecture delivers data reliability with ease of data management. Run BI, AI, ML, streaming analytics on the same data without moving or locking your data ever.

Cloudera Platform diagram

Cloudera delivers the world's only open data lakehouse providing the following benefits:

Open architecture

Cloudera’s data lakehouse powered by Apache Iceberg is 100% open—open source, open standards based, with wide community adoption. It can store multiple data formats and enables multiple engines to work on the same data.

Ease of adoption

By integrating Iceberg right into the Shared Data Experience (SDX), Cloudera offers the easiest path to deploying a lakehouse. Additional capabilities like schema evolution, hidden partition, and more simplify data management for large data sets.

Multi-cloud

Build a data lakehouse anywhere, on any public cloud or in your own data center. Build once and run anywhere without any headaches. Cloudera offers the same data services with full portability on all clouds.

Secure and governed

The Iceberg tables in Cloudera integrate within SDX, allowing for unified security, fine-grained policies, governance, lineage, and metadata management across multiple clouds, so you can focus on analyzing your data while we take care of the rest.

Cloudera's Open Data Lakehouse is now available on private cloud.
Key Components
 

Supercharge your data with an open lakehouse

Multifunction analytics

Cloudera provides the full range of data services to run AI, ML, BI, streaming analytics, data engineering on your data lakehouse. From ingestion and streaming, to processing and persistence, orchestration, discovery, and access, powerful and scalable data services deliver key analytic functions. And you can bring your choice of tools as well.

Multifunction analytics

Cloudera provides the full range of data services to run AI, ML, BI, streaming analytics, data engineering on your data lakehouse. From ingestion and streaming, to processing and persistence, orchestration, discovery, and access, powerful and scalable data services deliver key analytic functions. And you can bring your choice of tools as well.

 

 

Open Table Format, Apache Iceberg

Apache Iceberg is the key building block of the open lakehouse. It is a high-performance open table format for large analytic tables that brings the reliability of SQL tables to big data, while making it possible for multiple compute engines to work concurrently. It offers rich capabilities like time travel, snapshot isolation, schema evolution, hidden partitioning and more.

Open Table Format, Apache Iceberg

Apache Iceberg is the key building block of the open lakehouse. It is a high-performance open table format for large analytic tables that brings the reliability of SQL tables to big data, while making it possible for multiple compute engines to work concurrently. It offers rich capabilities like time travel, snapshot isolation, schema evolution, hidden partitioning and more.

 

 

Shared Data Experience (SDX)

SDX is a fundamental part of Cloudera that delivers unified security and governance technologies built on metadata. Providing full data management across data and analytics on all infrastructures everywhere, SDX reduces risk and operational costs. IT can deploy fully secured and governed data lakehouses faster, giving more users access to more data, without compromise.

Shared Data Experience (SDX)

SDX is a fundamental part of Cloudera that delivers unified security and governance technologies built on metadata. Providing full data management across data and analytics on all infrastructures everywhere, SDX reduces risk and operational costs. IT can deploy fully secured and governed data lakehouses faster, giving more users access to more data, without compromise.

 

 

Robust Data Catalog

Find, curate, and tag data anywhere across all infrastructures and generate relevant insight with Cloudera Data Catalog: 

  • Search, view and access all your data from a single place 

  • Understand, document, and monitor data and its use

  • Collaborate and share data responsibly with full insight

Robust Data Catalog

Find, curate, and tag data anywhere across all infrastructures and generate relevant insight with Cloudera Data Catalog: 

  • Search, view and access all your data from a single place 

  • Understand, document, and monitor data and its use

  • Collaborate and share data responsibly with full insight

 

 

 

 

Hello Fresh logo

NEW YORKER: Harnessing Data Insights to Identify Fashion Trends and Reduce Stockouts. The data lakehouse helps global retailer NEW YORKER anticipate customer needs for better in-store experience.

 

"Cloudera Data Platform’s reputation, reliability, scalability, speed and great customer support were influential factors in its selection.”

—Steffen Minz, Head of Data Science, NEW YORKER

Hello Fresh logo

NEW YORKER: Harnessing Data Insights to Identify Fashion Trends and Reduce Stockouts. The data lakehouse helps global retailer NEW YORKER anticipate customer needs for better in-store experience.

 

"Cloudera Data Platform’s reputation, reliability, scalability, speed and great customer support were influential factors in its selection.”

—Steffen Minz, Head of Data Science, NEW YORKER

Forrester report thumbnail

Use AI Via an End-to-End Data Lakehouse to Increase Data Lifecycle Efficiency

GigaOm Radar for Data Lakes & Lakehouses

Cloudera named a 2024 market leader for data lakehouses.
 

Download Report

GigaOm Radar for Data Lakes & Lakehouses Report Leader 2024
Resources
 

Discover more insights on managing data anywhere

Webinar

How the open data lakehouse enables enterprise AI

Whitepaper

Introducing Apache Iceberg: The Case for an Open Data Lakehouse Powered by Cloudera

World-class training, support, & services

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.