Is it possible to have a Datalake in Yugabyte


Is it possible to create or have datalake in YugabyteDB like the datalake in Hadoop environment?

Hi @janani_in

YugabyteDB is a “database”, think PostgreSQL,MySQL,Cassandra. All data are stored in “rows”. See datalake vs database comparison. Datalakes usually store raw files like object storage solutions, while databases store rows.

Yugabyte is an OLTP oriented database, which means it is better at small fast transaction.

Hadoop and other “Data Lake” environments are usually aimed at long running queries that explore TB if not PB of data by breaking it into chunks and reading it off of disk.

Yugabyte can handle large amount of data, but likes to have indexes that allow it to pull out specific records and utilize tiering data in memory. It’s core use case is not adhoc scans that read large datasets.