Hudi
Hudi
总览
Hudi可以在自管理数据库层上使用增量数据管道构建流数据湖,同时对湖引擎和常规批处理进行优化。
基本特性
Upserts, Deletes with fast, pluggable indexing. (快速进行Upsert、Delete操作,同时支持索引机制)
Transactions, Rollbacks, Concurrency Control. (事务、回滚、并发控制)
Automatic file sizing, data clustering, compactions, cleaning.(自动调整文件大小,数据集群,压缩以及清理)
Built-in metadata tracking for scalable storage access.(内置元数据跟踪可扩展存储访问)
Incremental queries, Record level change streams(增量查询,记录级别变更流)
SQL Read/Writes from Spark, Presto, Trino, Hive & more (支持多种引擎)
Streaming ingestion, Built-in CDC sources & tools. (流式读取,内嵌CDC源以及相关工具)
Backwards compatible schema evolution and enforcement. (向后兼容schema演进和增强)
Powered by Waline v2.14.1