Apache Iceberg: The Definitive Guide: Data Lakehouse Functionality, Performance, and Scalability on the Data Lake

Apache Iceberg: The Definitive Guide: Data Lakehouse Functionality, Performance, and Scalability on the Data Lake КНИГИ » ОС И БД

Название: Apache Iceberg: The Definitive Guide: Data Lakehouse Functionality, Performance, and Scalability on the Data Lake
Автор: Tomer Shiran, Jason Hughes, Alex Merced
Издательство: O’Reilly Media, Inc.
Год: 2024
Страниц: 479
Язык: английский
Формат: pdf, epub (true)
Размер: 14.0 MB

Traditional data architecture patterns are severely limited. To use these patterns, you have to ETL data into each tool—a cost-prohibitive process for making warehouse features available to all of your data. The lack of flexibility with these patterns requires you to lock into a set of priority tools and formats, which creates data silos and data drift. This practical book shows you a better way.

Apache Iceberg provides the capabilities, performance, scalability, and savings that fulfill the promise of an open data lakehouse. By following the lessons in this book, you'll be able to achieve interactive, batch, Machine Learning, and streaming analytics with this high-performance open source format. Authors Tomer Shiran, Jason Hughes, and Alex Merced from Dremio show you how to get started with Iceberg.

In these pages, you’ll learn what Apache Iceberg is, why it exists, how it works, and how to harness its power. Designed for data engineers, architects, scientists, and analysts working with large datasets across various use cases from BI dashboards to AI/ML, this book explores the core concepts, inner workings, and practical applications of Apache Iceberg. By the time you reach the end, you will have grasped the essentials and possess the practical knowledge to implement Apache Iceberg effectively in your data projects. Whether you are a newcomer or an experienced practitioner, Apache Iceberg: The Definitive Guide will be your trusted companion on this enlightening journey into Apache Iceberg.

The Part II of the book will delve into the practical aspects of using Apache Iceberg with some widely used compute engines and standalone APIs, including Apache Spark, Dremio’s SQL Engine, AWS Glue, Apache Flink, and PyIceberg. For a bonus chapter on the Iceberg Java/Python APIs, visit this supplemental repository. The primary focus is to provide in-depth explanations and code examples to demonstrate how Apache Iceberg works with various compute engines so that you can apply and build on the theoretical concepts discussed in the previous chapters. Visit the book’s GitHub repository to learn how to create a data lakehouse environment on your computer with Docker and to get hands-on with tools such as Apache Spark, Apache Flink, and Dremio.

With this book, you'll learn:
• The architecture of Apache Iceberg tables
• What happens under the hood when you perform operations on Iceberg tables
• How to further optimize Iceberg tables for maximum performance
• How to use Iceberg with popular data engines such as Apache Spark, Apache Flink, and Dremio

Discover why Apache Iceberg is a foundational technology for implementing an open data lakehouse.

Contents:

Показать / Скрыть текст

Скачать Apache Iceberg: The Definitive Guide: Data Lakehouse Functionality, Performance, and Scalability on the Data Lake

Скачать с Turbobit

ОТСУТСТВУЕТ ССЫЛКА/ НЕ РАБОЧАЯ ССЫЛКА ЕСТЬ РЕШЕНИЕ, ПИШЕМ СЮДА!

Автор: Ingvar16 9-05-2024, 21:35 | Напечатать |

Уважаемый посетитель, Вы зашли на сайт как незарегистрированный пользователь.

С этой публикацией часто скачивают:

Deciphering Data Architectures: Choosing Between a Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data Mesh Название: Deciphering Data Architectures: Choosing Between a Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data Mesh Автор: James Serra...

Architecting a Modern Data Warehouse for Large Enterprises Название: Architecting a Modern Data Warehouse for Large Enterprises Автор: Anjani Kumar, Abhishek Mishra, Sanjeev Kumar Издательство: Apress Год:...

Delta Lake: Up and Running: Modern Data Lakehouse Architectures with Delta Lake (Final) Название: Delta Lake: Up and Running: Modern Data Lakehouse Architectures with Delta Lake (Final) Автор: Bennie Haelen, Dan Davis Издательство:...

Delta Lake: Up and Running (5th Early Release) Название: Delta Lake: Up and Running: Modern Data Lakehouse Architectures with Delta Lake (5th Early Release) Автор: Bennie Haelen, Dan Davis...

The Cloud Data Lake: A Guide to Building Robust Cloud Data Architecture (Final Release) Название: The Cloud Data Lake: A Guide to Building Robust Cloud Data Architecture (Final Release) Автор: Rukmani Gopalan Издательство: O’Reilly...

Trino: The Definitive Guide: SQL at Any Scale, on Any Storage, in Any Environment, 2nd Edition (Final) Название: Trino: The Definitive Guide: SQL at Any Scale, on Any Storage, in Any Environment, 2nd Edition (Final) Автор: Matt Fuller, Manfred Moser,...

The Azure Data Lakehouse Toolkit Название: The Azure Data Lakehouse Toolkit Автор: Ron L’Esteve Издательство: Apress Год: 2022 Формат: True PDF Страниц: 467 Размер: 26 Mb Язык:...

Beginning Azure Synapse Analytics: Transition from Data Warehouse to Data Lakehouse Название: Beginning Azure Synapse Analytics: Transition from Data Warehouse to Data Lakehouse Автор: Bhadresh Shiyal Издательство: Apress Год: 2021...

Delta Lake: The Definitive Guide (Early Release) Название: Delta Lake: The Definitive Guide (Early Release) Автор: Denny Lee, Tathagata Das & Vini Jaiswal Издательство: O’Reilly Media Год:...

Data Lake for Enterprises Название: Data Lake for Enterprises Автор: Tomcy John, Pankaj Misra Издательство: Packt Publishing Год: 2017 Страниц: 596 Формат: True PDF, EPUB,...

Информация

Посетители, находящиеся в группе Гости, не могут оставлять комментарии к данной публикации.