Data Virtualization in the Cloud Era

Data Virtualization in the Cloud Era КНИГИ » ОС И БД

Название: Data Virtualization in the Cloud Era: Data Lakes and Data Federation At Scale
Автор: Daniel Abadi, Andrew Mott
Издательство: O’Reilly Media, Inc.
Год: 2024-07-03
Страниц: 184
Язык: английский
Формат: pdf, epub, mobi
Размер: 10.1 MB

For decades data virtualization has been little more than a dream. How nice it would be if we could ignore all the details regarding where data is located and how it is stored, and simply access all data within an organization from a single unified interface! Unfortunately, this dream was held back by fundamental limitations of hardware and complexity of the necessary software, so data virtualization remained a niche technology. However, in the last decade, advances in networking hardware and machine learning technology has started to transform data virtualization from dream to reality.

Language is a barrier beyond the fact that one dataset may be in English, another in Chinese, and another in Greek. Even if they are all in English, the computing system that stores the data may require questions to be posed in different languages in order to extract or answer questions about these datasets. One system may have an SQL interface, another GraphQL, and a third system may support only text search. The client who wishes to pose a question to these differing systems needs to learn the language that the system supports as its interface.

The goal of data virtualization (DV) is to eliminate or alleviate these other barriers. A DV System creates a central interface in which data can be accessed no matter where it is located, no matter how it is stored, and no matter how it is organized. The system does not physically move the data to a central location. Rather, the data exists there virtually. A user of the system is given the impression that all data is in one place, even though in reality it may be spread across the world. Furthermore, the user is presented with information about what datasets exist, how they are organized, and enough of the semantic details of each dataset to be able to formulate queries over them. The user can then issue commands that access any dataset virtualized by the system without needing to know any of the physical details regarding where data is located, which systems are being used to store it, and how the data is compressed or organized in storage.

The most complex part of a DV System is the data virtualization engine (DV Engine), which receives requests from clients (generated using the client interface) and performs whatever processing is required for these requests. This typically involves communication with the specific underlying data sources that contain data relevant to those requests. The DV Engine thus needs to know how to communicate with a variety of different types of systems that may store data that is being virtualized by the system. Furthermore, it may need to forward parts of client requests to these underlying source systems. Therefore, the engine needs to know how to properly express these requests such that the underlying data source system can perform these requests in a high-performing way and return the results in a manner that is consumable in a scalable fashion by the DV System. The DV Engine may also need to combine results received from multiple underlying data source systems involved in a client request.

In general, the goal of data virtualization is to allow clients to express requests over datasets without having to worry about the details of how the underlying data source systems store the source data. Yet most underlying data sources have unique interfaces that require expertise in that particular system before data can be accessed. Therefore, the DV Engine typically requires some translation on the fly from a global interface that is used by the client to access any underlying system into the particular interface used by specific underlying data sources.

In this book, we discuss:

What is data virtualization and why is it useful?
What are the technical underpinnings that make virtualization more practical today than it ever has been in the past?
Where does data virtualization fit into the modern data mesh and data fabric paradigms?

Contents:

Показать / Скрыть текст

Скачать Data Virtualization in the Cloud Era

Скачать с Turbobit

ОТСУТСТВУЕТ ССЫЛКА/ НЕ РАБОЧАЯ ССЫЛКА ЕСТЬ РЕШЕНИЕ, ПИШЕМ СЮДА!

Автор: Ingvar16 6-07-2024, 19:30 | Напечатать |

Уважаемый посетитель, Вы зашли на сайт как незарегистрированный пользователь.

С этой публикацией часто скачивают:

Cisco Data Center Fundamentals (Final Release) Название: Cisco Data Center Fundamentals (Final Release) Автор: Somit Maloo, Iskren Nikolov Издательство: Cisco Press Год: 2023 Страниц: 736 Язык:...

Hands-on Data Virtualization with Polybase: Administer Big Data, SQL Queries and Data Accessibility Название: Hands-on Data Virtualization with Polybase: Administer Big Data, SQL Queries and Data Accessibility Across Hadoop, Azure, Spark, Cassandra,...

SQL Server Big Data Clusters: Data Virtualization, Data Lake, and AI Platform Second Edition Название: SQL Server Big Data Clusters: Data Virtualization, Data Lake, and AI Platform Second Edition Автор: Weissman, Benjamin, van de Laar, Enrico...

Introducing Data Science: Big data, machine learning, and more, using Python tools Название: Introducing Data Science: Big data, machine learning, and more, using Python tools Автор: Davy Cielen, Arno D. B. Meysman, Mohamed Ali...

SQL Server Big Data Clusters: Early First Edition Based on Release Candidate 1 Название: SQL Server Big Data Clusters: Early First Edition Based on Release Candidate 1 Автор: Benjamin Weissman, Enrico van de Laar Издательство:...

SQL Server 2019 Revealed: Including Big Data Clusters and Machine Learning Название: SQL Server 2019 Revealed: Including Big Data Clusters and Machine Learning Автор: Bob Ward Издательство: Apress Год: 2019 Формат: true...

Data Center Virtualization Fundamentals Название: Data Center Virtualization Fundamentals Автор: Gustavo Alessandro Andrade Santana Издательство: Cisco Press Год: 2014 Формат: PDF Страниц:...

Practical Data Science: A Guide to Building the Technology Stack for Turning Data Lakes into Business Assets Название: Practical Data Science: A Guide to Building the Technology Stack for Turning Data Lakes into Business Assets Автор: Andreas Francois...

Software-Defined Data Infrastructure Essentials Название: Software-Defined Data Infrastructure Essentials: Cloud, Converged, and Virtual Fundamental Server Storage I/O Tradecraft Автор: Greg Schulz...

5G Radio Access Networks: Centralized RAN, Cloud-RAN and Virtualization of Small Cells Название: 5G Radio Access Networks: Centralized RAN, Cloud-RAN and Virtualization of Small Cells Автор: Hrishikesh Venkataraman and Ramona Trestian...

Информация

Посетители, находящиеся в группе Гости, не могут оставлять комментарии к данной публикации.