Название: Hadoop with Python Автор: Zachary Radtka, Donald Miner Издательство: O'Reilly Год: 2015 Формат: pdf Страниц: 71 Размер: 1.75 MB Язык: Английский
Hadoop is mostly written in Java, but that doesn't exclude the use of other programming languages with this distributed storage and processing framework, particularly Python. With this concise book, you'll learn how to use Python with the Hadoop Distributed File System (HDFS), MapReduce, the Apache Pig platform and Pig Latin script, and the Apache Spark cluster-computing framework.
Authors Zachary Radtka and Donald Miner from the data science firm Miner & Kasch take you through the basic concepts behind Hadoop, MapReduce, Pig, and Spark. Then, through multiple examples and use cases, you'll learn how to work with these technologies by applying various Python tools.
Use the Python library Snakebite to access HDFS programmatically from within Python applications Write MapReduce jobs in Python with mrjob, the Python MapReduce library Extend Pig Latin with user-defined functions (UDFs) in Python Use the Spark Python API (PySpark) to write Spark programs with Python Learn how to use the Luigi Python workflow scheduler to manage MapReduce jobs and Pig scripts
Zachary Radtka, a platform engineer at Miner & Kasch, has extensive experience creating custom analytics that run on petabyte-scale data sets.
Donald Miner, founder of Miner & Kasch, specializes in Hadoop enterprise architecture and applying machine learning to real-world business problems.
Уважаемый посетитель, Вы зашли на сайт как незарегистрированный пользователь.
С этой публикацией часто скачивают:
Big Data Processing with Apache Spark Название: Big Data Processing with Apache Spark Автор: Srini Penchikala Издательство: Год: 2018 Страниц: 104 Формат: PDF Размер: 10 Mb Язык: English...
Hadoop For Dummies Название: Hadoop For Dummies Автор: Dirk deRoos Издательство: For Dummies Год: 2014 ISBN: 9781118607558 Серия: For Dummies Формат: pdf Страниц: 369...
Learning Spark: Lightning-Fast Big Data Analysis Название: Learning Spark: Lightning-Fast Big Data Analysis Автор: Holden Karau, Andy Konwinski, Patrick Wendell Издательство: O'Reilly Media ISBN:...