Название: Developing Spark Applications with Python Автор: Xavier Morera, Nereo Campos Издательство: Big Data Inc Год: December 16, 2019 Страниц: 103 Язык: английский Формат: pdf Размер: 14.4 MB
If you are going to work with Big Data or Machine Learning, you need to learn Apache Spark. If you need to learn Spark, you should get this book. Spark is fast, and it integrates with many different platforms and storage platforms. It can be used for data engineering or Machine Learning. Knowledge of Apache Spark is an excellent skill to have in your toolbelt.
If you are reading this book, it’s because you have a pretty good idea of what Spark is, or at least what it is used for. There are millions of applications that could benefit from the programming model proposed by Spark; there is tons of data waiting to be analyzed to bring to life powerful insights that can change the course of a business. Perhaps you have seen the very catching and powerful phrase from Apache Spark’s site, “lightning-fast unified analytics engine.” What does this really mean? Can Spark solve all problems? What is Apache Spark? How does it work? In this chapter, we will show you features of Spark that make it amazing. Spark is an open source cluster computing framework intended to be used in processing and analyzing large amounts of data, just like Hadoop MapReduce but better. Processing of batch and streaming data has a huge impact because Spark achieves high performance, and it scales really well. Apart from the high performance achieved by Spark, there are other benefits that make it powerful and easy to use.
Spark redefines the way we work with Big Data as an open source, lightning-fast, general purpose and distributed framework that is easy to use and easy to learn, with a large and vibrant community, and that operates very well with the rest of the Hadoop ecosystem as well as other platforms, products and cloud services. Spark can help you process large amounts of data, both in the Data Engineering world, as well as in the Machine Learning one. Welcome to the Spark era!
Table of Contents: 1 The Spark Era; 2 Understanding Apache Spark; 3 Getting Technical with Spark; 4 Spark’s RDDs; 5 Going Deeper into Spark Core; 6 Data Frames and Spark SQL; 7 Spark SQL; 8 Understanding Typed API: DataSet; 9 Spark Streaming; 10 Exploring NOOA’s Datasets; 11 Final words; 12 About the Authors
Apache Spark: Invent The Future Название: Apache Spark: Invent The Future Автор: Ernesto Lee Издательство: Independently published Год: 2021 Страниц: 482 Язык: английский Формат:...
Spark with Python Название: Spark with Python Автор: Athul Dev Издательство: Independently published Год: 2020 Страниц: 154 Язык: английский Формат: pdf (true),...
Big Data Processing with Apache Spark Название: Big Data Processing with Apache Spark Автор: Srini Penchikala Издательство: Год: 2018 Страниц: 104 Формат: PDF Размер: 10 Mb Язык: English...
Practical Apache Spark: Using the Scala API Название: Practical Apache Spark: Using the Scala API Автор: Subhashini Chellappan, Dharanitharan Ganesan Издательство: Apress Год: 2019 Страниц:...
Learning Spark: Lightning-Fast Big Data Analysis Название: Learning Spark: Lightning-Fast Big Data Analysis Автор: Holden Karau, Andy Konwinski, Patrick Wendell Издательство: O'Reilly Media ISBN:...