Learning Spark

Learning Spark
Author: Jules S. Damji
Publisher: O'Reilly Media
Total Pages: 400
Release: 2020-07-16
Genre: Computers
ISBN: 1492050016

Data is bigger, arrives faster, and comes in a variety of formats—and it all needs to be processed at scale for analytics or machine learning. But how can you process such varied workloads efficiently? Enter Apache Spark. Updated to include Spark 3.0, this second edition shows data engineers and data scientists why structure and unification in Spark matters. Specifically, this book explains how to perform simple and complex data analytics and employ machine learning algorithms. Through step-by-step walk-throughs, code snippets, and notebooks, you’ll be able to: Learn Python, SQL, Scala, or Java high-level Structured APIs Understand Spark operations and SQL Engine Inspect, tune, and debug Spark operations with Spark configurations and Spark UI Connect to data sources: JSON, Parquet, CSV, Avro, ORC, Hive, S3, or Kafka Perform analytics on batch and streaming data using Structured Streaming Build reliable data pipelines with open source Delta Lake and Spark Develop machine learning pipelines with MLlib and productionize models using MLflow

Building the Data Lakehouse

Building the Data Lakehouse
Author: Bill Inmon
Publisher: Technics Publications
Total Pages: 256
Release: 2021-10
Genre:
ISBN: 9781634629669

The data lakehouse is the next generation of the data warehouse and data lake, designed to meet today's complex and ever-changing analytics, machine learning, and data science requirements. Learn about the features and architecture of the data lakehouse, along with its powerful analytical infrastructure. Appreciate how the universal common connector blends structured, textual, analog, and IoT data. Maintain the lakehouse for future generations through Data Lakehouse Housekeeping and Data Future-proofing. Know how to incorporate the lakehouse into an existing data governance strategy. Incorporate data catalogs, data lineage tools, and open source software into your architecture to ensure your data scientists, analysts, and end users live happily ever after.

Distributed Data Systems with Azure Databricks

Distributed Data Systems with Azure Databricks
Author: Alan Bernardo Palacio
Publisher: Packt Publishing Ltd
Total Pages: 414
Release: 2021-05-25
Genre: Computers
ISBN: 1838642692

Quickly build and deploy massive data pipelines and improve productivity using Azure Databricks Key FeaturesGet to grips with the distributed training and deployment of machine learning and deep learning modelsLearn how ETLs are integrated with Azure Data Factory and Delta LakeExplore deep learning and machine learning models in a distributed computing infrastructureBook Description Microsoft Azure Databricks helps you to harness the power of distributed computing and apply it to create robust data pipelines, along with training and deploying machine learning and deep learning models. Databricks' advanced features enable developers to process, transform, and explore data. Distributed Data Systems with Azure Databricks will help you to put your knowledge of Databricks to work to create big data pipelines. The book provides a hands-on approach to implementing Azure Databricks and its associated methodologies that will make you productive in no time. Complete with detailed explanations of essential concepts, practical examples, and self-assessment questions, you’ll begin with a quick introduction to Databricks core functionalities, before performing distributed model training and inference using TensorFlow and Spark MLlib. As you advance, you’ll explore MLflow Model Serving on Azure Databricks and implement distributed training pipelines using HorovodRunner in Databricks. Finally, you’ll discover how to transform, use, and obtain insights from massive amounts of data to train predictive models and create entire fully working data pipelines. By the end of this MS Azure book, you’ll have gained a solid understanding of how to work with Databricks to create and manage an entire big data pipeline. What you will learnCreate ETLs for big data in Azure DatabricksTrain, manage, and deploy machine learning and deep learning modelsIntegrate Databricks with Azure Data Factory for extract, transform, load (ETL) pipeline creationDiscover how to use Horovod for distributed deep learningFind out how to use Delta Engine to query and process data from Delta LakeUnderstand how to use Data Factory in combination with DatabricksUse Structured Streaming in a production-like environmentWho this book is for This book is for software engineers, machine learning engineers, data scientists, and data engineers who are new to Azure Databricks and want to build high-quality data pipelines without worrying about infrastructure. Knowledge of Azure Databricks basics is required to learn the concepts covered in this book more effectively. A basic understanding of machine learning concepts and beginner-level Python programming knowledge is also recommended.

Azure Databricks Cookbook

Azure Databricks Cookbook
Author: Phani Raj
Publisher: Packt Publishing Ltd
Total Pages: 452
Release: 2021-09-17
Genre: Computers
ISBN: 178961855X

Get to grips with building and productionizing end-to-end big data solutions in Azure and learn best practices for working with large datasets Key FeaturesIntegrate with Azure Synapse Analytics, Cosmos DB, and Azure HDInsight Kafka Cluster to scale and analyze your projects and build pipelinesUse Databricks SQL to run ad hoc queries on your data lake and create dashboardsProductionize a solution using CI/CD for deploying notebooks and Azure Databricks Service to various environmentsBook Description Azure Databricks is a unified collaborative platform for performing scalable analytics in an interactive environment. The Azure Databricks Cookbook provides recipes to get hands-on with the analytics process, including ingesting data from various batch and streaming sources and building a modern data warehouse. The book starts by teaching you how to create an Azure Databricks instance within the Azure portal, Azure CLI, and ARM templates. You'll work through clusters in Databricks and explore recipes for ingesting data from sources, including files, databases, and streaming sources such as Apache Kafka and EventHub. The book will help you explore all the features supported by Azure Databricks for building powerful end-to-end data pipelines. You'll also find out how to build a modern data warehouse by using Delta tables and Azure Synapse Analytics. Later, you'll learn how to write ad hoc queries and extract meaningful insights from the data lake by creating visualizations and dashboards with Databricks SQL. Finally, you'll deploy and productionize a data pipeline as well as deploy notebooks and Azure Databricks service using continuous integration and continuous delivery (CI/CD). By the end of this Azure book, you'll be able to use Azure Databricks to streamline different processes involved in building data-driven apps. What you will learnRead and write data from and to various Azure resources and file formatsBuild a modern data warehouse with Delta Tables and Azure Synapse AnalyticsExplore jobs, stages, and tasks and see how Spark lazy evaluation worksHandle concurrent transactions and learn performance optimization in Delta tablesLearn Databricks SQL and create real-time dashboards in Databricks SQLIntegrate Azure DevOps for version control, deploying, and productionizing solutions with CI/CD pipelinesDiscover how to use RBAC and ACLs to restrict data accessBuild end-to-end data processing pipeline for near real-time data analyticsWho this book is for This recipe-based book is for data scientists, data engineers, big data professionals, and machine learning engineers who want to perform data analytics on their applications. Prior experience of working with Apache Spark and Azure is necessary to get the most out of this book.

Deep Learning with Azure

Deep Learning with Azure
Author: Mathew Salvaris
Publisher: Apress
Total Pages: 298
Release: 2018-08-24
Genre: Computers
ISBN: 1484236793

Get up-to-speed with Microsoft's AI Platform. Learn to innovate and accelerate with open and powerful tools and services that bring artificial intelligence to every data scientist and developer. Artificial Intelligence (AI) is the new normal. Innovations in deep learning algorithms and hardware are happening at a rapid pace. It is no longer a question of should I build AI into my business, but more about where do I begin and how do I get started with AI? Written by expert data scientists at Microsoft, Deep Learning with the Microsoft AI Platform helps you with the how-to of doing deep learning on Azure and leveraging deep learning to create innovative and intelligent solutions. Benefit from guidance on where to begin your AI adventure, and learn how the cloud provides you with all the tools, infrastructure, and services you need to do AI. What You'll Learn Become familiar with the tools, infrastructure, and services available for deep learning on Microsoft Azure such as Azure Machine Learning services and Batch AI Use pre-built AI capabilities (Computer Vision, OCR, gender, emotion, landmark detection, and more) Understand the common deep learning models, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), generative adversarial networks (GANs) with sample code and understand how the field is evolving Discover the options for training and operationalizing deep learning models on Azure Who This Book Is For Professional data scientists who are interested in learning more about deep learning and how to use the Microsoft AI platform. Some experience with Python is helpful.

Spark: The Definitive Guide

Spark: The Definitive Guide
Author: Bill Chambers
Publisher: "O'Reilly Media, Inc."
Total Pages: 594
Release: 2018-02-08
Genre: Computers
ISBN: 1491912294

Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. Youâ??ll explore the basic operations and common functions of Sparkâ??s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Sparkâ??s scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasetsâ??Sparkâ??s core APIsâ??through worked examples Dive into Sparkâ??s low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Sparkâ??s stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation

MASTERING AZURE FOR PREDICTIVE ANALYTICS AND MACHINE LEARNING

MASTERING AZURE FOR PREDICTIVE ANALYTICS AND MACHINE LEARNING
Author: KRISHNA KISHOR TIRUPATI SATISH VADLAMANI SHALU JAIN A RENUKA
Publisher: DeepMisti Publication
Total Pages: 213
Release: 2024-10-09
Genre: Computers
ISBN: 9360447439

In Today's Data-Driven World, The Ability To Harness The Power Of Predictive Analytics And Machine Learning Has Become A Pivotal Force In Shaping Innovation Across Industries. This Book, Mastering Azure For Predictive Analytics And Machine Learning, Aims To Bridge The Gap Between Cloud Technology And The Analytical Tools Needed To Drive Insights From Complex Data. Our Objective Is To Provide Readers With The Foundational Knowledge And Advanced Techniques Necessary To Leverage Microsoft Azure For Predictive Modeling And Machine Learning Applications. The Structure Of This Book Offers A Comprehensive Exploration Of The Tools, Methodologies, And Best Practices That Define Modern Analytics And Machine Learning In The Cloud. From Setting Up Your Azure Environment To Deploying Machine Learning Models, We Cover Each Stage With Practical Examples And Detailed Guidance. The Content Is Designed For A Broad Audience, Including Students, Data Scientists, It Professionals, And Business Leaders Who Seek To Use Azure’s Capabilities To Make Data-Informed Decisions. Drawing From The Latest Industry Research And Real-World Use Cases, This Book Not Only Provides Theoretical Knowledge But Also Equips Readers With Hands-On Skills They Can Apply In Real-Time Data Projects. Each Chapter Balances Depth With Accessibility, Covering Topics Like Data Preparation, Model Building, And Cloud-Based Deployment, While Also Touching On Critical Issues Such As Scalability, Security, And Automation. Additionally, We Highlight Best Practices For Managing Azure’s Infrastructure And Optimizing Machine Learning Workflows Within The Platform. The Inspiration For This Book Comes From The Recognition Of The Growing Role That Cloud Platforms Like Azure Play In Transforming How Organizations Use Data To Innovate And Compete. We Are Immensely Thankful To Chancellor Shri Shiv Kumar Gupta Of Maharaja Agrasen Himalayan Garhwal University For His Support And Commitment To Academic And Technological Excellence, Which Has Been Instrumental In Making This Book A Reality. We Hope That Mastering Azure For Predictive Analytics And Machine Learning Will Be A Valuable Resource For Anyone Looking To Deepen Their Understanding Of How Cloud Computing And Machine Learning Can Converge To Unlock The Full Potential Of Predictive Analytics. The Knowledge Contained In These Pages Is Intended To Empower Readers To Lead Transformative Data Projects With Confidence. Thank You For Embarking On This Journey With Us. Authors

Automated Machine Learning with Microsoft Azure

Automated Machine Learning with Microsoft Azure
Author: Dennis Michael Sawyers
Publisher: Packt Publishing Ltd
Total Pages: 340
Release: 2021-04-23
Genre: Computers
ISBN: 1800561970

A practical, step-by-step guide to using Microsoft's AutoML technology on the Azure Machine Learning service for developers and data scientists working with the Python programming language Key FeaturesCreate, deploy, productionalize, and scale automated machine learning solutions on Microsoft AzureImprove the accuracy of your ML models through automatic data featurization and model trainingIncrease productivity in your organization by using artificial intelligence to solve common problemsBook Description Automated Machine Learning with Microsoft Azure will teach you how to build high-performing, accurate machine learning models in record time. It will equip you with the knowledge and skills to easily harness the power of artificial intelligence and increase the productivity and profitability of your business. Guided user interfaces (GUIs) enable both novices and seasoned data scientists to easily train and deploy machine learning solutions to production. Using a careful, step-by-step approach, this book will teach you how to use Azure AutoML with a GUI as well as the AzureML Python software development kit (SDK). First, you'll learn how to prepare data, train models, and register them to your Azure Machine Learning workspace. You'll then discover how to take those models and use them to create both automated batch solutions using machine learning pipelines and real-time scoring solutions using Azure Kubernetes Service (AKS). Finally, you will be able to use AutoML on your own data to not only train regression, classification, and forecasting models but also use them to solve a wide variety of business problems. By the end of this Azure book, you'll be able to show your business partners exactly how your ML models are making predictions through automatically generated charts and graphs, earning their trust and respect. What you will learnUnderstand how to train classification, regression, and forecasting ML algorithms with Azure AutoMLPrepare data for Azure AutoML to ensure smooth model training and deploymentAdjust AutoML configuration settings to make your models as accurate as possibleDetermine when to use a batch-scoring solution versus a real-time scoring solutionProductionalize your AutoML and discover how to quickly deliver valueCreate real-time scoring solutions with AutoML and Azure Kubernetes ServiceTrain a large number of AutoML models at once using the AzureML Python SDKWho this book is for Data scientists, aspiring data scientists, machine learning engineers, or anyone interested in applying artificial intelligence or machine learning in their business will find this machine learning book useful. You need to have beginner-level knowledge of artificial intelligence and a technical background in computer science, statistics, or information technology before getting started. Familiarity with Python will help you implement the more advanced features found in the chapters, but even data analysts and SQL experts will be able to train ML models after finishing this book.

Mastering Azure Machine Learning

Mastering Azure Machine Learning
Author: Kaijisse Waaijer
Publisher:
Total Pages: 394
Release: 2020-04-30
Genre: Computers
ISBN: 9781789807554

This book will help you learn how to build a scalable end-to-end machine learning pipeline in Azure from experimentation and training to optimization and deployment. By the end of this book, you will learn to build complex distributed systems and scalable cloud infrastructure using powerful machine learning algorithms to compute insights.

Microsoft Azure Security Center

Microsoft Azure Security Center
Author: Yuri Diogenes
Publisher: Microsoft Press
Total Pages: 298
Release: 2018-06-04
Genre: Computers
ISBN: 1509307060

Discover high-value Azure security insights, tips, and operational optimizations This book presents comprehensive Azure Security Center techniques for safeguarding cloud and hybrid environments. Leading Microsoft security and cloud experts Yuri Diogenes and Dr. Thomas Shinder show how to apply Azure Security Center’s full spectrum of features and capabilities to address protection, detection, and response in key operational scenarios. You’ll learn how to secure any Azure workload, and optimize virtually all facets of modern security, from policies and identity to incident response and risk management. Whatever your role in Azure security, you’ll learn how to save hours, days, or even weeks by solving problems in most efficient, reliable ways possible. Two of Microsoft’s leading cloud security experts show how to: • Assess the impact of cloud and hybrid environments on security, compliance, operations, data protection, and risk management • Master a new security paradigm for a world without traditional perimeters • Gain visibility and control to secure compute, network, storage, and application workloads • Incorporate Azure Security Center into your security operations center • Integrate Azure Security Center with Azure AD Identity Protection Center and third-party solutions • Adapt Azure Security Center’s built-in policies and definitions for your organization • Perform security assessments and implement Azure Security Center recommendations • Use incident response features to detect, investigate, and address threats • Create high-fidelity fusion alerts to focus attention on your most urgent security issues • Implement application whitelisting and just-in-time VM access • Monitor user behavior and access, and investigate compromised or misused credentials • Customize and perform operating system security baseline assessments • Leverage integrated threat intelligence to identify known bad actors