Snowflake: The Definitive Guide

Snowflake: The Definitive Guide
Author: Joyce Kay Avila
Publisher: "O'Reilly Media, Inc."
Total Pages: 489
Release: 2022-08-11
Genre: Computers
ISBN: 1098103777

Snowflake's ability to eliminate data silos and run workloads from a single platform creates opportunities to democratize data analytics, allowing users at all levels within an organization to make data-driven decisions. Whether you're an IT professional working in data warehousing or data science, a business analyst or technical manager, or an aspiring data professional wanting to get more hands-on experience with the Snowflake platform, this book is for you. You'll learn how Snowflake users can build modern integrated data applications and develop new revenue streams based on data. Using hands-on SQL examples, you'll also discover how the Snowflake Data Cloud helps you accelerate data science by avoiding replatforming or migrating data unnecessarily. You'll be able to: Efficiently capture, store, and process large amounts of data at an amazing speed Ingest and transform real-time data feeds in both structured and semistructured formats and deliver meaningful data insights within minutes Use Snowflake Time Travel and zero-copy cloning to produce a sensible data recovery strategy that balances system resilience with ongoing storage costs Securely share data and reduce or eliminate data integration costs by accessing ready-to-query datasets available in the Snowflake Marketplace

Jumpstart Snowflake

Jumpstart Snowflake
Author: Dmitry Anoshin
Publisher: Apress
Total Pages: 270
Release: 2019-12-20
Genre: Computers
ISBN: 1484253280

Explore the modern market of data analytics platforms and the benefits of using Snowflake computing, the data warehouse built for the cloud. With the rise of cloud technologies, organizations prefer to deploy their analytics using cloud providers such as Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform. Cloud vendors are offering modern data platforms for building cloud analytics solutions to collect data and consolidate into single storage solutions that provide insights for business users. The core of any analytics framework is the data warehouse, and previously customers did not have many choices of platform to use. Snowflake was built specifically for the cloud and it is a true game changer for the analytics market. This book will help onboard you to Snowflake, present best practices to deploy, and use the Snowflake data warehouse. In addition, it covers modern analytics architecture and use cases. It provides use cases of integration with leading analytics software such as Matillion ETL, Tableau, and Databricks. Finally, it covers migration scenarios for on-premise legacy data warehouses. What You Will Learn Know the key functionalities of Snowflake Set up security and access with cluster Bulk load data into Snowflake using the COPY command Migrate from a legacy data warehouse to Snowflake integrate the Snowflake data platform with modern business intelligence (BI) and data integration tools Who This Book Is For Those working with data warehouse and business intelligence (BI) technologies, and existing and potential Snowflake users

Snowflake Cookbook

Snowflake Cookbook
Author: Hamid Mahmood Qureshi
Publisher: Packt Publishing Ltd
Total Pages: 330
Release: 2021-02-25
Genre: Computers
ISBN: 1800560184

Develop modern solutions with Snowflake's unique architecture and integration capabilities; process bulk and real-time data into a data lake; and leverage time travel, cloning, and data-sharing features to optimize data operations Key Features Build and scale modern data solutions using the all-in-one Snowflake platform Perform advanced cloud analytics for implementing big data and data science solutions Make quicker and better-informed business decisions by uncovering key insights from your data Book Description Snowflake is a unique cloud-based data warehousing platform built from scratch to perform data management on the cloud. This book introduces you to Snowflake's unique architecture, which places it at the forefront of cloud data warehouses. You'll explore the compute model available with Snowflake, and find out how Snowflake allows extensive scaling through the virtual warehouses. You will then learn how to configure a virtual warehouse for optimizing cost and performance. Moving on, you'll get to grips with the data ecosystem and discover how Snowflake integrates with other technologies for staging and loading data. As you progress through the chapters, you will leverage Snowflake's capabilities to process a series of SQL statements using tasks to build data pipelines and find out how you can create modern data solutions and pipelines designed to provide high performance and scalability. You will also get to grips with creating role hierarchies, adding custom roles, and setting default roles for users before covering advanced topics such as data sharing, cloning, and performance optimization. By the end of this Snowflake book, you will be well-versed in Snowflake's architecture for building modern analytical solutions and understand best practices for solving commonly faced problems using practical recipes. What you will learn Get to grips with data warehousing techniques aligned with Snowflake's cloud architecture Broaden your skills as a data warehouse designer to cover the Snowflake ecosystem Transfer skills from on-premise data warehousing to the Snowflake cloud analytics platform Optimize performance and costs associated with a Snowflake solution Stage data on object stores and load it into Snowflake Secure data and share it efficiently for access Manage transactions and extend Snowflake using stored procedures Extend cloud data applications using Spark Connector Who this book is for This book is for data warehouse developers, data analysts, database administrators, and anyone involved in designing, implementing, and optimizing a Snowflake data warehouse. Knowledge of data warehousing and database and cloud concepts will be useful. Basic familiarity with Snowflake is beneficial, but not necessary.

Flow Architectures

Flow Architectures
Author: James Urquhart
Publisher: "O'Reilly Media, Inc."
Total Pages: 280
Release: 2021-01-06
Genre: Computers
ISBN: 1492075841

Software development today is embracing events and streaming data, which optimizes not only how technology interacts but also how businesses integrate with one another to meet customer needs. This phenomenon, called flow, consists of patterns and standards that determine which activity and related data is communicated between parties over the internet. This book explores critical implications of that evolution: What happens when events and data streams help you discover new activity sources to enhance existing businesses or drive new markets? What technologies and architectural patterns can position your company for opportunities enabled by flow? James Urquhart, global field CTO at VMware, guides enterprise architects, software developers, and product managers through the process. Learn the benefits of flow dynamics when businesses, governments, and other institutions integrate via events and data streams Understand the value chain for flow integration through Wardley mapping visualization and promise theory modeling Walk through basic concepts behind today's event-driven systems marketplace Learn how today's integration patterns will influence the real-time events flow in the future Explore why companies should architect and build software today to take advantage of flow in coming years

Trino: The Definitive Guide

Trino: The Definitive Guide
Author: Matt Fuller
Publisher: "O'Reilly Media, Inc."
Total Pages: 310
Release: 2021-04-14
Genre: Computers
ISBN: 1098107683

Perform fast interactive analytics against different data sources using the Trino high-performance distributed SQL query engine. With this practical guide, you'll learn how to conduct analytics on data where it lives, whether it's Hive, Cassandra, a relational database, or a proprietary data store. Analysts, software engineers, and production engineers will learn how to manage, use, and even develop with Trino. Initially developed by Facebook, open source Trino is now used by Netflix, Airbnb, LinkedIn, Twitter, Uber, and many other companies. Matt Fuller, Manfred Moser, and Martin Traverso show you how a single Trino query can combine data from multiple sources to allow for analytics across your entire organization. Get started: Explore Trino's use cases and learn about tools that will help you connect to Trino and query data Go deeper: Learn Trino's internal workings, including how to connect to and query data sources with support for SQL statements, operators, functions, and more Put Trino in production: Secure Trino, monitor workloads, tune queries, and connect more applications; learn how other organizations apply Trino

Rise of the Data Cloud

Rise of the Data Cloud
Author: Frank Slootman
Publisher: AuthorHouse
Total Pages: 200
Release: 2020-12-18
Genre: Business & Economics
ISBN: 1728373069

The rise of the Data Cloud is ushering in a new era of computing. The world’s digital data is mass migrating to the cloud, where it can be more effectively integrated, managed, and mobilized. The data cloud eliminates data siloes and enables data sharing with business partners, capitalizing on data network effects. It democratizes data analytics, making the most sophisticated data science tools accessible to organizations of all sizes. Data exchanges enable businesses to discover, explore, and easily purchase or sell data—opening up new revenue streams. Business leaders have long dreamed of data driving their organizations. Now, thanks to the Data Cloud, nothing stands in their way.

The Data Warehouse Toolkit

The Data Warehouse Toolkit
Author: Ralph Kimball
Publisher: John Wiley & Sons
Total Pages: 464
Release: 2011-08-08
Genre: Computers
ISBN: 1118082141

This old edition was published in 2002. The current and final edition of this book is The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition which was published in 2013 under ISBN: 9781118530801. The authors begin with fundamental design recommendations and gradually progress step-by-step through increasingly complex scenarios. Clear-cut guidelines for designing dimensional models are illustrated using real-world data warehouse case studies drawn from a variety of business application areas and industries, including: Retail sales and e-commerce Inventory management Procurement Order management Customer relationship management (CRM) Human resources management Accounting Financial services Telecommunications and utilities Education Transportation Health care and insurance By the end of the book, you will have mastered the full range of powerful techniques for designing dimensional databases that are easy to understand and provide fast query response. You will also learn how to create an architected framework that integrates the distributed data warehouse using standardized dimensions and facts.

Snowflake Access Control

Snowflake Access Control
Author: Jessica Megan Larson
Publisher: Apress
Total Pages: 246
Release: 2022-03-03
Genre: Computers
ISBN: 9781484280379

Understand the different access control paradigms available in the Snowflake Data Cloud and learn how to implement access control in support of data privacy and compliance with regulations such as GDPR, APPI, CCPA, and SOX. The information in this book will help you and your organization adhere to privacy requirements that are important to consumers and becoming codified in the law. You will learn to protect your valuable data from those who should not see it while making it accessible to the analysts whom you trust to mine the data and create business value for your organization. Snowflake is increasingly the choice for companies looking to move to a data warehousing solution, and security is an increasing concern due to recent high-profile attacks. This book shows how to use Snowflake's wide range of features that support access control, making it easier to protect data access from the data origination point all the way to the presentation and visualization layer. Reading this book helps you embrace the benefits of securing data and provide valuable support for data analysis while also protecting the rights and privacy of the consumers and customers with whom you do business. What You Will Learn Identify data that is sensitive and should be restricted Implement access control in the Snowflake Data Cloud Choose the right access control paradigm for your organization Comply with CCPA, GDPR, SOX, APPI, and similar privacy regulations Take advantage of recognized best practices for role-based access control Prevent upstream and downstream services from subverting your access control Benefit from access control features unique to the Snowflake Data Cloud Who This Book Is For Data engineers, database administrators, and engineering managers who want to improve their access control model; those whose access control model is not meeting privacy and regulatory requirements; those new to Snowflake who want to benefit from access control features that are unique to the platform; technology leaders in organizations that have just gone public and are now required to conform to SOX reporting requirements

40 Algorithms Every Programmer Should Know

40 Algorithms Every Programmer Should Know
Author: Imran Ahmad
Publisher: Packt Publishing Ltd
Total Pages: 374
Release: 2020-06-12
Genre: Computers
ISBN: 178980986X

Learn algorithms for solving classic computer science problems with this concise guide covering everything from fundamental algorithms, such as sorting and searching, to modern algorithms used in machine learning and cryptography Key Features Learn the techniques you need to know to design algorithms for solving complex problems Become familiar with neural networks and deep learning techniques Explore different types of algorithms and choose the right data structures for their optimal implementation Book DescriptionAlgorithms have always played an important role in both the science and practice of computing. Beyond traditional computing, the ability to use algorithms to solve real-world problems is an important skill that any developer or programmer must have. This book will help you not only to develop the skills to select and use an algorithm to solve real-world problems but also to understand how it works. You’ll start with an introduction to algorithms and discover various algorithm design techniques, before exploring how to implement different types of algorithms, such as searching and sorting, with the help of practical examples. As you advance to a more complex set of algorithms, you'll learn about linear programming, page ranking, and graphs, and even work with machine learning algorithms, understanding the math and logic behind them. Further on, case studies such as weather prediction, tweet clustering, and movie recommendation engines will show you how to apply these algorithms optimally. Finally, you’ll become well versed in techniques that enable parallel processing, giving you the ability to use these algorithms for compute-intensive tasks. By the end of this book, you'll have become adept at solving real-world computational problems by using a wide range of algorithms.What you will learn Explore existing data structures and algorithms found in Python libraries Implement graph algorithms for fraud detection using network analysis Work with machine learning algorithms to cluster similar tweets and process Twitter data in real time Predict the weather using supervised learning algorithms Use neural networks for object detection Create a recommendation engine that suggests relevant movies to subscribers Implement foolproof security using symmetric and asymmetric encryption on Google Cloud Platform (GCP) Who this book is for This book is for programmers or developers who want to understand the use of algorithms for problem-solving and writing efficient code. Whether you are a beginner looking to learn the most commonly used algorithms in a clear and concise way or an experienced programmer looking to explore cutting-edge algorithms in data science, machine learning, and cryptography, you'll find this book useful. Although Python programming experience is a must, knowledge of data science will be helpful but not necessary.

Spark: The Definitive Guide

Spark: The Definitive Guide
Author: Bill Chambers
Publisher: "O'Reilly Media, Inc."
Total Pages: 594
Release: 2018-02-08
Genre: Computers
ISBN: 1491912294

Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. Youâ??ll explore the basic operations and common functions of Sparkâ??s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Sparkâ??s scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasetsâ??Sparkâ??s core APIsâ??through worked examples Dive into Sparkâ??s low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Sparkâ??s stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation