Data Processing with Optimus

Data Processing with Optimus
Author: Dr. Argenis Leon
Publisher: Packt Publishing Ltd
Total Pages: 301
Release: 2021-09-03
Genre: Computers
ISBN: 1801077754

Written by the core Optimus team, this comprehensive guide will help you to understand how Optimus improves the whole data processing landscape Key FeaturesLoad, merge, and save small and big data efficiently with OptimusLearn Optimus functions for data analytics, feature engineering, machine learning, cross-validation, and NLPDiscover how Optimus improves other data frame technologies and helps you speed up your data processing tasksBook Description Optimus is a Python library that works as a unified API for data cleaning, processing, and merging data. It can be used for handling small and big data on your local laptop or on remote clusters using CPUs or GPUs. The book begins by covering the internals of Optimus and how it works in tandem with the existing technologies to serve your data processing needs. You'll then learn how to use Optimus for loading and saving data from text data formats such as CSV and JSON files, exploring binary files such as Excel, and for columnar data processing with Parquet, Avro, and OCR. Next, you'll get to grips with the profiler and its data types - a unique feature of Optimus Dataframe that assists with data quality. You'll see how to use the plots available in Optimus such as histogram, frequency charts, and scatter and box plots, and understand how Optimus lets you connect to libraries such as Plotly and Altair. You'll also delve into advanced applications such as feature engineering, machine learning, cross-validation, and natural language processing functions and explore the advancements in Optimus. Finally, you'll learn how to create data cleaning and transformation functions and add a hypothetical new data processing engine with Optimus. By the end of this book, you'll be able to improve your data science workflow with Optimus easily. What you will learnUse over 100 data processing functions over columns and other string-like valuesReshape and pivot data to get the output in the required formatFind out how to plot histograms, frequency charts, scatter plots, box plots, and moreConnect Optimus with popular Python visualization libraries such as Plotly and AltairApply string clustering techniques to normalize stringsDiscover functions to explore, fix, and remove poor quality dataUse advanced techniques to remove outliers from your dataAdd engines and custom functions to clean, process, and merge dataWho this book is for This book is for Python developers who want to explore, transform, and prepare big data for machine learning, analytics, and reporting using Optimus, a unified API to work with Pandas, Dask, cuDF, Dask-cuDF, Vaex, and Spark. Although not necessary, beginner-level knowledge of Python will be helpful. Basic knowledge of the CLI is required to install Optimus and its requirements. For using GPU technologies, you'll need an NVIDIA graphics card compatible with NVIDIA's RAPIDS library, which is compatible with Windows 10 and Linux.

Data Processing on FPGAs

Data Processing on FPGAs
Author: Jens Teubner
Publisher: Springer Nature
Total Pages: 104
Release: 2022-05-31
Genre: Computers
ISBN: 3031018494

Roughly a decade ago, power consumption and heat dissipation concerns forced the semiconductor industry to radically change its course, shifting from sequential to parallel computing. Unfortunately, improving performance of applications has now become much more difficult than in the good old days of frequency scaling. This is also affecting databases and data processing applications in general, and has led to the popularity of so-called data appliances—specialized data processing engines, where software and hardware are sold together in a closed box. Field-programmable gate arrays (FPGAs) increasingly play an important role in such systems. FPGAs are attractive because the performance gains of specialized hardware can be significant, while power consumption is much less than that of commodity processors. On the other hand, FPGAs are way more flexible than hard-wired circuits (ASICs) and can be integrated into complex systems in many different ways, e.g., directly in the network for a high-frequency trading application. This book gives an introduction to FPGA technology targeted at a database audience. In the first few chapters, we explain in detail the inner workings of FPGAs. Then we discuss techniques and design patterns that help mapping algorithms to FPGA hardware so that the inherent parallelism of these devices can be leveraged in an optimal way. Finally, the book will illustrate a number of concrete examples that exploit different advantages of FPGAs for data processing. Table of Contents: Preface / Introduction / A Primer in Hardware Design / FPGAs / FPGA Programming Models / Data Stream Processing / Accelerated DB Operators / Secure Data Processing / Conclusions / Bibliography / Authors' Biographies / Index

NASA SP-7500

NASA SP-7500
Author: United States. National Aeronautics and Space Administration
Publisher:
Total Pages: 140
Release: 1972
Genre:
ISBN:

Computerworld

Computerworld
Author:
Publisher:
Total Pages: 48
Release: 1976-03-22
Genre:
ISBN:

For more than 40 years, Computerworld has been the leading source of technology news and information for IT influencers worldwide. Computerworld's award-winning Web site (Computerworld.com), twice-monthly publication, focused conference series and custom research form the hub of the world's largest global IT media network.

Software Architecture for Big Data and the Cloud

Software Architecture for Big Data and the Cloud
Author: Ivan Mistrik
Publisher: Morgan Kaufmann
Total Pages: 472
Release: 2017-06-12
Genre: Computers
ISBN: 0128093382

Software Architecture for Big Data and the Cloud is designed to be a single resource that brings together research on how software architectures can solve the challenges imposed by building big data software systems. The challenges of big data on the software architecture can relate to scale, security, integrity, performance, concurrency, parallelism, and dependability, amongst others. Big data handling requires rethinking architectural solutions to meet functional and non-functional requirements related to volume, variety and velocity. The book's editors have varied and complementary backgrounds in requirements and architecture, specifically in software architectures for cloud and big data, as well as expertise in software engineering for cloud and big data. This book brings together work across different disciplines in software engineering, including work expanded from conference tracks and workshops led by the editors. Discusses systematic and disciplined approaches to building software architectures for cloud and big data with state-of-the-art methods and techniques Presents case studies involving enterprise, business, and government service deployment of big data applications Shares guidance on theory, frameworks, methodologies, and architecture for cloud and big data

Major Companies of Europe 1991-1992 Vol. 1 : Major Companies of the Continental European Community

Major Companies of Europe 1991-1992 Vol. 1 : Major Companies of the Continental European Community
Author: R. M. Whiteside
Publisher: Springer Science & Business Media
Total Pages: 1043
Release: 2012-12-06
Genre: Business & Economics
ISBN: 9401130167

Volumes 1 & 2 Guide to the MAJOR COMPANIES OF EUROPE 1991/92, Volume 1, arrangement of the book contains useful information on over 4000 of the top companies in the European Community, excluding the UK, over 1100 This book has been arranged in order to allow the reader to companies of which are covered in Volume 2. Volume 3 covers find any entry rapidly and accurately. over 1300 of the top companies within Western Europe but outside the European Community. Altogether the three Company entries are listed alphabetically within each country volumes of MAJOR COMPANIES OF EUROPE now provide in section; in addition three indexes are provided in Volumes 1 authoritative detail, vital information on over 6500 of the largest and 3 on coloured paper at the back of the books, and two companies in Western Europe. indexes in the case of Volume 2. MAJOR COMPANIES OF EUROPE 1991/92, Volumes 1 The alphabetical index to companies throughout the & 2 contain many of the largest companies in the world. The Continental EC lists all companies having entries in Volume 1 area covered by these volumes, the European Community, in alphabetical order irrespective of their main country of represents a rich consumer market of over 320 million people. operation. Over one third of the world's imports and exports are channelled through the EC. The Community represents the The alphabetical index in Volume 1 to companies within each world's largest integrated market.

Design and Synthesis

Design and Synthesis
Author: Hiroyuki Yoshikawa
Publisher: North-Holland
Total Pages: 796
Release: 1985
Genre: Technology & Engineering
ISBN:

Data Management, Analytics and Innovation

Data Management, Analytics and Innovation
Author: Neha Sharma
Publisher: Springer Nature
Total Pages: 476
Release: 2020-08-18
Genre: Technology & Engineering
ISBN: 9811556164

This book presents the latest findings in the areas of data management and smart computing, big data management, artificial intelligence and data analytics, along with advances in network technologies. Gathering peer-reviewed research papers presented at the Fourth International Conference on Data Management, Analytics and Innovation (ICDMAI 2020), held on 17–19 January 2020 at the United Services Institute (USI), New Delhi, India, it addresses cutting-edge topics and discusses challenges and solutions for future development. Featuring original, unpublished contributions by respected experts from around the globe, the book is mainly intended for a professional audience of researchers and practitioners in academia and industry.