Data Processing with Optimus

Data Processing with Optimus
Author: Dr. Argenis Leon
Publisher: Packt Publishing Ltd
Total Pages: 301
Release: 2021-09-03
Genre: Computers
ISBN: 1801077754

Written by the core Optimus team, this comprehensive guide will help you to understand how Optimus improves the whole data processing landscape Key FeaturesLoad, merge, and save small and big data efficiently with OptimusLearn Optimus functions for data analytics, feature engineering, machine learning, cross-validation, and NLPDiscover how Optimus improves other data frame technologies and helps you speed up your data processing tasksBook Description Optimus is a Python library that works as a unified API for data cleaning, processing, and merging data. It can be used for handling small and big data on your local laptop or on remote clusters using CPUs or GPUs. The book begins by covering the internals of Optimus and how it works in tandem with the existing technologies to serve your data processing needs. You'll then learn how to use Optimus for loading and saving data from text data formats such as CSV and JSON files, exploring binary files such as Excel, and for columnar data processing with Parquet, Avro, and OCR. Next, you'll get to grips with the profiler and its data types - a unique feature of Optimus Dataframe that assists with data quality. You'll see how to use the plots available in Optimus such as histogram, frequency charts, and scatter and box plots, and understand how Optimus lets you connect to libraries such as Plotly and Altair. You'll also delve into advanced applications such as feature engineering, machine learning, cross-validation, and natural language processing functions and explore the advancements in Optimus. Finally, you'll learn how to create data cleaning and transformation functions and add a hypothetical new data processing engine with Optimus. By the end of this book, you'll be able to improve your data science workflow with Optimus easily. What you will learnUse over 100 data processing functions over columns and other string-like valuesReshape and pivot data to get the output in the required formatFind out how to plot histograms, frequency charts, scatter plots, box plots, and moreConnect Optimus with popular Python visualization libraries such as Plotly and AltairApply string clustering techniques to normalize stringsDiscover functions to explore, fix, and remove poor quality dataUse advanced techniques to remove outliers from your dataAdd engines and custom functions to clean, process, and merge dataWho this book is for This book is for Python developers who want to explore, transform, and prepare big data for machine learning, analytics, and reporting using Optimus, a unified API to work with Pandas, Dask, cuDF, Dask-cuDF, Vaex, and Spark. Although not necessary, beginner-level knowledge of Python will be helpful. Basic knowledge of the CLI is required to install Optimus and its requirements. For using GPU technologies, you'll need an NVIDIA graphics card compatible with NVIDIA's RAPIDS library, which is compatible with Windows 10 and Linux.

Data Processing on FPGAs

Data Processing on FPGAs
Author: Jens Teubner
Publisher: Springer Nature
Total Pages: 104
Release: 2022-05-31
Genre: Computers
ISBN: 3031018494

Roughly a decade ago, power consumption and heat dissipation concerns forced the semiconductor industry to radically change its course, shifting from sequential to parallel computing. Unfortunately, improving performance of applications has now become much more difficult than in the good old days of frequency scaling. This is also affecting databases and data processing applications in general, and has led to the popularity of so-called data appliances—specialized data processing engines, where software and hardware are sold together in a closed box. Field-programmable gate arrays (FPGAs) increasingly play an important role in such systems. FPGAs are attractive because the performance gains of specialized hardware can be significant, while power consumption is much less than that of commodity processors. On the other hand, FPGAs are way more flexible than hard-wired circuits (ASICs) and can be integrated into complex systems in many different ways, e.g., directly in the network for a high-frequency trading application. This book gives an introduction to FPGA technology targeted at a database audience. In the first few chapters, we explain in detail the inner workings of FPGAs. Then we discuss techniques and design patterns that help mapping algorithms to FPGA hardware so that the inherent parallelism of these devices can be leveraged in an optimal way. Finally, the book will illustrate a number of concrete examples that exploit different advantages of FPGAs for data processing. Table of Contents: Preface / Introduction / A Primer in Hardware Design / FPGAs / FPGA Programming Models / Data Stream Processing / Accelerated DB Operators / Secure Data Processing / Conclusions / Bibliography / Authors' Biographies / Index

Computerworld

Computerworld
Author:
Publisher:
Total Pages: 48
Release: 1976-03-22
Genre:
ISBN:

For more than 40 years, Computerworld has been the leading source of technology news and information for IT influencers worldwide. Computerworld's award-winning Web site (Computerworld.com), twice-monthly publication, focused conference series and custom research form the hub of the world's largest global IT media network.

Major Companies of Europe 1991-1992 Vol. 1 : Major Companies of the Continental European Community

Major Companies of Europe 1991-1992 Vol. 1 : Major Companies of the Continental European Community
Author: R. M. Whiteside
Publisher: Springer Science & Business Media
Total Pages: 1043
Release: 2012-12-06
Genre: Business & Economics
ISBN: 9401130167

Volumes 1 & 2 Guide to the MAJOR COMPANIES OF EUROPE 1991/92, Volume 1, arrangement of the book contains useful information on over 4000 of the top companies in the European Community, excluding the UK, over 1100 This book has been arranged in order to allow the reader to companies of which are covered in Volume 2. Volume 3 covers find any entry rapidly and accurately. over 1300 of the top companies within Western Europe but outside the European Community. Altogether the three Company entries are listed alphabetically within each country volumes of MAJOR COMPANIES OF EUROPE now provide in section; in addition three indexes are provided in Volumes 1 authoritative detail, vital information on over 6500 of the largest and 3 on coloured paper at the back of the books, and two companies in Western Europe. indexes in the case of Volume 2. MAJOR COMPANIES OF EUROPE 1991/92, Volumes 1 The alphabetical index to companies throughout the & 2 contain many of the largest companies in the world. The Continental EC lists all companies having entries in Volume 1 area covered by these volumes, the European Community, in alphabetical order irrespective of their main country of represents a rich consumer market of over 320 million people. operation. Over one third of the world's imports and exports are channelled through the EC. The Community represents the The alphabetical index in Volume 1 to companies within each world's largest integrated market.

NASA SP-7500

NASA SP-7500
Author: United States. National Aeronautics and Space Administration
Publisher:
Total Pages: 140
Release: 1972
Genre:
ISBN:

Design and Synthesis

Design and Synthesis
Author: Hiroyuki Yoshikawa
Publisher: North-Holland
Total Pages: 796
Release: 1985
Genre: Technology & Engineering
ISBN:

Intelligent Information and Database Systems

Intelligent Information and Database Systems
Author: Jeng-Shyang Pan
Publisher: Springer
Total Pages: 593
Release: 2012-03-14
Genre: Computers
ISBN: 3642284906

The three-volume set LNAI 7196, LNAI 7197 and LNAI 7198 constitutes the refereed proceedings of the 4th Asian Conference on Intelligent Information and Database Systems, ACIIDS 2012, held in Kaohsiung, Taiwan in March 2012. The 161 revised papers presented were carefully reviewed and selected from more than 472 submissions. The papers included cover the following topics: intelligent database systems, data warehouses and data mining, natural language processing and computational linguistics, semantic Web, social networks and recommendation systems, collaborative systems and applications, e-bussiness and e-commerce systems, e-learning systems, information modeling and requirements engineering, information retrieval systems, intelligent agents and multi-agent systems, intelligent information systems, intelligent internet systems, intelligent optimization techniques, object-relational DBMS, ontologies and knowledge sharing, semi-structured and XML database systems, unified modeling language and unified processes, Web services and semantic Web, computer networks and communication systems.