R for Data Science

R for Data Science
Author: Hadley Wickham
Publisher: "O'Reilly Media, Inc."
Total Pages: 521
Release: 2016-12-12
Genre: Computers
ISBN: 1491910364

Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible. Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You'll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you've learned along the way. You'll learn how to: Wrangle—transform your datasets into a form convenient for analysis Program—learn powerful R tools for solving data problems with greater clarity and ease Explore—examine your data, generate hypotheses, and quickly test them Model—provide a low-dimensional summary that captures true "signals" in your dataset Communicate—learn R Markdown for integrating prose, code, and results

Big Data

Big Data
Author: Min Chen
Publisher: Springer
Total Pages: 100
Release: 2014-05-05
Genre: Computers
ISBN: 331906245X

This Springer Brief provides a comprehensive overview of the background and recent developments of big data. The value chain of big data is divided into four phases: data generation, data acquisition, data storage and data analysis. For each phase, the book introduces the general background, discusses technical challenges and reviews the latest advances. Technologies under discussion include cloud computing, Internet of Things, data centers, Hadoop and more. The authors also explore several representative applications of big data such as enterprise management, online social networks, healthcare and medical applications, collective intelligence and smart grids. This book concludes with a thoughtful discussion of possible research directions and development trends in the field. Big Data: Related Technologies, Challenges and Future Prospects is a concise yet thorough examination of this exciting area. It is designed for researchers and professionals interested in big data or related research. Advanced-level students in computer science and electrical engineering will also find this book useful.

Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Author: Jiawei Han
Publisher: Elsevier
Total Pages: 740
Release: 2011-06-09
Genre: Computers
ISBN: 0123814804

Data Mining: Concepts and Techniques provides the concepts and techniques in processing gathered data or information, which will be used in various applications. Specifically, it explains data mining and the tools used in discovering knowledge from the collected data. This book is referred as the knowledge discovery from data (KDD). It focuses on the feasibility, usefulness, effectiveness, and scalability of techniques of large data sets. After describing data mining, this edition explains the methods of knowing, preprocessing, processing, and warehousing data. It then presents information about data warehouses, online analytical processing (OLAP), and data cube technology. Then, the methods involved in mining frequent patterns, associations, and correlations for large data sets are described. The book details the methods for data classification and introduces the concepts and methods for data clustering. The remaining chapters discuss the outlier detection and the trends, applications, and research frontiers in data mining. This book is intended for Computer Science students, application developers, business professionals, and researchers who seek information on data mining. - Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects - Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields - Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of your data

Enterprise Information Systems: Concepts, Methodologies, Tools and Applications

Enterprise Information Systems: Concepts, Methodologies, Tools and Applications
Author: Management Association, Information Resources
Publisher: IGI Global
Total Pages: 2042
Release: 2010-09-30
Genre: Computers
ISBN: 1616928530

This three-volume collection, titled Enterprise Information Systems: Concepts, Methodologies, Tools and Applications, provides a complete assessment of the latest developments in enterprise information systems research, including development, design, and emerging methodologies. Experts in the field cover all aspects of enterprise resource planning (ERP), e-commerce, and organizational, social and technological implications of enterprise information systems.

Process Mining

Process Mining
Author: Wil M. P. van der Aalst
Publisher: Springer
Total Pages: 477
Release: 2016-04-15
Genre: Computers
ISBN: 3662498510

This is the second edition of Wil van der Aalst’s seminal book on process mining, which now discusses the field also in the broader context of data science and big data approaches. It includes several additions and updates, e.g. on inductive mining techniques, the notion of alignments, a considerably expanded section on software tools and a completely new chapter of process mining in the large. It is self-contained, while at the same time covering the entire process-mining spectrum from process discovery to predictive analytics. After a general introduction to data science and process mining in Part I, Part II provides the basics of business process modeling and data mining necessary to understand the remainder of the book. Next, Part III focuses on process discovery as the most important process mining task, while Part IV moves beyond discovering the control flow of processes, highlighting conformance checking, and organizational and time perspectives. Part V offers a guide to successfully applying process mining in practice, including an introduction to the widely used open-source tool ProM and several commercial products. Lastly, Part VI takes a step back, reflecting on the material presented and the key open challenges. Overall, this book provides a comprehensive overview of the state of the art in process mining. It is intended for business process analysts, business consultants, process managers, graduate students, and BPM researchers.

Principles of Database Management

Principles of Database Management
Author: Wilfried Lemahieu
Publisher: Cambridge University Press
Total Pages: 817
Release: 2018-07-12
Genre: Computers
ISBN: 1107186129

Introductory, theory-practice balanced text teaching the fundamentals of databases to advanced undergraduates or graduate students in information systems or computer science.

Big-Data Analytics for Cloud, IoT and Cognitive Computing

Big-Data Analytics for Cloud, IoT and Cognitive Computing
Author: Kai Hwang
Publisher: John Wiley & Sons
Total Pages: 432
Release: 2017-03-17
Genre: Computers
ISBN: 1119247292

The definitive guide to successfully integrating social, mobile, Big-Data analytics, cloud and IoT principles and technologies The main goal of this book is to spur the development of effective big-data computing operations on smart clouds that are fully supported by IoT sensing, machine learning and analytics systems. To that end, the authors draw upon their original research and proven track record in the field to describe a practical approach integrating big-data theories, cloud design principles, Internet of Things (IoT) sensing, machine learning, data analytics and Hadoop and Spark programming. Part 1 focuses on data science, the roles of clouds and IoT devices and frameworks for big-data computing. Big data analytics and cognitive machine learning, as well as cloud architecture, IoT and cognitive systems are explored, and mobile cloud-IoT-interaction frameworks are illustrated with concrete system design examples. Part 2 is devoted to the principles of and algorithms for machine learning, data analytics and deep learning in big data applications. Part 3 concentrates on cloud programming software libraries from MapReduce to Hadoop, Spark and TensorFlow and describes business, educational, healthcare and social media applications for those tools. The first book describing a practical approach to integrating social, mobile, analytics, cloud and IoT (SMACT) principles and technologies Covers theory and computing techniques and technologies, making it suitable for use in both computer science and electrical engineering programs Offers an extremely well-informed vision of future intelligent and cognitive computing environments integrating SMACT technologies Fully illustrated throughout with examples, figures and approximately 150 problems to support and reinforce learning Features a companion website with an instructor manual and PowerPoint slides www.wiley.com/go/hwangIOT Big-Data Analytics for Cloud, IoT and Cognitive Computing satisfies the demand among university faculty and students for cutting-edge information on emerging intelligent and cognitive computing systems and technologies. Professionals working in data science, cloud computing and IoT applications will also find this book to be an extremely useful working resource.

Corporate Data Quality

Corporate Data Quality
Author: Boris Otto
Publisher: epubli
Total Pages: 168
Release: 2015-12-08
Genre: Business & Economics
ISBN: 3737575932

Data is the foundation of the digital economy. Industry 4.0 and digital services are producing so far unknown quantities of data and make new business models possible. Under these circumstances, data quality has become the critical factor for success. This book presents a holistic approach for data quality management and presents ten case studies about this issue. It is intended for practitioners dealing with data quality management and data governance as well as for scientists. The book was written at the Competence Center Corporate Data Quality (CC CDQ) in close cooperation between researchers from the University of St. Gallen and Fraunhofer IML as well as many representatives from more than 20 major corporations. Chapter 1 introduces the role of data in the digitization of business and society and describes the most important business drivers for data quality. It presents the Framework for Corporate Data Quality Management and introduces essential terms and concepts. Chapter 2 presents practical, successful examples of the management of the quality of master data based on ten cases studies that were conducted by the CC CDQ. The case studies cover every aspect of the Framework for Corporate Data Quality Management. Chapter 3 describes selected tools for master data quality management. The three tools have been distinguished through their broad applicability (method for DQM strategy development and DQM maturity assessment) and their high level of innovation (Corporate Data League). Chapter 4 summarizes the essential factors for the successful management of the master data quality and provides a checklist of immediate measures that should be addressed immediately after the start of a data quality management project. This guarantees a quick start into the topic and provides initial recommendations for actions to be taken by project and line managers. Please also check out the book's homepage at cdq-book.org/

Data Quality

Data Quality
Author: Jack E. Olson
Publisher: Elsevier
Total Pages: 313
Release: 2003-01-09
Genre: Computers
ISBN: 0080503691

Data Quality: The Accuracy Dimension is about assessing the quality of corporate data and improving its accuracy using the data profiling method. Corporate data is increasingly important as companies continue to find new ways to use it. Likewise, improving the accuracy of data in information systems is fast becoming a major goal as companies realize how much it affects their bottom line. Data profiling is a new technology that supports and enhances the accuracy of databases throughout major IT shops. Jack Olson explains data profiling and shows how it fits into the larger picture of data quality.* Provides an accessible, enjoyable introduction to the subject of data accuracy, peppered with real-world anecdotes. * Provides a framework for data profiling with a discussion of analytical tools appropriate for assessing data accuracy. * Is written by one of the original developers of data profiling technology. * Is a must-read for any data management staff, IT management staff, and CIOs of companies with data assets.