Practical Machine Learning: A New Look at Anomaly Detection

Practical Machine Learning: A New Look at Anomaly Detection
Author: Ted Dunning
Publisher: "O'Reilly Media, Inc."
Total Pages: 65
Release: 2014-07-21
Genre: Computers
ISBN: 1491914181

Finding Data Anomalies You Didn't Know to Look For Anomaly detection is the detective work of machine learning: finding the unusual, catching the fraud, discovering strange activity in large and complex datasets. But, unlike Sherlock Holmes, you may not know what the puzzle is, much less what “suspects” you’re looking for. This O’Reilly report uses practical examples to explain how the underlying concepts of anomaly detection work. From banking security to natural sciences, medicine, and marketing, anomaly detection has many useful applications in this age of big data. And the search for anomalies will intensify once the Internet of Things spawns even more new types of data. The concepts described in this report will help you tackle anomaly detection in your own project. Use probabilistic models to predict what’s normal and contrast that to what you observe Set an adaptive threshold to determine which data falls outside of the normal range, using the t-digest algorithm Establish normal fluctuations in complex systems and signals (such as an EKG) with a more adaptive probablistic model Use historical data to discover anomalies in sporadic event streams, such as web traffic Learn how to use deviations in expected behavior to trigger fraud alerts

Network Anomaly Detection

Network Anomaly Detection
Author: Dhruba Kumar Bhattacharyya
Publisher: CRC Press
Total Pages: 364
Release: 2013-06-18
Genre: Computers
ISBN: 146658209X

With the rapid rise in the ubiquity and sophistication of Internet technology and the accompanying growth in the number of network attacks, network intrusion detection has become increasingly important. Anomaly-based network intrusion detection refers to finding exceptional or nonconforming patterns in network traffic data compared to normal behavi

Practical Machine Learning for Computer Vision

Practical Machine Learning for Computer Vision
Author: Valliappa Lakshmanan
Publisher: "O'Reilly Media, Inc."
Total Pages: 481
Release: 2021-07-21
Genre: Computers
ISBN: 1098102339

This practical book shows you how to employ machine learning models to extract information from images. ML engineers and data scientists will learn how to solve a variety of image problems including classification, object detection, autoencoders, image generation, counting, and captioning with proven ML techniques. This book provides a great introduction to end-to-end deep learning: dataset creation, data preprocessing, model design, model training, evaluation, deployment, and interpretability. Google engineers Valliappa Lakshmanan, Martin Görner, and Ryan Gillard show you how to develop accurate and explainable computer vision ML models and put them into large-scale production using robust ML architecture in a flexible and maintainable way. You'll learn how to design, train, evaluate, and predict with models written in TensorFlow or Keras. You'll learn how to: Design ML architecture for computer vision tasks Select a model (such as ResNet, SqueezeNet, or EfficientNet) appropriate to your task Create an end-to-end ML pipeline to train, evaluate, deploy, and explain your model Preprocess images for data augmentation and to support learnability Incorporate explainability and responsible AI best practices Deploy image models as web services or on edge devices Monitor and manage ML models

Real-World Hadoop

Real-World Hadoop
Author: Ted Dunning
Publisher: "O'Reilly Media, Inc."
Total Pages: 104
Release: 2015-03-24
Genre: Computers
ISBN: 1491928921

If you’re a business team leader, CIO, business analyst, or developer interested in how Apache Hadoop and Apache HBase-related technologies can address problems involving large-scale data in cost-effective ways, this book is for you. Using real-world stories and situations, authors Ted Dunning and Ellen Friedman show Hadoop newcomers and seasoned users alike how NoSQL databases and Hadoop can solve a variety of business and research issues. You’ll learn about early decisions and pre-planning that can make the process easier and more productive. If you’re already using these technologies, you’ll discover ways to gain the full range of benefits possible with Hadoop. While you don’t need a deep technical background to get started, this book does provide expert guidance to help managers, architects, and practitioners succeed with their Hadoop projects. Examine a day in the life of big data: India’s ambitious Aadhaar project Review tools in the Hadoop ecosystem such as Apache’s Spark, Storm, and Drill to learn how they can help you Pick up a collection of technical and strategic tips that have helped others succeed with Hadoop Learn from several prototypical Hadoop use cases, based on how organizations have actually applied the technology Explore real-world stories that reveal how MapR customers combine use cases when putting Hadoop and NoSQL to work, including in production

Intelligent Distributed Computing XI

Intelligent Distributed Computing XI
Author: Mirjana Ivanović
Publisher: Springer
Total Pages: 319
Release: 2017-10-03
Genre: Technology & Engineering
ISBN: 3319663798

This book presents a collection of contributions addressing recent advances and research in synergistic combinations of topics in the joint fields of intelligent computing and distributed computing. It focuses on the following specific topics: distributed data mining and machine learning, reasoning and decision-making in distributed environments, distributed evolutionary algorithms, trust and reputation models for distributed systems, scheduling and resource allocation in distributed systems, intelligent multi-agent systems, advanced agent-based and service-based architectures, and Smart Cloud and Internet of Things (IoT) environments. The book represents the combined peer-reviewed proceedings of the 11th International Symposium on Intelligent Distributed Computing (IDC 2017) and the 7th International Workshop on Applications of Software Agents (WASA 2017), both of which were held in Belgrade, Serbia from October 11 to 13, 2017.

Data Analytics

Data Analytics
Author: Juan J. Cuadrado-Gallego
Publisher: Springer Nature
Total Pages: 486
Release: 2023-11-30
Genre: Computers
ISBN: 3031391292

Building upon the knowledge introduced in The Data Science Framework, this book provides a comprehensive and detailed examination of each aspect of Data Analytics, both from a theoretical and practical standpoint. The book explains representative algorithms associated with different techniques, from their theoretical foundations to their implementation and use with software tools. Designed as a textbook for a Data Analytics Fundamentals course, it is divided into seven chapters to correspond with 16 weeks of lessons, including both theoretical and practical exercises. Each chapter is dedicated to a lesson, allowing readers to dive deep into each topic with detailed explanations and examples. Readers will learn the theoretical concepts and then immediately apply them to practical exercises to reinforce their knowledge. And in the lab sessions, readers will learn the ins and outs of the R environment and data science methodology to solve exercises with the R language. With detailed solutions provided for all examples and exercises, readers can use this book to study and master data analytics on their own. Whether you're a student, professional, or simply curious about data analytics, this book is a must-have for anyone looking to expand their knowledge in this exciting field.

Data Science For Cyber-security

Data Science For Cyber-security
Author: Nicholas A Heard
Publisher: World Scientific
Total Pages: 305
Release: 2018-09-26
Genre: Computers
ISBN: 178634565X

Cyber-security is a matter of rapidly growing importance in industry and government. This book provides insight into a range of data science techniques for addressing these pressing concerns.The application of statistical and broader data science techniques provides an exciting growth area in the design of cyber defences. Networks of connected devices, such as enterprise computer networks or the wider so-called Internet of Things, are all vulnerable to misuse and attack, and data science methods offer the promise to detect such behaviours from the vast collections of cyber traffic data sources that can be obtained. In many cases, this is achieved through anomaly detection of unusual behaviour against understood statistical models of normality.This volume presents contributed papers from an international conference of the same name held at Imperial College. Experts from the field have provided their latest discoveries and review state of the art technologies.

Streaming Architecture

Streaming Architecture
Author: Ted Dunning
Publisher: "O'Reilly Media, Inc."
Total Pages: 116
Release: 2016-05-10
Genre: Computers
ISBN: 1491953888

More and more data-driven companies are looking to adopt stream processing and streaming analytics. With this concise ebook, you’ll learn best practices for designing a reliable architecture that supports this emerging big-data paradigm. Authors Ted Dunning and Ellen Friedman (Real World Hadoop) help you explore some of the best technologies to handle stream processing and analytics, with a focus on the upstream queuing or message-passing layer. To illustrate the effectiveness of these technologies, this book also includes specific use cases. Ideal for developers and non-technical people alike, this book describes: Key elements in good design for streaming analytics, focusing on the essential characteristics of the messaging layer New messaging technologies, including Apache Kafka and MapR Streams, with links to sample code Technology choices for streaming analytics: Apache Spark Streaming, Apache Flink, Apache Storm, and Apache Apex How stream-based architectures are helpful to support microservices Specific use cases such as fraud detection and geo-distributed data streams Ted Dunning is Chief Applications Architect at MapR Technologies, and active in the open source community. He currently serves as VP for Incubator at the Apache Foundation, as a champion and mentor for a large number of projects, and as committer and PMC member of the Apache ZooKeeper and Drill projects. Ted is on Twitter as @ted_dunning. Ellen Friedman, a committer for the Apache Drill and Apache Mahout projects, is a solutions consultant and well-known speaker and author, currently writing mainly about big data topics. With a PhD in Biochemistry, she has years of experience as a research scientist and has written about a variety of technical topics. Ellen is on Twitter as @Ellen_Friedman.

Sharing Big Data Safely

Sharing Big Data Safely
Author: Ted Dunning
Publisher: "O'Reilly Media, Inc."
Total Pages: 95
Release: 2015-09-15
Genre: Computers
ISBN: 1491953632

Many big data-driven companies today are moving to protect certain types of data against intrusion, leaks, or unauthorized eyes. But how do you lock down data while granting access to people who need to see it? In this practical book, authors Ted Dunning and Ellen Friedman offer two novel and practical solutions that you can implement right away. Ideal for both technical and non-technical decision makers, group leaders, developers, and data scientists, this book shows you how to: Share original data in a controlled way so that different groups within your organization only see part of the whole. You’ll learn how to do this with the new open source SQL query engine Apache Drill. Provide synthetic data that emulates the behavior of sensitive data. This approach enables external advisors to work with you on projects involving data that you can't show them. If you’re intrigued by the synthetic data solution, explore the log-synth program that Ted Dunning developed as open source code (available on GitHub), along with how-to instructions and tips for best practice. You’ll also get a collection of use cases. Providing lock-down security while safely sharing data is a significant challenge for a growing number of organizations. With this book, you’ll discover new options to share data safely without sacrificing security.

ECML PKDD 2020 Workshops

ECML PKDD 2020 Workshops
Author: Irena Koprinska
Publisher: Springer Nature
Total Pages: 619
Release: 2021-02-01
Genre: Computers
ISBN: 3030659658

This volume constitutes the refereed proceedings of the workshops which complemented the 20th Joint European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD, held in September 2020. Due to the COVID-19 pandemic the conference and workshops were held online. The 43 papers presented in volume were carefully reviewed and selected from numerous submissions. The volume presents the papers that have been accepted for the following workshops: 5th Workshop on Data Science for Social Good, SoGood 2020; Workshop on Parallel, Distributed and Federated Learning, PDFL 2020; Second Workshop on Machine Learning for Cybersecurity, MLCS 2020, 9th International Workshop on New Frontiers in Mining Complex Patterns, NFMCP 2020, Workshop on Data Integration and Applications, DINA 2020, Second Workshop on Evaluation and Experimental Design in Data Mining and Machine Learning, EDML 2020, Second International Workshop on eXplainable Knowledge Discovery in Data Mining, XKDD 2020; 8th International Workshop on News Recommendation and Analytics, INRA 2020. The papers from INRA 2020 are published open access and licensed under the terms of the Creative Commons Attribution 4.0 International License.