Data Science for Librarians

Data Science for Librarians
Author: Yunfei Du
Publisher: Bloomsbury Publishing USA
Total Pages: 181
Release: 2020-03-26
Genre: Language Arts & Disciplines
ISBN: 1440871221

This unique textbook intersects traditional library science with data science principles that readers will find useful in implementing or improving data services within their libraries. Data Science for Librarians introduces data science to students and practitioners in library services. Writing for academic, public, and school library managers; library science students; and library and information science educators, authors Yunfei Du and Hammad Rauf Khan provide a thorough overview of conceptual and practical tools for data librarian practice. Partially due to how quickly data science evolves, libraries have yet to recognize core competencies and skills required to perform the job duties of a data librarian. As society transitions from the information age into the era of big data, librarians and information professionals require new knowledge and skills to stay current and take on new job roles, such as data librarianship. Such skills as data curation, research data management, statistical analysis, business analytics, visualization, smart city data, and learning analytics are relevant in library services today and will become increasingly so in the near future. This text serves as a tool for library and information science students and educators working on data science curriculum design.

Data Science for Librarians

Data Science for Librarians
Author: Yunfei Du
Publisher: Libraries Unlimited
Total Pages: 0
Release: 2020-03-26
Genre: Computers
ISBN: 1440871213

More data, more problems -- A new strand of librarianship -- Data creation and collection -- Data for the academic librarian -- Research data services and the library ecosystem -- Data sources -- Data curation (archiving/preservation) -- Data storage, management, and retrieval -- Data analysis and visualization -- Data ethics and policies -- Data for public libraries and special libraries -- Conclusion: library, information, and data science.

Data Science Concepts and Techniques with Applications

Data Science Concepts and Techniques with Applications
Author: Usman Qamar
Publisher: Springer Nature
Total Pages: 492
Release: 2023-04-02
Genre: Computers
ISBN: 3031174429

This textbook comprehensively covers both fundamental and advanced topics related to data science. Data science is an umbrella term that encompasses data analytics, data mining, machine learning, and several other related disciplines. The chapters of this book are organized into three parts: The first part (chapters 1 to 3) is a general introduction to data science. Starting from the basic concepts, the book will highlight the types of data, its use, its importance and issues that are normally faced in data analytics, followed by presentation of a wide range of applications and widely used techniques in data science. The second part, which has been updated and considerably extended compared to the first edition, is devoted to various techniques and tools applied in data science. Its chapters 4 to 10 detail data pre-processing, classification, clustering, text mining, deep learning, frequent pattern mining, and regression analysis. Eventually, the third part (chapters 11 and 12) present a brief introduction to Python and R, the two main data science programming languages, and shows in a completely new chapter practical data science in the WEKA (Waikato Environment for Knowledge Analysis), an open-source tool for performing different machine learning and data mining tasks. An appendix explaining the basic mathematical concepts of data science completes the book. This textbook is suitable for advanced undergraduate and graduate students as well as for industrial practitioners who carry out research in data science. They both will not only benefit from the comprehensive presentation of important topics, but also from the many application examples and the comprehensive list of further readings, which point to additional publications providing more in-depth research results or provide sources for a more detailed description of related topics. "This book delivers a systematic, carefully thoughtful material on Data Science." from the Foreword by Witold Pedrycz, U Alberta, Canada.

Data Science Bookcamp

Data Science Bookcamp
Author: Leonard Apeltsin
Publisher: Simon and Schuster
Total Pages: 702
Release: 2021-12-07
Genre: Computers
ISBN: 1638352305

Learn data science with Python by building five real-world projects! Experiment with card game predictions, tracking disease outbreaks, and more, as you build a flexible and intuitive understanding of data science. In Data Science Bookcamp you will learn: - Techniques for computing and plotting probabilities - Statistical analysis using Scipy - How to organize datasets with clustering algorithms - How to visualize complex multi-variable datasets - How to train a decision tree machine learning algorithm In Data Science Bookcamp you’ll test and build your knowledge of Python with the kind of open-ended problems that professional data scientists work on every day. Downloadable data sets and thoroughly-explained solutions help you lock in what you’ve learned, building your confidence and making you ready for an exciting new data science career. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology A data science project has a lot of moving parts, and it takes practice and skill to get all the code, algorithms, datasets, formats, and visualizations working together harmoniously. This unique book guides you through five realistic projects, including tracking disease outbreaks from news headlines, analyzing social networks, and finding relevant patterns in ad click data. About the book Data Science Bookcamp doesn’t stop with surface-level theory and toy examples. As you work through each project, you’ll learn how to troubleshoot common problems like missing data, messy data, and algorithms that don’t quite fit the model you’re building. You’ll appreciate the detailed setup instructions and the fully explained solutions that highlight common failure points. In the end, you’ll be confident in your skills because you can see the results. What's inside - Web scraping - Organize datasets with clustering algorithms - Visualize complex multi-variable datasets - Train a decision tree machine learning algorithm About the reader For readers who know the basics of Python. No prior data science or machine learning skills required. About the author Leonard Apeltsin is the Head of Data Science at Anomaly, where his team applies advanced analytics to uncover healthcare fraud, waste, and abuse. Table of Contents CASE STUDY 1 FINDING THE WINNING STRATEGY IN A CARD GAME 1 Computing probabilities using Python 2 Plotting probabilities using Matplotlib 3 Running random simulations in NumPy 4 Case study 1 solution CASE STUDY 2 ASSESSING ONLINE AD CLICKS FOR SIGNIFICANCE 5 Basic probability and statistical analysis using SciPy 6 Making predictions using the central limit theorem and SciPy 7 Statistical hypothesis testing 8 Analyzing tables using Pandas 9 Case study 2 solution CASE STUDY 3 TRACKING DISEASE OUTBREAKS USING NEWS HEADLINES 10 Clustering data into groups 11 Geographic location visualization and analysis 12 Case study 3 solution CASE STUDY 4 USING ONLINE JOB POSTINGS TO IMPROVE YOUR DATA SCIENCE RESUME 13 Measuring text similarities 14 Dimension reduction of matrix data 15 NLP analysis of large text datasets 16 Extracting text from web pages 17 Case study 4 solution CASE STUDY 5 PREDICTING FUTURE FRIENDSHIPS FROM SOCIAL NETWORK DATA 18 An introduction to graph theory and network analysis 19 Dynamic graph theory techniques for node ranking and social network analysis 20 Network-driven supervised machine learning 21 Training linear classifiers with logistic regression 22 Training nonlinear classifiers with decision tree techniques 23 Case study 5 solution

Data Science

Data Science
Author: Field Cady
Publisher: John Wiley & Sons
Total Pages: 208
Release: 2020-12-30
Genre: Business & Economics
ISBN: 1119544084

Tap into the power of data science with this comprehensive resource for non-technical professionals Data Science: The Executive Summary – A Technical Book for Non-Technical Professionals is a comprehensive resource for people in non-engineer roles who want to fully understand data science and analytics concepts. Accomplished data scientist and author Field Cady describes both the “business side” of data science, including what problems it solves and how it fits into an organization, and the technical side, including analytical techniques and key technologies. Data Science: The Executive Summary covers topics like: Assessing whether your organization needs data scientists, and what to look for when hiring them When Big Data is the best approach to use for a project, and when it actually ties analysts’ hands Cutting edge Artificial Intelligence, as well as classical approaches that work better for many problems How many techniques rely on dubious mathematical idealizations, and when you can work around them Perfect for executives who make critical decisions based on data science and analytics, as well as mangers who hire and assess the work of data scientists, Data Science: The Executive Summary also belongs on the bookshelves of salespeople and marketers who need to explain what a data analytics product does. Finally, data scientists themselves will improve their technical work with insights into the goals and constraints of the business situation.

Introduction to Data Science and Machine Learning

Introduction to Data Science and Machine Learning
Author: Keshav Sud
Publisher: BoD – Books on Demand
Total Pages: 233
Release: 2020-03-25
Genre: Computers
ISBN: 1838803335

Introduction to Data Science and Machine Learning has been created with the goal to provide beginners seeking to learn about data science, data enthusiasts, and experienced data professionals with a deep understanding of data science application development using open-source programming from start to finish. This book is divided into four sections: the first section contains an introduction to the book, the second covers the field of data science, software development, and open-source based embedded hardware; the third section covers algorithms that are the decision engines for data science applications; and the final section brings together the concepts shared in the first three sections and provides several examples of data science applications.

Handbook of Research on Academic Libraries as Partners in Data Science Ecosystems

Handbook of Research on Academic Libraries as Partners in Data Science Ecosystems
Author: Mani, Nandita S.
Publisher: IGI Global
Total Pages: 415
Release: 2022-05-06
Genre: Language Arts & Disciplines
ISBN: 1799897044

Beyond providing space for data science activities, academic libraries are often overlooked in the data science landscape that is emerging at academic research institutions. Although some academic libraries are collaborating in specific ways in a small subset of institutions, there is much untapped potential for developing partnerships. As library and information science roles continue to evolve to be more data-centric and interdisciplinary, and as research using a variety of data types continues to proliferate, it is imperative to further explore the dynamics between libraries and the data science ecosystems in which they are a part. The Handbook of Research on Academic Libraries as Partners in Data Science Ecosystems provides a global perspective on current and future trends concerning the integration of data science in libraries. It provides both a foundational base of knowledge around data science and explores numerous ways academicians can reskill their staff, engage in the research enterprise, contribute to curriculum development, and help build a stronger ecosystem where libraries are part of data science. Covering topics such as data science initiatives, digital humanities, and student engagement, this book is an indispensable resource for librarians, information professionals, academic institutions, researchers, academic libraries, and academicians.

Data Science Essentials For Dummies

Data Science Essentials For Dummies
Author: Lillian Pierson
Publisher: John Wiley & Sons
Total Pages: 199
Release: 2024-12-24
Genre: Computers
ISBN: 1394297009

Feel confident navigating the fundamentals of data science Data Science Essentials For Dummies is a quick reference on the core concepts of the exploding and in-demand data science field, which involves data collection and working on dataset cleaning, processing, and visualization. This direct and accessible resource helps you brush up on key topics and is right to the point—eliminating review material, wordy explanations, and fluff—so you get what you need, fast. Strengthen your understanding of data science basics Review what you've already learned or pick up key skills Effectively work with data and provide accessible materials to others Jog your memory on the essentials as you work and get clear answers to your questions Perfect for supplementing classroom learning, reviewing for a certification, or staying knowledgeable on the job, Data Science Essentials For Dummies is a reliable reference that's great to keep on hand as an everyday desk reference.

Hands-On Data Science with Anaconda

Hands-On Data Science with Anaconda
Author: Yuxing Yan
Publisher: Packt Publishing Ltd
Total Pages: 356
Release: 2018-05-31
Genre: Computers
ISBN: 1788834739

Develop, deploy, and streamline your data science projects with the most popular end-to-end platform, Anaconda Key Features -Use Anaconda to find solutions for clustering, classification, and linear regression -Analyze your data efficiently with the most powerful data science stack -Use the Anaconda cloud to store, share, and discover projects and libraries Book Description Anaconda is an open source platform that brings together the best tools for data science professionals with more than 100 popular packages supporting Python, Scala, and R languages. Hands-On Data Science with Anaconda gets you started with Anaconda and demonstrates how you can use it to perform data science operations in the real world. The book begins with setting up the environment for Anaconda platform in order to make it accessible for tools and frameworks such as Jupyter, pandas, matplotlib, Python, R, Julia, and more. You’ll walk through package manager Conda, through which you can automatically manage all packages including cross-language dependencies, and work across Linux, macOS, and Windows. You’ll explore all the essentials of data science and linear algebra to perform data science tasks using packages such as SciPy, contrastive, scikit-learn, Rattle, and Rmixmod. Once you’re accustomed to all this, you’ll start with operations in data science such as cleaning, sorting, and data classification. You’ll move on to learning how to perform tasks such as clustering, regression, prediction, and building machine learning models and optimizing them. In addition to this, you’ll learn how to visualize data using the packages available for Julia, Python, and R. What you will learn Perform cleaning, sorting, classification, clustering, regression, and dataset modeling using Anaconda Use the package manager conda and discover, install, and use functionally efficient and scalable packages Get comfortable with heterogeneous data exploration using multiple languages within a project Perform distributed computing and use Anaconda Accelerate to optimize computational powers Discover and share packages, notebooks, and environments, and use shared project drives on Anaconda Cloud Tackle advanced data prediction problems Who this book is for Hands-On Data Science with Anaconda is for you if you are a developer who is looking for the best tools in the market to perform data science. It’s also ideal for data analysts and data science professionals who want to improve the efficiency of their data science applications by using the best libraries in multiple languages. Basic programming knowledge with R or Python and introductory knowledge of linear algebra is expected.

Ultimate Data Science Programming in Python

Ultimate Data Science Programming in Python
Author: Saurabh Chandrakar
Publisher: BPB Publications
Total Pages: 745
Release: 2024-09-25
Genre: Computers
ISBN: 9365895669

DESCRIPTION In today's data-driven world, the ability to extract meaningful insights from vast datasets is crucial for success in various fields. This ultimate book for mastering open-source libraries of data science in Python equips you with the essential tools and techniques to navigate the ever-evolving field of data analysis and visualization. Discover how to use Python libraries like NumPy, Pandas, and Matplotlib for data manipulation, analysis, and visualization. This book also covers scientific computing with SciPy and integrates ChatGPT to boost your data science workflow. Designed for data scientists, analysts, and beginners, it offers a practical, hands-on approach to mastering data science fundamentals. With real-world applications and exercises, you will turn raw data into actionable insights, gaining a competitive edge. This book covers everything you need, including open-source libraries, Visual Explorer tools, and ChatGPT, making it a one-stop resource for Python-based data science. Readers will gain confidence after going through this book and we assure you that all the minute details have been taken into consideration while delivering the content. After reading, learning, and practicing from this book, we are sure that all IT professionals, novices, or job seekers will be able to work on data science projects thus proving their mettle. KEY FEATURES ● Master key Python libraries like NumPy, Pandas, and Seaborn for effective data analysis and visualization. ● Understand complex data science concepts through simple explanations and practical examples. ● Get hands-on experience with 300+ solved examples to solidify your Python data science skills. WHAT YOU WILL LEARN ● Learn to work with popular IDEs like VS Code and Jupyter Notebook for efficient Python development. ● Master open-source libraries such as NumPy, SciPy, Matplotlib, and Pandas through advanced, real-world examples. ● Utilize automated EDA tools like PyGWalker and AutoViz to simplify complex data analysis. ● Create sophisticated visualizations like heatmaps, FacetGrid, and box plots using Matplotlib and Seaborn. ● Efficiently handle missing data, outliers, and perform filtering, sorting, grouping, and aggregation using Pandas and Polars. WHO THIS BOOK IS FOR This book is ideal for diploma, undergraduate, and postgraduate students from engineering and science fields to programming and software professionals. It is also perfect for data science, ML, and AI engineers looking to expand their expertise in cutting-edge technologies. TABLE OF CONTENTS 1. Environmental Setup for Using Data Science Libraries in Python 2. Exploring Numpy Library for Data Science in Python 3. Exploring Array Manipulations in Numpy 4. Exploring Scipy Library for Data Science in Python 5. Line Plot exploration with Matplotlib Library 6. Charting Data With Various Visuals Using Matplotlib 7. Exploring Pandas Series for Data Science in Python 8. Exploring Pandas Dataframe for Data Science in Python 9. Advanced Dataframe Filtering Techniques 10. Exploring Polars Library for Data Science in Python 11. Exploring Expressions in Polars 12. Exploring Seaborn Library for Data Science in Python 13. Crafting Seaborn Plots: KDE, Line, Violin and Facets 14. Integrating Data Science Libraries with ChatGPT Prompts 15. Exploring Automated EDA Libraries for Machine Learning 16. Case Study Using Python Data Science Libraries