Generate Value from Data with Machine Learning and Data Analytics
E-Book, Englisch, 573 Seiten
ISBN: 978-1-56990-888-4
Verlag: Hanser Publications
Format: EPUB
Kopierschutz: Wasserzeichen (»Systemvoraussetzungen)
Autoren/Hrsg.
Weitere Infos & Material
1 Introduction “Data really powers everything that we do.” Jeff Weiner Questions Answered in this Chapter: What makes Data Science, ML, AI and everything else closely connected to generate value out of data so fascinating? Why do organisations need a strategy to become data driven? What are some everyday use cases in the B2B or NGO world? How are data projects structured? What is the composition of a data team? Data Science and related technologies have been the center of attention since 2010. Various changes in the ecosystem triggered this trend, such as significant advancements in processing a vast amount of unstructured data, substantial cost reduction of disk storage, the emergence of new data sources such as social media and sensor data. The HBR called the data scientist the sexiest job of the 21st century while quoting Hal Varian from Google.1 Strategy consultants declared data to be the new oil, and there have been occasional “data rushes” where “enthusiasts in data fever” mined new data sources for yet unknown treasures. This book explores data science and incorporates various views on the discipline. Figure 1.1 Data Science and related technologies on trends.google.com2 1.1 What are Data Science, Machine Learning and Artificial Intelligence? There are many views on data science, and stakeholders in data science projects may give different answers to what they consider data science to be. Representatives address various aspects and may use different vocabulary since businesses and NGOs, for example, pursue different insights from data science applications. Perhaps the one common denominator is this: Everyone expects data science to deliver some value, which was not there before, with the help of data. Table 1.1 Various views on Data Science View Description Definition from Wikipedia Data science is an interdisciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data and apply knowledge and actionable insights from data across a broad range of application domains.3 Application-centered view We collect data and put this into pandas-data frames or data frames in R Studio. We also use tools such as TensorFlow or Keras. Our goal is to use these tools to explore the data. Platform-oriented view We create value from the data that we loaded on our SaaS platform in the cloud. Then, depending on the provided data and its structures, we store them in different storage containers, such as blob storage and distributed databases. Evangelist-oriented view Data science was the next big thing in 2015. Now, you should look at more specific applications. Looking at the Gartner charts, invest your time exploring cutting-edge trends such as neuromorphic hardware or augmented intelligence. Management-oriented view These are the ways of working to bring our company into the 21st century as a data-driven enterprise. During and after our transition, we will penetrate new markets and monetize data as a service. Career-oriented view As a senior data scientist at a major company, I can earn a six-digit yearly salary and explore interesting fields in corporate labs. Use case-oriented view Tell me your business problem, and we will tell you how we solved it for another customer. From fraud detection to customer retention to social network analysis, feel free to check out our catalog of possible analytics applications. Entrepreneurial/Optimistic view Data Science is one way to change the world. Using Data Science, we can prevent climate change and fight poverty and hunger on a global scale. Pessimist view Data Science is one way to change the world. But, unfortunately, power-hungry people will use it to spy on us and suppress us. So Big Brother will be watching you. Statistician’s view Data Science is just a buzzword. It is just another word for statistics. We might call it statistics on steroids, maybe. But in the end, it’s just another marketing hype to create another buzzword to sell services to someone. The essentials of data science lay in mathematics. Data scientists apply statistics to generate new knowledge from data. Besides using algorithms on data, a data scientist must understand the scientific process of exploring data, such as creating reproducible experiments and interpreting the results. There are many different terms related to data science. For example, professionals talk about Artificial Intelligence, machine learning, or deep learning. Sometimes experts also talk about related terms such as analytics or business intelligence and simulation. In the following chapters, we will detail and highlight how we distinguish between analytics and data science. We will also highlight various data science applications, such as gaining insights into a text through Natural Language Processing or extracting objects from images via object recognition or modeling railway networks for optimal pathfinding. Data Science as Part of a Cultural Shift Suppose you apply for a job as a data scientist in a company. Imagine that, although it is unlikely you will get such an answer, the HR of this company rejects you because your astrology chart based on the data you have provided in your CV does not match the position. Humans decide on what they believe is right. But, unfortunately, human judgment is flawed through bias4, and we have mechanisms, such as confirmation bias, which assure us that we cannot err. For example, some people believe in the flat Earth theory or hollow Earth theory, which shows how powerful mechanisms such as confirmation bias can be. For many of us, it would be disastrous to realize that a comfortable binary view of the world divided into black and white, good and evil, and right and wrong often does not work out. Modern sociological ideas such as constructivism5 are more connected to data science than many think. The idea is that everyone constructs a reality based on their experience. Within the framework of “our reality”, including its rules and conventions, we make decisions. According to studies, it is not uncommon that we are deeply convinced that we are right even if our choices are questionable to others. For example, suppose we have created mental models for ourselves in which we are confident that astrology must be correct. In that case, it is logical to assume zodiacs for personnel decisions will improve the hiring process. At the same time, people with strong religious beliefs might run into conflicts if they ignore what they might call signs or messages from God. Thanks to the biases mentioned above, our belief systems are often hammered into stone. Data Science is not just a method to extract value from data; it also has the potential to be a method for making decisions that avoids or reduces human bias in the process. However, as will be shown in Chapter 18 on Trustworthy AI, data alone cannot solve the problem, because historical data and the model building process itself are often imbued with the very same biases. With that, business leaders can integrate data science and transparent and non-discriminatory practices, into corporate culture, and this will substantially impact the company’s DNA. For example, a bias-aware company will adjust processes. Hiring a new employee is a good example. Many companies enlarge hiring teams that decide on the outcome of the candidate interviews in order to ensure that the bias of a single interviewer will not affect a hiring decision too much. In modern hiring processes, data science can be used to generate predictionsabout candidates to assist the decision-making process. If done with care, these model predictions can help to minimise biases in employment decisions. In the beginning, every judgment is a theory. A theory is neither right nor wrong but inconclusive until it is proven or disproven. Therefore, the positive effect of hiring personnel using astrological zodiacs would be nothing more than a theory. As long as we cannot prove that an astrological assessment would benefit a hiring process, the statement is inconclusive and, therefore, not recommended to use. Calling astrology inclusive rather than wrong might also make the discussion with believers in astrology less...