|
SEMINARS |
|
Understanding Data Science: An Emerging Discipline for Data Intensive Discovery Michael L. Brodie |
|||
Abstract: Over the past two decades, Data-Intensive Analysis has emerged not only as a basis for the Fourth Paradigm of engineering and scientific discovery but as a basis for discovery in most human endeavors for which data is available. Originating in the 1960s, its recent emergence due to Big Data and massive computing power is leading to widespread deployment, yet it is in its infancy in its application and our understanding of it; hence in its development. Given the potential risks and rewards of Data-Intensive Analysis and its breadth of application, it is imperative that we get this right. The perspective taken here is first that the objective of this emerging Fourth Paradigm, like its predecessor, the Scientific Method, is more than merely acquiring data and extracting knowledge; it is to investigate phenomena by acquiring new knowledge, and correcting and integrating it with previous knowledge; and second, that Data Science is a body of principles and techniques with which to measure and improve the correctness, completeness, and efficiency of Data-Intensive Analysis. It is now time to identify and understand the fundamentals. This perspective is used to analyze more than 30 very large-scale use cases to understand current practical aspects to gain insight into the fundamentals, to address the fourth “V” of Big Data — veracity. This development may take decades. Language: English |