meta data for this page

Definition of Big Data

In Big Data the data amounts are usually relatively large, maybe even constantly growing or changing, and the amount of data is often so large, that it is hard to work with using current technologies. The physical size of data isn't the only defining character, amount of data can be relative, so that datasets that aren't large in storage space, can be big data due to such data never before being available in such amounts.

Critical questions for Big Data, Boyd & Crawford, 2011, Information, Communication & Society Volume 15, Issue 5, 2012 Special Issue: A decade in Internet time: the dynamics of the Internet and society

Big Data Classification: Problems and Challenges in Network Intrusion Prediction with Machine Learning, Shan Suthaharan, ACM SIGMETRICS Performance Evaluation Review, Volume 41 Issue 4, March 2014

Mining big data: current status, and forecast to the future, ACM SIGKDD Explorations Newsletter, volume 14 issue 2, December 2012, pages 1-5

Ethics in big data

Ethics in big data is a complicated issue. Collecing the data is one part, but using the data is maybe even more critical issue. Due to how these days so much data can be collected and stored in so many ways, there is an issue of how to properly inform people that they are being monitored. Terms of use feel like an inadequate solution, and some major IT companies like Facebook simply have such a dominant position in our society that they can demand all kinds of rights from the users. Second issue of using the data can be even more difficult. It can be hard to predict how the data will be used, and as shown by examples even anonymizing data sets can be useless, if with enough data sets some smart guy can connect the dots. Data holders have responsibility on what they sell and to whom. But even with these limitations it can be impossible to verify that the data is used as agreed.

Coursera Diploma

Exam questions

Describe the 3 main actors and their relationships in big data ecosystem.

  • Question is about data/data owners, ideas/innovators, expertise/data scientists. Idea would be that person would understand to changing dynamics and roles of each of these three. How they all have a role in utilizing big data and how they are dependant on each other, but can be different entities like companies.

Explain why the value of data is difficult to estimate, and how that value can change.

  • Idea is that student would understand and explain how data can have unseen secondary uses, how it can have different value for different entities, and how data can gain or lose value over time, either by coming outdate or by new technologies coming up and giving value to old data.

What kind of advantages do large companies have over smaller companies, when developing new killer apps?

  • Point is to show understanding that not just startups can innovate, but that large companies have some advantages such as time and money.

How to and why to control the innovation development process?

  • List methods on how to monitor and evaluate the killer app during its development, like periodical reviews, prototyping, playing out doomsday scenarios. Understand the reasons for using these methods.