meta data for this page

Big Data

The term “Big Data”, is now a popular term used in many contexts, yet it is hard to define it by itself, isolated from any of these contexts.

The company Doug Laney defined “Big Data” as “Big data” is high-volume, -velocity and -variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making”

However the term can be better understood by thinking of it as a way of doing research, solving problems and thinking. The big data approach to problem solving involves considering as much as possible information about the problem(if possible “all” the information), allowing information to be “messy”, and focus on searching for patterns and relations among the variables, rather than trying to explain the causal relationships behind the phenomena.

Coursera diploma

Exam questions

1. How can you critically asses an author's statements when you read an article? This question is to measure a student's knowledge on critical assesment of literature, correct answers might be verifying sources, searching for bias from the author's side or veryfing the author presents support and his or her conclussions are logical.

2. How can you differentiate a fact from an assertion and an opinion? This question is to find out if the studen't knows the difference between these three types of statement.

3.The guidelines for critical thinking present a systematic approach to asses arguments, while big data proposes a paradigm more flexible and empirical. What are the main differencecs between these approaches and how are they related?

4. The growing use of big data presents some ethical issues such as loss of privacy, do you think the bennefits of big data outweight the negative aspects associated with it? This is an open question to observe the student's stand point about big data and it's ethic implications. There are no right or wrong answers to it.

Ethics of big data

There are some ethic isues that come with the use of big data, and they become more critical as big data applications grow in number. Probably the harderst part of discussing ethics is that it is a highly subjective field where everybody might have a different opinion to a certain topic.

The most alarming example of big data use that threats human dignity is probably where people are judged according to how propense they are to do something (or fail to do something). The idea of sending someone to jail for a crime he has not commited yet still looks like science fiction, but mild versions of such principle are being applied today, insurance companies and banks often use statistical approaches based on an individual's background to calculate how likely they are to have an accident or fail to pay a loan, although no solid proof can be found for such assertion.

To grant so much power to big data is, in my opinion, not a healthy practice for a society, this conducts to what the author of the book calls dictatorship of data, where decisions are made only by data and statistics and human aspects are left aside. I think big data should be used as a support tool in decision making or an alternate paradigm to look at a problem or opportunity, but it should never be the only way to make decisions.

Another very important downside of big data is the threat to privacy, this is the issue that concerns most users currently. Personally I find the idea of managing information and protecting privacy using policies is a difficult project. Nowadays companies and governents collect information from users and use this information for multiple purposes, which are often not know at the time the user provided the information, and the idea of going back to each user and ask for permission to reuse his or her information is not feasible. Users should embrace the fact that once they provide their information, he no longer has absolute control over it.

The best way to guide users and companies in the big data world are solid values and empathy for others