meta data for this page

This is an old revision of the document!

Big Data Definition

For a long time Big Data related to a vast amount of data that a single machine was unable to process. With increasing computational power however these boundaries became less visible resulting in a need of a more precise definition. In such a need Gartner came up with the 3V model. It defines big data as high-volume, high-velocity and high-variety information.

  • High-Volume: Today vast amounts of data is generated by users, IT systems and sensors. The data is stored in digital form. The amount of available digital data increased over the past years dramatically and is expected to continue in the future.
  • High-Velocity: Relates to the gained speed in which data is collected nowadays. As our life’s become more and more digital and the number of sensors and IT systems increases so does the speed in which data is collected.
  • High-Variety: Big data analyses data from different sources and formats. Helpful is a trend to digitise all kinds of data sources such as in field of communication, climate, media and many more.

McAfee, Andrew, and Erik Brynjolfsson. “Big data: the management revolution.” Harvard business review 90 (2012): 60-6.

Russom, Philip. “Big data analytics.” TDWI Best Practices Report, Fourth Quarter (2011).

Manyika, James, et al. “Big data: The next frontier for innovation, competition, and productivity.” (2011).

Ethics in Big Data

Today our lifes involve more and more digital interactions - from social media to phones that are always connected to the internet to eCommerce shopping. In such interactions we leave footprints (data) that may contain information and an increasing number of governments and companies started to collect and analyse such data.

Users are aware that their data is being collected. At the same time almost nobody knows which data is collected, if it will be sold to third party players and which analyses are performed on the data. Therefore the question arises why do users still use these services? The answer might be different from country to country but for European countries it can be said that a few users trust these services whereas the majority has at least some worries but feels like they can do nothing against it. This is a contradiction as the European Union has strong privacy regulations. However in a globalist digital world borders are less visible. For example a US company that owns a server in the European Union where it stores data from EU users is in the dilemma that the European Union argues that EU privacy rules apply to this data whereas the US government disagrees with this opinion. Unfortunately privacy rules vary a lot from country to country. Russia and the US for example has a low privacy protection whereas the EU has strict rules regarding the privacy of its citizens.

In my humbled opinion every companies interest is to provide the user with the greatest user experience. This not only includes great design and a rich set of functionality but also a transparent process which data is collected and how it is used. Such information should not be hidden in long lists of terms, instead it should be presented in a graphical easy to understand way. Also the user should be able to option out of such usages and if he ends his contract with the company all his data should be deleted.


Statement of Accomplishment