The Basics of Big Data

Author: Maria Silvia Martinson, lawyer at RestMark METIDA

shutterstock_331429310On 10 January 2017, the European Commission issued a communication titled „Building a European Data Economy“ which explains the need for free movement of data and explores both the obstacles and possibilities of overcoming them in pursuit of a digital single market for data. In the context of the new General Data Protection Regulation, which will apply from 25 May 2018, and further data protection reforms taking place in the future, Big Data with its benefits and complications has become the dominating topic of many discussions. What are the basics of this phenomenon that should be kept in mind by everyone?

Interestingly enough, there is no single agreed upon definition of Big Data and in various context the term might be understood differently. This article understands the term „Big Data“ in conformity with the explanation offered by the European Commission: „Big Data refers to large amounts of different types of data produced from various types of sources, such as people, machines or sensors“. In other definitions it has also been brought out that the data sets which are considered Big Data are either so complex or just so large that they have outgrown traditional processing applications and require innovative data management software. However, it is considered standard that Big Data is defined by the following three characteristics: volume, velocity and variety.

„Volume“ indicates extremely large amounts of data which have been derived from different sources. The extent of collected and stored data has grown explosively after the beginning of the so-called „digital age“ in global information storage capacity in 2002. More precisely, a study conducted by Martin Hilbert and Priscila López demonstrated that year 2002 was the tipping point where more data became stored in digital format rather than in analog data format such as paper and films. By 2007 94% of all data was being stored digitally, whereas the amount of data had multiplied several times by then. And that was even before cloud computing, the practice of storing and accessing data using remote servers hosted on the internet which offers exceptional storage capability, which started its growth spurt in 2009 and became widespread in 2013.

„Velocity“ and „variety“ respectively describe the unprecedented speed of data streams and the different formats in which data is received. Due to the advanced machine-to-machine communication and the increasing number of objects which have the Internet of Things qualities (such as embedded sensors), more and more tiny bits of data are transmitted by the second. Additionally, people produce a massive amount of data each day, for example Facebook users are said to upload over 900 million photos every day. Since the data comes from very different sources, it can be in the form of emails, numeric data, video or something else. Additionally, although some of the data might be structured, most of it is not and thus is more difficult to manage and analyse.

Considering the aforementioned, it can be easily agreed upon that more data is collected, stored and managed than ever before. While there are some issues related to additional storage capacity, it is not even close to being the most actual question about Big Data. The objective of most discussions and developments is finding the best way to extract knowledge from Big Data and use it as an accelerant to innovation. Having access to large amounts of data is only half the victory, since not all of it is useful and the parts that are can be as difficult to find as a needle from a haystack. However, once that needle is found, possibilities are endless.

The extraction of value from Big Data requires vastly intelligent software packages, which can perform extensive search operations to retrieve intended results. After that, said results need to be analysed to identify trends, quite often as fast as to allow real-time response. This process mostly is and should be automated as otherwise it would be highly labour-intensive. In his 2016 article in Huffington Post, entrepreneur James Canton stated that autonomous decision-making is becoming the norm. Additionally, he brought out that although identifying trends is useful, extracting meaning to advise us and determine better outcomes faster may be possible by combining Big Data with artificial intelligence.

The benefits of Big Data are widespread. Knowledge extracted from it can be used by businesses to better target their customers (personalised advertisements), optimise supply planning and product quality, or by healthcare specialists to optimise patients’ treatment or even prevent and find new cures for diseases. With the help of Big Data, traffic flows can be managed and our homes improved. The public sector also uses Big Data to prevent cyber attacks, fight terrorism and prevent different types of criminal activity.

In other words, Big Data is highly valuable and not only in the sense of the benefits it can offer. According to the European Commission, the value of European citizens’ personal data has the potential to grow to EUR 643 billion by 2020, which would correspond to 3.17% of the overall European Union GDP. Big Data can be and already is monetized by many enterprises as large amounts of data, both personal and non-personal, are available to them. However, that raises the question of data protection and privacy. Transparency and trust are the keywords here, since according to the study conducted in 2015 by Eurobarometer 81% of Europeans feel that they do not have complete control over their personal data online.

Some of the concerns have been already addressed with the General Data Protection Regulation (GDPR), which was adopted on 27 April 2016 and will apply from 25 May 2018. More precisely, the GDPR provides a single set of rules for the entire European Union, which regulates the processing of personal data. That also includes machine-generated data, which is deemed sufficient to identify a natural person. One of the biggest changes the GDPR brings to the table is expanded territorial reach: even companies which are located outside the European Union, but target consumers in the European Union, will be subject to the regulation. However, as stated previously, GDPR does not answer all of the concerns in this matter nor entirely clear the way for optimal „data economy“. Therefore the European Commission is launching a wide dialogue on the issues brought out in the aforementioned 10 January 2017 communication, starting with a public consultation.

 More blog posts:



Įrašas paskelbtas temoje Uncategorized. Išsisaugokite pastovią nuorodą.

Leave a Reply