Is your data big data?

The amount of data in the world keeps growing and growing. What makes data of a high standard, and how to best utilise it? Held at the Kuopio Campus, UEF’s Big Data Day presented various viewpoints on the topic, as well as examples of how big data has been applied.

“What is big data, eventually? The amount of stored raw data is growing every day, as we save massive amounts of it. Most new data is generated by various pieces of research equipment - not by us humans. Some data, on the other hand, is generated through social networks, phone calls and medical registers,” said Professor Emeritus Erkki Oja from Aalto University. He has been researching artificial intelligence for 40 years.

According to Oja, one quarter of the world’s population is already on Facebook, with more than 200 terabits of data stored there. YouTube, on the other hand, boasts 300 hours of video being uploaded every minute, attracting more than 5 billion views each day.

“So, when does your data become big data? How valuable is your data, when volumes grow and targets keep moving?” Oja asked.

“In fact, big data is technology comprising the acquisition, storage, sharing, analysis, processing and applications of data. Artificial intelligence, on the other hand, is the collection of search engines, translation software, games, pattern recognition, robotics, machine learning and so on inside the digital world.”

“In the past few years, the biggest moments of victory for artificial intelligence have been when artificial intelligence beat the world’s best chess and Go players, and the most efficient use of artificial intelligence has been witnessed in various Q&A systems. Today’s hot topic – deep learning – is actually just a component of artificial neural networks, or computer programmes.”

The development of artificial intelligence is currently at a turning point: development can either stop completely, or it can enter a stage of rapid, even dangerous, growth. Oja is optimistic about the future course of development, and he believes that the truth will be somewhere in the middle ground.

“Data volumes are growing exponentially, and efficient algorithms and equipment will allow us to make the most of data. In five years’ time, everything will be more practical, and the applications of big data will no longer be as isolated as they are today,” he said.

“The mere production of big data is not enough: we need to generate new knowledge and theories relating to it,” Associate Professor Lauri Mehtätalo from the University of Eastern Finland School of Computing said.

“Big data is not experiment-based science or research. Understanding of the topic comes from investing time in its reflection.”

The topics discussed in the seminar also included whether big data should be used at the level of populations or individuals in health care planning. If planning takes place at the level of individuals, all earlier data and test results should be available. Large homogeneous groups may be difficult to find, but there should nevertheless be less variation.

Text and photo: Marianne Mustonen