How to data mine

Published on: 29 February 2016

Data mining is part of the knowledge, discovery and databases process. 

To undertake data mining, you need to acquire a dataset, analyse the data eg, identify anomalies, cluster data into groups, classify data, undertake regression analysis and undertake dependency modelling (looking at the relationship between two variables). Data mining is a useful tool for finding trends that would not have been predicted otherwise.

In 1987 Copenhagen Water undertook a data mining process when they started replacing 1 per cent of their pipe network each year. To decide which pipes to replace each year they created a risk model of which pipes needed replacing most urgently based on characteristics of previously burst pipes and found that the biggest factors in whether a pipe was likely to burst was whether that pipe had previously burst and the number of houses where a pipe was laid. Through data mining they discovered that some pipes were laid in high intensive areas.

Related links

 

Report inappropriate comment | Read our comment moderation policy