About data mining
Databases today can range in size into the terabytes — more than 1,000,000,000,000 bytes of data. Within these masses of data lies hidden information of strategic importance. But when there are so many trees, how do you draw meaningful conclusions about the forest?
Data mining, also called predictive analytics, is being used both to increase revenues (through improved marketing) and to reduce costs (through detecting and preventing waste and fraud). Worldwide, organizations of all types are achieving measurable payoffs from this technology.
Data mining finds patterns and relationships in data by using sophisticated techniques to build models — abstract representations of reality. A good model is a useful guide to understanding your business and making decisions.
There are two main kinds of models in data mining: predictive and descriptive. Predictive models can be used to forecast explicit values, based on patterns determined from known results. For example, from a database of customers who have already responded to a particular offer, a model can be built that predicts which prospects are likeliest to respond to the same offer. Descriptive models describe patterns in existing data, and are generally used to create meaningful subgroups such as demographic clusters.
In addition to algorithms, data mining software usually has features to simplify the graphic representation of the data (visualization tools) plus interfaces to common database formats.
Data mining is only one step in the knowledge discovery process. Other steps include identifying the problem to be solved, collecting and preparing the right data, interpreting and deploying models, and monitoring the results. The real key to success, however, is to have a thorough understanding of your data and of your business. Algorithms can provide meaningful results only when sensibly directed.
The potential payoffs are enormous. Innovative organizations are already using data mining to locate and appeal to higher-value customers, reconfigure their product offerings to increase sales, and minimize losses due to error or fraud. The list of data mining applications is surprisingly broad.
To learn more:
- Download a copy of our free tutorial booklet (PDF). It’s a non-technical overview of the uses and techniques of data mining.
- Consult our glossary of data mining terms.
Useful websites for further information about data mining include:
- KDNuggets — a directory of data mining and knowledge discovery resources (the best site we know for pointers to tools, references, organizations, and much more).
- CRISP-DM — an industry standard methodology for data mining and predictive analytics.
- The BI Verdict — (formerly The OLAP Report) a source of information on Business Intelligence products.