Previous: Dynamic Programming ...
Datamining
Today data mining is a buzz word but many people, who use the
word constantly, only have a vague idea what it is all about.
This section tries to give a better understanding of it.
Data mining is a new fast growing discipline in todays world where
many organizations accumulate very large data sets electronically.
The aim of data mining is essentially the same as that of
traditional statistics: analyzing historical data and using
it to make forecasts for events in the future.
Traditional statistics is closely linked to probability theory and
was typically used to analyze rather small structured data sets.
In comparison to this, data mining is typically applied to large
unstructured data sets and the techniques used are derived from
statistics but more importantly from new techniques of
artificial intelligence, machine learning and numerical optimization.
Artificial intelligence tries to simulate processes that occur in
nature and biology in order to make computers perform tasks that
so far only humans or intelligent beings can perform.
Machine learning applies techniques of mathematics and computer
science to make computers perform intelligent tasks.
The distinction between artificial intelligence and machine learning
is quite subtle and often it is not possible to clearly say
whether an approach belongs to one or the other.
Numerical optimization is used in statistics, artificial
intelligence and machine learning to solve sub-problems.
The most important techniques for data mining are:
-
Bayesian statistics
- decision trees
- neural networks
- logic programming
- genetic algorithms (used in conjunction with decision trees or neural networks)
The possible applications of data mining are vast. Some examples are:
- pharmacy --- drug testing
- text mining --- classification of the content of a text. (for example: emails - Is it a spam or not?)
- marketing --- customer behaviour, analysis and forecasting
- banks --- credit scoring. Is an applicant for a credit a risk or not?
- insurances --- What is the risk class of an applicant for an insurance policy?
Previous: Dynamic Programming ...