Two Crows Corp. logo

 Home  |  About Data Mining  |  Publications  |  Seminars  |  Consulting  |  About Two Crows 

Two-day course
Public presentations
 

Course Outline:
"
Successful Data Mining"

The following subjects will be covered in the context of the sample data sets.

OVERVIEW OF DATA MINING

  • Definition
  • Business problems
  • Types of analysis: descriptive (clustering, association detection, sequence detection); predictive (classification, regression, time series);
  • The Two Crows data mining process
  • Why building predictive models is a difficult problem

IDENTIFYING THE BUSINESS PROBLEM

  • Targeting data mining applications
  • Uses of data mining

PREPARING THE DATA FOR MINING

  • Building the mining database: architecture, types of data, collecting the data
  • Data quality: data consolidation, missing values, erroneous values, outliers
  • Understanding and transforming the data: visualizations, statistical profiling
  • Selecting data: columns (reducing dimensionality); rows (sampling)
  • Transforming the data: data representation (scaling, binning, encoding, time series)

MINING E-COMMERCE DATA

  • Building the e-commerce database
  • Personalizing the interaction
  • Making recommendations
  • What can go wrong

BUILDING THE MODEL

  • The train, test, validate cycle
  • Validating the model
  • Model evaluation
  • What can go wrong

MODEL TYPES AND ALGORITHMS

  • Classical regression (linear and non-linear), logistic regression
  • Decision trees
  • Neural nets
  • K-nearest neighbor
  • MARS

DATA MINING PRODUCT SELECTION

  • Types of data mining tools
  • Analytic applications
  • Market overview

 Return to course description
Top of page

 Home  |  About Data Mining  |  Publications  |  Seminars  |  Consulting  |  About Two Crows