For predictive analytics algorithms to work we must have access to historical data which exhibits known customer behavior. We must also know what problem we are trying to solve in our case probability to default.
These fields are called predictors and are combined into
a predictive model which you can use in your business processes.
Predective
Model involves 5 Steps:
Data preparation, Data analysis, Model development, Model
analysis, Model export
Predictive analytics director supports two types of
models:
·
Scoring Models - for the prediction of binary
behavior
·
Spectrum Models - for the prediction of
continuous behavior
Scoring
Models:
The value calculated by the model, known as the score,
places a case on a numerical scale. High scores are associated with better
business (good behavior) and low scores are associated with worse (bad
behavior). Typically, the range of scores is broken into intervals of
increasing likelihood of one of the two types of behavior. Scoring models
require behavior to be classified into two distinct forms like positive and
negative. Classic examples of such behavior are:
·
Responding to a mailing or not
·
Repaying loans or going into arrears
Spectrum
Models:
Spectrum models extend the ideas of scoring models to the
prediction of continuous behavior
·
Likely purchase value of responders to a
direct mail campaign
·
The likely eventual write-off of cases
recently falling into arrears
Model
Template:
In the Predictive Analytics Director portal we have
defined a number of model templates which you can use. These include Risk,
Retention, Recruitment, and Recommendation.
Select the appropriate Model Template and start working
on the Project
1)
Model Creation:
·
Select source as CSV file or DB
·
Set the Sampling size using % & tot.
cases. Define the properties to be used and the type of the property
·
You identify the field which you are trying
to predict. In this example, the field is behavior. The field exhibits binary
characteristics (N/Y) so the scoring model is the most appropriate.
·
Now define Good & Bad behavior under
Outcome definition.
2)
Data Analysis:
·
Predictive Analytics Director facilitates
automatic discovery of correlation patterns of individual predictors and their
ability to predict the outcome. Any unique identifiers will not be a valid predictors.
customer ID appears to be a reasonably well performing predictor. However, since
we know that customer ID is a random number or a member of a sequence, it
cannot have any impact on the good or bad behavior. Thus we remove IDs from the
candidate list for predictors. Any property which is common to use to
differentiate the good and bad behaviors can go as a predictor
·
So data analysis involved in defining the
properties & Binning (Split the data into Buckets) and then grouping by predictors.
3) Model
Development:
·
Predictive Analytics Director provides a rich
model factory supporting industry standard models such a regression and
decision tree models.
·
The system automatically creates two models:
the Regression and the Decision Tree-CHAID model. At this stage, you can create
additional models if required.
·
We can group the predictors into Group,.
Which means either Predictor A/B/C outcome will be considered. Eg.
HouseOwner/Rented etc.
·
Also look at the scorecard representation
model to check the scores of the predictor and define the weight of the
predictor from 0-1000
4)
Model Analysis:
·
Model Analysis is to enable you to create a
shortlist of models and then select the best model for your use case. At this
stage, you also group the scores into statistically significant set of score
bands; firstly, let’s examine the relative performance of individual models.
·
A very important aspect of each model is its
performance, i.e. how good is a model or a given predictor in predicting the
required behavior. We use a term “Coefficient of Concordance” or (CoC) for the
measure of the performance of predictors and models. You could describe CoC as
a measure of how good the model is in discriminating between good cases from
bad cases. The value of CoC ranges between 50%: a random distribution, and
100%: the perfect discrimination.
·
Analyze the Models created and select the
models where how good the model is in discriminating between good cases from
bad cases.
·
There are following steps in Model analysis:
Score Comparison, Score Distribution and Class Comparison.
5)
Model Export:
·
In the final stage, you can produce the
reports about the model and export the model into a model file or an instance
of the Predictive Model rule. Check the customer properties and the predictors
created in the model under input mapping.
·
The report contains:
1. A
project summary
2. A visualization
of the whole Decision Tree
3. The
sensitivity of the model for each of the input fields
4. Model
segmentation
5. Detailed
insight in the analysis, grouping, and validation of each of the attributes
6. Date
when the model was developed and by