![]() |
|
![]() |
![]() |
|
![]() |
![]() |
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Subscribe]
Re[2]: DM: discretizationFrom: Troy_Haines Date: Tue, 19 Aug 1997 04:02:30 -0400 (EDT)
To my knowledge discretization strategies are usually performed
with
the value of an outcome variable explicitly taken into account
(bivariate framework maximizing some association metric) or
clustering
with (few) selected variables deemed important a priori.
The real trick is to design a discretization strategy that is
optimal
in a multivariate world, one that takes into consideration
interaction
among several variables simultaneously. There is no reason to
think
that a discretization scheme optimized in a bivariate sense will
be
optimal for multivariate models (such as a multivariate logistic
regression model). Of course, if tree induction is the
algorithm of
choice, a bivariate discretization strategy optimized at each
node may
be appropriate.
Troy.
troy_haines@mail.amsinc.com
______________________________ Reply Separator
_________________________________
Subject: Re: DM: discretization
Author: ronnyk@cthulhu.engr.sgi.com at AMS-Internet
Date: 8/18/97 3:18 PM
Bob> as decision trees are much easier to induce than generalized
Bob> classifiers, many people automatically (and blindly) discretize
Bob> their continuous variables prior to the induction process.
Bob> does anyone know of general discussions of this discretizing or
Bob> quantizing process? how should variables that represent counts
or
Bob> frequencies be treated? what about the situation where all but
Bob> one of the cases have the same value for a variable, should it
be
Bob> treated as continuous?
There's an overview paper of discretization methods in
Dougherty, J., Kohavi, R. and Sahami, M., Supervised and unsupervised
discretization of continuous features. Machine Learning 1995.
and another paper that compares the newer optimal error minimizer T2
in
Kohavi, R., Sahami M., Error-Based and Entropy-Based Discretization
of
Continuous Features. KDD-96.
Both are available at:
http://robotics.stanford.edu/users/ronnyk/ronnyk-bib.html
--
Ronny Kohavi (ronnyk@sgi.com, http://robotics.stanford.edu/~ronnyk)
|
MHonArc 2.2.0