Re: DM: CHAID vs. CARTŽ

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]

Re: DM: CHAID vs. CARTŽ

From: Ronny Kohavi
Date: Fri, 26 Sep 1997 09:58:51 -0400 (EDT)

Mike> Ronny Kohavi wrote:

>> Since the theory says that it's impossible for one algorithm to
>> uniformly beat any other on generalization accuracy in
>> classification tasks (where uniformly is for all possible target
>> concepts), the question is *when* (under what conditions) one
>> algorithm is better than another, not *whether*.
>> 

Mike> Do you have a reference to the paper or book that describes such
Mike> a theory?  Thanks.

Nicholas mentioned Schaffer's paper which I won't repeat.  The theorem
is actually trivial and has been mentioned through the years in
various paper (Tom Mitchell's version spaces and some older refs in
the late 60's).

David Wolpert makes things very formal in his paper, including
interesting repercussions for meta-level learning:

   author = "David H. Wolpert",
   title = "The Relationship between {PAC}, the statistical physics
              framework, the {Bayesian} framework, and the {VC} 
framework",
   year = 1994,
   booktitle = {The Mathematics of Generalization},
   publisher = {Addison Wesley},
   editor = {David H. Wolpert}

The proof is based on the following simple idea for Boolean labels:
  Without any assumptions, if algorithm A beats algorithm B
  on a dataset, then there exists a dataset with the opposite labels 
and
  it is just as likely, so B will beat A on that.

In real life, there are smoothness assumptions, especially on
real-valued attributes, which is why the theorem isn't too interesting
and why work in the field is still going on.  Identifying those
smoothness assumptions (or other types of assumptions) is the big
problem.  There are very few rules of thumb as to when one
should use one algorithm over another.

--

   Ronny Kohavi (ronnyk@sgi.com, http://robotics.stanford.edu/~ronnyk)
   Engineering Manager, Analytical Data Mining.
   Silicon Graphics, Inc.

References:
- Re: DM: CHAID vs. CARTŽ
  - From: Mike Muratet

Prev by Date: DM: neuro-fuzzy book
Next by Date: Re: DM: neuro-fuzzy book
Prev by thread: Re: DM: CHAID vs. CARTŽ
Next by thread: Re: DM: CHAID vs. CARTŽ
Index(es):
- Date
- Thread