Nautilus Systems, Inc. logo and menu bar Site Index Home
News Books
Button Bar Menu- Choices also at bottom of page About Nautilus Services Partners Case Studies Contact Us
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]

Re: DM: Ross Quinlan's pessimistic pruning mechanism


From: Ronny Kohavi
Date: Sun, 2 Nov 1997 15:49:20 -0500 (EST)

Sergei> Dear colleagues I am desperately seeking for the exact formula
Sergei> for Ross Quinlan's pessimistic pruning mechanism used in his
Sergei> decision tree algorithms. It was said in some papers that this
Sergei> formula can be found in the book "C4.5: algorithms for machine
Sergei> learning" by R.Quinlan but I failed to find it there.

The formula is based on the standard confidence interval for binomial
distributions.  You correct the error at the leaves, taking the
pessimistic bound, and sum them up the tree.  If the pessimistic
error for a node is lower than for the subtree, you prune.

If all you want is the formula, I have it and the
derivation in one of my papers:
   Kohavi R, A Study of Cross-Validation and Bootstrap for Accuracy
   Estimation and Model Selection. IJCAI-95,
which you can get off:
   http://robotics.stanford.edu/users/ronnyk/
under publication.

MLC++ has the pruning code available freely for research purposes:
   http://www.sgi.com/Technology/mlc
Relevant routines are ML/CatTestResult.c, MTree/TDDTInducer.c,
MTree/ID3Inducer.c

--

   Ronny Kohavi (ronnyk@sgi.com, http://robotics.stanford.edu/~ronnyk)
   Engineering Manager, Analytical Data Mining.




[ Home | About Nautilus | Case Studies | Partners | Contact Nautilus ]
[ Subscribe to Lists | Recommended Books ]

logo Copyright © 1998 Nautilus Systems, Inc. All Rights Reserved.
Email: nautilus-info@nautilus-systems.com
Mail converted by MHonArc 2.2.0