Nautilus Systems, Inc. logo and menu bar Site Index Home
News Books
Button Bar Menu- Choices also at bottom of page About Nautilus Services Partners Case Studies Contact Us
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]

Re: DM: binary vs. multiway splits in classification trees

From: Dan Steinberg
Date: Fri, 26 Mar 1999 16:36:57 -0500 (EST)
On Thu, 25 Mar 1999, Tjen-Sien Lim wrote:

> I was wondering if anyone has an example where classification tree
> with multiway splits provides a much better explanation about the 
> than tree with binary splits does. Using KnowlegdeSEEKER ('cluster'
> and 'exhaustive' methods), I haven't found data sets showing 
> splits are better than binary. Thanks.

Of course you know that a series of binary splits can always reproduce a multi-way split, and thus theoretically, binary splits should never do worse than multi-way. If the binary splitting rule does not want to reproduce the multi-way it will be because the multi-way split is not best (from a myopic perspective which sees only one split at a time). Further, since multi-way splits fragment the data much faster than binary splits they ought to do worse. Similar observations have been made by Breiman, Friedman, Olshen and Stone in their monograph Classification and Regression Trees (1984) and by Usama Fayyad in one of his earlier articles (1991? -- can't locate reference right now).

I do have an example in which a multi-way split is significantly worse than binary splitting. The multi-way split grows a smaller tree and cannot find the important splitters which become invisible once the data fragmentation has occurred.

An important question for FINITE data sets is which method works better in a specific context. The multi-way split is a form of constraint and there will be occasions in which a constrained tree (multi-way) will outperform an unconstrained one (binary). However there will also be many circumstances in which the constrained tree is worse.

 | Dan Steinberg             | FAX (619) 543 8888              |
 | Salford Systems           | VOICE (619) 543-8880            |
 | 8880 Rio San Diego Dr     |                                 | 
 | San Diego, CA 92120       |  |

[ Home | About Nautilus | Case Studies | Partners | Contact Nautilus ]
[ Subscribe to Lists | Recommended Books ]

logo Copyright © 1999 Nautilus Systems, Inc. All Rights Reserved.
Mail converted by MHonArc 2.2.0