Nautilus Systems, Inc. logo and menu bar Site Index Home
News Books
Button Bar Menu- Choices also at bottom of page About Nautilus Services Partners Case Studies Contact Us
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]

DM: missing attribute values in classification trees


From: Tjen-Sien Lim
Date: Thu, 29 Oct 1998 13:56:30 -0500 (EST)
Hi, I'd like to get some advice from those of you who have analyzed
datasets with missing attribute values in supervised learning.

What's the "typical" proportion of cases with missing values? How big
the proportion has to be before it presents problems?

We're conducting a project comparing classification trees classifiers
on datasets with missing values. We'd like to simulate
missing-at-random on datasets that contain no missing values to
increase the number of datasets. Should we simulate 5%, 10%, 20%, or
30% missing at random? Any preferred way to induce those missing
values?

Thanks in advance for any advice/pointers/suggestions.

-- 
Tjen-Sien Lim                (608) 262-8181                        
Ph.D. candidate              limt@stat.wisc.edu                    
Dept. of Statistics          http://www.stat.wisc.edu/~limt
Univ. of Wisconsin-Madison
1210 West Dayton Street       
Madison, WI 53706



[ Home | About Nautilus | Case Studies | Partners | Contact Nautilus ]
[ Subscribe to Lists | Recommended Books ]

logo Copyright © 1998 Nautilus Systems, Inc. All Rights Reserved.
Email: nautilus-info@nautilus-systems.com
Mail converted by MHonArc 2.2.0