Nautilus Systems, Inc. logo and menu bar Site Index Home
News Books
Button Bar Menu- Choices also at bottom of page About Nautilus Services Partners Case Studies Contact Us
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]

Re: DM: Data Mining in Small Databases


From: Megan Conklin
Date: Sat, 08 Jan 2000 12:30:37 -0500
I am relatively new to data mining and kdd as well, but I believe the
emphasis on large databases is because a lot of the original algorithms
designed to find patterns in data are just too slow to run on larger data
sets. A program relying on an On^2 algorithm can be alright on a small
dataset, but not on a large one.

For example, I am doing a large paper on clustering algorithms right now.
Clustering is an old technique used to find data elements which are
"similar" to each other in some way. And while there are tons of clustering
algorithms, and some of them are really old, a lot of them are simply
impractical for use on large databases. At the same time, as disk space
becomes cheaper, and data becomes easier to get (think: Internet),
databases just keep getting bigger.

So a lot of the algorithms have to be rethought to handle larger data. In
my opinion, this is why you see so much research (especially the newer
research) is on larger data sets.

-megan conklin
Nova Southeastern University
PhD student (computer science)

At 04:46 PM 1/7/00 +0200, Bostjan Brumen wrote:

 >Hi!
 >
 >I've been doing some research on Data Mining and have come into the =
 >twilight zone: why is everybody talking only about "large" databases? =
 >What about "small" databases - don't they have anything valuable inside? =
 >Don't they hide nuggets, useful patterns?
 >
 >And, nobody (best to my knowledge) has come up with a definition of =
 >"small" and "large" - not in terms of bits and bytes, but something more =
 >persistent to the change.
 >
 >If you have an opinion about the themes I outlined in the questions =
 >please drop me a note. I will appreciate your comments.
 >
 >Best,
 >Bostjan Brumen
 >





[ Home | About Nautilus | Case Studies | Partners | Contact Nautilus ]
[ Subscribe to Lists | Recommended Books ]

logo Copyright © 1999 Nautilus Systems, Inc. All Rights Reserved.
Email: firschng@nautilus-systems.com
Mail converted by MHonArc 2.2.0