![]() |
|
![]() |
![]() |
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Subscribe]
Re: DM: Data Mining in Small DatabasesFrom: Megan Conklin Date: Sat, 08 Jan 2000 12:30:37 -0500 I am relatively new to data mining and kdd as well, but I believe the emphasis on large databases is because a lot of the original algorithms designed to find patterns in data are just too slow to run on larger data sets. A program relying on an On^2 algorithm can be alright on a small dataset, but not on a large one. For example, I am doing a large paper on clustering algorithms right now. Clustering is an old technique used to find data elements which are "similar" to each other in some way. And while there are tons of clustering algorithms, and some of them are really old, a lot of them are simply impractical for use on large databases. At the same time, as disk space becomes cheaper, and data becomes easier to get (think: Internet), databases just keep getting bigger. So a lot of the algorithms have to be rethought to handle larger data. In my opinion, this is why you see so much research (especially the newer research) is on larger data sets. -megan conklin Nova Southeastern University PhD student (computer science) At 04:46 PM 1/7/00 +0200, Bostjan Brumen wrote: >Hi! > >I've been doing some research on Data Mining and have come into the = >twilight zone: why is everybody talking only about "large" databases? = >What about "small" databases - don't they have anything valuable inside? = >Don't they hide nuggets, useful patterns? > >And, nobody (best to my knowledge) has come up with a definition of = >"small" and "large" - not in terms of bits and bytes, but something more = >persistent to the change. > >If you have an opinion about the themes I outlined in the questions = >please drop me a note. I will appreciate your comments. > >Best, >Bostjan Brumen >
|
MHonArc 2.2.0