Nautilus Systems, Inc. logo and menu bar Site Index Home
News Books
Button Bar Menu- Choices also at bottom of page About Nautilus Services Partners Case Studies Contact Us
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Subscribe]

DM: RE: Why is Singular Vector Decomposition for OLS?


From: Cunningham, Scott W
Date: Thu, 9 Apr 1998 13:26:43 -0400 (EDT)
Krishnadas

Singular value decomposition DOES have clear advantages over the plain
inverse(X'X) X'Y or "normal equations."
The reason has to do with the so called "multi-collinearity" problem. 
 When
two independent variables are closely 
correlated it is very difficult to accurately assess the correct 
regression
parameters for each.  The variance about the estimates is 
artificially high.


In fact, some regression packages (such as Excel 97) will REFUSE to 
perform
regressions when two variables are completely linearly dependent.
Statistically, there is nothing wrong with the procedure.  In large 
datasets
(where you can't eye for linear dependency) the refusal of the 
software to
perform regression is a major difficulty.

Contrast this with singular value decomposition (SVD), as applied to
ordinary least squares.  SVD finds linear combinations of the 
variables so
that the resulting "eigenvectors" are linearly independent 
(orthogonal) of
each other.  Then when linear regression is performed it is 
algorithmically
very clear as to which eigenvectors are responsible for which 
percentage of
the original variance.  The regression parameters on the eigenvectors 
are
then converted back into regression parameters on the original 
variables,
and the output is then returned to the user.

Both procedures return regression estimates.  The difference:  SVD
regression estimates are much more tightly bound. 

An excellent book on the topic, which discusses "scientific computing"
rather than "statistics" is 

Press, et al. (1992). Numerical Recipes in C:  The Art of Scientific
Computing, Second Edition.  Cambridge University Press: Cambridge. 

It has a number of algorithms in C that are of special interest to 
data
miners.  There are versions of the book for other languages, including
Fortran.

Best wishes,


Scott Cunningham, D.Phil.
NCR Corporation 
Human Interface Technology Center


        -----Original Message-----
        From:   Krishnadas [SMTP:ckkrish@cyberspace.org]
        Sent:   Thursday, April 09, 1998 9:55 AM
        To:     Datamining Mailinng List
        Subject:        DM: Why is Singular Vector Decomposition for 
OLS?


        Hello,

        Since SVD is used widely for OLS I guess it has clear 
advantages
over
        plain Inverse(X'X) X'Y.  Can anyone tell me about it?  Any 
good
books
        or references on the motivation for SVD and application of 
other
matrix
        decomposition in statistics?

        Thanks.

          -- Krishnadas

        
-----------------------------------------------------------------
        C. K. Krishnadas                c k krish at cyberspace dot o 
r g
        ckkrish@cyberspace.org         
http://www.cyberspace.org/~ckkrish
        na.kck@na-net.ornl.gov
        
-----------------------------------------------------------------
        



[ Home | About Nautilus | Case Studies | Partners | Contact Nautilus ]
[ Subscribe to Lists | Recommended Books ]

logo Copyright © 1998 Nautilus Systems, Inc. All Rights Reserved.
Email: nautilus-info@nautilus-systems.com
Mail converted by MHonArc 2.2.0