archive-com.com » COM » E » EIGENVECTOR.COM Total: 295 Choose link from "Titles, links and description words view": Or switch to
"Titles and links view". |

- PLS_Toolbox FAQ

or not the categorical variables are complementary e g just inverses of each other The key to understanding what is useful and what isn t is how categorical variables are encoded as a y block for classification in PLSDA For a single categorical variable each class is encoded into a separate true or false column in the y block A 1 0 A 1 0 A 1 0 B 0 1 B 0 1 B 0 1 column 1 is Is A column 2 is Is B Thus the following two level categorical variables if combined would be redundant and provide a trivial solution A C 1 0 1 0 A C 1 0 1 0 A C 1 0 1 0 B D 0 1 0 1 B D 0 1 0 1 B D 0 1 0 1 Using only one of these categories would give you the same answer as using both If you have two non trivially different categorical variables still two levels each you can encode these similarly creating a four column y block A C 1 0 1 0 A D 1 0 0 1 A C 1 0 1 0 B D 0 1 0 1 B C 0 1 1 0 B D 0 1 0 1 If you wanted to create this y block in PLS Toolbox you would use the command line function class2logical with each of the separate categorical variables and combine the results y class2logical cat AB class2logical cat CD NOTE PLSDA in the Analysis GUI automatically handles converting classes into logical y blocks It is better to use this automatic management rather than a hand constructed y block However notice that the first two y columns are orthogonal to the second two y columns This is a

Original URL path: http://eigenvector.com/faq/index.php?id=96 (2016-04-27)

Open archived version from archive - PLS_Toolbox FAQ

Search for Keyword s Issue Can you give me more information on the R Squared statistic Possible Solutions R Squared R 2 is an assessment of how well the model does the prediction it is similar to RMSEC except that it doesn t show if there is a bias You can access the R 2 by right clicking on a scores plot of predicted vs measured It is one of the items which show up in the information box Show on figure puts it on the figure Note in other software R 2 is for the MODELED data only In PLS Toolbox we calculate it for the DISPLAYED data That means that if you show excluded data or if you show predicted test data with calibration data Show Cal with Test the R 2 will be for what is shown and will be different from the calibration data Turn off the Show Cal with Test checkbox on the Plot Controls window to view the R 2 for only the test data R 2 is calculated as the square of the correlation coefficient between the X and Y axes plotted in the figure If the only data shown is the estimation

Original URL path: http://eigenvector.com/faq/index.php?id=68 (2016-04-27)

Open archived version from archive - PLS_Toolbox FAQ

EigenGuide Videos Search for Keyword s Issue Convergence of PARAFAC How much variation between models is expected a particular PARAFAC is fit multiple times with the same settings Possible Solutions Correctly converged models can vary in the loadings e g permutation of components but the fit should be exactly the same e g as expressed by the sum of the squared residuals If repeatedly fitted models are not identical in

Original URL path: http://eigenvector.com/faq/index.php?id=125 (2016-04-27)

Open archived version from archive - PLS_Toolbox FAQ

Frequently Asked Questions Browse FAQ Browse Documentation Wiki Browse EigenGuide Videos Search for Keyword s Issue Does the software stop working if my maintenance expires Possible Solutions Many of our software products come with a maintenance agreement The details of this service are given on the Maintenance Agreement Page If your maintenance expires the software does not stop working Expired maintenance means you no longer have free access to new

Original URL path: http://eigenvector.com/faq/index.php?id=168 (2016-04-27)

Open archived version from archive - PLS_Toolbox FAQ

Overview PLS Toolbox MIA Toolbox EMSC Toolbox Solo Solo MIA Solo Predictor Model Exporter DataSet Object DSO Floating Licenses Training Training Overview Eigenvector University EigenU Online Courses Resources Contact Us Search Site Search for FAQ Frequently Asked Questions Browse FAQ Browse Documentation Wiki Browse EigenGuide Videos Search for Keyword s Issue How and where do I report a problem with PLS Toolbox Possible Solutions See our documentation wiki support page

Original URL path: http://eigenvector.com/faq/index.php?id=32 (2016-04-27)

Open archived version from archive - PLS_Toolbox FAQ

values R2Y and Q2Y are reported for regression models The R2Y value is equivalent to the y block cumulative variance captured as reported in the 5th column of the variance captured table or the detail ssq field of a model The Q2Y value is analogous to R2Y except it is based on the cross validated results It is inversely proportional to the RMSECV values according to this equation but only if the y block is mean centered before the model is built where RMSECV is the root mean square error of cross validation m is the number of samples and y i is the actual aka measured y value for sample i R2Y and Q2Y represent fractions of variance captured while the cumulative variance captured table and detail ssq field represent percentages They are identical except for a factor of 100 difference between fraction and percentage Given a PLS model named m which used only mean centering or autoscaling on the y block the following code calculates Q2Y incl m detail include 1 2 y m detail data 2 data incl my length incl Q2Y 1 m rmsecv 2 my sum mncn y 2 The practical aspects of these statistics

Original URL path: http://eigenvector.com/faq/index.php?id=150 (2016-04-27)

Open archived version from archive - PLS_Toolbox FAQ

original data In other words this is how the original variables project into the normalized multivariate space of the model To calculate the T contributions for a given sample in a PLS Toolbox PCA model use the tconcalc function Given the sample s data in variable data and the model in variable model the following will calculate T contributions T con tconcalc data model Note that if data is a matrix of all your data and you want only a single sample s T contributions pass only that sample s row data row number Numerical Calculation Details To calculate the T contributions for the i th sample T con i in Matlab notation T con i t i L U where t i is a row vector of the scores size 1 x k for a given sample U is the transposed matrix of loadings size n x k and L is a diagonal matrix containing the inverse of the square root of the eigenvalues for the k components For example with a three PC model L would be 1 λ 1 1 2 0 0 0 1 λ 2 1 2 0 0 0 1 λ 3 1 2

Original URL path: http://eigenvector.com/faq/index.php?id=47 (2016-04-27)

Open archived version from archive - PLS_Toolbox FAQ

side of the ROC figure comes from calculating the sensitivity and specificity for a given threshold value Specificity is calculated as the fraction of not in class samples which are below the given threshold Sensitivity is calculated as the fraction of in class samples which are above the given threshold These are empirical curves in that they are calculated from the data directly and not from a model of the distribution of the data so there will be some stepping In fact with smaller sample sizes the curves may NEVER be smooth because sensitivity and specificity only change up or down when the threshold moves past a sample s predicted y value For example if the number of not in class samples above a threshold of 0 46 is no different than the number above 0 45 these two thresholds technically give the same specificity As of version 3 5 4 of PLS Toolbox we actually calculate only critical thresholds those that actually make a difference in the sensitivity and specificity curves and interpolate between them Even then a multi modal distribution of y predictions for either in or out of class samples will lead to non smooth curves The cross validated versions of the curves are determined by using the same procedure outlined above except that we use the y value predicted for each sample when it was left out of the calibration set during cross validation One might assume that doing multiple replicate cross validation subsets would lead to smoother cross validation curves Two things keep this from happening First before version 4 0 of PLS Toolbox the software doesn t actually average the predicted y values from multiple replicates It only remembers the predicted y value from the LAST time a given sample was left out Secondly

Original URL path: http://eigenvector.com/faq/index.php?id=69 (2016-04-27)

Open archived version from archive