# Questions tagged [caret]

Caret is an R package containing a set of functions that attempt to streamline the process of creating predictive models.

418
questions

**0**

votes

**2**answers

51 views

### In a neural network, why can't there be more weights than the number of observations?

After having this exact same issue with caret, I arrived at this thread. However, I do not intuitively understand why this answer is correct.
Why can't there be ...

**1**

vote

**1**answer

26 views

### Predicted probabilities seem too low with Gradient Boosting Machine on `iris` data

I'm doing a test run of the Gradient Boosting Machine algorithm on the iris data with the caret package.
...

**1**

vote

**1**answer

18 views

### What method does varImp.gam use in caret package in R?

I see that the caret package has support for gam objects for the varImp function. I was wondering if there was documentation about which method the function uses when gam is the input?

**0**

votes

**0**answers

29 views

### Result reproducibility using time series cross validation with Caret in R

I'm using Caret package in R to train a Lasso regression on a time series dataset. My problem is that even if I set a seed before the ...

**0**

votes

**0**answers

26 views

### Using caret::sbf to apply feature selection and classification

I'm aiming to use caret::sbf to filter a large number of predictors before using different machine learning models to predict a binary outcome. I would also like to optimise tuning parameters and do ...

**0**

votes

**0**answers

13 views

### Why does createDataPartition() split my dataset unevenly? [migrated]

This is what I did in R using Galton's data from the HistData package:
The code is:
...

**1**

vote

**2**answers

50 views

### Multi- Class probabilities of Random Forest inside caret Model

Im facing a problem with the results of a multi-class random forest model.
I want to use a) the predictions of the model and b) the class probabilities of these predictions for further work.
I did a ...

**0**

votes

**0**answers

16 views

### Techniques to account for differences in misclassification “cost” on variables other than the outcome

Suppose you're in a classic classification context: you want to predict whether a patient has a certain virus. You are working in multiple regions (let's say 2 for simplicity: Region A and Region B) ...

**0**

votes

**0**answers

21 views

### R: caret (elasticnet): ridge regression: understanding the returned parameters

I wanted to play around with the ridge regression in caret (which apparently uses elasticnet), so I did two experiments:
use the original data
use the modified data where the values of ...

**1**

vote

**1**answer

37 views

### Machine learning with univariate time series

I am trying to make predictions with daily data with time series in R.
This time series is univariate and contains only data from sales from each 365 days in a four year period.
My intention is to ...

**0**

votes

**0**answers

14 views

### Can we compare class probabilities between different methods in caret?

I am using the caret package in R for binary classification and I want to compare different methods (e.g., random forest, SVM, ANN, PLS-DA...). I consider the class probabilities as a "certainty score"...

**0**

votes

**0**answers

12 views

### Infer estimation accuracy from the percentage of number of trees voting for the majority class?

I am performing a classification with caret in R. With the predict()-function I can also get a percentage-output of the number of trees voting for majority class. In my case I have 4 possible classes ...

**0**

votes

**0**answers

62 views

### R: interpreting output of caret::train( ) with method glmStepAIC

I am having trouble understanding the output of caret::train( ) with method = "glmStepAIC". Here is the sample code on the ...

**0**

votes

**1**answer

115 views

### Knn using caret: how to specify k?

I'm using caret package to train a knn model with the following R code:
...

**1**

vote

**1**answer

34 views

### How can I separate the overall variable importance values when using Random forest?

I implemented a random forest model in R using the package 'ranger' combined in 'caret' package with 10fold CV. My outcome is binary (0,1) and I have a couple numeric predictor variables. I used the ...

**0**

votes

**0**answers

21 views

### R: Random Forest with count-data: hand it over as a quantitative integer or an ordered factor?

I am using the "rf" method as implemented in the caret package for R.
In R you can differentiate between qualitative and quantitative variables. I am unsure how to view count-data. E.g. the number of ...

**0**

votes

**0**answers

19 views

### Questionnaire to predict physical activity - scale?

We defined a questionnaire which gives great ICC, correlation, BA-plots etc. compared to the nowadays used questionnaires.
We want to improve it a little more, but I can't find literature/method for ...

**1**

vote

**0**answers

21 views

### Plot Actual vs Predicted SVM Regression

I am building an SVM regression model using caret package, however, I am not sure what is the best approach to plot predicted vs actual values. I have the code below. You can reproduce the output by ...

**2**

votes

**1**answer

43 views

### PCA in R: different results for caret and prcomp

Can some tell my why the preProcess function from the caret packages gives a different result than the ...

**2**

votes

**1**answer

23 views

### How do I set reasonable initial gridsearch parameters?

I'm training a number of models with the aim of identifying those which will perform well on my data. As such, I'm using a lot of models that I am unfamiliar with, and they all have their own tuning ...

**0**

votes

**1**answer

28 views

### Minimum number of obs. for machine learning and training/test sets?

Are there a minimum number of observations for ML techniques (classification, regression) in psychology/cognitive neuroscience? In particular for training and test datasets?
I found this article for ...

**-1**

votes

**1**answer

31 views

### R-Caret : Not-meaningful class probabilities and AUC value

I am very new to ML therefore my question might be primitive. I am working on a binary-class problem. The response (target) variable is occurrence : a factor ...

**0**

votes

**0**answers

24 views

### How to use caret package in r to tune the detailed structure of neural network

I am new to ANN, and I am struggling on how to tune my ANN. My question is how can I determine the ANN structure (like how many hidden layers and nodes in each layer) is the best, or at least good ...

**0**

votes

**0**answers

112 views

### SVR feature selection

I'm trying to go through feature selection with SVR (trough caret package in R). Working on a dataset with 400+ points and 20+ features and 2 target variable. Can I use correlation coefficient for ...

**2**

votes

**1**answer

328 views

### glmStepAIC model is doing better that other models

I am training a model on an imbalanced dataset (about 5-20% of positive class) and trying out different algorithms in R using caret package.
I have 57 predictors and around 2000-3000 observations in ...

**1**

vote

**1**answer

496 views

### R - xgbTree, xgbDart and gbm in Caret predict Small Range

I am currently facing an issue in my regression model. I have tried the models in the title (xgbTree, xgbDART, and gbm in caret), and they tend to predict a very small range for the output variable.
...

**1**

vote

**0**answers

75 views

### Data splitting with caret: Can we remove the ID Coloumn after folds are created?

so we have a dependent sample (two observations for each participant). To prevent data from one participant being in the training set and in the unseen fold in cross-validation we used the groupKFold ...

**0**

votes

**0**answers

21 views

### How does caret package choose the optimal parameters when there are several such values?

First Example
I fitted the following penalized logistic regression model using caret package as follows,
...

**1**

vote

**1**answer

233 views

### R caret classification - why doesn't model accuracy equal accuracy given by predict()?

I have a dataset with 1000 samples, and each sample is 1 of 3 classes. I'm training classifiers on the dataset and predicting classes (5-fold cross-validated) and I'd like to know how well each ...

**5**

votes

**1**answer

248 views

### How to interpret coefficients of a multinomial elastic net (glmnet) regression

I'm trying to model a membership in one of three well-being clusters (flourisher, normative, languisher) based on a set of predictors, using elastic net for both variable selection & modelling. I ...

**0**

votes

**1**answer

40 views

### How do I predict future scenarios after training and validating my model?

Problem
I'm new to machine learning and need a little activation energy to get me past this sticking point. I've trained/validated/tuned, and tested a random forest model. Therefore, I've used my ...

**0**

votes

**0**answers

75 views

### How to find the Prediction Intervals of a Gaussian Process Regression via caret kernlab packages?

I am trying to use a Gaussian Process Regression (GPR) model to predict hourly streamflow discharges in a river. I've got good results applying the caret::kernlab train () function.
Since the ...

**0**

votes

**0**answers

267 views

### How to improve specificity with unbalanced data? (R caret package)

I am working on a classification problem where my outcome variable is either "Approved" or "Denied". The % of approvals in my dataset is roughly 60% and the denials make up roughly 30%. I have tried ...

**0**

votes

**0**answers

224 views

### caret() and glmnet() give different coefficients

I have understood from this post (http://stackoverflow.com/questions/48653465/r-coefficients-from-glmnet-and-caret-are-different-for-the-same-lambda) that caret() and glmnet() may not use the same ...

**1**

vote

**1**answer

53 views

### Predictions based on k-fold Cross Validation, which model is used (Caret)

I am sorry if there is an obvious or intuitive answer to this, which I missed.
We have tuned the hyperparameters of a RF using Grouped 10 - Fold CV (repeated 5 times), to obtain the values for mtry ...

**0**

votes

**1**answer

400 views

### Grouped 7-fold Cross Validation in R

I am searching for a grouped 7-fold cross validation function. I couldn't find it in the caret package.
I got 70 subjects performing 7 trials (Outcome variable: categorical with 7 values) = 490 ...

**0**

votes

**0**answers

83 views

### R-Caret, Regression, different number of PCA components for finalModel and resampling with PCA?

In my project I train models with the "Timeslice" method from the caret package but this question also fits in with other methods, such as cross-validation.
Imagine you have 2584 records and split ...

**0**

votes

**0**answers

9 views

### Any project on iterative evaluation of dimensionality reduction and model selection strategies?

Caret and Scikit-learn offer great many alternatives for various steps in machine learning. Is there any project that aims at trying all(or most) available alternatives in these packages (or other ...

**1**

vote

**1**answer

60 views

### Testing set accuracy by using cross validation using xgboost with caret

I am working on an xgboost model using caret. I'm using cross validation, but don't know if I'm understanding it correctly. As I understand, it creates multiple training and test sets. Does this mean ...

**0**

votes

**0**answers

85 views

### Is the use of Nested Cross Validation and train- test CV necessary or an overkill?

I have been relatively obsessed lately in the proper way of selecting a model (including tuning hyper parameters) and then assessing model performance.
I have read various posts and the approach I ...

**1**

vote

**1**answer

98 views

### How does caret resolve ties in the KNN classification? [closed]

I have a multi-class classification problem, in which I'm using caret package k nearest neighbour classifier, (4 classes), which means that an odd number for k won't prevent classification ties.
So ...

**1**

vote

**0**answers

26 views

### How is this standard error obtained?

I am working through the exercises in Kuhn and Johnson's "Applied Predictive Modelling" and cannot reproduce one of their results in the exercises.
Looking at 4.3 we have
... find the number of ...

**0**

votes

**0**answers

215 views

### Can we calculate Variable Importance in Projection (VIP) scores for PLS-DA in R-caret? Is it comparable with coefficients?

Can we calculate Variable Importance in Projection (VIP) scores for PLS-DA in caret (R)? Are VIP scores comparable with PLS-DA coefficients?

**1**

vote

**0**answers

27 views

### How to select the most important features, categorical & numerical data

I need to find out which factors are relevant when predicting low birth weight. My model looks like this:
...

**0**

votes

**0**answers

85 views

### Identifying important variables in a PLSDA model using caret in R: are coefficients standardized?

I am doing a PLSDA using the caret package in R. My objective is to predict a status of a cow (0 vs 1) using spectral data. I want to compare the coefficients to know which spectral points contribute ...

**1**

vote

**0**answers

60 views

### Caret rfe varImp: scaled variable imprtance for rfe results [closed]

I want to plot the scaled variable importance of a rfe object (recursive feature elimination). With the following code I compute the rfe model and the variable importance:
...

**0**

votes

**0**answers

31 views

### When to use which classification model?

This is something that continues to give me trouble.
Assuming I am working to extract a classification from a dataset and assuming I have the computing resources to do the necessary calculations (in ...

**0**

votes

**0**answers

85 views

### How train function in caret choose lamda for elastic net

I'm a beginner in elastic net. I'm using following code for elastic net in R
...

**0**

votes

**1**answer

55 views

### Correct calculation of repeated cross-validation classification metrics

We can obtain a resampled estimate of training set classification accuracy from caret::confusionMatrix.train(model)
e.g.,
...

**1**

vote

**1**answer

66 views

### How do I avoid time leakage in my KNN model?

I am building a KNN model to predict housing prices. I'll go through my data and my model and then my problem.
Data -
...