Questions tagged [boosting]

A family of algorithms combining weakly predictive models into a strongly predictive model. The most common approach is called gradient boosting, and the most commonly used weak models are classification/regression trees.

Filter by
Sorted by
Tagged with
0
votes
1answer
11 views

How do you interpret your features when you standardize your data?

Let's say I have built a boosting tree or neural network and I standardized my features beforehand. When I built my model, I split my data into training, validation, and test sets - each with their ...
0
votes
0answers
14 views

xgboost classifier returning different proba and predict(with output_margin=True) [closed]

It's my understanding that for an XGBoost classifier with objective='multi:softprob', the output of ...
0
votes
0answers
13 views

Regression with zero-inflated outcome

I am trying to fit and tune a Regression gradient boosting model where my target variable is zero inflated (80% zero) and the rest of the values are distributed as positive and negative values (not ...
0
votes
0answers
8 views

Gradient Boosting vs Forward Stagewise Additive Model

Given that the famous Adaboost and Gradient Boosting are both some kind of approximation to Forward Stagewise Additive Modeling, why not directly fit a model using Forward Stagewise Additive Model? On ...
0
votes
0answers
10 views

Why gradient boosting use first-order Taylor expansion approximation?

The target of boosting at step $m$ is (see Wikipedia): $$F_{m}(x)=F_{m-1}(x)+\underset{h_{m} \in \mathcal{H}}{\arg \min }\left[\sum_{i=1}^{n} L\left(y_{i}, F_{m-1}\left(x_{i}\right)+h_{m}\left(x_{i}\...
1
vote
1answer
26 views

Predicted probabilities seem too low with Gradient Boosting Machine on `iris` data

I'm doing a test run of the Gradient Boosting Machine algorithm on the iris data with the caret package. ...
0
votes
0answers
8 views

Combine CatBoost with deep learning classifier

I'm using CatBoost to solve a binary classification problem. Most of my features are binary, but the order of features does matter. I've come up with a Recurrent Neural Network that has similar ...
0
votes
0answers
14 views

Spark Gradient Boosted Tree give predicted probability wildly different from actual probability

In your experience of using GBT (of Spark or general) for binary classification, have you encountered the predicted probability very different from the actual probability ? Train and test have same ...
0
votes
0answers
5 views

Splits in Decision Trees vs Dendrograms

gradient boosting is a supervised learning algorithm that splits/grows decision trees to improve predictions iteratively. hierarchical clustering is an unsupervised learning algorithm that splits/...
1
vote
0answers
22 views

Question for [Element of statistical Learning ] Page 357 [closed]

Here is the book link http://web.stanford.edu/~hastie/Papers/ESLII.pdf I am very confused about the statement here: I am familiar with CART and gradient boosting machine but I have no idea what we ...
0
votes
0answers
17 views

How to deal with unbalanced time series data for machine learning?

My understanding when it comes to unbalanced datasets is that we can randomly sample from the dominant class. What are some ways to deal with unbalanced data when we have time series data and the ...
2
votes
1answer
37 views

How does LightGBM deals with incremental learning (and concept drift)?

With some research I found that it updates the leaves (does not create new or remove old ones) is it right? How this happens? Another question is when the incremental learning is done in concept ...
1
vote
1answer
37 views

Steps in gradient boosting algorithm

Can some one please explain the 3rd step 2(c) in the below gradient boosting algorithm. I was under the impression, that the 2(c) computation is nothing but the mean of the corresponding terminal node ...
0
votes
0answers
18 views

How the first tree in gradient boosting classifier is constructed and the split criteria [duplicate]

I am aware how GB classifiers are constructed as regression trees and predictions are made, but not sure how the initial tree and node splitting for it is done. Can someone please explain how the ...
2
votes
1answer
30 views

Performance drops when adding a feature using XGBoost

I did some feature engineering with my data set. When I added on of the new features, the performance significantly dropped. How is this possible? I thought XGBoost is robust to irrelevant variables.
2
votes
0answers
27 views

Low OOB error but high CV error with MABoost

I am using Mirror Ascent Boosting (R package maboost) to learn a 3-class predictor over a set of 123 patients (very small , I know). Classes are almost balanced. I am getting excellent OOB errors (...
1
vote
1answer
38 views

int vs Float in regression modeling

This is general question to understand a concept. I have a dataframe with all columns having float values(precision varies from 2 to 8 digits). I use GBM to train my model. When i train my model ...
0
votes
0answers
18 views

How to compare feature selection regression-based algorithm with tree-based algorithms?

I'm trying to compare which feature selection model is more eficiente for a specific domain. Nowadays the state of the art in this domain (GWAS) is regression-based algorithms (LR, LMM, SAIGE, etc), ...
1
vote
1answer
52 views

catboost does not overfit - how is that possible?

I'm fitting and evaluating a CatBoostRegressor and a XGBRegressor to the same regression problem. I tried matching their ...
0
votes
0answers
16 views

Termination Condition for AdaBoost.R2

I can't quite wrap my head around the termination condition of AdaBoost.R2 as defined by Drucker in this paper. On page 2 of the paper he states to "repeat the following while the average loss* $\bar{...
1
vote
0answers
32 views

Decision tree- Alternative model to predict this data?

My data looks something this (for example): ...
0
votes
1answer
53 views

What is minimized/optimized when we use AdaBoost

When I learned about CART, we learned that at each split, we try to minimize some measure (usually Gini index) of the split. That is, we determine the predictor and threshold that decreases the Gini ...
1
vote
1answer
68 views

How do you interpret prediction output in GBM() in R for classification problem?

I created a model using the gbm() function in library(gbm). Within the gbm() function, I set the distribution as "adaboost". I have a binary response [0, 1]. I used the predict.gbm function for ...
0
votes
0answers
16 views

how to increase the accuracy and reduce the overfitting in xgboost [duplicate]

I am doing multi-classification problem , I got 95% accuracy in validation data set and test data set i got 25% accuracy , when I submitting my predictions I got 75% score,please help me how to fix ...
2
votes
1answer
53 views

Combining XGBoost and LightGBM

I'm working on a text classification problem and I am comparing LightGBM and XGBoost performances. Both on train and test sets I get roughly the same accuracy metrics, but what looks amusing to me is ...
0
votes
0answers
19 views

Categorical encoding on test set using h2o (R)

I trained a GBM with the following parameters ...
0
votes
0answers
28 views

using the test population as an eval_set when doing hyperparameter optimization

I'm looking at this guide for hyperparameters optimization of boosting regressors using hyperopt. I noticed that for each trial, it uses the following code for the ...
1
vote
1answer
41 views

What is the difference between Gini index and Gini coefficient?

I am building a decision tree from scratch. I have been using entropy so far (calculated this way): ...
1
vote
0answers
24 views

“Jumping” among several interpolation techniques?

I am comparing several interpolation methods using monthly climatic data, through RMSE and a 10-fold cross-validation scheme. What I'm observing is that the performances vary from one month to ...
1
vote
1answer
33 views

The accumulative tree structure in a tree based gradient boosting

I'm playing with gradient boosting methods and with its python packages out there. I tried lightgbm, started with a high-dimensional input to predict a task. A left ...
0
votes
0answers
19 views

what's the split criteria used by catboost?

I'm trying to understand the split criteria used by catboost in the "plain" boosting mode (not interested in the "ordered" mode complication). In "algorithm 2 - Building a tree" they are saying that ...
0
votes
0answers
14 views

required sample size for establishing equivalence of a gradient boosting model on different population

I have a trained Gradient boosting Trees (regression) model with a given R2 metric (obtained via cross validation) Now I want to verify that the same model is valid for a very different population. Is ...
3
votes
1answer
76 views

Spelling out a detail in the gradient boosting machine algorithm for binary classification

This is a very long question, but perhaps people who are trying to deeply understand the Gradient Boosting Machine algorithm will think it's interesting. I've been working on understanding the ...
0
votes
0answers
23 views

Which is the best classification Algorithm to be used for finding the “second best class”?

I have a dataframe containing skillsets of players in different positions. I can build a classification problem for predicting the position of player based on the skillsets. However, the problem ...
0
votes
0answers
23 views

Estimate distribution from mean and prediction intervals

I'm using an ML-model (gradient boosting) to predict mean, upper and lower quantiles of a target variable which is gamma distributed. I want to construct distributions for the predictions and figured ...
1
vote
1answer
39 views

Comparison of regression models in terms of the importance of variables

I would like to compare models (multiple regression, LASSO, Ridge, GBM) in terms of the importance of variables. But I'm not sure if the procedure is correct, because the values ​​obtained are not on ...
3
votes
1answer
53 views

how to avoid overfitting in XGBoost model

I try to classify data from a dataset of 35K data point and 12 features Firstly i have divided the data into train and test data for cross validation After cross validation i have built a XGBoost ...
1
vote
0answers
38 views

Gradient boosting (GB) splitting methods (categorical features)

Regarding categorical features - ordinary trees treat categorical features in two main ways, CART - considers only binary splitting, those computing the mean response value (y_mean_i per each category ...
0
votes
0answers
25 views

how does using decision stump lead to an additive model?

In chapter 8 of ISLR it says boosting using stumps leads to an additive model. How would I derive $$f(X) = \sum^p_{j=1} f_j(X_j)$$ from $$\hat{f}(x) = \sum^B_{b=1} \lambda \hat{f}^b(x)$$?
0
votes
0answers
39 views

Why is the step length by default equal to 1 in gradient boosting?

On ESL p.359, it explains steepest descent: But in 10.37, it is trying to minimize the distance to g_im. It looks like the default step length is 1. Why is it so?
0
votes
0answers
89 views

Tuning threshold from multiclass ROC for Gradient Boosting Classifier?

I have created a ROC curve based on the output of a multiclass Gradient Boosting Classifier (See Figure below implemented from Yellowbrick ROCAUC: http://www.scikit-yb.org/en/latest/api/classifier/...
0
votes
0answers
12 views

What does “a distribution is consistent with a hypothesis class” mean?

What does "a distribution is consistent with a hypothesis class" mean? I came across the following statement in this pdf To see this, first note that for every distribution $P$ consistent with $...
3
votes
0answers
24 views

When should one use Bradley-Terry instead of gradient boosted trees for pairwise ranking

Both the Bradley-Terry model and Gradient boosted trees can be used to learn a ranking from pairwise comparisons (e.g. with libraries choix and XGboost). How do they relate to each other? Is there ...
3
votes
1answer
76 views

XGBOOST objective function derivation algebra

I need some help please with the derivation of xgboost objective function. I am following this online tutorial (Math behind GBM and XGBoost) How do you go from here $$ loss = \sum_{i=1}^{n} \left( ...
2
votes
2answers
161 views

Overfitting in extreme gradient boosting

My situation is: 36,197 observations/ 125 outcomes in training data 26 predictors A relatively successful prediction model has been built in a similar dataset using just logistic regression; I ...
2
votes
1answer
31 views

Calculate minimum accuracy for a boosting algorithm

Suppose, you are working on a binary classification problem. And there are 3 models each with 70% accuracy. If you want to ensemble these models using majority voting. What will be the minimum ...
2
votes
1answer
53 views

weak learning of 3-piece classifiers using decision stumps

I have a question about Example 10.1 in Shalev-Shwartz and Ben-David's "Understanding Machine Learning." The example means to illustrate weak learning of 3-piece classifiers $\mathcal H$ using ...
4
votes
0answers
58 views

How to explain random forest ML algorithm doesn't learn at all, while logistic regression learns very well?

My prediction task is as follows: Use name to predict people's ethnicity (into 4 categories: "English", "French", "Chinese", and "All others") as a multiclass classification problem. The name ...
1
vote
0answers
53 views

Calculate Gini Importance for Boosting Trees

From my understanding, Gini Importance means Mean Decrease in MSE for regression objectives, and Mean Decrease in Impurity for classification objectives. Typical random forest packages like ...
0
votes
0answers
10 views

Whats a good estimation for error measuremets when trying to predict values inside two bands?

I am using gradient boosting to predict two quantiles (upper and lower). The predicted value can be above, below, or in bounds. The problem I am facing is that counting the number of values in bound ...

1
2 3 4 5
16