Questions tagged [regression]

Techniques for analyzing the relationship between one (or more) "dependent" variables and "independent" variables.

Filter by
Sorted by
Tagged with
0
votes
1answer
13 views

sum of the x's and the sample mean issues

Quick simple question as I must have missed the explanation. why $$\sum_{j=1}^n (x_j - \bar{x}) = \sum_{j=1}^n x_j - n\bar{x} = (n\bar{x}- n\bar{x})$$ I understand why $\bar{x}$ turns out to be $n\...
0
votes
1answer
5 views

Is there any formal guideline that would indicate the necessity for adjustment for baseline when analysing change from baseline?

As in the topic. I saw many critical discussions along with mathematical explanation on why the baseline shouldn't (or mustn't) be employed as a covariate, when analysing the change from baseline. ...
1
vote
1answer
26 views

Regression - Interpretation of coefficients and probability

I am very confused about the output of my regressions. First of all, I am not even sure if I could divide my sample as I did, meaning that by subsampling as I did the variable ESG score is both ...
0
votes
0answers
17 views

Using the STAN math library [closed]

I would like to use a Matern Covariance Function for gaussian process regression in STAN. (Through RStan) The standard exponential covariance function works withouth issues ...
1
vote
0answers
9 views

Compare linear regression models for same variables but different data

I have created a linear regression model for height and weight using UK data, and want to compare this with the height and weight relationship of other countries. What would be an appropriate method ...
2
votes
0answers
8 views

When do I have to correct for multiple comparison when computing different mutiple regression models?

I' ve got a problem with my thesis. I use a set of 3 bloodbiomarkers plus age as a covariate to predict Marker of Functional Connectivity on the one hand and Structural Connectivity on the other hand. ...
5
votes
1answer
26 views

Calculating Diagonal Elements of $(X^TX)^{-1}$ From R Output

With $X$ being the design matrix, calculate the diagonal elements of the matrix $(X^TX)^{-1}$ using only the R output. I found the diagonal elements to be $$\frac{1}{n SSX} \bigg[n,\sum X_{i1}^2, \...
0
votes
0answers
18 views

How to compare change in regression

I am trying to determine the effect of a program on a single measurable value over time. I am a computer programmer with only a very elementary understanding of statistics, so please forgive me for ...
0
votes
0answers
11 views

Proof for sigma caret squared = (sum of residuals squared)/ (N-1) [duplicate]

The general proof is that sigma caret squared = (sum of residuals squared)/ (N-k), where k is the number of parameters in the regression equation. I am unable to prove for the regression equation: y=...
0
votes
0answers
7 views

Identical IV and Moderator in two different models - Problematic? (e.g. endogeneity / causality)

I’m currently working on two research papers. I want to make sure that they do not stand in conflict with each other (methodologically). Both papers are based on a multiple linear regressions (OLS). ...
0
votes
0answers
6 views

How to factor in Effect of on going Covid-19 situation in sales volume prediction model?

I have a regression model (log-log) which predicts the volume in the presence of variables like price and other promotion causals. Now given the current situation how do we factor in the effect of ...
1
vote
0answers
11 views

Regression equation given a joint distribution

Let $X$ and $Y$ be two random variables with joint probability density function $f(x,y)$ = $ \left\{ \begin{array}{ll} 1 & -y\leq x\leq y , & 0 < y <1\\ 0 & ...
0
votes
0answers
11 views

¿How can I tell if I have a spurious regression for panel data?

I have a strange case. In a model N=150, T=17 (i.e. 150 countries observed across 17 years) I run the model in which I regress GDP against some variables. I get R-2=0.95 and all independent variables ...
0
votes
0answers
14 views

Is a regression model the best model to use for predicting stock prices based on other stocks considering the multicollinearity problem?

I'm just learning how to use Sklearn and python for data analysis. One of the projects I'm working on is trying to predict the price of a stock over a short time span (a few minutes) based on the ...
0
votes
1answer
15 views

Partial derivative of a linear regression with correlated predictors

Let's set up the situation of having some $Y$ that I think depends on a linear combination of $X_1$ and $X_2$. I could fit a regression model: $$y_i = \beta_0 + \beta_1x_{i1} + \beta_2x_{i2}$$ We ...
0
votes
0answers
15 views

Chow test or simple t-test for the interaction term in regression?

If I want to check whether a specific regression coefficient differs between two samples, let us say men and women, I believe the right approach is to combine the samples, add a men/women dummy, and ...
0
votes
0answers
8 views

How to regress variables in quantiles in a Fixed-effects model? (one large model or multiple separate ones)

For my thesis I examine whether generating renewable energy influences a company's cost of capital (beta, cost of debt & leverage). In this, I split my variables into 3 quantiles of Firm size by ...
0
votes
0answers
10 views

How does an unbalanced nuisance factor influence my outcomes

I am retrospectively analyzing a specialized dataset of low sample size with only four subjects. All of the subjects' dependent variables were recorded in three conditions (levels: A, B, C) within ...
1
vote
1answer
22 views

How should I use interaction variables to compare two logistic regressions?

I am working on creating a predictive model using logistic regressions. I am hoping to compare two different populations, using the same set of variables but different data sets with different sample ...
1
vote
0answers
29 views

linear regression with log transform

I have a (time-series) data of 15 points: (1,0.3555),(2,0.2739),(3,0.2379),(4,0.2254),(5,0.1826),(6,0.1715),(7,0.1549),(8,0.1314),(9,0.1452),(10,0.1148),(11,0.1051),(12,0.0996),(13,0.0941),(14,0.101),(...
1
vote
0answers
15 views

Regression for no representative sample

Can I use regression analysis for data which are not representative to the whole population? I want to describe relations between Y and Xs but I have no ambition whatsoever to extrapolate behind this ...
2
votes
1answer
18 views

Inverse Regression vs Reverse Regression

I'm aware there's a great number of questions which deal with the mathematical difference between the two, but I'm still confused as to best practice. Basically I'm looking at a situation where we ...
0
votes
0answers
13 views

Impute Continuous Predictor which 0 or median is not an option

I have a dataframe of the following patients: PatientID Days.To.Develop.Symptoms 1 0 2 1 3 3 4 NA ...
0
votes
1answer
19 views

How to interpret a third-order regression?

I read some questions about this subject, but I couldn't find an answer. I'm having trouble interpreting the practical effect of the polynomial predictor variable on the response variable. My model ...
2
votes
0answers
22 views

Normalizing logistic regression probabilities to fixed number

I have run a logistic regression model on a target variable and get a list of probabilities like [0.50, 0.30, 0.20, 0.10, 0.05, 0.05, 0.01]. For the target situation, I know that there are always ...
0
votes
0answers
5 views

Gaussian Process Regression for High Dimensional Data: Vanishing predicted y values

I am working on a dataset with roughly 196 dimensions. I have been trying to fit Gaussian Process Regression into this dataset but it does not perform well. Mathematically speaking, I found out that ...
0
votes
0answers
7 views

Interpretation of Johanson co-integration test results

if the Max-eigenvalue test indicates no cointegration at the 0.05 level, while the Trace test indicates 2 cointegrating eqn(s) at the 0.05 level, how do i interpret the results?
1
vote
1answer
22 views

Can we express the following unconditional probability as follows?

Some of you may be aware that I have been asking a nagging question for quite a while on this forum, in different shapes and forms. Although I may have been a nuisance, may I thank you as this has ...
0
votes
0answers
7 views

Understanding the false discovery rate and positive selection rate

I want to know the relationship with EBIC and the FDR and PSR of some GLM, and what these two formulas mean? EBIC is defined as $$ EBIC(s) = -2\ell(\hat{\beta}(s)) + v(s)log(n)+2v(s)\gamma\...
0
votes
1answer
30 views

Regression analysis: Log-transformation to meet assumptions?

For my master's thesis I'm exploring the relationship between attitude towards the advertismenent (Aad), brand types (boutiques and high street) and willingness to recommend (willing or not). ...
1
vote
1answer
17 views

How to plot density for repeated k-fold cross validation?

Long story short, I conducted regression using repeated k-fold cross validation. While messing around I decided to plot the density of the R-squared distribution for the resampling. Obviously there ...
3
votes
1answer
25 views

A regression model for ratio of two Binomial success probabilities

There are two series of observations of two Binomial trials. Observation $i$ of series 1 includes $n_{1i}$, the number of Bernoulli draws and $\overline{p}_{1i}$, the ratio between number of success ...
0
votes
1answer
38 views

Interpretation of linear regreassion output when one categorical variable is represented as several dummies

I have a question regarding the interpretation of a linear regression output. In my data I have one independent categorical variable (condition) with five values which I represented as four dummy ...
0
votes
1answer
12 views

Bayesian regression kernel

I am reading Bishops book "pattern recognition and machine learning". In chapter 3 "linear models for regression" section 3.3.3 "equivalent kernel" equation 3.63 on page 160 is given as follows: $$...
0
votes
0answers
19 views

when the DV ranges from +1 to -1, which regression model?

I would like to ask which regression model I have to use when the DV ranges from +1 to -1. One of IV also ranges from +1 to -1. the other IV ranges from 0 to 1. These are all continuous variables but ...
0
votes
1answer
25 views

Hypothesis Test on the Difference between two random vectors

Each of my vectors consists of beta estimates for two separate models of the same data and the same number of explanatory variables. The question is asking whether the difference between these two ...
2
votes
1answer
35 views

How can you do regression when two groups of variables sum to each other?

Suppose I have a model like this: $$ y = \beta_1x_1 + \beta_2x_2 + \beta_3z_1 +\beta_4z_2 + \epsilon $$ where $\epsilon$ is noise. It so happens that $$x_1 + x_2 = z_1 + z_2$$ but there is no ...
2
votes
1answer
20 views

How do we obtain the probability density of a truncated regression with an upper and lower bound

I know my density for $y$ is supposed to be something of this form $$g(y|x_{i},t)=\frac{f(y|x'\beta, \sigma^{2})}{F(t|x' \beta' \sigma^{2}}$$ where the numerator is the density of the normal ...
2
votes
1answer
21 views

Propensity scores and linearity in logistic regression

I know (and have read in other posts) that logistic regression isn't the only way to calculate propensity scores. But if you do want to use logistic regression for that, must you then check the ...
1
vote
0answers
16 views

Is it valid to use company/year combinations for (financial) data analysis?

I want do to a multiple regression analysis on some financial data. Sadly I only have 30 observations (the 30 different companies) and I would like to have more. My friend asked me why I don't add the ...
0
votes
0answers
19 views

Whether to cap the dependent variable while treating the outliers?

So I am trying to run a linear regression model in R where the objective is to identify what's driving the credit card spends including both primary and secondary. I have a dataset with 10000 obs I ...
-1
votes
0answers
10 views

Logistic regression in r : Deviance is much higher for one variable than for another [closed]

When I run a logistic regression in r, and I run the model using the anova function anova(MODEL0), two of the variables are much higher than the others; they come up as significant (barely). Any ...
1
vote
1answer
18 views

Expecation of Linear Regression Coefficients

Let the entity ${\widehat{\boldsymbol\beta}}$ be a linear estimator (not necessarily the least squares estimator) of the true coefficient ${{\boldsymbol\beta}}$ in the regression of 𝐲 on 𝐗. In this ...
0
votes
1answer
20 views

Treatment of main terms when adding interaction terms to linear regression

If I have a model that is Y~A+B, where only A is significant (low p-value). After adding an interaction term to form a new model Y~A+B+(AB), B remains not significant but the interaction term (AB) is ...
0
votes
0answers
18 views

Nested Variables- Insiginficant estimate. Whats Next?

Note: for more background about nested variables please see: this link. I have 2 variables ( one nested in the other) and I am trying to model them in R. The variables are as the following: ...
0
votes
1answer
39 views

Interpret loadings in PCA Regression

Reading about PCR I've found many people claiming that PCR is not so good since it doesn't allow you to evaluate the final loading over each base factors. I can't really understand where this problem ...
0
votes
0answers
27 views

Linear regression with empty cells for variable

The follow graph has been obtained using a repeated measures general linear model. As can be seen from the table, there are a fair number of empty cells for the variable educational level. ...
0
votes
1answer
31 views

What to do/tests to run when population size is unknown [closed]

Shall I stick to descriptive statistics? Everything I know about making inferences about a population seems pretty redundant. Feel free to roast me I'm working with some non-experimental cross-...
0
votes
2answers
65 views

Statistical illusion? What's statistically happening, when regression analysis results get significant only with all predictors and interaction term?

I got a research question, where the hypothesis (derived from theory) postulates, that the relationship between predictor X and outcome Y is moderated by W (X+W+X*W -> Y). All variables are sums of ...
0
votes
1answer
18 views

Calculate mean of dependent variable from a given linear regression model based on given values of independent variables in R

I have built a interactive model to predict child.iq with mother.age and mother.highschool: ...

1
2 3 4 5
413