All Questions

Filter by
Sorted by
Tagged with
1
vote
0answers
3 views

User-based vs Clustering-based Collaborative Filtering

Reading about recommender systems in this blog, i found that KNN (k-nearest neighbors) can be used for user-item (user-based) collaborative filtering to find similar users. But in another category of ...
0
votes
0answers
3 views

Confusion on scikit-learn nested cross validation example

There are a ton of threads on nested cross-validation. "An intuitive understanding of each fold of a nested cross validation for parameter/model tuning" gives a good explanation. scikit-learn has an ...
0
votes
0answers
5 views

LOOCV in Caret package ( randomForest example) - not unique results

I pose you my doubts: For what I know there is only a single way to perform a LOOCV for a model (i.e. testing each one of the N elements vs the model trained with the other N-1 elements). Namely, ...
0
votes
0answers
9 views

How does this solution relate to the actual real-life stats problem?

4.62 in Newbold (8 ed): A new warehouse is being designed and a decision concerning the number of loading docks is required. There are two models based on truckarrival assumptions for the ...
0
votes
0answers
17 views

Statistical Model used for predicting number of deaths due to COVID-19 [closed]

What statistical model is used to predict number of deaths due to COVID-19? Suppose I have a dataset with number of deaths for last two months and would like to predict number of deaths for next month?...
1
vote
0answers
7 views

How to analyze repeated measures when condition changes at each time point?

I have a dataset from a repeated measures experiment that I am trying to analyze. The experiment had 4 possible conditions. Participants were measured on 1 condition and then again on a second ...
0
votes
0answers
8 views

Monte Carlo integration and mixture distribution

How could I estimate an integral using Monte Carlo method when I have a mixture distribution? For example I want to estimate the below integral: $I=\int_{0}^{1}f(x)dx$ And my distribution is: $p(x)=...
0
votes
0answers
7 views

Mixed Effects Model (3 level model?)

Consider the following problem. The dataset that I am considering has $n=1800$ units (high-end copying machines). Label the units $i = 1,\dots,n$. Unit $i$ has $n_i$ recordings. It is of interest to ...
0
votes
0answers
3 views

How can I plot fitted lines of a mixed model with a large number of clusters? [migrated]

I am working with a longitudinal data set with measurements of growth of several hundred individuals. I am using the lme4 and pedigreemm R packages for the modelling part. I would like to plot the ...
0
votes
1answer
12 views

sum of the x's and the sample mean issues

Quick simple question as I must have missed the explanation. why $$\sum_{j=1}^n (x_j - \bar{x}) = \sum_{j=1}^n x_j - n\bar{x} = (n\bar{x}- n\bar{x})$$ I understand why $\bar{x}$ turns out to be $n\...
0
votes
0answers
5 views

cox frailty model in R

I run a cox frailty model(model 1) in R, by adding new co-variate to model(model 2) The ACI decreases that show the new model is better than model 1, but the variance of random effect of model 2 is ...
1
vote
0answers
14 views

Iterative solution to Gamma distribution MLE problem

I'm trying to follow the derivation for the MLE parameters of the gamma distribution in [1]. The standard approach is to derive an expression for the log likelihood, differentiate with respect to ...
0
votes
1answer
5 views

Is there any formal guideline that would indicate the necessity for adjustment for baseline when analysing change from baseline?

As in the topic. I saw many critical discussions along with mathematical explanation on why the baseline shouldn't (or mustn't) be employed as a covariate, when analysing the change from baseline. ...
2
votes
0answers
6 views

Formula of the Chebyshev's inequality for an asymmetric interval

The formula for Chebyshev's inequality for the asymmetric two-sided case is: $$\Pr( l < X < h ) \ge \frac{ 4 [ ( \mu - l )( h - \mu ) - \sigma^2 ] }{ ( h - l )^2 } \ .$$ What I don't understand ...
0
votes
0answers
8 views

What is the ACF plot of $x_t = 0.9 x_{t-2} + w_t$

I am just learning time series, and I am wondering about the following AR(2) model: $x_t = 0.9 x_{t-2} + w_t, w_t \sim N(0, \sigma_w^2)$ Please show me the plot of its Autocorrelation Function, or ...
0
votes
0answers
5 views

How to calculate correlation coefficient and AIC for non-linear estimation in R or Statistica?

I need to compare two non-linear models of growth. The first one is calculated with nls function in R and with non-linear estimation function in Statistica - both programs gave identical results and ...
0
votes
0answers
10 views

R: read.csv imports my numeric columns with lots of missing as NULL, how to prevent? [closed]

My data has 63 columns, and for a column, 'hours' has more than 50% of missing values and it's converted as NULL when importing. But the column is very important and needs to be used after cleaned....
0
votes
0answers
3 views

KL divergence of categorical distribution with continuous inputs

I want to simulate a process. I have a probability distribution and I have d classes to choose from. The inputs of my distribution are 3d points and it maps each of these points to a d-dimensional ...
0
votes
0answers
6 views

Virtual seminars and workshops in Statistics and//or Machine Learning [closed]

I was wondering if there are any webinars in Statistics or Machine Learning one could join through Zoom during these bizarre times. I know that economists have a list of online resources here: http:...
1
vote
0answers
19 views

No significant p values after multiple comparison of 126 tests

I am wondering if there is any reasonable other ways to adjust for multiple comparisons when you have such a large number of tests. I have a study with 126 brain regions being scanned in a group (N=20)...
0
votes
0answers
7 views

Properties Of Bivariate Distribution Function [closed]

I have a problem with the below problem. Actually I have no idea how can I solve the question. I tried to implement 4 properties of bivariate distribution function, but I couldn't. I have derived to ...
0
votes
0answers
3 views

mixed models vs anova for within-subject design

I know this topic may have been handled before and I apologise for being so lazy in checking all the other posts. Here's my concern: I have a group of babies (so everything is within-subjects) whose ...
0
votes
0answers
5 views

AIC model selection for group studies

In some areas, it is common to fit a model separately to multiple clusters in a data set, for instance fitting a cognitive model separately to data from each participant in an experiment. Model ...
0
votes
0answers
6 views

Is there a way to normalize my data of multiple groups to use as a random effect or to incorporate it into my model

I have Retention Efficiency(RE) percentages of 5 food types for sponges. RE is found by ((incurrent food - excurrent food)/incurrent food). I am running a generalized mixed effect model to look to see ...
0
votes
0answers
6 views

variation partitioning with a GAMM model including an auto-correlation structure in R

I would like to undertake variation partitioning in a GAM framework in R, as described here: http://r.789695.n4.nabble.com/variance-explained-by-each-term-in-a-GAM-td836513.html However, my gam ...
1
vote
1answer
26 views

Regression - Interpretation of coefficients and probability

I am very confused about the output of my regressions. First of all, I am not even sure if I could divide my sample as I did, meaning that by subsampling as I did the variable ESG score is both ...
0
votes
0answers
9 views

Estimating ratio of two PDFs where one of them is noisy

I have a list $L_1$ of positive integers, such as $[1, 2, 1, 3, 10, ...]$. There are repetitions. From this list, I sample (with repetition) according to some method (not relevant to my question), and ...
0
votes
0answers
17 views

Using the STAN math library [closed]

I would like to use a Matern Covariance Function for gaussian process regression in STAN. (Through RStan) The standard exponential covariance function works withouth issues ...
4
votes
2answers
31 views

Why does component-wise median not make sense in higher dimensions?

I would like to compute the median of a higher-dimensional point set by computing the component-wise median for each individual dimension. The point that consists of the medians of each individual ...
0
votes
0answers
14 views

When and when not to use an A/A test?

I'm curious about the circumstances under which an A/A test is a appropriate, vs. when it is not. Here is my current understanding: A/A testing is an empirical method, and the point is to ...
0
votes
1answer
15 views

Can a random variable be expressed as a sum of deterministic and random variable?

Say we have a sequence of random variables $\{X_t:t\geq 0\}$ following an unknown stochastic process with distribution $X_t\sim N(\mu_X,\sigma_X^2)$. This idea came to me from the additive noise model....
0
votes
0answers
9 views

Compare linear regression models for same variables but different data

I have created a linear regression model for height and weight using UK data, and want to compare this with the height and weight relationship of other countries. What would be an appropriate method ...
2
votes
1answer
19 views

Combining information from multiple distributions

I have 13 classes. For each class, I have a different distribution: e.g. For each distribution, the y-axis indicates the probability and the x-axis indicates a count value. Given some input data, I ...
0
votes
0answers
8 views

graph convolution network

I am trying to understand papers and lectures on graph convolution networks but whenever I open some paper, I get lost on the very first page. I started with some videos like this and this and papers ...
0
votes
1answer
11 views

Custom metrics for multiclass classification when class errors have different weights

I have a multiclass classification problem (eg. the target variable is made by 4 different outcomes: Product A, Product B, Product C and NO Product). Not all the errors are equal: for example, if the ...
1
vote
0answers
7 views

How to test Multinomial Logistic Regression assumption in R

So I'm currently trying to use a multinomial logistic regression model in R on a data set with 13 variables (mix of continuous and categorical) and 33,000 observations, where the dependent variable ...
1
vote
0answers
7 views

First-difference and lags

I am newbie to time-series econometrics. I want to estimate a model for the association between greenhouse gas emissions and new green technologies. The estimation equation I want to use is $$CO_{t} = ...
0
votes
0answers
13 views

Choosing a rotation method for ESEM

I am trying to decide on which oblique rotation method to use for my ESEM analysis (with MLR estimator). MPLUS provides a number of options (GEOMIN, QUARTIMIN, OBLIMIN, CRAWFER, etc.). I was wondering ...
0
votes
0answers
6 views

Linking Correlated Dependent Variable with Independent Variable

I have a Monte Carlo model that generates a distribution of possibilities $X_i$ for the non-normal stochastic process $Z$ it describes. The distribution of $X$ and $Z$ is fat tailed but for the most ...
0
votes
0answers
6 views

How can the prediction of a model be assessed?

I just played around with the VGG16 and ResNet56 model trained on the ImageNet dataset and realized, after running some tests, that the prediction confidence of both networks is really high even if ...
5
votes
1answer
24 views

What does “version” mean here?

In a paper I read about the following statement: "Assumption 1. There is a version of $f(x)$ that is twice continuously differentiable" Note that $f(x)=E(Y|x)$ is an unknown function to be estimated ...
0
votes
0answers
7 views

Calculating Confidence Interval for Estimated Parameters of SEIR model

I used a Log-Likelihood Estimation (Poisson) Objective Function to estimate and fit a curve to a data of reported infected cases of COVID-19 using SEIR model in order to estimate its coefficients. How ...
0
votes
1answer
15 views

Transforming a random sample [duplicate]

For a dsitribution $p(X)$, let $x_1,\ldots,x_n$ be an independent sample of $p(X)$. Consider the one to one transformation $h(\cdot)$ such that $Y = h(X)$. If we apply the transformation to each of ...
1
vote
0answers
12 views

Determining the power of the test in this question

The following problem is from Devore's Probability and Statistics for Engineering and the Sciences, 8th edition, exercise 8.1 question 33: Reconsider the accompanying sample data on expense ratio(%...
-1
votes
1answer
32 views

How to use linear regression for prediction [closed]

A taxi company monitoring the safety of its cabs kept track of the number of miles tires had been driven (in thousands) and the depth of the tread remaining (in mm). Their data are displayed in the ...
0
votes
0answers
7 views

cant seem to do random slope intercept model because I am missing values, any way around it?

I have a data set with 3 fixed effects categories region(2 levels), genus(2 levels), and food(5 levels). I am looking to see if sponges have different retention efficiency of the different food type ...
1
vote
1answer
23 views

R: Question about central limit theorem

Hello everyone :) can you help me please, I really don't understand my teacher's videos and it is the last part of our 20-pages work :O In the question 1 they ask us to create a Poisson distribution ...
0
votes
0answers
4 views

Measures of dispersion which are scale invariant and can handle a mean of 0

Is there a measure of dispersion which is scale-invariant, s.t. I can compare it between datasets of different scale and does not have the problem like the Coefficient of Variation which is undefined ...
0
votes
0answers
5 views

How to attempt the continuous MC with generator matrix Q part c [closed]

continuous MC with generator matrix Q
0
votes
1answer
20 views

Hypothesis test for arbitrary distribution

Given a sample of $N$ observed values I'd like to test the null hypothesis that they arose from an arbitrary PDF (for which I have the analytical form). There are tests in place that can handle some ...

15 30 50 per page
1
2 3 4 5
3129