Questions tagged [skewness]

Skewness measures (or refers to) a degree of asymmetry in the distribution of a variable.

Filter by
Sorted by
Tagged with
0
votes
0answers
5 views

Trying to determine which distribution to use for my percentage data for mixed effects model

I am seeing a lot of different answers to percentage data, either beta or binomial with a logit link and not to use poison distribution because it isn't count data. My response variable is retention ...
3
votes
1answer
87 views

In R, how to detect possible outliers in right skewed data assuming Poisson distribution?

I am attempting to identify possible outliers in data which is skewed to the right and I assume it is Poisson distributed. I am a novice in all things statistics, and the following may be utterly ...
1
vote
0answers
21 views

Which calculation is correct/used in which case for adjusted Skewness?

I was looking at the wikipedia page for skewness here: http://en.wikipedia.org/wiki/Skewness and under the section on sample skewness the following modification is shown for sample skewness: However,...
0
votes
1answer
29 views

Using paired t-test or Sign test to compare two groups of correlated measures on the same subject?

I have conducted a survey where participants are shown 8 different advertisements: 4 of the ads attempt to evoke the feeling of guilt, 4 others attempt to evoke the feeling of shame. After seeing each ...
0
votes
0answers
8 views

Using standard deviation and mean on skewed distribution

I know there are multiple posts about this matter however I am still confused. http://web.ma.utexas.edu/users/mks/statmistakes/skeweddistributions.html http://courses.lumenlearning.com/wmopen-...
1
vote
2answers
55 views

PDF Formula for distribution with mean, standard deviation, skew, and kurtosis

What would the probability density function be for a graph with input variables: mean, standard deviation, skewness, and kurtosis? For example, if the inputs were confined only to mean and standard ...
0
votes
0answers
27 views

confidence interval for mean based on small sample when CLT does not hold

I have looked at similar questions but could not find an satisfactory answer. Please forgive if I'm wrong. I have a small sample (n = 24) and use the sample mean as estimator of the true mean. I want ...
0
votes
0answers
19 views

Deriving skew t density function through convolution representation?

I am studying on skew t distribution, so i need its density function. I want to derive that via, integral of convolution representation. Could you please help me and introduce a good source?
0
votes
0answers
28 views

Correlation between a normal distribution and a high positively skewed distribution

I would like to test the correlation between a quantitative continuous variable normally distributed (body mass index) and a quantitative continuous variable positively skewed (kurtosis=5, skewness=2)(...
0
votes
0answers
27 views

Necessary to deal with skewness of response variable for Random Forest?

I am dealing with a regression problem for predicting the first-year production volumes of oil wells. My response variable is quite heavily right-skewed, as can be seen from the following distribution ...
0
votes
0answers
10 views

Kernel for Skewness in U-statistics

How to find a kernel for a parameter \theta = E [(X-E[X])^3] and use it later for calculation of U statistics?
1
vote
1answer
34 views

How to represent skewness(X) in terms of the expected value?

Let $X$ be the random variable. $E(X)$ is the expected value of $X$ Then $Var(X)$ = $E(X^2)$ − $[E(X)]^2$ where $Var(X)$ is the variance of $X$ Then how to represent skewness(X) in terms of the ...
0
votes
0answers
20 views

Creating robust intervals from highly skewed data?

I am using factor analysis to model the underlying structure of social capital. My data consists of individual responses expressing how often they interacted with other individuals in a specific year, ...
0
votes
0answers
16 views

Metric (distance) for highly skewed data

I am currently investigating some points in 4D in relation to some reference point (also in 4D). That is, I want to test how the distance to the reference point depends on some other variables. ...
1
vote
1answer
36 views

How to deal with differently skewed biological data?

I have a single-cell data set with around 40 variables per cell (protein expression, all variables are measured simultaneously). The expression distributions for the single channels look quite ...
1
vote
0answers
47 views

Skewness and kurtosis using quantiles and mean/variance

I would like to ask if there is a way to get skewness and kurtosis measures if we only know the distribution's mean, variance, and certain quantiles. Basically, the problem that I am facing is I have ...
0
votes
0answers
34 views

Generate random values to mimic skewness

I have a actual set of data where the variables are heavily skewed, both positively and negatively. I need to generate random sample data for the values going forward. The data needs to be similarly ...
1
vote
1answer
76 views

Estimating Quartiles with Moments

The Wikipedia article on Skewness indicates that the median of a distribution can be estimated from the mean, standard deviation, and skeweness with an error term that goes as $O(skewness^2)$. ...
0
votes
1answer
19 views

How to conduct a test for independence in case of skewed classes (experiment design)?

The setting is as follows. We have a population of size $N$. Each subject has two properties $A$ and $B$, which can be either true or false. The question is: if for a random subject $A$ holds, is the ...
0
votes
3answers
94 views

Performing t-test on highly skewed financial data + outlier treatment?

I need some advice on performing statistical tests on financial ratios and highly skewed data. I have gathered a large sample of several financial ratios for two groups. The sample size is + 40,000 (...
0
votes
0answers
26 views

Z-score from Skewed Student T

I'm implementing the following method. The text is provided for background, but my question is about line (8). Am I understanding this as "a z-score generated from a standardized skewed Student t?" ...
0
votes
0answers
11 views

R st.mple versus sstdFit versus python ss.nct.fit

Working through David Ruppert Statistics and Data Analysis second edition. Trying to determine difference between the st.mple output and the sstdFit output. ...
1
vote
1answer
138 views

Skewness statistic - how close to zero should it be?

I am working my way through Chapter 3 in the Applied Predictive Modeling by Kuhn and Johnson. In section 3.2 the discussion values close to zero indicate symmetry. My question is - how close to zero? ...
3
votes
1answer
94 views

Notation for skewness and kurtosis

Traditionally (see Johnson et al., 2005, for instance), the (population) standardised skewness and kurtosis can be denoted as $\sqrt{\beta_1}= \frac{\mu_3}{\sigma^3}\;\;$ and $\;\;\beta_2= \frac{\...
0
votes
1answer
44 views

How can I describe a heavy skewed population?

I have a set of data on the size of a group a person is apart of for 100 persons. member group_size bob 16 sally 30 jim 5 The size of the group a person ...
1
vote
1answer
56 views

Regression with skewed/half-normal distribution - should I transform, undersample, or something else?

I'm trying to create a regression model using data that are heavily skewed, almost like a half-normal distribution with a mode at zero. The lower scores are over-represented due to sampling technique -...
0
votes
0answers
22 views

Correlation and skewness

I have a data set on which i want to run a machine learning algorithm , some of the columns are skewed. If I apply a transformation (lets say log) to those columns and I want to display the matrix of ...
0
votes
1answer
29 views

Skewness and correlation [duplicate]

I have a data set on which I want to run a machine learning algorithm. Some of the columns are skewed. If I apply a transformation (let's say log) to those columns and I want to display the matrix ...
0
votes
2answers
69 views

Parametrization of a skew-normal distribution such that negative part is constant

I was wondering, how the parameters of the skew-normal distribution (http://en.wikipedia.org/wiki/Skew_normal_distribution) would be constrained when I require that a constant part of its support is ...
1
vote
2answers
162 views

Ratio of median to mean

Is the ratio of the median to the mean of a distribution used for any descriptors e.g. measure of skewness?
2
votes
1answer
30 views

How does taking the log of a variable to reduce its skewness change how we interpret that variable?

I understand that taking the log of a variable is important to reduce the skewness of that variable, which thus allows it to be better suited for statistical tests such as OLS regression. However, ...
0
votes
0answers
26 views

Hypothesis Testing Skewness in two-sample cases

Anyone knows how to do hypothesis testing on skewness for two samples? Better in R please. The null hypothesis is the skewness of two groups is the same. I've found the below two relevant posts, but ...
0
votes
0answers
32 views

Using central moments (like skewness and kurtosis) for machine learning

I am working on a problem where we model two events, let's call them $A$ and $B$. Overall goal is to figure out if $A$ and $B$ have any relationship with each other, such that A and B happen ...
0
votes
0answers
30 views

Measures of Uncertainty in Higher Order Moments [duplicate]

Suppose I have a sample of data and compute the mean. I am able to quantity uncertainty in this by computing the variance. I am wondering if known methods exist for evaluating the uncertainty ...
0
votes
0answers
24 views

how to detect ouliers in skewed data(left skewed distribution) [duplicate]

I have studied a lot of ways of dealing with outliers of normal or multidimensional data. But my problem is about skewed data. How can I find outliers for data with a skewed distribution to the left?
0
votes
1answer
84 views

Skewed outcome variable, sem model: is it a problem?

My outcome variable is really skewed, and I want to include it in a SEM model (I am using lavaan - R). It is measured with a 7-points Likert scale (agreement) and consists of 5 items. If the model ...
0
votes
1answer
58 views

What is the type of this distribution :$f(t,\beta ,\mu , \theta )= \frac{\sqrt{\pi}}{2} e^{\theta(t-\mu)^{\beta}},\theta >0$ on unit interval?

This question is related to my question here Which is answered by Martijn Weterings showning that is closed to truncated Generalised Gaussian distribution for negative $\theta $ values , My question ...
1
vote
1answer
98 views

How to identify if my data set is skewed or not?

I think my assumptions are a bit naive regarding this matter. I have two metrics about my data set: the number of items and the cardinality of the items. Low cardinality means a lot of repeated items ...
0
votes
1answer
92 views

Find probability between $-\infty$ and $0$

The graph shown below is the numerically result of differences of Normal Distribution ($N(15.5 , 0.60^2))$ and Exponential Distribution $(\exp(0.5))$ (Both are independent). I am trying to find the ...
0
votes
0answers
20 views

Pooling Skewness, Kurtosis, and/or Shapiro Wilk Values in a Multiply Imputed Database

I am currently screening a database that has undergone multiple imputation (20 iterations). Unfortunately, SPSS does not provide pooled values for the Skewness, Kurtosis, or Shapiro Wilk. I am ...
0
votes
0answers
17 views

Transform right skewed distribution to normal [duplicate]

My datasets histogram is the following: It contains a lot of zeros, that is why the high bar around zero. How can I transform this to a normal distribution? My problem is that the high bar coming ...
0
votes
1answer
60 views

What if we take the logarithm of $X$ ? How does skewness and kurtosis change?

Motivated by this question, Let $X$ be a random variable. What if we take the logarithm of $X$ ? How does skewness and kurtosis change of $X$ changes compared to $\ln{X}$ ?
1
vote
0answers
200 views

Interpreting Shapiro-Wilk-Test, skewness and kurtosis in the light of a qqplot and histogram

Currently I'm trying to find out if my data with n=11 follows a normal distribution to decide how I process further. To find this out I use the Shapiro-Wilk-Test which gives me p < 0.05 and thus I ...
0
votes
0answers
21 views

Small sample size with very skewed right response [duplicate]

I have a dataset of 300 observations with 7 predictor variables with 1 continuous response variable. The response is strongly skewed to the right and there are no significant correlations among any ...
0
votes
0answers
34 views

How to Incorporate Skew Into Simulated Data?

Suppose I have a dataset $\mathbf{X}$ which is a $n \times m$ matrix of $n$ independent realizations of some $m$-dimensional random vector $\mathbf{x}$. I want to generate a new dataset $\mathbf{X}'$ ...
0
votes
1answer
24 views

Measuring the effect of a variable across a threshold

Within my data I am trying to assess whether the response variable increases as we move across different thresholds. The difficulty is that the response variable also increases exponentially as a ...
0
votes
0answers
17 views

Modelling births as an outcome of population diversity as opposed to population size

I wish to model and estimate the relationship between population diversity and births (or birth rates) across populations, with panel data, and I face several challenges. 1) Births, population size ...
0
votes
1answer
132 views

Log-Normalization of skewed data before feeding to neural network models ( autoencoders)

If your input data has few columns that are extremely skewed, It is well known that one would log normalize ( take log and then normalize or standardize) the data before passing to regression ...
3
votes
1answer
83 views

Are the skew-normal distribution and the skew-Cauchy distribution heavy-tailed?

I think the title is self-explanatory. I understand that the skewness and the tail behavior of some distribution are completely unrelated as any symmetric distribution will have a skewness of zero ...
0
votes
0answers
41 views

Kfold validation of skewed data

I'm working on a data which is skewed,I was trying to use cross validation but since most of my data is in just 1 of 6 classes when using cross validation some folds don't contain samples from all 6 ...

1
2 3 4 5
11