# Tags

A tag is a keyword or label that categorizes your question with other, similar questions. Using the right tags makes it easier for others to find and answer your question.

for any *on-topic* question that (a) involves `R` either as a critical part of the question or expected answer, & (b) is not *just* about how to use `R`.

Techniques for analyzing the relationship between one (or more) "dependent" variables and "independent" variables.

Machine learning algorithms build a model of the training data. The term "machine learning" is vaguely defined; it includes what is also called statistical learning, reinforcement learning, unsupervis…

data observed over time (either in continuous time or at discrete time periods).

A probability provides a quantitative description of the likely occurrence of a particular event.

inconsistent with a given hypothesis rather than being an effect of random fluctuations.

a mathematical description of probabilities or frequencies.

A routine exercise from a textbook, course, or test used for a class or self-study. This community's policy is to "provide helpful hints" for such questions rather than complete answers.

Refers generally to statistical procedures that utilize the logistic function, most commonly various forms of logistic regression

a method of statistical inference that relies on treating the model parameters as random variables and applying Bayes' theorem to deduce subjective probability statements about t…

a broad class of computational models loosely based on biological neural networks. They encompass feedforward NNs (including "deep" NNs), convolutional NNs, recu…

the problem of identifying the sub-population to which new observations belong, where the identity of the sub-population is unknown, on the basis of a training set of dat…

Mathematical theory of statistics, concerned with formal definitions and general results.

A measure of the degree of linear association among a pair of variables.

Statistical significance refers to the probability that, if, in the population from which this sample were drawn the true effect were 0 (or some hypothesized value) a test statistic as extreme or more…

The normal, or Gaussian, distribution has a density function that is a symmetrical bell-shaped curve. It is one of the most important distributions in statistics. Use the [normality] tag for asking ab…

ANOVA stands for ANalysis Of VAriance, a statistical model and set of procedures for comparing multiple group means. The independent variables in an ANOVA model are categorical, but an ANOVA table can…

Regression that includes two or more non-constant independent variables.

Mixed (aka multilevel or hierarchical) models are linear models that include both fixed effects and random effects. They are used to model longitudinal or nested data.

the task of partitioning data into subsets of objects according to their mutual "similarity," without using preexisting knowledge such as class labels. [Clustered-standard-errors a…

an interval that covers an unknown parameter with $(1-\alpha)\%$ confidence. Confidence intervals are a frequentist concept. They are often confused with credible intervals wh…

A generalization of linear regression allowing for nonlinear relationships via a "link function" and for the variance of the response to depend on the predicted value. (Not to be confused with "genera…

a programming language commonly used for machine learning. Use this tag for any *on-topic* question that (a) involves `Python` either as a critical part of the question or expected answer, &…

The expected squared deviation of a random variable from its mean; or, the average squared deviation of data about their mean.

a special case of [prediction], in the context of [time-series].

Categorical (also called nominal) data can take on a limited number of possible values called categories. Categorical values "label", they do not "measure". Please use [ordinal-data] tag for discrete …

a linear dimensionality reduction technique. It reduces a multivariate dataset to a smaller set of constructed variables preserving as much information (as much v…

Repeatedly withholding subsets of the data during model fitting in order to quantify the model performance on the withheld data subsets.

A test for comparing the means of two samples, or the mean of one sample (or even parameter estimates) with a specified value; also known as the "Student t-test" after the pseudonym of its inventor.

too general; please provide a more specific tag. For questions about the properties of specific estimators, use [estimators] tag instead.

Constructing meaningful and useful graphical representations of data. (If your question is only about how to get particular software to produce a specific effect, then it is likely not on topic here.)

a method of estimating parameters of a statistical model by choosing the parameter value that optimizes the probability of observing the given sample.

Creating samples from a well-specified population using a probabilistic method and/or producing random numbers from a specified distribution. As this tag is ambiguous, please consider [survey-sampling…

A test (typically of distribution, independence, or goodness of fit) or a family of distributions related to such a test.

R packages used for fitting linear, generalized linear and nonlinear mixed effects models. For general questions about mixed models use [mixed-model] tag.

Refers to the AutoRegressive Integrated Moving Average model used in time series modeling both for data description and for forecasting. This model generalizes the ARMA model by including a term for d…