Questions tagged [predictive-models]

Predictive models are statistical models whose primary purpose is to predict other observations of a system optimally, as opposed to models whose purpose is to test a particular hypothesis or explain a phenomenon mechanistically. As such, predictive models place less emphasis on interpretability and more emphasis on performance.

Filter by
Sorted by
Tagged with
0
votes
0answers
14 views

Statistical Model used for predicting number of deaths due to COVID-19

What statistical model is used to predict number of deaths due to COVID-19? Suppose I have a dataset with number of deaths for last two months and would like to predict number of deaths for next month?...
0
votes
0answers
6 views

How can the prediction of a model be assessed?

I just played around with the VGG16 and ResNet56 model trained on the ImageNet dataset and realized, after running some tests, that the prediction confidence of both networks is really high even if ...
0
votes
2answers
12 views

Most likely sources of divergence between (adjusted)-R squared and out-of-sample predictive performance

I'm wondering which invalid assumptions are most likely to explain the wild discrepancies between a model's R-squared as a measure of predictive performance, and the actual out-of-sample predictive ...
1
vote
0answers
21 views

Using quantile regression results to select and weight variables for models

Linear regression is commonly used to identify predictor(s) (e.g., scores on cognitive ability or personality assessments) of job performance. Typically, predictors that exhibit a significant ...
1
vote
0answers
22 views
1
vote
1answer
22 views

Can we express the following unconditional probability as follows?

Some of you may be aware that I have been asking a nagging question for quite a while on this forum, in different shapes and forms. Although I may have been a nuisance, may I thank you as this has ...
0
votes
0answers
13 views

Isn't a simulation a great model for model-based reinforcement learning?

Most reinforcement learning agents are trained in simulated environments. And the goal is often to maximize performance in this same environment. Why is the simulation not used for planning in these ...
0
votes
1answer
13 views

How do you interpret your features when you standardize your data?

Let's say I have built a boosting tree or neural network and I standardized my features beforehand. When I built my model, I split my data into training, validation, and test sets - each with their ...
1
vote
0answers
18 views

In predictive regressions, can we assume that the predictor follows the Ornstein-Uhlenbeck process?

A predictive regression is a regression of the form \begin{equation} y_t=\beta x_{t-1}+\varepsilon_t \end{equation} where $x_{t-1}$ is generally assumed to be a highly persistent stochastic variable, ...
0
votes
0answers
17 views

Accounting for extreme events in machine learning models

I am researching ways to account for extreme or anomalous events in predictive models. For example, if I am predicting revenue or consumer demand, what are some ways to account for events, like ...
0
votes
0answers
12 views

Model to predict coronavirus (covid19) spread [closed]

im new in data sience and machine learning but i have some mathematical and statistics backgroud. I really just want some information about models (like papers or raw models). So if you have any ...
0
votes
0answers
18 views

Unexpected Y predicted value using Generalized additive model with smooth term on predictor in R

I have made GAM model about a relation between marine debris concentration (as Y variable) with beach feature and a distance from a point location to a river, port, tourism object and city (as X ...
1
vote
0answers
23 views

build and evaluate prediction model with the same data

I have a dataset with a sample size of n=30, one dependent variable and 31 possible predictors. Now I want to build a regression model as part of a regression kriging model to predict my dependent ...
0
votes
0answers
14 views

Hedge fund rank on their returns or rating predictions modeling problem - How to find patterns between return and metrics

Problem: Hi, I m a new machine learning practitioner. I have a dataset about hedge funds. It contains monthly hedge fund returns and some financial metrics. I calculated metrics for every month from ...
0
votes
0answers
11 views

Using curve estimation for prediction

Using Java, I have acquired raw data through time for 2 and half days. the data are numeric and increasing in value through time, devided into coherent categories. Data is also collected from multiple ...
1
vote
1answer
26 views

How to deal with intentionally missing data

I have a dataset describing a vehicle's sensors. One of the sensors records the distance from cars in other lanes. Sometimes there are no cars to the right or the left of the vehicle and this is ...
0
votes
0answers
9 views

Are two machine learning models preferable to one when predicting a time series with several zeros?

I have a data series of positive integers. The majority of values are zero, but occasionally the values can be quite high. The task is to predict the following value given a set of features. Let's ...
1
vote
1answer
10 views

After training an XGBoost classifier on a set of features, (how) can I use it to make new predictions based on one of those features?

Forgive me if this is a somewhat naïve question. I have trained an XGBoost classifier that uses COVID-19 patients' age, sex, location, etc. to predict their mortality risk (here is the dataset). The ...
0
votes
0answers
20 views

Can we fit a piecewise exponential function to coronavirus cases over time?

The naive assumption on number of cases, is it is growing exponentially over time. However, different countries has different policies, for example travel ban or staying at home. So, does it making ...
0
votes
0answers
9 views

Does it make sense to propensity scores for re weighting samples in prediction tasks?

When reading the literature on propensity scores, the focus is mainly on estimating treatment effects (be it ATE, ATT, or else). But that, in linear models terms, is equivalent to asking questions ...
0
votes
0answers
19 views

Predictive probability, and 95% CI and OD ratio for Firth logistic regression

i am doing a logistic regression but I am GETTING VERY large confidence interval for the logistic equation ? Because I have a quasi separation (due to small sample and lots of variables )and people ...
1
vote
1answer
27 views

Predicted probabilities seem too low with Gradient Boosting Machine on `iris` data

I'm doing a test run of the Gradient Boosting Machine algorithm on the iris data with the caret package. ...
0
votes
0answers
12 views

Model specification through machine learning (neural network)

Assumptions: I observe the entire population N, so in my setting there is no overfitting, a model that fits perfectly with my data is a "perfect" model I know that the variables at play are the only ...
0
votes
1answer
34 views

In a linear regression, how can I test whether a given variable is driving the regression's predictive power?

Let's say I have at linear regression of income per capita (PPP) and other variables on food-prices I suspect that the variable that is driving most, if not all, the prediction power of the ...
0
votes
0answers
8 views

interpreting error in imputation: missForest

Are there norms for how much error in imputation indicates too much error? I'm using missForest and missRanger to impute ...
0
votes
0answers
20 views

How to apply different weights on independant variables in lmer?

I am trying to build a mixed-effect model to predict an outcome based on three independant variables. Below is the line of code I wrote so far. ...
0
votes
0answers
12 views

Predictions all coming out the same

Sorry for this basic question, I'm a total noob. I have input data in the format of 6000 x 3 x 256, that is 6000 rows with 3 features, which themselves contain 256 ...
0
votes
0answers
16 views

Alternatives to Genetic Search to find optimla values?

I have a small program that I can configure via a simple input number (integer) called a 'grain size'. I've noticed that the run time can vary based on this number's value. I wrote up my search like ...
1
vote
1answer
23 views

What are benchmarks for precision when working with unbalanced data?

I have a dataset where the positive class is 1.7%, which equates to about 40k positive cases and a total basis of approx 2.5m. What is a realistic precision to achieve for the most likely to cancel? ...
1
vote
0answers
5 views

Buld prediction intervals for glmmTMB

I'm using glmmTMB to build a mixed effect logistic model from which I want to draw predictions. The predict() function applied to a glmmTMB model allows extracting the ML prediction and the relative ...
0
votes
0answers
11 views

Unconditional distribution of a predictive regression

May I preface my question by first apologizing, as I have asked this question in different forms on various occasions. Some questions were perhaps formed incorrectly, predicated on wrong assumptions ...
0
votes
0answers
28 views

Are positively biased bootstrap-derived GAM predictions indicative of model issues?

all, I am using a negative binomial GAM fitted with mgcv::gam to estimate counts for new data, and I wanted to use bootstrapping to find a 95% confidence interval for point estimates. In my ...
0
votes
0answers
19 views

Creating a prediction model with non-normal residuals on purpose

I am trying to build a model with non-normal residuals as it is supposed to overestimate rather than underestimate while being as accurate as it can be. So far I have not experienced a scenario where ...
0
votes
0answers
28 views

Elastic Net scikit-learn: same model and input data but different prediction values

I was training an Elastic Net using scikit-learn and I bumped into the following problem. I am getting different prediction values for the same input data and model. What is happening? Am I missing ...
1
vote
0answers
12 views

What is the affect of datasets having (Events Per Variable) EPV less than 10?

I read some studies in Software defect predictions (published in good journals) that mentioned that we should use datasets with an Events Per Variable (EPV) greater than 10; otherwise the results will ...
0
votes
0answers
29 views

Can neural ODEs “fit” an ODE from just measurements?

The neural ODE technique, to my knowledge, presents a neural network based way of solving ODEs efficiently, which implies it needs an ODE and an initial value in order to construct the evolution over ...
1
vote
0answers
32 views

Handling rare levels in a categorical variable? (or maybe it's not categorical at all)

I have a dataset where I'm trying to predict completion time of an application. There are a number of numeric and categorical predictors, with a one group of predictors being holds. An application may ...
0
votes
1answer
15 views

Getting different results when separating the data. What is the reason?

I am currently running some linear models and generalized least squares in R to detect latitudinal effects in the body size of a group of marine invertebrates. The response variable is a measurement ...
1
vote
1answer
30 views

Scoring rules for time series data

I have found quite a lot of articles about scoring rules that seem to first work out theorems and proofs for scoring rules in an iid setting, after which they proceed to apply them to some time series ...
0
votes
0answers
14 views

Lack of accuracy for sales forecasting [duplicate]

I'm an intern in a company that sells about 900 products. We are trying to forecast our sales for the next year. I already have the monthly data for the 3 previous years, cleaned it, and analysed ...
2
votes
1answer
28 views

How to evaluate the quality of predictions?

Say I have a big sample of data (simple positive numbers), and I want to check my prediction quality by calibrating my "model" (predictor?) based on the initial part of the sample, generating a ...
0
votes
0answers
110 views

Reporting the average log-probability the model assigns to some examples

I am currently studying Deep Learning by Goodfellow, Bengio, and Courville. In chapter 5.1.2 The Performance Measure, $P$, the authors say the following: To evaluate the abilities of a machine ...
0
votes
0answers
10 views

Multiple changes in statistical significance of predictor across steps in hierarchical regression - your insights please?

I have three IVs (A, B, C) and have entered them in three separate blocks (steps) in regression using SPSS. In step 1, Variable A independently predicts the DV. In step 2, after the addition of ...
3
votes
1answer
34 views

Definition of predictive hazard function?

In a Bayesian context, the posterior predictive probability density function is $$f_p(t) = \int f(t\mid \theta)\pi(\theta\mid \text{Data})d\theta,$$ where $\pi(\theta\mid \text{Data})$ is the ...
0
votes
0answers
4 views

Predictive modelling to determine takeover targets

I am interested in building a model to predict the binary outcome, retention (1 - takeover target; 0 - no takeover target) with various potential predictor variables. I have a dataset containing ...
0
votes
0answers
8 views

Tensorflow classification model - Training data questions

I am working on a classification problem. I have a set of data that is classified to different hierarchies . As an example we could have Description: whole grain brown bread with seeds 12 slices ...
0
votes
0answers
7 views

Will out-of-sample tests be significant when the corresponding in-sample tests are not significant?

I am relatively new to the concept of the out-of-sample tests. I understand that the out-of-sample test is conducted through the following steps. (1) Split the data, (2) Use the data in the first ...
0
votes
0answers
13 views

Does inclusion of categorical dummy variables impact OLS prediction?

Say I am trying to predict city price levels of apartments and my dataset contains a variable coded as 'region' (which is a larger geographical variable than city) for 4 levels: region N, region S, ...
0
votes
0answers
19 views

Predicting game outcomes with moving averages of goals scored

I decided to make a sports gambling script so I could quit my job and never work again. I just started and I read about Poisson distributions (which kind of approximate the chance of X goals getting ...
0
votes
1answer
61 views

Why AutoRegression(AR) model in python is giving inaccurate negative prediction

I have time series data with 8 points. I used AR model from statsmodels.tsa.ar_model library. I trained data using 8 points and predicted next 3 points. Though all values in the series are in ...

1
2 3 4 5
42