# Questions tagged [feature-engineering]

Feature engineering is the process of using domain knowledge of the data to create features for machine learning models. This tag is meant for both theoretical and practical questions regarding feature engineering, excluding questions asking for code, that would be off-topic on CrossValidated.

558
questions

**1**

vote

**1**answer

34 views

### Expected Counts in Chi-Squared Goodness-of-Fit Tests of Normality

I have a variable with of 200 values that I would like to test for normality using the Chi-square Goodness of Fit test. To do this, I have to calculate, for each value, the expected value in a normal ...

**0**

votes

**0**answers

7 views

### Priority between feature engineering and normalisation

My question is related to the priority between feature engineering (for example a simple transformation) and normalisation.
It is a general question and I am not sure I understand all the ...

**0**

votes

**0**answers

6 views

### Feature extraction for exponentially damped signals

I am looking into exponentially damped signals where it is a stationary signal (after implementing the Adfuller statistical test) and I would like to look into how can I extract meaningful features ...

**0**

votes

**0**answers

14 views

### How to approach Feature Extraction and Feature Selection part in machone learning in python?

I am a bit new to machine learning and I have the following questions:
Question 1:
When dealing with feature extraction with signals from sensors, what is the typical approach to extract features ...

**0**

votes

**0**answers

6 views

### TSFRESH - features extracted by a symmetric sliding window [closed]

As raw data we have measurements m_{i,j}, measured every 30 seconds (i=0, 30, 60, 90,...720,..) for every subject ...

**0**

votes

**0**answers

6 views

### Is there I guide to decide which transformation to choose for different scenarios/ types of data and distribution?

1) how do i decide which transformation or scaling to use before passing our data into machine learning model. Can someone please guide me on which transformation to use in different situations. There ...

**0**

votes

**0**answers

10 views

### handling counting features in classification model

I'm working on training a binary classification model. In my data I have 29 numerical features, continuous and discrete, apart from the target. Discrete features are all count features. I know that ...

**0**

votes

**0**answers

13 views

### Should I impute the missing values of timeseries data?

I have the following task - predicting the next 12 hours of PM10 particles based on historical data of previous 24 hours of PM10, O3 (ozone), CO (carbon monoxide), and others (not included) using RNN'...

**0**

votes

**1**answer

13 views

### How do you interpret your features when you standardize your data?

Let's say I have built a boosting tree or neural network and I standardized my features beforehand. When I built my model, I split my data into training, validation, and test sets - each with their ...

**0**

votes

**0**answers

31 views

### How do you code missing values if 0 is meaningful?

Per this deep learning book I am reading:
In general, with neural networks, itās safe to input missing values as 0, with the condition that 0 isnāt already a meaningful value. The network will ...

**2**

votes

**1**answer

29 views

### Handling zeros in features of a binary classification problem

I'm working on training a binary classification model. In my data I have 29 numerical features, continuous and discrete, apart from the target which is categorical. I have 29 features, 8 of them have ...

**1**

vote

**0**answers

22 views

### How to deal with a features that overweight others in a regression?

I have been facing a problem that has been taking quite a while to over. In my problem I have basically 3 input features in my model and one single output. I have been using GP to fit my model to data,...

**0**

votes

**0**answers

8 views

### Implementing Scikit Learn's FeatureHasher for High Cardinality Categorical Data

Background: I am working on a binary classification of health insurance claims. The data I am working with has approximately 1 million rows and a mix of numeric features and categorical features (all ...

**0**

votes

**0**answers

9 views

### Is there a good score function for finding stationary-covariance features from time series via variational inference?

There are various ad-hoc methods for picking differencing orders or fractional difference orders of time series. Am looking for sound scoring functions and discussions that target automatic stationary ...

**0**

votes

**0**answers

8 views

### How to select the best features for Support Vector Classifier in sklearn

I have a range of different technical analysis indicators as a feature set for my SVM. I would like to think some indicators are better than others at predicting and that there must be some sort of ...

**2**

votes

**2**answers

27 views

### Is constructing the target variable manually a form of data leakage?

Let's say, I have a data table with numerical features A, B, C. I do not have the target variable but I extract the target variable Y from the features A, B, and C. like so:
...

**0**

votes

**0**answers

19 views

### How to Find features for my model?

So i am a newbie in all Machine Learning stuff,
i am trying to build a model of detecting fake news articles,
as a starting point, i am just trying to build a simple model using known classifiers (...

**1**

vote

**0**answers

39 views

### Standardization vs dividing uncentered data by Standard Deviation

I am working with a dataset that involves a collection of one-hot encoded, ordinal, and numerical features. I am using a LASSO model. As the difference in scales can influence the estimates, I am ...

**1**

vote

**1**answer

15 views

### Feature generation for anomaly detection

I have Room Temperature data(T1) and Outside Temperature data(T2) with me for various houses which are having HVAC system installed. I am building a system which detects faulty HVAC (heating, ...

**0**

votes

**2**answers

45 views

### Feature extraction definition

I have difficulty understanding the concept of feature extraction since there are two main ways to describe it.
The first one refers to mapping the raw data into a vector in R^d or the translation of ...

**0**

votes

**0**answers

15 views

### Why to say neural network can extract implicit feature combinations?

I just couldn't understand why to say neural network layers can extract implicit feature interactions in the DeepFM model.
What does the keyword, implicit feature interactions, exactly mean here?
And ...

**0**

votes

**0**answers

10 views

### new features selection

I am in a project in which I have a specific description of a certain binary profile for which I have about 200 positive examples and another 200 negative. This description is given from about 60 ...

**0**

votes

**0**answers

11 views

### Should I reduce data points for a feature if there are many inputs?

I have an assignment for college based on the MediaEval Competition, to predict video memorability.
We have 8000 videos, each with a score in terms of it's memorability. As well as this, we have ...

**6**

votes

**1**answer

76 views

### Feature Engineering : combine a categorical Feature and a continuous Feature

When we analyze data , we can observe several variables that may contain mutual information. For an example , There can be a binary variable such as Y=Have you ever smoke ? And then there will be a ...

**0**

votes

**1**answer

41 views

### Dealing with over 1000 categorical values (which are also a unique identifiers)

I am preparing my dataset for a logistic regression and need to check how best to handle a column with categorical values. As the dataset is for sales transactions, the column in question is the ...

**0**

votes

**0**answers

9 views

### Feature Selection by individual AUC

I am creating a model for classification and I have several ways to get subset of features but I was wondering if the following is reasonable:
Use the train set to calculate LOOCV or LPOCV AUC values ...

**0**

votes

**1**answer

24 views

### How to set feature engineering for day of a week?

Apologies if this is a very basic question. I'm currently learning data science and was wondering to help validating what I'm trying to do.
So I have a model set up to predict event duration by ...

**0**

votes

**1**answer

12 views

### Best way to encode information for input to neural network?

If I have a 20x20 grid of cells, each cell can take on one of four values (Red, Green, Black, Blue)
What is the best way to encode this information?
My first guess would be that one hot encoding is ...

**0**

votes

**0**answers

21 views

### Increased computation time for training and prediction with reduced feature space?

I implemented a PCA algorithm to reduce the input feature space of my neural network from 230 to 110 features.
My naive expectation was that if I train a neural network using the same hyper ...

**2**

votes

**1**answer

30 views

### Performance drops when adding a feature using XGBoost

I did some feature engineering with my data set. When I added on of the new features, the performance significantly dropped. How is this possible? I thought XGBoost is robust to irrelevant variables.

**0**

votes

**0**answers

15 views

### Feature expansion (multiplication) - What to do with higher correlations?

If I have a set of features {x1,x2,x3} and I expand the feature set by multiplying all the features to have the following: {x1,x2,x3,x1*x2,x2*x3,x1*x3}. Now, I find that two of my features {x1*x2 and ...

**1**

vote

**1**answer

24 views

### Encoding cyclical feature minutes and hours

I'm working with time-series data to train a binary classification model that predicts if an event is going to happen or not in the future. The likelihood of the event depends on the specific time ...

**0**

votes

**0**answers

10 views

### Time-dependent feature analysis

I have a linear relation between variable A and variable B. Variable A is an area under the curve, where the curve is a gaussian fitted to a time series evolution. Now, apart from the time series data ...

**0**

votes

**0**answers

10 views

### How to use target encoding : expanding mean on the test set

The expanding mean is a way to prevent overfitting when performing target encoding. But what I do not understand is how to use ...

**0**

votes

**0**answers

22 views

### modeling a electrical pulse which is technically time dependent

I have an electrical pulse that I need to fit a curve to a certain area of but not the entire thing.
The whole pulse looks like this
However the only part that I need to model is this
My boss ...

**1**

vote

**0**answers

20 views

### Feature engineering: including counter-parties of a transaction in a dataset

Background
Say I have a dataset of transfers between bank accounts structured like so:
...

**2**

votes

**2**answers

243 views

### Difference between Feature engineering and hyperparameter optimizations?

Hyperparameter optimizations and feature engineering can(in my understanding) both be used to create a machine learning model. But what is the difference? And what is done to the y = wx + b formula in ...

**0**

votes

**0**answers

11 views

### Extracting Features to Determine Periodicity

I have accelerometer time series data sampled at 30Hz from participants and am extracting features from each separate movement in the collection period per person to use for machine learning. I have ~...

**0**

votes

**1**answer

21 views

### How to handle potential ambiguity when one-hot encoding?

Let's say I have two categorical features: Movie, Director. I one-hot encode both the Movie and Director features for use in a linear regression model.
The problem is that two or more movies may be ...

**1**

vote

**1**answer

19 views

### Does mislabeling due to adversarial noise in features count as adversarial machine learning?

According to the traditional definition, Adversarial machine learning is a technique employed in the field of machine learning which attempts to fool models through malicious input. However, I have ...

**1**

vote

**1**answer

24 views

### Different scales of input features for stacking ensembles?

I have two models to predict future stock market behavior based on historical data:
ARIMA time series model
lstm model (including data from various other sources)
ARIMA tries to model the daily ...

**3**

votes

**1**answer

59 views

### How to construct a function with given local minima?

I need to construct a function $f(x,y)$ in which there are 3 minima: 2 local and 1 global as given below.
Locals are: z = f(0.2,0.3) = 0.7 | z = f(0.6,0.8) = 0.8
Global is: z = f(0.85,0.5) = ...

**1**

vote

**0**answers

20 views

### Can RFs find a product interaction between two independent variables?

I'm doing the FastAI course on ML, and the main topic that is currently being discussed is random forests. Jeremy Howard explains how random forests, unlike something such as logistic regression, can ...

**0**

votes

**0**answers

24 views

### Combine TFIDF with non-textual features

I am dealing with an email classification problem in which I have email requests coming from different groups of people. I am building a classifier to classify these emails based on historical email ...

**1**

vote

**0**answers

32 views

### Machine learning: use benchmark as a feature

The project I am doing is to predict surgery lengths. The benchmark I am trying to compare is to take the average of most recent 20 cases for the cases with the same ID.
What I tested is to use this ...

**0**

votes

**0**answers

13 views

### The most basic question about Feature Important or Permutation_Importance

Consider the XOR gate with three inputs. The truth table will be:
Now all the variables on their own are near random as far as the model is concerned. Each input 1 or 0 has a 50% chance of being ...

**0**

votes

**0**answers

17 views

### Reduction in number of observation by extracting piecemeal signal features,while keeping the no of features same. Can it be called feature extraction?

I have a dataset generated from 9 sensors in an E-nose system for a binary class classification problem. The system provides a response for 240 seconds for each sample. i.e. I have a data set of 240 * ...

**1**

vote

**1**answer

73 views

### “Deep learning removes the need for feature engineering”?

I have seen it written in several papers and currently see it written in Deep Learning with Python by Francois Chollet that
Deep learning removes the need for feature engineering
What does this ...

**0**

votes

**0**answers

20 views

### Should I engine features from coordinates (positional data)?

I am trying to do a regression on housing price (price/m^2). Apart from the lat and lng of the property, I also have city_code, district_id, street_id.
I am thinking whether I should remove city_code,...

**1**

vote

**0**answers

11 views

### Imputing null values for metrics used as features in ML model

I have a data set from a live application. Each row is a user interacting with the app. We are predicting a feature for which we currently have a deterministic solution for. There is ample training ...