#### Explain the concepts of one hot encoding and label encoding. What effect do they have on the dimensionality of the given dataset?

Categorical variables are represented as binary vectors using one-hot encoding. Label encoding is the process of turning labels/words into numerical representation.

Using one-shot encoding increases the data set’s dimensionality. The dimensionality of the data collection is unaffected by label encoding.

One-hot encoding generates a new variable for each level of the variable, whereas Label encoding encodes the levels of a variable as 1 and 0.

#### What exactly is overfitting?

Overfitting is a sort of modeling error that results in the inability to accurately forecast future observations or fit extra data into the present model.

It happens when a function is too closely fit to a small amount of data points, and it frequently results in additional parameters.

#### What is the difference between bias and variance, and what do you mean by the Bias-Variance Tradeoff?

Both are Machine Learning Algorithm faults. Bias occurs when the algorithm’s ability to deduce the correct observation from the dataset is limited. Variation, on the other hand, happens when the model is particularly sensitive to small perturbations.

When additional characteristics are added to a model, it becomes more complex, and we lose bias while gaining some variation. We execute a tradeoff between bias and variance based on the needs of a business in order to retain the appropriate degree of mistake.

Bias refers to the inaccuracy caused by incorrect or too simplified assumptions in the learning algorithm. Because of this assumption, the model may underfit the data, making it difficult for it to have high predicted accuracy and for you to generalize your knowledge from the training set to the test set.

Variance is also an error caused by the learning algorithm’s overcomplexity. This could explain why the algorithm is very sensitive to high levels of variation in training data, causing your model to overfit the data. There is too much noise in the training data for your model to be relevant for the test data.

Image Source: https://www.kdnuggets.com/2016/08/bias-variance-tradeoff-overview.html

The bias-variance decomposition essentially decomposes any algorithm’s learning error by adding the bias, the variance, and a bit of irreducible error due to noise in the underlying dataset. Essentially, as the model becomes more sophisticated and more variables are added, you will lose bias but acquire some variation – in order to achieve the best minimized amount of error, you must trade off bias and variance. You don’t want your model to have either a strong bias or a big variance.

#### What is the relationship between standard deviation and variance?

The standard deviation of your data is the spread of your data from the mean. The average degree to which each point deviates from the mean, i.e. the average of all data points, is defined as variance.

Standard deviation and Variance are related because Standard deviation is the square root of Variance.

#### What exactly do you mean by “machine learning”?

Machine learning is a type of Artificial Intelligence that works with system programming and automates data analysis to allow computers to learn and behave without being explicitly programmed.

Robots, for example, are programmed in such a way that they can perform tasks based on data collected through sensors. They learn programming automatically from data and improve as they gain experience.

#### What is the difference between inductive and deductive learning?

Inductive learning involves the model learning by example from a group of observed instances in order to get a generalized conclusion. With contrast, in deductive learning, the model first applies the conclusion and then draws the conclusion.

- Inductive learning is the process of drawing inferences from observations.
- Deductive learning is the process of forming observations based on inferences.

For instance, suppose we had to explain to a child that playing with fire can result in burns. We may convey this to a child in two ways: we can display training examples of various fire accidents or photographs of burnt people and label them as “Hazardous.”

In this situation, a child will understand from examples and will not play with fire. This is a type of inductive machine learning. Another method for teaching the same concept is to let the child play with fire and see what happens. If the child gets burned, it will teach him or her not to play with fire and to stay away from it. It is a type of inductive learning.

#### What exactly is the distinction between Data Mining and Machine Learning?

Data mining can be defined as the process of abstracting information or interesting unknown patterns from organized data. Machine learning algorithms are used during this procedure.

Machine learning is the study, creation, and development of algorithms that enable processors to learn without being explicitly programmed.

#### What is the difference between classification and regression?

Classification | Regression |

The aim of classifying is to predict a discrete class label. | The objective of regression is to predict a continuous quantity. |

Data is labeled into one of two or more classes in a classification issue. | A regression problem necessitates the estimation of a quantity. |

A classification with two classes is known as binary classification, while a classification with more than two classes is known as multi-class classification. | A multivariate regression problem is a regression problem with numerous input variables. |

A classification problem is determining whether an email is spam or not. | A regression challenge involves predicting the price of a stock over time. |

#### What exactly do you mean by “ensemble learning”?

Ensemble learning is the process through which a large number of models, such as classifiers, are intentionally created and merged to solve a specified computational problem. Ensemble methods are also referred to as committee-based learning and learning multiple classifier systems. It trains different theories to solve the same problem.

The random forest trees are a good example of ensemble modeling since they use numerous decision trees to forecast outcomes. It is used to improve a model’s classification, function approximation, prediction, and so on.

**Additional Reading**: Data Mining for the Internet of Things: Literature Review and Challenges