## Machine Learning: Question Set – 07

#### State few examples explaining importance/effect of False Positive and False Negative

Notion of False positive, False negative is demonstrated in following diagram. They have different impact on different applications.

Giving loans is the primary source of income in the banking industry, but if your repayment rate is poor, you will not make a profit and will instead risk enormous losses.

Banks do not want to lose excellent clients, but they also do not want to gain undesirable customers. In this case, measuring both false positives and false negatives becomes critical.

Nowadays, we hear a lot about athletes utilizing drugs during sports tournaments. Before the game, each player must undergo a steroid test. A false positive can terminate a great athlete’s career, while a false negative can make the game unfair.

#### Discuss the strengths and weaknesses of Linear Regression

Strengths:

• The space complexity is quite simple; all that is required is to preserve the weights at the end of training. As a result, it has a high delay algorithm.
• It’s really simple to grasp. Excellent interpretability
• At the time of model construction, feature significance is created. You can handle feature selection with the help of hyperparameter lamba, and thus we can achieve dimensionality reduction.

Weaknesses:

The algorithm assumes that data is normally distributed, although it is not. Multicollinearity should be avoided while developing a model.

Linear regression has two general limits for data analysis:

• Is the model sufficiently descriptive of the processes that generated the data?
• Is the output truly linear across all inputs?
• Are the inputs truly independent of one another?
• Does the model take into account all of the inputs?
• Is the data sufficient to estimate the model’s coefficients?
• Is there sufficient data?
• Does the data have a high signal-to-noise ratio? (A and B are related.) You might be able to adjust for a reduced signal-to-noise ratio by gathering additional data.)

#### Differentiate validation set and test set.

The validation set might be regarded a subset of the training set because it is used to choose parameters and avoid overfitting of the model being developed. A test set, on the other hand, is used to test or evaluate the performance of a trained machine learning model.

In a nutshell, the distinctions are as follows:

• Training Set is used to fit the parameters, such as weights
• The validation set is used to fine-tune the parameters.
• Test Set is used to evaluate the model’s performance, such as predictive power and generalization

The process of model validation and model selection is depicted in following figure

#### State few examples of when a False Negative was more crucial than a False Positive

Assume an airport has received high security threats, and based on certain features, they determine if a specific traveler is a threat or not. Due to a staff shortfall, they opted to check passengers who were judged to have risk positives by their predictive model.
What happens if an airport model flags an actual threat customer as non-threat?

Another example is the legal system. What happens if a jury or a judge decides to let a criminal walk free?

What if you rejected a highly nice person based on your prediction model, only to meet him/her a few years later and learn you had a false negative?

#### State few examples of when a False Positive was more crucial than a False Negative

Before we get started, let’s define false positives and false negatives.

False Positives occur when you incorrectly classify a non-event as an event, often known as a Type I error.

False Negatives are circumstances in which you incorrectly label occurrences as non-events, often known as a Type II error.

Another example could come from marketing. Assume an ecommerce company decides to issue a $1000 gift voucher to consumers who are expected to spend at least$5000 on merchandise. They send free voucher mail to 100 consumers with no minimum purchase requirement since they expect to make at least a 20% profit on things sold for more than \$5,000.
What if they accidentally sent it to a false positive case?

Assume you have to deliver chemo therapy to people in the medical industry. Your lab collects critical information from patients and decides whether or not to treat them with radiation therapy based on the results.

Assume a patient arrives at that hospital and is tested positive for cancer (despite the fact that he does not have cancer) based on lab prediction. What will become of him? (With Sensitivity set to 1)

Additional Reading: How to interpret confusion matrix. Click to read