#### Q41: Both supervised learning and unsupervised clustering necessitate the use of at least one

- (A) Hidden attribute
- (B) Output attribute
- (C) Input attribute
- (D) Categorical attribute

#### Q42: Grid search is

- (A) Linear in D and Polynomial in D
- (B) Polynomial in D
- (C) Exponential in D and Linear in N
- (D) Polynomial in D and Linear in N

#### Q43: The leaf nodes of a model tree are

- (A) averages of numeric output attribute values
- (B) nonlinear regression equations
- (C) linear regression equations
- (D) sums of numeric output attribute values

#### Q44: Which of the following statements regarding outliers is correct?

- (A) Outliers should be identified and removed from a dataset
- (B) Outliers should be part of the training dataset but should not be present in the test data
- (C) Outliers should be part of the test dataset but should not be present in the training data
- (D) The nature of the problem determines how outliers are used

#### Q45: Assume you’re working on a classification issue with a severely unbalanced class. In the training data, the majority class is observed 99 percent of the time. After making predictions using test data, your model has 99 percent accuracy. In this situation, which of the following is true? (1). For unbalanced class issues, the accuracy metric is not a good concept. (2). For unbalanced class issues, an accuracy metric is a good idea. (3). Precision and recall metrics are useful for situations with unbalanced classes. (4). Precision and recall metrics are ineffective for situations with an unbalanced class.

- (A) 1 and 3
- (B) 1 and 4
- (C) 2 and 3
- (D) 2 and 4

#### Q46: What are the factors to select the depth of neural network? (1). Type of neural network (eg. MLP, CNN etc) (2). Input data (3). Computation power, i.e. Hardware capabilities and software capabilities (4). Learning Rate

- (A) 1, 2, 4, 5
- (B) 2, 3, 4, 5
- (C) 1, 3, 4, 5
- (D) All of these

#### Q47:We calculate a low-rank approximation to a term-document matrix in latent semantic indexing. Which of the following is the reason for the low-rank reconstruction?

- (A) Finding documents that are related to each other, e.g. of a similar genre
- (B) The low-rank approximation provides a lossless method for compressing an input matrix
- (C) In many applications, some principal components encode noise rather than meaningful structure
- (D) Low-rank approximation enables discovery of nonlinear relations

#### Q48: Consider rolling a tetrahedral die twice. What is the chance that the first roll’s number is strictly greater than the second roll’s number? It should be noted that a tetrahedral die has just four sides (1, 2, 3, and 4).

- (A) 1/2
- (B) 3/8
- (C) 7/16
- (D) 9/16

#### Q49: Four marbles are contained in a jar. 3 red and 1 white. After each draw, two marbles are pulled, with one being replaced. What is the probability of drawing the same color marble twice?

- (A) 1/2
- (B) 1/3
- (C) 5/8
- (D) 1/8

#### Q50: Which of the following are not classification problems?

- (A) Predicting price of house
- (B) Predicting patient has tumor
- (C) Predicting who will hold the title in football league
- (D) Predicting percentage of student for next semester

## Answers:

Question | Q41 | Q42 | Q43 | Q44 | Q45 | Q46 | Q47 | Q48 | Q49 | Q50 |

Answer | A | C | C | D | A | D | A, C | B | C | A, D |