# Data mining

1.Show that the entropy of a node never increases after splitting it into smaller successor nodes.

2.Compute a two-level decision tree using the greedy approach described in this chapter. Use the classification error rate as the criterion for splitting. What is the overall error rate of the induced tree?

Note: To determine the test condition at the root note, you first need to computer the error rates for attributes X, Y, and Z.

For attribute X the corresponding counts are:

 x c1 c2 0 60 60 1 40 40

For Y the corresponding counts are:

 y c1 c2 0 40 60 1 60 40

For Z the corresponding counts are:

 Z c1 c2 0 30 70 1 70 30

3.Consider a binary classification problem with the following set of attributes and attribute values:

• Air Conditioner = {Working, Broken}
• Mileage = {High, Medium, Low}
• Rust = {Yes, No}
• Suppose a rule-based classifier produces the following rule set:

Mileage = High −→ Value = Low

Mileage = Low −→ Value = High

Air Conditioner = Working, Engine = Good −→ Value = High

Air Conditioner = Working, Engine = Bad −→ Value = Low

Air Conditioner = Broken −→ Value = Low (

a) Are the rules mutually exclusive?

b) Is the rule set exhaustive?

c) Is ordering needed for this set of rules?

d) Do you need a default class for the rule set?

Consider the one-dimensional data set shown below:

 X .5 3.0 4.5 4.6 4.9 5.2 5.3 5.5 7.0 9.5 Y – – + + + – – + – –

(a)Classify the data point x = 5.0 according to its 1-, 3-, 5-, and 9-nearest neighbors (using majority vote).

(b)Repeat the previous analysis using the distance-weighted voting approach.

## "Get a Free Quote/Consultation for a Similar Assignment"

#### Proficient Writer Editorial Team

Proficient Writer is a team of professionals that offer academic help. We write fresh, unique, and premium quality academic papers. Our professional academic experts write for a wide range of subjects. Do you need help with your essay or any academic work? Please chat with us or send us an email (support@proficientwriter.com)