decision tree
-
Problem: This is our first programming assignment. This assignment has two parts:
• Part 1: I ask you to write a program to build a decision tree using Gini impurity measurement to guide tree generation. The data set is the poker hand data set archived at UCI Machine Learning Repository: Poker Hand Data Set
Data Set
Characteristics: Multivariate Number of Instances: 1025010 Area: GameAttribute
Characteristics: Categorical, Integer Number of Attributes: 11 Date Donated 2007-01-
01Associated Tasks: Classification Missing Values? No Number of Web
Hits: 212827You shall use the training data set to build your decision tree and then use the testing data set to evaluate your decision tree. You need to report classification accuracy using a bar chart and compare it with the distance based classification which is given in Part II.
• Part II: For this part, I ask you to use the same training data set in Part I to build a distance-based classification model. Here, you need to find a good distance metric and a parameter k that serves as the threshold to bound the nearest neighbors for any given data item (or point). Then, you need to apply your model to the testing data set to evaluate your classification model. You shall record the classification accuracy and compare it in a
bar chart with that of the decision tree model built in Part I.
Programming Language:
C++ or Java, but C++ is preferred. -
Problem: This is our first programming assignment. This assignment has two parts:
• Part 1: I ask you to write a program to build a decision tree using Gini impurity measurement to guide tree generation. The data set is the poker hand data set archived at UCI Machine Learning Repository: Poker Hand Data Set
Data Set
Characteristics: Multivariate Number of Instances: 1025010 Area: GameAttribute
Characteristics: Categorical, Integer Number of Attributes: 11 Date Donated 2007-01-
01Associated Tasks: Classification Missing Values? No Number of Web
Hits: 212827You shall use the training data set to build your decision tree and then use the testing data set to evaluate your decision tree. You need to report classification accuracy using a bar chart and compare it with the distance based classification which is given in Part II.
• Part II: For this part, I ask you to use the same training data set in Part I to build a distance-based classification model. Here, you need to find a good distance metric and a parameter k that serves as the threshold to bound the nearest neighbors for any given data item (or point). Then, you need to apply your model to the testing data set to evaluate your classification model. You shall record the classification accuracy and compare it in a
bar chart with that of the decision tree model built in Part I.
Programming Language:
C++ or Java, but C++ is preferred. -
Problem: This is our first programming assignment. This assignment has two parts:
• Part 1: I ask you to write a program to build a decision tree using Gini impurity measurement to guide tree generation. The data set is the poker hand data set archived at UCI Machine Learning Repository: Poker Hand Data Set
Data Set
Characteristics: Multivariate Number of Instances: 1025010 Area: GameAttribute
Characteristics: Categorical, Integer Number of Attributes: 11 Date Donated 2007-01-
01Associated Tasks: Classification Missing Values? No Number of Web
Hits: 212827You shall use the training data set to build your decision tree and then use the testing data set to evaluate your decision tree. You need to report classification accuracy using a bar chart and compare it with the distance based classification which is given in Part II.
• Part II: For this part, I ask you to use the same training data set in Part I to build a distance-based classification model. Here, you need to find a good distance metric and a parameter k that serves as the threshold to bound the nearest neighbors for any given data item (or point). Then, you need to apply your model to the testing data set to evaluate your classification model. You shall record the classification accuracy and compare it in a
bar chart with that of the decision tree model built in Part I.
Programming Language:
C++ or Java, but C++ is preferred.Was this handed out in the 10am or 11am CS class with Professor Lewis?
"One man's wage rise is another man's price increase." - Harold Wilson
"Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons
"You can easily judge the character of a man by how he treats those who can do nothing for him." - James D. Miles