{"cells":[{"cell_type":"markdown","metadata":{"id":"bB7k4iHZVyRp"},"source":["\n","NETID: PLEASE FILL ME IN\n"]},{"cell_type":"markdown","metadata":{"id":"t10Fmd_VVyRv"},"source":["# Introduction to Classifiers"]},{"cell_type":"markdown","metadata":{"id":"zJFjeouTgrai"},"source":["### Problems\n","- Problem 1 (4 points)\n","- Problem 2 (3 points)\n","- Problem 3 (2 points)\n","- Problem 4 (1 point)"]},{"cell_type":"markdown","metadata":{"id":"WhGluyWlVyRx"},"source":["Two lectures ago we covered linear regression and predicting the value of a continuous variable. We use __classifiers__ to predict binary or categorical variables. Classifiers can help us answer yes/no questions or categorize an observation into one of several categories.\n","\n","## kNN Classifier\n","\n","There are various classification algorithms, each of which is better suited to some situations than others. In this lecture we are learning about __kNN__, which is one of these classifiers"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"jXnamTZwVyRy","outputId":"9bbbee22-4708-4365-abfb-a6bfc8978404","scrolled":true},"outputs":[{"data":{"text/html":["
\n"," | diagnosis | \n","radius_mean | \n","texture_mean | \n","perimeter_mean | \n","area_mean | \n","smoothness_mean | \n","compactness_mean | \n","concavity_mean | \n","concave points_mean | \n","symmetry_mean | \n","... | \n","radius_worst | \n","texture_worst | \n","perimeter_worst | \n","area_worst | \n","smoothness_worst | \n","compactness_worst | \n","concavity_worst | \n","concave points_worst | \n","symmetry_worst | \n","fractal_dimension_worst | \n","
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n","M | \n","17.99 | \n","10.38 | \n","122.80 | \n","1001.0 | \n","0.11840 | \n","0.27760 | \n","0.3001 | \n","0.14710 | \n","0.2419 | \n","... | \n","25.38 | \n","17.33 | \n","184.60 | \n","2019.0 | \n","0.1622 | \n","0.6656 | \n","0.7119 | \n","0.2654 | \n","0.4601 | \n","0.11890 | \n","
1 | \n","M | \n","20.57 | \n","17.77 | \n","132.90 | \n","1326.0 | \n","0.08474 | \n","0.07864 | \n","0.0869 | \n","0.07017 | \n","0.1812 | \n","... | \n","24.99 | \n","23.41 | \n","158.80 | \n","1956.0 | \n","0.1238 | \n","0.1866 | \n","0.2416 | \n","0.1860 | \n","0.2750 | \n","0.08902 | \n","
2 | \n","M | \n","19.69 | \n","21.25 | \n","130.00 | \n","1203.0 | \n","0.10960 | \n","0.15990 | \n","0.1974 | \n","0.12790 | \n","0.2069 | \n","... | \n","23.57 | \n","25.53 | \n","152.50 | \n","1709.0 | \n","0.1444 | \n","0.4245 | \n","0.4504 | \n","0.2430 | \n","0.3613 | \n","0.08758 | \n","
3 | \n","M | \n","11.42 | \n","20.38 | \n","77.58 | \n","386.1 | \n","0.14250 | \n","0.28390 | \n","0.2414 | \n","0.10520 | \n","0.2597 | \n","... | \n","14.91 | \n","26.50 | \n","98.87 | \n","567.7 | \n","0.2098 | \n","0.8663 | \n","0.6869 | \n","0.2575 | \n","0.6638 | \n","0.17300 | \n","
4 | \n","M | \n","20.29 | \n","14.34 | \n","135.10 | \n","1297.0 | \n","0.10030 | \n","0.13280 | \n","0.1980 | \n","0.10430 | \n","0.1809 | \n","... | \n","22.54 | \n","16.67 | \n","152.20 | \n","1575.0 | \n","0.1374 | \n","0.2050 | \n","0.4000 | \n","0.1625 | \n","0.2364 | \n","0.07678 | \n","
5 rows × 31 columns
\n","\n"," | Positive' (Predicted) | \n"," Negative' (Predicted) | \n","
Positive (Actual) | \n"," True Positive | \n","False Negative | \n","
Negative (Actual) | \n"," False Positive | \n","True Negative | \n","
\n"," | Positive' (Predicted) | \n"," Negative' (Predicted) | \n","
Positive (Actual) | \n"," 146 | \n","32 | \n","
Negative (Actual) | \n"," 21 | \n","590 | \n","