This is one of the earliest datasets used in the literature on classification methods and widely used in statistics and machine learning. This famous (Fisher’s or Anderson’s) iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica.
Classification and Regression Training - Version 6.0-94
Functions for training and plotting classification and regression models.
🔖Kuhn, M. (2008). Building Predictive Models in R Using the caret Package. Journal of Statistical Software, 28(5), 1–26. https://doi.org/10.18637/jss.v028.i05
k-Nearest Neighbors
120 samples
4 predictor
3 classes: 'setosa', 'versicolor', 'virginica'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 108, 108, 108, 108, 108, 108, ...
Resampling results across tuning parameters:
k Accuracy Kappa
5 0.9500000 0.9250
7 0.9500000 0.9250
9 0.9416667 0.9125
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was k = 7.
Support Vector Machines with Radial Basis Function Kernel
120 samples
4 predictor
3 classes: 'setosa', 'versicolor', 'virginica'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 108, 108, 108, 108, 108, 108, ...
Resampling results across tuning parameters:
C Accuracy Kappa
0.25 0.9166667 0.8750
0.50 0.9250000 0.8875
1.00 0.9416667 0.9125
Tuning parameter 'sigma' was held constant at a value of 0.6558599
Accuracy was used to select the optimal model using the largest value.
The final values used for the model were sigma = 0.6558599 and C = 1.
CART
120 samples
4 predictor
3 classes: 'setosa', 'versicolor', 'virginica'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 108, 108, 108, 108, 108, 108, ...
Resampling results across tuning parameters:
cp Accuracy Kappa
0.0000 0.9083333 0.8625
0.4375 0.7333333 0.6000
0.5000 0.3333333 0.0000
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was cp = 0.
Random Forest
120 samples
4 predictor
3 classes: 'setosa', 'versicolor', 'virginica'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 108, 108, 108, 108, 108, 108, ...
Resampling results across tuning parameters:
mtry Accuracy Kappa
2 0.9333333 0.9
3 0.9333333 0.9
4 0.9333333 0.9
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was mtry = 2.
Select the best model
select best model - resampling and looking the results