Source: UCI Machine Learning repository (http://www.ics.uci.edu/~mlearn/MLSummary.html).
First, let's load the dataset and look what features are collected for the CPU's:
Vendor | Model | MYCT | MMIN | MMAX | CACH | CHMIN | CHMAX | PRP | ERP | |
---|---|---|---|---|---|---|---|---|---|---|
0 | adviser | 32/60 | 125 | 256 | 6000 | 256 | 16 | 128 | 198 | 199 |
1 | amdahl | 470v/7 | 29 | 8000 | 32000 | 32 | 8 | 32 | 269 | 253 |
2 | amdahl | 470v/7a | 29 | 8000 | 32000 | 32 | 8 | 32 | 220 | 253 |
3 | amdahl | 470v/7b | 29 | 8000 | 32000 | 32 | 8 | 32 | 172 | 253 |
4 | amdahl | 470v/7c | 29 | 8000 | 16000 | 32 | 8 | 16 | 132 | 132 |
In the 1982 landscape, Amdahl delivers 9 CPU's of the premium class.
MYCT | MMIN | MMAX | CACH | CHMIN | CHMAX | PRP | |
---|---|---|---|---|---|---|---|
0 | 125 | 256 | 6000 | 256 | 16 | 128 | 198 |
1 | 29 | 8000 | 32000 | 32 | 8 | 32 | 269 |
2 | 29 | 8000 | 32000 | 32 | 8 | 32 | 220 |
3 | 29 | 8000 | 32000 | 32 | 8 | 32 | 172 |
4 | 29 | 8000 | 16000 | 32 | 8 | 16 | 132 |
The original ERP vs BYTE's PRP metrics, reported on UCI, were measured as mean deviation percentage. This is not standard today, so later authors have chosen to use RMSE.
Apparently, the RMSE results of later preformed linear regressions came not close to the accuracy of the original Linear Regression, as found in the dataset field ERP (Estimated relative performance).
Mean Absolute Error: 24.330143540669855 Mean Squared Error: 1737.3349282296651 Root Mean Squared Error: 41.68134988492653
Mean Absolute Percentage Error (MAPE): 33.91 Accuracy: 66.09
to isolate the ERP and PRP values, before splitting
MYCT | MMIN | MMAX | CACH | CHMIN | CHMAX | |
---|---|---|---|---|---|---|
10 | 400 | 1000 | 3000 | 0 | 1 | 2 |
14 | 350 | 64 | 64 | 0 | 1 | 4 |
15 | 200 | 512 | 16000 | 0 | 4 | 32 |
17 | 143 | 512 | 5000 | 0 | 7 | 32 |
18 | 143 | 1000 | 2000 | 0 | 5 | 16 |
... | ... | ... | ... | ... | ... | ... |
202 | 180 | 262 | 4000 | 0 | 1 | 3 |
203 | 180 | 512 | 4000 | 0 | 1 | 3 |
204 | 124 | 1000 | 8000 | 0 | 1 | 8 |
206 | 125 | 2000 | 8000 | 0 | 2 | 14 |
208 | 480 | 1000 | 4000 | 0 | 0 | 0 |
69 rows × 6 columns
MYCT | MMIN | MMAX | CACH | CHMIN | CHMAX | |
---|---|---|---|---|---|---|
122 | 1500 | 768 | 1000 | 0 | 0 | 0 |
123 | 1500 | 768 | 2000 | 0 | 0 | 0 |
124 | 800 | 768 | 2000 | 0 | 0 | 0 |
207 | 480 | 512 | 8000 | 32 | 0 | 0 |
208 | 480 | 1000 | 4000 | 0 | 0 | 0 |
MYCT | MMIN | MMAX | CACH | CHMIN | CHMAX | PRP | |
---|---|---|---|---|---|---|---|
0 | 125 | 256 | 6000 | 256 | 16 | 128 | 198 |
1 | 29 | 8000 | 32000 | 32 | 8 | 32 | 269 |
2 | 29 | 8000 | 32000 | 32 | 8 | 32 | 220 |
3 | 29 | 8000 | 32000 | 32 | 8 | 32 | 172 |
4 | 29 | 8000 | 16000 | 32 | 8 | 16 | 132 |
feat engineering was no success, so I'll use KFold methode...
test_size=0.22, random_state=42
MYCT | MMIN | MMAX | CACH | CHMIN | CHMAX | |
---|---|---|---|---|---|---|
count | 209.000000 | 209.000000 | 209.000000 | 209.000000 | 209.000000 | 209.000000 |
mean | 203.822967 | 2867.980861 | 11796.153110 | 25.205742 | 4.698565 | 18.267943 |
std | 260.262926 | 3878.742758 | 11726.564377 | 40.628722 | 6.816274 | 25.997318 |
min | 17.000000 | 64.000000 | 64.000000 | 0.000000 | 0.000000 | 0.000000 |
25% | 50.000000 | 768.000000 | 4000.000000 | 0.000000 | 1.000000 | 5.000000 |
50% | 110.000000 | 2000.000000 | 8000.000000 | 8.000000 | 2.000000 | 8.000000 |
75% | 225.000000 | 4000.000000 | 16000.000000 | 32.000000 | 6.000000 | 24.000000 |
max | 1500.000000 | 32000.000000 | 64000.000000 | 256.000000 | 52.000000 | 176.000000 |