Model Selection¶
Model selection is central to TorchKM. In a standard workflow, a kernel machine is trained repeatedly over a grid of tuning parameters and cross-validation folds. This can be expensive because each fit may require solving a large kernel system.
TorchKM changes this workflow by integrating training and tuning into the solver.
Standard workflow¶
In a standard scikit-learn workflow, users might combine an estimator with GridSearchCV:
# Conceptual standard workflow
# for C in grid:
# for fold in folds:
# fit a separate model
This is easy to use, but it can require many repeated kernel solves.
TorchKM workflow¶
In TorchKM, users pass a sequence of candidate regularization values to the estimator:
Cs = np.logspace(2, -2, num=4)
clf = TorchKMSVC(
kernel="rbf",
Cs=Cs,
nC=len(Cs),
cv=5,
device=device,
)
clf.fit(Xtr, ytr)
After fitting, the selected value is available as:
clf.best_C_
The cross-validation scores are available as:
clf.cv_mis_
Parameters¶
| Parameter | Meaning |
|---|---|
Cs |
Candidate regularization values under the scikit-learn/LIBSVM convention |
nC |
Number of candidate regularization values |
cv |
Number of cross-validation folds |
foldid |
Optional user-specified fold assignments |
random_state |
Random seed for deterministic fold construction |
device |
"cpu", "cuda", or None for automatic selection |
Notes¶
- Larger
nCgives a finer regularization grid but increases computation. - Larger
cvcan give a more stable estimate of predictive performance but also increases work. - The selected parameter depends on the fold assignment and the candidate grid.
- For small examples and tests, use a short grid and a small number of folds.