Skip to content

Model Selection

Model selection is central to TorchKM. In a standard workflow, a kernel machine is trained repeatedly over a grid of tuning parameters and cross-validation folds. This can be expensive because each fit may require solving a large kernel system.

TorchKM changes this workflow by integrating training and tuning into the solver.

Standard workflow

In a standard scikit-learn workflow, users might combine an estimator with GridSearchCV:

# Conceptual standard workflow
# for C in grid:
#     for fold in folds:
#         fit a separate model

This is easy to use, but it can require many repeated kernel solves.

TorchKM workflow

In TorchKM, users pass a sequence of candidate regularization values to the estimator:

Cs = np.logspace(2, -2, num=4)

clf = TorchKMSVC(
    kernel="rbf",
    Cs=Cs,
    nC=len(Cs),
    cv=5,
    device=device,
)
clf.fit(Xtr, ytr)

After fitting, the selected value is available as:

clf.best_C_

The cross-validation scores are available as:

clf.cv_mis_

Parameters

Parameter Meaning
Cs Candidate regularization values under the scikit-learn/LIBSVM convention
nC Number of candidate regularization values
cv Number of cross-validation folds
foldid Optional user-specified fold assignments
random_state Random seed for deterministic fold construction
device "cpu", "cuda", or None for automatic selection

Notes

  • Larger nC gives a finer regularization grid but increases computation.
  • Larger cv can give a more stable estimate of predictive performance but also increases work.
  • The selected parameter depends on the fold assignment and the candidate grid.
  • For small examples and tests, use a short grid and a small number of folds.