Feature Selectors (tm.fs)
The tablemage.fs module contains the feature selectors used by the
tablemage.Analyzer.regress() and tablemage.Analyzer.classify() methods of the tablemage.Analyzer class.
tm.fs.KBestFSR
- class tablemage.fs.KBestFSR(scorer: Literal['f_regression', 'r_regression', 'mutual_info_regression'], k: int, name: str | None = None)[source]
Selects the k best features based on the f_regression, r_regression, or mutual info regression score.
- __init__(scorer: Literal['f_regression', 'r_regression', 'mutual_info_regression'], k: int, name: str | None = None)[source]
Constructs a KBestFSR.
- Parameters:
scorer (Literal['f_regression', 'r_regression',) – ‘mutual_info_regression’]
k (int) – Number of desired features, < n_predictors.
name (str | None) – Default: None. If None, then outputs the class name.
tm.fs.LassoFSR
- class tablemage.fs.LassoFSR(max_n_features: int, alpha: float | None = None, name: str | None = None)[source]
Selects the (at most) k best features via Lasso regression model-inherent feature selection.
- __init__(max_n_features: int, alpha: float | None = None, name: str | None = None)[source]
Constructs a LassoFSR.
- Parameters:
max_n_features (int) – Number of desired features, < n_predictors.
alpha (float | None) – Default: None. Regularization term weight. If None, then alpha is selected via five-fold cross validation from a default grid of candidate alphas.
name (str | None) – Default: None. If None, then name is set to default.
tm.fs.BorutaFSR
- class tablemage.fs.BorutaFSR(estimator: Literal['random_forest', 'xgboost'] = 'random_forest', n_estimators: int = 100, max_depth: int = 5, model_random_state: int = 42, n_jobs: int = -1, name: str | None = None)[source]
- __init__(estimator: Literal['random_forest', 'xgboost'] = 'random_forest', n_estimators: int = 100, max_depth: int = 5, model_random_state: int = 42, n_jobs: int = -1, name: str | None = None)[source]
Constructs a BorutaFSR.
- Parameters:
estimator (Literal["random_forest", "xgboost"]) – Default: “random_forest”. The estimator to use for Boruta. Default hyperparameters are used for the estimator.
n_estimators (int) – Default: 100. The number of estimators to use for Boruta.
max_depth (int) – Default: 5. The maximum depth of the trees in the ensemble.
model_random_state (int) – Default: 42. The random state to use for the estimator.
n_jobs (int) – Default: -1. The number of jobs to run in parallel.
name (str | None) – Default: None. If None, then outputs the default name.
tm.fs.KBestFSC
- class tablemage.fs.KBestFSC(scorer: Literal['f_classif', 'mutual_info_classif', 'chi2'], k: int, name: str | None = None)[source]
Selects the k best features based on the f_classif or mutual info regression score.
- __init__(scorer: Literal['f_classif', 'mutual_info_classif', 'chi2'], k: int, name: str | None = None)[source]
Initializes a KBestFSC object.
- Parameters:
scorer (Literal['f_classif', 'mutual_info_classif'])
k (int) – Number of desired features, < n_predictors.
name (str | None) – Default: None. If None, then outputs the default name.
tm.fs.LassoFSC
- class tablemage.fs.LassoFSC(max_n_features: int, c: float | None = None, name: str | None = None)[source]
Selects the (at most) k best features via Lasso regression model-inherent feature selection.
- __init__(max_n_features: int, c: float | None = None, name: str | None = None)[source]
Constructs a LassoFSC.
- Parameters:
max_n_features (int) – Number of desired features, < n_predictors.
c (float | None) – Default: None. Inverse of regularization strength. If None, then c is selected via five-fold cross validation from a grid of 10 candidate values, on a log scale from 1e-4 to 1e4.
name (str | None) – Default: None. If None, then name is set to default.
tm.fs.BorutaFSC
- class tablemage.fs.BorutaFSC(estimator: Literal['random_forest', 'xgboost'] = 'random_forest', n_estimators: int = 100, max_depth: int = 5, model_random_state: int = 42, name: str | None = None)[source]
- __init__(estimator: Literal['random_forest', 'xgboost'] = 'random_forest', n_estimators: int = 100, max_depth: int = 5, model_random_state: int = 42, name: str | None = None)[source]
Constructs a BorutaFSC.
- Parameters:
estimator (Literal["random_forest", "xgboost"]) – Default: “random_forest”. The estimator to use for Boruta. Default hyperparameters are used for the estimator.
n_estimators (int) – Default: 100. The number of estimators to use for Boruta’s estimator.
max_depth (int) – Default: 5. The maximum depth of the trees in the ensemble.
model_random_state (int) – Default: 42. The random state to use for the estimator.
name (str | None) – Default: None. If None, then outputs the default name.