Title: | Learning with Subset Stacking |
---|---|
Description: | "Learning with Subset Stacking" is a supervised learning algorithm that is based on training many local estimators on subsets of a given dataset, and then passing their predictions to a global estimator. You can find the details about LESS in our manuscript at <arXiv:2112.06251>. |
Authors: | Ilker Birbil [aut], Burhan Ozer Cavdar [aut, trl, cre] |
Maintainer: | Burhan Ozer Cavdar <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0 |
Built: | 2025-02-06 04:11:55 UTC |
Source: | https://github.com/cran/less |
The number of rings is the value to predict.
abalone
abalone
A dataframe with 4177 rows and 8 variables
Longest shell measurement in mm
perpendicular to length in mm
with meat in shell in mm
whole abalone in grams
weight of meat in grams
gut weight (after bleeding) in grams
after being dried in grams
+1.5 gives the age in years
Dua, D. and Graff, C. (2019). UCI Machine Learning Repository http://archive.ics.uci.edu/ml/. Irvine, CA: University of California, School of Information and Computer Science.
data(abalone)
data(abalone)
A dummy base R6 class that provides get_all_fields, get_attributes and set_random_state functionalities for estimators
R6 Class of BaseEstimator
get_all_fields()
Auxiliary function returning the name of all private and public fields of the self class
BaseEstimator$get_all_fields()
TestClass <- R6::R6Class(classname = "TestClass", inherit = BaseEstimator, private = list(random_state = NULL)) exampleClass <- TestClass$new() exampleClass$get_all_fields()
get_attributes()
Auxiliary function returning the name and values of all private and public fields of the self class
BaseEstimator$get_attributes()
exampleClass$get_attributes()
set_random_state()
Auxiliary function that sets random state attribute of the self class
BaseEstimator$set_random_state(random_state)
random_state
seed number to be set as random state
self
exampleClass$set_random_state(2022)
clone()
The objects of this class are cloneable with this method.
BaseEstimator$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `BaseEstimator$get_all_fields` ## ------------------------------------------------ TestClass <- R6::R6Class(classname = "TestClass", inherit = BaseEstimator, private = list(random_state = NULL)) exampleClass <- TestClass$new() exampleClass$get_all_fields() ## ------------------------------------------------ ## Method `BaseEstimator$get_attributes` ## ------------------------------------------------ exampleClass$get_attributes() ## ------------------------------------------------ ## Method `BaseEstimator$set_random_state` ## ------------------------------------------------ exampleClass$set_random_state(2022)
## ------------------------------------------------ ## Method `BaseEstimator$get_all_fields` ## ------------------------------------------------ TestClass <- R6::R6Class(classname = "TestClass", inherit = BaseEstimator, private = list(random_state = NULL)) exampleClass <- TestClass$new() exampleClass$get_all_fields() ## ------------------------------------------------ ## Method `BaseEstimator$get_attributes` ## ------------------------------------------------ exampleClass$get_attributes() ## ------------------------------------------------ ## Method `BaseEstimator$set_random_state` ## ------------------------------------------------ exampleClass$set_random_state(2022)
Checks if the given estimator is fitted
check_is_fitted(estimator)
check_is_fitted(estimator)
estimator |
An estimator with is_fitted attribute |
TRUE if the estimator is fitted, FALSE is not
Plots the fitted functions obtained with various regressors (using their default values) on the one-dimensional dataset (X, y).
comparison_plot(X, y, model_list)
comparison_plot(X, y, model_list)
X |
Predictors |
y |
Response variables |
model_list |
List of models to be compared |
sine_data_list <- less::synthetic_sine_curve() X_sine <- sine_data_list[[1]] y_sine <- sine_data_list[[2]] model_list <- c(DecisionTreeRegressor$new(), LinearRegression$new(), KNeighborsRegressor$new()) comparison_plot(X_sine, y_sine, model_list)
sine_data_list <- less::synthetic_sine_curve() X_sine <- sine_data_list[[1]] y_sine <- sine_data_list[[2]] model_list <- c(DecisionTreeRegressor$new(), LinearRegression$new(), KNeighborsRegressor$new()) comparison_plot(X_sine, y_sine, model_list)
Wrapper R6 Class of FNN::get.knnx function that can be used for LESSRegressor and LESSClassifier
The cover tree is O(n) space data structure which allows us to answer queries in the same O(log(n)) time as kd tree given a fixed intrinsic dimensionality. Templated code from https://hunch.net/~jl/projects/cover_tree/cover_tree.html is used.
R6 Class of CoverTree
new()
Creates a new instance of R6 Class of CoverTree
CoverTree$new(X = NULL)
X
An M x d data.frame or matrix, where each of the M rows is a point or a (column) vector (where d=1).
data(abalone) ct <- CoverTree$new(abalone[1:100,])
query()
Finds the p number of near neighbours for each point in an input/output dataset.
CoverTree$query(query_X = private$X, k = 1)
query_X
A set of N x d points that will be queried against X
. d, the number of columns, must be the same as X
.
If missing, defaults to X
.
k
The maximum number of nearest neighbours to compute (deafults to 1).
A list
of length 2 with elements:
nn.idx |
A N x k integer matrix returning the near neighbour indices. |
nn.dists |
A N x k matrix returning the near neighbour Euclidean distances |
res <- ct$query(abalone[1:3,], k=2) print(res)
clone()
The objects of this class are cloneable with this method.
CoverTree$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `CoverTree$new` ## ------------------------------------------------ data(abalone) ct <- CoverTree$new(abalone[1:100,]) ## ------------------------------------------------ ## Method `CoverTree$query` ## ------------------------------------------------ res <- ct$query(abalone[1:3,], k=2) print(res)
## ------------------------------------------------ ## Method `CoverTree$new` ## ------------------------------------------------ data(abalone) ct <- CoverTree$new(abalone[1:100,]) ## ------------------------------------------------ ## Method `CoverTree$query` ## ------------------------------------------------ res <- ct$query(abalone[1:3,], k=2) print(res)
Wrapper R6 Class of rpart::rpart function that can be used for LESSRegressor and LESSClassifier
R6 Class of DecisionTreeClassifier
less::BaseEstimator
-> less::SklearnEstimator
-> DecisionTreeClassifier
new()
Creates a new instance of R6 Class of DecisionTreeClassifier
DecisionTreeClassifier$new( min_samples_split = 2, min_samples_leaf = 1, cp = 0.001, xval = 10, surrogate_style = 0, max_depth = 30 )
min_samples_split
The minimum number of observations that must exist in a node in order for a split to be attempted (defaults to 2).
min_samples_leaf
The minimum number of observations in any terminal (leaf) node (defaults to 1).
cp
Complexity Parameter. Any split that does not decrease the overall lack of fit by a factor of cp is not attempted. This means that the overall R-squared must increase by cp at each step. The main role of this parameter is to save computing time by pruning off splits that are obviously not worthwhile. (defaults to 0.001)
xval
Number of cross-validations (defaults to 10)
surrogate_style
Controls the selection of a best surrogate. If set to 0 (default) the program uses the total number of correct classification for a potential surrogate variable, if set to 1 it uses the percent correct, calculated over the non-missing values of the surrogate. The first option more severely penalizes covariates with a large number of missing values.
max_depth
The maximum depth of any node of the final tree, with the root node counted as depth 0. Values greater than 30 will give nonsense results on 32-bit machines.
dt <- DecisionTreeClassifier$new() dt <- DecisionTreeClassifier$new(min_samples_split = 10) dt <- DecisionTreeClassifier$new(min_samples_leaf = 6, cp = 0.01)
fit()
Builds a decision tree regressor from the training set (X, y).
DecisionTreeClassifier$fit(X, y)
X
2D matrix or dataframe that includes predictors
y
1D vector or (n,1) dimensional matrix/dataframe that includes labels
Fitted R6 Class of DecisionTreeClassifier
data(iris) split_list <- train_test_split(iris, test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] dt <- DecisionTreeClassifier$new() dt$fit(X_train, y_train)
predict()
Predict regression value for X0.
DecisionTreeClassifier$predict(X0)
X0
2D matrix or dataframe that includes predictors
Factor of the predict classes.
dt <- DecisionTreeClassifier$new() dt$fit(X_train, y_train) preds <- dt$predict(X_test) dt <- DecisionTreeClassifier$new() preds <- dt$fit(X_train, y_train)$predict(X_test) preds <- DecisionTreeClassifier$new()$fit(X_train, y_train)$predict(X_test) print(caret::confusionMatrix(data=preds, reference = factor(y_test)))
get_estimator_type()
Auxiliary function returning the estimator type e.g 'regressor', 'classifier'
DecisionTreeClassifier$get_estimator_type()
dt$get_estimator_type()
clone()
The objects of this class are cloneable with this method.
DecisionTreeClassifier$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `DecisionTreeClassifier$new` ## ------------------------------------------------ dt <- DecisionTreeClassifier$new() dt <- DecisionTreeClassifier$new(min_samples_split = 10) dt <- DecisionTreeClassifier$new(min_samples_leaf = 6, cp = 0.01) ## ------------------------------------------------ ## Method `DecisionTreeClassifier$fit` ## ------------------------------------------------ data(iris) split_list <- train_test_split(iris, test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] dt <- DecisionTreeClassifier$new() dt$fit(X_train, y_train) ## ------------------------------------------------ ## Method `DecisionTreeClassifier$predict` ## ------------------------------------------------ dt <- DecisionTreeClassifier$new() dt$fit(X_train, y_train) preds <- dt$predict(X_test) dt <- DecisionTreeClassifier$new() preds <- dt$fit(X_train, y_train)$predict(X_test) preds <- DecisionTreeClassifier$new()$fit(X_train, y_train)$predict(X_test) print(caret::confusionMatrix(data=preds, reference = factor(y_test))) ## ------------------------------------------------ ## Method `DecisionTreeClassifier$get_estimator_type` ## ------------------------------------------------ dt$get_estimator_type()
## ------------------------------------------------ ## Method `DecisionTreeClassifier$new` ## ------------------------------------------------ dt <- DecisionTreeClassifier$new() dt <- DecisionTreeClassifier$new(min_samples_split = 10) dt <- DecisionTreeClassifier$new(min_samples_leaf = 6, cp = 0.01) ## ------------------------------------------------ ## Method `DecisionTreeClassifier$fit` ## ------------------------------------------------ data(iris) split_list <- train_test_split(iris, test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] dt <- DecisionTreeClassifier$new() dt$fit(X_train, y_train) ## ------------------------------------------------ ## Method `DecisionTreeClassifier$predict` ## ------------------------------------------------ dt <- DecisionTreeClassifier$new() dt$fit(X_train, y_train) preds <- dt$predict(X_test) dt <- DecisionTreeClassifier$new() preds <- dt$fit(X_train, y_train)$predict(X_test) preds <- DecisionTreeClassifier$new()$fit(X_train, y_train)$predict(X_test) print(caret::confusionMatrix(data=preds, reference = factor(y_test))) ## ------------------------------------------------ ## Method `DecisionTreeClassifier$get_estimator_type` ## ------------------------------------------------ dt$get_estimator_type()
Wrapper R6 Class of rpart::rpart function that can be used for LESSRegressor and LESSClassifier
R6 Class of DecisionTreeRegressor
less::BaseEstimator
-> less::SklearnEstimator
-> DecisionTreeRegressor
new()
Creates a new instance of R6 Class of DecisionTreeRegressor
DecisionTreeRegressor$new( min_samples_split = 2, min_samples_leaf = 1, cp = 0.001, xval = 10, surrogate_style = 0, max_depth = 30 )
min_samples_split
The minimum number of observations that must exist in a node in order for a split to be attempted (defaults to 2).
min_samples_leaf
The minimum number of observations in any terminal (leaf) node (defaults to 1).
cp
Complexity Parameter. Any split that does not decrease the overall lack of fit by a factor of cp is not attempted. This means that the overall R-squared must increase by cp at each step. The main role of this parameter is to save computing time by pruning off splits that are obviously not worthwhile. (defaults to 0.001)
xval
Number of cross-validations (defaults to 10)
surrogate_style
Controls the selection of a best surrogate. If set to 0 (default) the program uses the total number of correct classification for a potential surrogate variable, if set to 1 it uses the percent correct, calculated over the non-missing values of the surrogate. The first option more severely penalizes covariates with a large number of missing values.
max_depth
The maximum depth of any node of the final tree, with the root node counted as depth 0. Values greater than 30 will give nonsense results on 32-bit machines.
dt <- DecisionTreeRegressor$new() dt <- DecisionTreeRegressor$new(min_samples_split = 10) dt <- DecisionTreeRegressor$new(min_samples_leaf = 6, cp = 0.01)
fit()
Builds a decision tree regressor from the training set (X, y).
DecisionTreeRegressor$fit(X, y)
X
2D matrix or dataframe that includes predictors
y
1D vector or (n,1) dimensional matrix/dataframe that includes response variables
Fitted R6 Class of DecisionTreeRegressor
data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] dt <- DecisionTreeRegressor$new() dt$fit(X_train, y_train)
predict()
Predict regression value for X0.
DecisionTreeRegressor$predict(X0)
X0
2D matrix or dataframe that includes predictors
The predict values.
dt <- DecisionTreeRegressor$new() dt$fit(X_train, y_train) preds <- dt$predict(X_test) dt <- DecisionTreeRegressor$new() preds <- dt$fit(X_train, y_train)$predict(X_test) preds <- DecisionTreeRegressor$new()$fit(X_train, y_train)$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction"))))))
get_estimator_type()
Auxiliary function returning the estimator type e.g 'regressor', 'classifier'
DecisionTreeRegressor$get_estimator_type()
dt$get_estimator_type()
clone()
The objects of this class are cloneable with this method.
DecisionTreeRegressor$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `DecisionTreeRegressor$new` ## ------------------------------------------------ dt <- DecisionTreeRegressor$new() dt <- DecisionTreeRegressor$new(min_samples_split = 10) dt <- DecisionTreeRegressor$new(min_samples_leaf = 6, cp = 0.01) ## ------------------------------------------------ ## Method `DecisionTreeRegressor$fit` ## ------------------------------------------------ data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] dt <- DecisionTreeRegressor$new() dt$fit(X_train, y_train) ## ------------------------------------------------ ## Method `DecisionTreeRegressor$predict` ## ------------------------------------------------ dt <- DecisionTreeRegressor$new() dt$fit(X_train, y_train) preds <- dt$predict(X_test) dt <- DecisionTreeRegressor$new() preds <- dt$fit(X_train, y_train)$predict(X_test) preds <- DecisionTreeRegressor$new()$fit(X_train, y_train)$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction")))))) ## ------------------------------------------------ ## Method `DecisionTreeRegressor$get_estimator_type` ## ------------------------------------------------ dt$get_estimator_type()
## ------------------------------------------------ ## Method `DecisionTreeRegressor$new` ## ------------------------------------------------ dt <- DecisionTreeRegressor$new() dt <- DecisionTreeRegressor$new(min_samples_split = 10) dt <- DecisionTreeRegressor$new(min_samples_leaf = 6, cp = 0.01) ## ------------------------------------------------ ## Method `DecisionTreeRegressor$fit` ## ------------------------------------------------ data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] dt <- DecisionTreeRegressor$new() dt$fit(X_train, y_train) ## ------------------------------------------------ ## Method `DecisionTreeRegressor$predict` ## ------------------------------------------------ dt <- DecisionTreeRegressor$new() dt$fit(X_train, y_train) preds <- dt$predict(X_test) dt <- DecisionTreeRegressor$new() preds <- dt$fit(X_train, y_train)$predict(X_test) preds <- DecisionTreeRegressor$new()$fit(X_train, y_train)$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction")))))) ## ------------------------------------------------ ## Method `DecisionTreeRegressor$get_estimator_type` ## ------------------------------------------------ dt$get_estimator_type()
Prints the available regressors, clustering methods, tree functions and helper functions within LESS package.
get_functions()
get_functions()
get_functions()
get_functions()
Wrapper R6 Class of stats::hclust function that can be used for LESSRegressor and LESSClassifier
R6 Class of HierarchicalClustering
less::BaseEstimator
-> HierarchicalClustering
new()
Creates a new instance of R6 Class of HierarchicalClustering
HierarchicalClustering$new(linkage = "ward.D2", n_clusters = 8)
linkage
the agglomeration method to be used. This should be (an unambiguous abbreviation of) one of "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC) (defaults to ward.D2).
n_clusters
the number of clusters (defaults to 8).
hc <- HierarchicalClustering$new() hc <- HierarchicalClustering$new(n_clusters = 10) hc <- HierarchicalClustering$new(n_clusters = 10, linkage = "complete")
fit()
Perform hierarchical clustering on a data matrix.
HierarchicalClustering$fit(X)
X
numeric matrix of data, or an object that can be coerced to such a matrix (such as a numeric vector or a data frame with all numeric columns).
Fitted R6 class of HierarchicalClustering() that has 'labels' attribute
data(abalone) hc <- HierarchicalClustering$new() hc$fit(abalone[1:100,])
get_cluster_centers()
Auxiliary function returning the cluster centers
HierarchicalClustering$get_cluster_centers()
print(hc$get_cluster_centers())
get_labels()
Auxiliary function returning a vector of integers (from 1:k) indicating the cluster to which each point is allocated.
HierarchicalClustering$get_labels()
print(hc$get_labels())
clone()
The objects of this class are cloneable with this method.
HierarchicalClustering$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `HierarchicalClustering$new` ## ------------------------------------------------ hc <- HierarchicalClustering$new() hc <- HierarchicalClustering$new(n_clusters = 10) hc <- HierarchicalClustering$new(n_clusters = 10, linkage = "complete") ## ------------------------------------------------ ## Method `HierarchicalClustering$fit` ## ------------------------------------------------ data(abalone) hc <- HierarchicalClustering$new() hc$fit(abalone[1:100,]) ## ------------------------------------------------ ## Method `HierarchicalClustering$get_cluster_centers` ## ------------------------------------------------ print(hc$get_cluster_centers()) ## ------------------------------------------------ ## Method `HierarchicalClustering$get_labels` ## ------------------------------------------------ print(hc$get_labels())
## ------------------------------------------------ ## Method `HierarchicalClustering$new` ## ------------------------------------------------ hc <- HierarchicalClustering$new() hc <- HierarchicalClustering$new(n_clusters = 10) hc <- HierarchicalClustering$new(n_clusters = 10, linkage = "complete") ## ------------------------------------------------ ## Method `HierarchicalClustering$fit` ## ------------------------------------------------ data(abalone) hc <- HierarchicalClustering$new() hc$fit(abalone[1:100,]) ## ------------------------------------------------ ## Method `HierarchicalClustering$get_cluster_centers` ## ------------------------------------------------ print(hc$get_cluster_centers()) ## ------------------------------------------------ ## Method `HierarchicalClustering$get_labels` ## ------------------------------------------------ print(hc$get_labels())
Applies k-Fold cross validation to the given model on the given data
k_fold_cv( data = NULL, model = NULL, random_state = NULL, k = 5, y_index = ncol(data) )
k_fold_cv( data = NULL, model = NULL, random_state = NULL, k = 5, y_index = ncol(data) )
data |
The dataset to be used |
model |
A classification or a regression model (from LESS package) |
random_state |
A seed number to get reproducable result |
k |
Number of splits on the training set (defaults to 5) |
y_index |
Column index of the response variable on the given data. Default is the last column. |
A vector consists of metric of the individual folds and the average metric over the folds
k_fold_cv(data = iris, model = KNeighborsClassifier$new(), k = 3)
k_fold_cv(data = iris, model = KNeighborsClassifier$new(), k = 3)
Wrapper R6 Class of RANN::nn2 function that can be used for LESSRegressor and LESSClassifier
R6 Class of KDTree
new()
Creates a new instance of R6 Class of KDTree
KDTree$new(X = NULL)
X
An M x d data.frame or matrix, where each of the M rows is a point or a (column) vector (where d=1).
data(abalone) kdt <- KDTree$new(abalone[1:100,])
query()
Finds the p number of near neighbours for each point in an input/output dataset. The advantage of the kd-tree is that it runs in O(M log M) time.
KDTree$query(query_X = private$X, k = 1)
query_X
A set of N x d points that will be queried against X
. d, the number of columns, must be the same as X
.
If missing, defaults to X
.
k
The maximum number of nearest neighbours to compute (deafults to 1).
A list
of length 2 with elements:
nn.idx |
A N x k integer matrix returning the near neighbour indices. |
nn.dists |
A N x k matrix returning the near neighbour Euclidean distances |
res <- kdt$query(abalone[1:3,], k=2) print(res)
clone()
The objects of this class are cloneable with this method.
KDTree$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `KDTree$new` ## ------------------------------------------------ data(abalone) kdt <- KDTree$new(abalone[1:100,]) ## ------------------------------------------------ ## Method `KDTree$query` ## ------------------------------------------------ res <- kdt$query(abalone[1:3,], k=2) print(res)
## ------------------------------------------------ ## Method `KDTree$new` ## ------------------------------------------------ data(abalone) kdt <- KDTree$new(abalone[1:100,]) ## ------------------------------------------------ ## Method `KDTree$query` ## ------------------------------------------------ res <- kdt$query(abalone[1:3,], k=2) print(res)
Wrapper R6 Class of stats::kmeans function that can be used for LESSRegressor and LESSClassifier
R6 Class of KMeans
less::BaseEstimator
-> KMeans
new()
Creates a new instance of R6 Class of KMeans
KMeans$new(n_clusters = 8, n_init = 10, max_iter = 300, random_state = NULL)
n_clusters
the number of clusters. A random set of (distinct) rows in X is chosen as the initial centres (default to 8)
n_init
how many random sets should be chosen? (default to 10)
max_iter
the maximum number of iterations allowed (default to 300).
random_state
seed number to be used for fixing the randomness (default to NULL).
km <- KMeans$new() km <- KMeans$new(n_clusters = 10) km <- KMeans$new(n_clusters = 10, random_state = 100)
fit()
Perform k-means clustering on a data matrix.
KMeans$fit(X)
X
numeric matrix of data, or an object that can be coerced to such a matrix (such as a numeric vector or a data frame with all numeric columns).
Fitted R6 class of KMeans() that has 'cluster_centers' and 'labels' attributes
data(abalone) km <- KMeans$new() km$fit(abalone[1:100,])
get_cluster_centers()
Auxiliary function returning the cluster centers
KMeans$get_cluster_centers()
print(km$get_cluster_centers())
get_labels()
Auxiliary function returning a vector of integers (from 1:k) indicating the cluster to which each point is allocated.
KMeans$get_labels()
print(km$get_labels())
clone()
The objects of this class are cloneable with this method.
KMeans$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `KMeans$new` ## ------------------------------------------------ km <- KMeans$new() km <- KMeans$new(n_clusters = 10) km <- KMeans$new(n_clusters = 10, random_state = 100) ## ------------------------------------------------ ## Method `KMeans$fit` ## ------------------------------------------------ data(abalone) km <- KMeans$new() km$fit(abalone[1:100,]) ## ------------------------------------------------ ## Method `KMeans$get_cluster_centers` ## ------------------------------------------------ print(km$get_cluster_centers()) ## ------------------------------------------------ ## Method `KMeans$get_labels` ## ------------------------------------------------ print(km$get_labels())
## ------------------------------------------------ ## Method `KMeans$new` ## ------------------------------------------------ km <- KMeans$new() km <- KMeans$new(n_clusters = 10) km <- KMeans$new(n_clusters = 10, random_state = 100) ## ------------------------------------------------ ## Method `KMeans$fit` ## ------------------------------------------------ data(abalone) km <- KMeans$new() km$fit(abalone[1:100,]) ## ------------------------------------------------ ## Method `KMeans$get_cluster_centers` ## ------------------------------------------------ print(km$get_cluster_centers()) ## ------------------------------------------------ ## Method `KMeans$get_labels` ## ------------------------------------------------ print(km$get_labels())
Wrapper R6 Class of caret::knnreg function that can be used for LESSRegressor and LESSClassifier
R6 Class of KNeighborsClassifier
less::BaseEstimator
-> less::SklearnEstimator
-> KNeighborsClassifier
new()
Creates a new instance of R6 Class of KNeighborsClassifier
KNeighborsClassifier$new(k = 5)
k
Number of neighbors considered (defaults to 5).
knc <- KNeighborsClassifier$new() knc <- KNeighborsClassifier$new(k = 5)
fit()
Fit the k-nearest neighbors regressor from the training set (X, y).
KNeighborsClassifier$fit(X, y)
X
2D matrix or dataframe that includes predictors
y
1D vector or (n,1) dimensional matrix/dataframe that includes labels
Fitted R6 Class of KNeighborsClassifier
data(iris) split_list <- train_test_split(iris, test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] knc <- KNeighborsClassifier$new() knc$fit(X_train, y_train)
predict()
Predict regression value for X0.
KNeighborsClassifier$predict(X0)
X0
2D matrix or dataframe that includes predictors
Factor of the predict classes.
knc <- KNeighborsClassifier$new() knc$fit(X_train, y_train) preds <- knc$predict(X_test) knc <- KNeighborsClassifier$new() preds <- knc$fit(X_train, y_train)$predict(X_test) preds <- KNeighborsClassifier$new()$fit(X_train, y_train)$predict(X_test) print(caret::confusionMatrix(data=factor(preds), reference = factor(y_test)))
get_estimator_type()
Auxiliary function returning the estimator type e.g 'regressor', 'classifier'
KNeighborsClassifier$get_estimator_type()
knc$get_estimator_type()
clone()
The objects of this class are cloneable with this method.
KNeighborsClassifier$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `KNeighborsClassifier$new` ## ------------------------------------------------ knc <- KNeighborsClassifier$new() knc <- KNeighborsClassifier$new(k = 5) ## ------------------------------------------------ ## Method `KNeighborsClassifier$fit` ## ------------------------------------------------ data(iris) split_list <- train_test_split(iris, test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] knc <- KNeighborsClassifier$new() knc$fit(X_train, y_train) ## ------------------------------------------------ ## Method `KNeighborsClassifier$predict` ## ------------------------------------------------ knc <- KNeighborsClassifier$new() knc$fit(X_train, y_train) preds <- knc$predict(X_test) knc <- KNeighborsClassifier$new() preds <- knc$fit(X_train, y_train)$predict(X_test) preds <- KNeighborsClassifier$new()$fit(X_train, y_train)$predict(X_test) print(caret::confusionMatrix(data=factor(preds), reference = factor(y_test))) ## ------------------------------------------------ ## Method `KNeighborsClassifier$get_estimator_type` ## ------------------------------------------------ knc$get_estimator_type()
## ------------------------------------------------ ## Method `KNeighborsClassifier$new` ## ------------------------------------------------ knc <- KNeighborsClassifier$new() knc <- KNeighborsClassifier$new(k = 5) ## ------------------------------------------------ ## Method `KNeighborsClassifier$fit` ## ------------------------------------------------ data(iris) split_list <- train_test_split(iris, test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] knc <- KNeighborsClassifier$new() knc$fit(X_train, y_train) ## ------------------------------------------------ ## Method `KNeighborsClassifier$predict` ## ------------------------------------------------ knc <- KNeighborsClassifier$new() knc$fit(X_train, y_train) preds <- knc$predict(X_test) knc <- KNeighborsClassifier$new() preds <- knc$fit(X_train, y_train)$predict(X_test) preds <- KNeighborsClassifier$new()$fit(X_train, y_train)$predict(X_test) print(caret::confusionMatrix(data=factor(preds), reference = factor(y_test))) ## ------------------------------------------------ ## Method `KNeighborsClassifier$get_estimator_type` ## ------------------------------------------------ knc$get_estimator_type()
Wrapper R6 Class of caret::knnreg function that can be used for LESSRegressor and LESSClassifier
R6 Class of KNeighborsRegressor
less::BaseEstimator
-> less::SklearnEstimator
-> KNeighborsRegressor
new()
Creates a new instance of R6 Class of KNeighborsRegressor
KNeighborsRegressor$new(k = 5)
k
Number of neighbors considered (defaults to 5).
knr <- KNeighborsRegressor$new() knr <- KNeighborsRegressor$new(k = 5)
fit()
Fit the k-nearest neighbors regressor from the training set (X, y).
KNeighborsRegressor$fit(X, y)
X
2D matrix or dataframe that includes predictors
y
1D vector or (n,1) dimensional matrix/dataframe that includes response variables
Fitted R6 Class of KNeighborsRegressor
data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] knr <- KNeighborsRegressor$new() knr$fit(X_train, y_train)
predict()
Predict regression value for X0.
KNeighborsRegressor$predict(X0)
X0
2D matrix or dataframe that includes predictors
The predict values.
knr <- KNeighborsRegressor$new() knr$fit(X_train, y_train) preds <- knr$predict(X_test) knr <- KNeighborsRegressor$new() preds <- knr$fit(X_train, y_train)$predict(X_test) preds <- KNeighborsRegressor$new()$fit(X_train, y_train)$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction"))))))
get_estimator_type()
Auxiliary function returning the estimator type e.g 'regressor', 'classifier'
KNeighborsRegressor$get_estimator_type()
knr$get_estimator_type()
clone()
The objects of this class are cloneable with this method.
KNeighborsRegressor$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `KNeighborsRegressor$new` ## ------------------------------------------------ knr <- KNeighborsRegressor$new() knr <- KNeighborsRegressor$new(k = 5) ## ------------------------------------------------ ## Method `KNeighborsRegressor$fit` ## ------------------------------------------------ data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] knr <- KNeighborsRegressor$new() knr$fit(X_train, y_train) ## ------------------------------------------------ ## Method `KNeighborsRegressor$predict` ## ------------------------------------------------ knr <- KNeighborsRegressor$new() knr$fit(X_train, y_train) preds <- knr$predict(X_test) knr <- KNeighborsRegressor$new() preds <- knr$fit(X_train, y_train)$predict(X_test) preds <- KNeighborsRegressor$new()$fit(X_train, y_train)$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction")))))) ## ------------------------------------------------ ## Method `KNeighborsRegressor$get_estimator_type` ## ------------------------------------------------ knr$get_estimator_type()
## ------------------------------------------------ ## Method `KNeighborsRegressor$new` ## ------------------------------------------------ knr <- KNeighborsRegressor$new() knr <- KNeighborsRegressor$new(k = 5) ## ------------------------------------------------ ## Method `KNeighborsRegressor$fit` ## ------------------------------------------------ data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] knr <- KNeighborsRegressor$new() knr$fit(X_train, y_train) ## ------------------------------------------------ ## Method `KNeighborsRegressor$predict` ## ------------------------------------------------ knr <- KNeighborsRegressor$new() knr$fit(X_train, y_train) preds <- knr$predict(X_test) knr <- KNeighborsRegressor$new() preds <- knr$fit(X_train, y_train)$predict(X_test) preds <- KNeighborsRegressor$new()$fit(X_train, y_train)$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction")))))) ## ------------------------------------------------ ## Method `KNeighborsRegressor$get_estimator_type` ## ------------------------------------------------ knr$get_estimator_type()
An alternative distance function that can be used in LESS.
laplacian(data, center, coeff = 0.01)
laplacian(data, center, coeff = 0.01)
data |
Data that includes points in shape of (M x d) |
center |
A constant point in shape of (1 x d) |
coeff |
Coefficient value for Laplacian kernel |
A numeric vector containing the laplacian kernel distance between each point in data and center.
data <- matrix(1:12, nrow=3) center <- c(2, 7, 1, 3) distances <- laplacian(data, center) print(distances)
data <- matrix(1:12, nrow=3) center <- c(2, 7, 1, 3) distances <- laplacian(data, center) print(distances)
The base class for LESSRegressor and LESSClassifier
R6 class of LESSBase
less::BaseEstimator
-> less::SklearnEstimator
-> LESSBase
new()
Creates a new instance of R6 Class of LESSBase
LESSBase$new(replications = NULL, scobject = NULL, isFitted = FALSE)
replications
List to store the replications
scobject
Scaling object used for normalization (less::StandardScaler)
isFitted
Flag to check whether LESS is fitted
set_random_state()
Auxiliary function that sets random state attribute of the self class
LESSBase$set_random_state(random_state)
random_state
seed number to be set as random state
self
get_n_subsets()
Auxiliary function returning the number of subsets
LESSBase$get_n_subsets()
get_n_neighbors()
Auxiliary function returning the number of neighbors
LESSBase$get_n_neighbors()
get_frac()
Auxiliary function returning the percentage of samples used to set the number of neighbors
LESSBase$get_frac()
get_n_replications()
Auxiliary function returning the number of replications
LESSBase$get_n_replications()
get_d_normalize()
Auxiliary function returning the flag for normalization
LESSBase$get_d_normalize()
get_scaling()
Auxiliary function returning the flag for scaling
LESSBase$get_scaling()
get_val_size()
Auxiliary function returning the validation set size
LESSBase$get_val_size()
get_random_state()
Auxiliary function returning the random seed
LESSBase$get_random_state()
get_isFitted()
Auxiliary function returning the isFitted flag
LESSBase$get_isFitted()
get_replications()
Auxiliary function returning the isFitted flag
LESSBase$get_replications()
clone()
The objects of this class are cloneable with this method.
LESSBase$clone(deep = FALSE)
deep
Whether to make a deep clone.
Auxiliary binary classifier for Learning with Subset Stacking (LESS)
R6 class of LESSBinaryClassifier
less::BaseEstimator
-> less::SklearnEstimator
-> less::LESSBase
-> LESSBinaryClassifier
less::BaseEstimator$get_all_fields()
less::BaseEstimator$get_attributes()
less::SklearnEstimator$get_type()
less::SklearnEstimator$predict()
less::LESSBase$get_d_normalize()
less::LESSBase$get_frac()
less::LESSBase$get_isFitted()
less::LESSBase$get_n_neighbors()
less::LESSBase$get_n_replications()
less::LESSBase$get_n_subsets()
less::LESSBase$get_random_state()
less::LESSBase$get_replications()
less::LESSBase$get_scaling()
less::LESSBase$get_val_size()
new()
Creates a new instance of R6 Class of LESSBinaryClassifier
LESSBinaryClassifier$new( frac = NULL, n_neighbors = NULL, n_subsets = NULL, n_replications = 20, d_normalize = TRUE, val_size = NULL, random_state = NULL, tree_method = function(X) KDTree$new(X), cluster_method = NULL, local_estimator = LinearRegression$new(), global_estimator = DecisionTreeClassifier$new(), distance_function = NULL, scaling = TRUE, warnings = TRUE )
frac
fraction of total samples used for the number of neighbors (default is 0.05)
n_neighbors
number of neighbors (default is NULL)
n_subsets
number of subsets (default is NULL)
n_replications
number of replications (default is 20)
d_normalize
distance normalization (default is TRUE)
val_size
percentage of samples used for validation (default is NULL - no validation)
random_state
initialization of the random seed (default is NULL)
tree_method
method used for constructing the nearest neighbor tree, e.g., less::KDTree (default)
cluster_method
method used for clustering the subsets, e.g., less::KMeans (default is NULL)
local_estimator
estimator for the local models (default is less::LinearRegression)
global_estimator
estimator for the global model (default is less::DecisionTreeRegressor)
distance_function
distance function evaluating the distance from a subset to a sample, e.g., df(subset, sample) which returns a vector of distances (default is RBF(subset, sample, 1.0/n_subsets^2))
scaling
flag to normalize the input data (default is TRUE)
warnings
flag to turn on (TRUE) or off (FALSE) the warnings (default is TRUE)
fit()
Dummy fit function that calls the proper method according to validation and clustering parameters Options are:
Default fitting (no validation set, no clustering)
Fitting with validation set (no clustering)
Fitting with clustering (no) validation set)
Fitting with validation set and clustering
LESSBinaryClassifier$fit(X, y)
X
2D matrix or dataframe that includes predictors
y
1D vector or (n,1) dimensional matrix/dataframe that includes response variables
Fitted R6 Class of LESSBinaryClassifier
predict_proba()
Prediction probabilities are evaluated for the test samples in X0
LESSBinaryClassifier$predict_proba(X0)
X0
2D matrix or dataframe that includes predictors
get_global_estimator()
Auxiliary function returning the global_estimator
LESSBinaryClassifier$get_global_estimator()
set_random_state()
Auxiliary function that sets random state attribute of the self class
LESSBinaryClassifier$set_random_state(random_state)
random_state
seed number to be set as random state
self
clone()
The objects of this class are cloneable with this method.
LESSBinaryClassifier$clone(deep = FALSE)
deep
Whether to make a deep clone.
Classifier for Learning with Subset Stacking (LESS)
R6 class of LESSClassifier
less::BaseEstimator
-> less::SklearnEstimator
-> less::LESSBase
-> LESSClassifier
less::BaseEstimator$get_all_fields()
less::BaseEstimator$get_attributes()
less::SklearnEstimator$get_type()
less::LESSBase$get_d_normalize()
less::LESSBase$get_frac()
less::LESSBase$get_isFitted()
less::LESSBase$get_n_neighbors()
less::LESSBase$get_n_replications()
less::LESSBase$get_n_subsets()
less::LESSBase$get_random_state()
less::LESSBase$get_replications()
less::LESSBase$get_scaling()
less::LESSBase$get_val_size()
new()
Creates a new instance of R6 Class of LESSClassifier
LESSClassifier$new( frac = NULL, n_neighbors = NULL, n_subsets = NULL, n_replications = 20, d_normalize = TRUE, val_size = NULL, random_state = NULL, tree_method = function(X) KDTree$new(X), cluster_method = NULL, local_estimator = LinearRegression$new(), global_estimator = DecisionTreeClassifier$new(), distance_function = NULL, scaling = TRUE, warnings = TRUE, multiclass = "ovr" )
frac
fraction of total samples used for the number of neighbors (default is 0.05)
n_neighbors
number of neighbors (default is NULL)
n_subsets
number of subsets (default is NULL)
n_replications
number of replications (default is 20)
d_normalize
distance normalization (default is TRUE)
val_size
percentage of samples used for validation (default is NULL - no validation)
random_state
initialization of the random seed (default is NULL)
tree_method
method used for constructing the nearest neighbor tree, e.g., less::KDTree (default)
cluster_method
method used for clustering the subsets, e.g., less::KMeans (default is NULL)
local_estimator
estimator for the local models (default is less::LinearRegression)
global_estimator
estimator for the global model (default is less::DecisionTreeRegressor)
distance_function
distance function evaluating the distance from a subset to a sample, e.g., df(subset, sample) which returns a vector of distances (default is RBF(subset, sample, 1.0/n_subsets^2))
scaling
flag to normalize the input data (default is TRUE)
warnings
flag to turn on (TRUE) or off (FALSE) the warnings (default is TRUE)
multiclass
available strategies are 'ovr' (one-vs-rest, default), 'ovo' (one-vs-one), 'occ' (output-code-classifier) (default is 'ovr')
lessclassifier <- LESSClassifier$new() lessclassifier <- LESSClassifier$new(multiclass = "ovo")
fit()
Dummy fit function that calls the fit method of the multiclass strategy
LESSClassifier$fit(X, y)
X
2D matrix or dataframe that includes predictors
y
1D vector or (n,1) dimensional matrix/dataframe that includes response variables
Fitted R6 Class of LESSClassifier
data(iris) set.seed(2022) shuffled_iris <- iris[sample(1:nrow(iris)),] split_list <- train_test_split(shuffled_iris[1:10,], test_size = 0.3, random_state = 1) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] lessclassifier <- LESSClassifier$new() lessclassifier$fit(X_train, y_train)
predict()
Dummy predict function that calls the predict method of the multiclass strategy
LESSClassifier$predict(X0)
X0
2D matrix or dataframe that includes predictors
Predicted values of the given predictors
preds <- lessclassifier$predict(X_test) print(caret::confusionMatrix(data=factor(preds), reference = factor(y_test)))
get_estimator_type()
Auxiliary function returning the estimator type e.g 'regressor', 'classifier'
LESSClassifier$get_estimator_type()
lessclassifier$get_estimator_type()
set_random_state()
Auxiliary function that sets random state attribute of the self class
LESSClassifier$set_random_state(random_state)
random_state
seed number to be set as random state
self
lessclassifier$set_random_state(2022)
clone()
The objects of this class are cloneable with this method.
LESSClassifier$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `LESSClassifier$new` ## ------------------------------------------------ lessclassifier <- LESSClassifier$new() lessclassifier <- LESSClassifier$new(multiclass = "ovo") ## ------------------------------------------------ ## Method `LESSClassifier$fit` ## ------------------------------------------------ data(iris) set.seed(2022) shuffled_iris <- iris[sample(1:nrow(iris)),] split_list <- train_test_split(shuffled_iris[1:10,], test_size = 0.3, random_state = 1) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] lessclassifier <- LESSClassifier$new() lessclassifier$fit(X_train, y_train) ## ------------------------------------------------ ## Method `LESSClassifier$predict` ## ------------------------------------------------ preds <- lessclassifier$predict(X_test) print(caret::confusionMatrix(data=factor(preds), reference = factor(y_test))) ## ------------------------------------------------ ## Method `LESSClassifier$get_estimator_type` ## ------------------------------------------------ lessclassifier$get_estimator_type() ## ------------------------------------------------ ## Method `LESSClassifier$set_random_state` ## ------------------------------------------------ lessclassifier$set_random_state(2022)
## ------------------------------------------------ ## Method `LESSClassifier$new` ## ------------------------------------------------ lessclassifier <- LESSClassifier$new() lessclassifier <- LESSClassifier$new(multiclass = "ovo") ## ------------------------------------------------ ## Method `LESSClassifier$fit` ## ------------------------------------------------ data(iris) set.seed(2022) shuffled_iris <- iris[sample(1:nrow(iris)),] split_list <- train_test_split(shuffled_iris[1:10,], test_size = 0.3, random_state = 1) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] lessclassifier <- LESSClassifier$new() lessclassifier$fit(X_train, y_train) ## ------------------------------------------------ ## Method `LESSClassifier$predict` ## ------------------------------------------------ preds <- lessclassifier$predict(X_test) print(caret::confusionMatrix(data=factor(preds), reference = factor(y_test))) ## ------------------------------------------------ ## Method `LESSClassifier$get_estimator_type` ## ------------------------------------------------ lessclassifier$get_estimator_type() ## ------------------------------------------------ ## Method `LESSClassifier$set_random_state` ## ------------------------------------------------ lessclassifier$set_random_state(2022)
Regressor for Learning with Subset Stacking (LESS)
R6 class of LESSRegressor
less::BaseEstimator
-> less::SklearnEstimator
-> less::LESSBase
-> LESSRegressor
less::BaseEstimator$get_all_fields()
less::BaseEstimator$get_attributes()
less::SklearnEstimator$get_type()
less::LESSBase$get_d_normalize()
less::LESSBase$get_frac()
less::LESSBase$get_isFitted()
less::LESSBase$get_n_neighbors()
less::LESSBase$get_n_replications()
less::LESSBase$get_n_subsets()
less::LESSBase$get_random_state()
less::LESSBase$get_replications()
less::LESSBase$get_scaling()
less::LESSBase$get_val_size()
less::LESSBase$set_random_state()
new()
Creates a new instance of R6 Class of LESSRegressor
LESSRegressor$new( frac = NULL, n_neighbors = NULL, n_subsets = NULL, n_replications = 20, d_normalize = TRUE, val_size = NULL, random_state = NULL, tree_method = function(X) KDTree$new(X), cluster_method = NULL, local_estimator = LinearRegression$new(), global_estimator = DecisionTreeRegressor$new(), distance_function = NULL, scaling = TRUE, warnings = TRUE )
frac
fraction of total samples used for the number of neighbors (default is 0.05)
n_neighbors
number of neighbors (default is NULL)
n_subsets
number of subsets (default is NULL)
n_replications
number of replications (default is 20)
d_normalize
distance normalization (default is TRUE)
val_size
percentage of samples used for validation (default is NULL - no validation)
random_state
initialization of the random seed (default is NULL)
tree_method
method used for constructing the nearest neighbor tree, e.g., less::KDTree (default)
cluster_method
method used for clustering the subsets, e.g., less::KMeans (default is NULL)
local_estimator
estimator for the local models (default is less::LinearRegression)
global_estimator
estimator for the global model (default is less::DecisionTreeRegressor)
distance_function
distance function evaluating the distance from a subset to a sample, e.g., df(subset, sample) which returns a vector of distances (default is RBF(subset, sample, 1.0/n_subsets^2))
scaling
flag to normalize the input data (default is TRUE)
warnings
flag to turn on (TRUE) or off (FALSE) the warnings (default is TRUE)
lessRegressor <- LESSRegressor$new() lessRegressor <- LESSRegressor$new(val_size = 0.3) lessRegressor <- LESSRegressor$new(cluster_method = less::KMeans$new()) lessRegressor <- LESSRegressor$new(val_size = 0.3, cluster_method = less::KMeans$new())
fit()
Dummy fit function that calls the proper method according to validation and clustering parameters Options are:
Default fitting (no validation set, no clustering)
Fitting with validation set (no clustering)
Fitting with clustering (no) validation set)
Fitting with validation set and clustering
LESSRegressor$fit(X, y)
X
2D matrix or dataframe that includes predictors
y
1D vector or (n,1) dimensional matrix/dataframe that includes response variables
Fitted R6 Class of LESSRegressor
data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] lessRegressor <- LESSRegressor$new() lessRegressor$fit(X_train, y_train)
predict()
Predictions are evaluated for the test samples in X0
LESSRegressor$predict(X0)
X0
2D matrix or dataframe that includes predictors
Predicted values of the given predictors
preds <- lessRegressor$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction"))))))
get_estimator_type()
Auxiliary function returning the estimator type e.g 'regressor', 'classifier'
LESSRegressor$get_estimator_type()
lessRegressor$get_estimator_type()
clone()
The objects of this class are cloneable with this method.
LESSRegressor$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `LESSRegressor$new` ## ------------------------------------------------ lessRegressor <- LESSRegressor$new() lessRegressor <- LESSRegressor$new(val_size = 0.3) lessRegressor <- LESSRegressor$new(cluster_method = less::KMeans$new()) lessRegressor <- LESSRegressor$new(val_size = 0.3, cluster_method = less::KMeans$new()) ## ------------------------------------------------ ## Method `LESSRegressor$fit` ## ------------------------------------------------ data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] lessRegressor <- LESSRegressor$new() lessRegressor$fit(X_train, y_train) ## ------------------------------------------------ ## Method `LESSRegressor$predict` ## ------------------------------------------------ preds <- lessRegressor$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction")))))) ## ------------------------------------------------ ## Method `LESSRegressor$get_estimator_type` ## ------------------------------------------------ lessRegressor$get_estimator_type()
## ------------------------------------------------ ## Method `LESSRegressor$new` ## ------------------------------------------------ lessRegressor <- LESSRegressor$new() lessRegressor <- LESSRegressor$new(val_size = 0.3) lessRegressor <- LESSRegressor$new(cluster_method = less::KMeans$new()) lessRegressor <- LESSRegressor$new(val_size = 0.3, cluster_method = less::KMeans$new()) ## ------------------------------------------------ ## Method `LESSRegressor$fit` ## ------------------------------------------------ data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] lessRegressor <- LESSRegressor$new() lessRegressor$fit(X_train, y_train) ## ------------------------------------------------ ## Method `LESSRegressor$predict` ## ------------------------------------------------ preds <- lessRegressor$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction")))))) ## ------------------------------------------------ ## Method `LESSRegressor$get_estimator_type` ## ------------------------------------------------ lessRegressor$get_estimator_type()
Wrapper R6 Class of stats::lm function that can be used for LESSRegressor and LESSClassifier
R6 Class of LinearRegression
less::BaseEstimator
-> less::SklearnEstimator
-> LinearRegression
fit()
Fits a linear model (y ~ X)
LinearRegression$fit(X, y)
X
2D matrix or dataframe that includes predictors
y
1D vector or (n,1) dimensional matrix/dataframe that includes response variables
Fitted R6 Class of LinearRegression
data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] lr <- LinearRegression$new() lr$fit(X_train, y_train)
predict()
Predict regression value for X.
LinearRegression$predict(X0)
X0
2D matrix or dataframe that includes predictors
The predict values.
lr <- LinearRegression$new() lr$fit(X_train, y_train) preds <- lr$predict(X_test) lr <- LinearRegression$new() preds <- lr$fit(X_train, y_train)$predict(X_test) preds <- LinearRegression$new()$fit(X_train, y_train)$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction"))))))
get_estimator_type()
Auxiliary function returning the estimator type e.g 'regressor', 'classifier'
LinearRegression$get_estimator_type()
lr$get_estimator_type()
clone()
The objects of this class are cloneable with this method.
LinearRegression$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `LinearRegression$fit` ## ------------------------------------------------ data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] lr <- LinearRegression$new() lr$fit(X_train, y_train) ## ------------------------------------------------ ## Method `LinearRegression$predict` ## ------------------------------------------------ lr <- LinearRegression$new() lr$fit(X_train, y_train) preds <- lr$predict(X_test) lr <- LinearRegression$new() preds <- lr$fit(X_train, y_train)$predict(X_test) preds <- LinearRegression$new()$fit(X_train, y_train)$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction")))))) ## ------------------------------------------------ ## Method `LinearRegression$get_estimator_type` ## ------------------------------------------------ lr$get_estimator_type()
## ------------------------------------------------ ## Method `LinearRegression$fit` ## ------------------------------------------------ data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] lr <- LinearRegression$new() lr$fit(X_train, y_train) ## ------------------------------------------------ ## Method `LinearRegression$predict` ## ------------------------------------------------ lr <- LinearRegression$new() lr$fit(X_train, y_train) preds <- lr$predict(X_test) lr <- LinearRegression$new() preds <- lr$fit(X_train, y_train)$predict(X_test) preds <- LinearRegression$new()$fit(X_train, y_train)$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction")))))) ## ------------------------------------------------ ## Method `LinearRegression$get_estimator_type` ## ------------------------------------------------ lr$get_estimator_type()
Takes X and y datasets and merges them into a dataframe with column names (y, X_1, X_2 ...)
prepareDataset(X, y)
prepareDataset(X, y)
X |
Independent variables |
y |
Response variables |
A named dataframe which consists of X and y combined
X <- matrix(1:20, nrow = 4) y <- c(5:8) prepareDataset(X, y)
X <- matrix(1:20, nrow = 4) y <- c(5:8) prepareDataset(X, y)
Takes X dataset and convert it into a dataframe with column names (X_1, X_2 ...)
prepareXset(X)
prepareXset(X)
X |
Independent variables |
A named dataframe which consists of X
X <- matrix(1:20, nrow = 4) prepareXset(X)
X <- matrix(1:20, nrow = 4) prepareXset(X)
Wrapper R6 Class of randomForest::randomForest function that can be used for LESSRegressor and LESSClassifier
R6 Class of RandomForestClassifier
less::BaseEstimator
-> less::SklearnEstimator
-> RandomForestClassifier
new()
Creates a new instance of R6 Class of RandomForestClassifier
RandomForestClassifier$new( n_estimators = 100, random_state = NULL, min_samples_leaf = 1, max_leaf_nodes = NULL )
n_estimators
Number of trees to grow. This should not be set to too small a number, to ensure that every input row gets predicted at least a few times (defaults to 100).
random_state
Seed number to be used for fixing the randomness (default to NULL).
min_samples_leaf
Minimum size of terminal nodes. Setting this number larger causes smaller trees to be grown (and thus take less time) (defaults to 1)
max_leaf_nodes
Maximum number of terminal nodes trees in the forest can have. If not given, trees are grown to the maximum possible (subject to limits by nodesize). If set larger than maximum possible, a warning is issued. (defaults to NULL)
rf <- RandomForestClassifier$new() rf <- RandomForestClassifier$new(n_estimators = 500) rf <- RandomForestClassifier$new(n_estimators = 500, random_state = 100)
fit()
Builds a random forest regressor from the training set (X, y).
RandomForestClassifier$fit(X, y)
X
2D matrix or dataframe that includes predictors
y
1D vector or (n,1) dimensional matrix/dataframe that includes labels
Fitted R6 Class of RandomForestClassifier
data(iris) split_list <- train_test_split(iris, test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] rf <- RandomForestClassifier$new() rf$fit(X_train, y_train)
predict()
Predict regression value for X0.
RandomForestClassifier$predict(X0)
X0
2D matrix or dataframe that includes predictors
Factor of the predict classes.
rf <- RandomForestClassifier$new() rf$fit(X_train, y_train) preds <- rf$predict(X_test) rf <- RandomForestClassifier$new() preds <- rf$fit(X_train, y_train)$predict(X_test) preds <- RandomForestClassifier$new()$fit(X_train, y_train)$predict(X_test) print(caret::confusionMatrix(data=preds, reference = factor(y_test)))
get_estimator_type()
Auxiliary function returning the estimator type e.g 'regressor', 'classifier'
RandomForestClassifier$get_estimator_type()
rf$get_estimator_type()
clone()
The objects of this class are cloneable with this method.
RandomForestClassifier$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `RandomForestClassifier$new` ## ------------------------------------------------ rf <- RandomForestClassifier$new() rf <- RandomForestClassifier$new(n_estimators = 500) rf <- RandomForestClassifier$new(n_estimators = 500, random_state = 100) ## ------------------------------------------------ ## Method `RandomForestClassifier$fit` ## ------------------------------------------------ data(iris) split_list <- train_test_split(iris, test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] rf <- RandomForestClassifier$new() rf$fit(X_train, y_train) ## ------------------------------------------------ ## Method `RandomForestClassifier$predict` ## ------------------------------------------------ rf <- RandomForestClassifier$new() rf$fit(X_train, y_train) preds <- rf$predict(X_test) rf <- RandomForestClassifier$new() preds <- rf$fit(X_train, y_train)$predict(X_test) preds <- RandomForestClassifier$new()$fit(X_train, y_train)$predict(X_test) print(caret::confusionMatrix(data=preds, reference = factor(y_test))) ## ------------------------------------------------ ## Method `RandomForestClassifier$get_estimator_type` ## ------------------------------------------------ rf$get_estimator_type()
## ------------------------------------------------ ## Method `RandomForestClassifier$new` ## ------------------------------------------------ rf <- RandomForestClassifier$new() rf <- RandomForestClassifier$new(n_estimators = 500) rf <- RandomForestClassifier$new(n_estimators = 500, random_state = 100) ## ------------------------------------------------ ## Method `RandomForestClassifier$fit` ## ------------------------------------------------ data(iris) split_list <- train_test_split(iris, test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] rf <- RandomForestClassifier$new() rf$fit(X_train, y_train) ## ------------------------------------------------ ## Method `RandomForestClassifier$predict` ## ------------------------------------------------ rf <- RandomForestClassifier$new() rf$fit(X_train, y_train) preds <- rf$predict(X_test) rf <- RandomForestClassifier$new() preds <- rf$fit(X_train, y_train)$predict(X_test) preds <- RandomForestClassifier$new()$fit(X_train, y_train)$predict(X_test) print(caret::confusionMatrix(data=preds, reference = factor(y_test))) ## ------------------------------------------------ ## Method `RandomForestClassifier$get_estimator_type` ## ------------------------------------------------ rf$get_estimator_type()
Wrapper R6 Class of randomForest::randomForest function that can be used for LESSRegressor and LESSClassifier
R6 Class of RandomForestRegressor
less::BaseEstimator
-> less::SklearnEstimator
-> RandomForestRegressor
new()
Creates a new instance of R6 Class of RandomForestRegressor
RandomForestRegressor$new( n_estimators = 100, random_state = NULL, min_samples_leaf = 1, max_leaf_nodes = NULL )
n_estimators
Number of trees to grow. This should not be set to too small a number, to ensure that every input row gets predicted at least a few times (defaults to 100).
random_state
Seed number to be used for fixing the randomness (default to NULL).
min_samples_leaf
Minimum size of terminal nodes. Setting this number larger causes smaller trees to be grown (and thus take less time) (defaults to 1)
max_leaf_nodes
Maximum number of terminal nodes trees in the forest can have. If not given, trees are grown to the maximum possible (subject to limits by nodesize). If set larger than maximum possible, a warning is issued. (defaults to NULL)
rf <- RandomForestRegressor$new() rf <- RandomForestRegressor$new(n_estimators = 500) rf <- RandomForestRegressor$new(n_estimators = 500, random_state = 100)
fit()
Builds a random forest regressor from the training set (X, y).
RandomForestRegressor$fit(X, y)
X
2D matrix or dataframe that includes predictors
y
1D vector or (n,1) dimensional matrix/dataframe that includes response variables
Fitted R6 Class of RandomForestRegressor
data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] rf <- RandomForestRegressor$new() rf$fit(X_train, y_train)
predict()
Predict regression value for X0.
RandomForestRegressor$predict(X0)
X0
2D matrix or dataframe that includes predictors
The predict values.
preds <- rf$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction"))))))
get_estimator_type()
Auxiliary function returning the estimator type e.g 'regressor', 'classifier'
RandomForestRegressor$get_estimator_type()
rf$get_estimator_type()
clone()
The objects of this class are cloneable with this method.
RandomForestRegressor$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `RandomForestRegressor$new` ## ------------------------------------------------ rf <- RandomForestRegressor$new() rf <- RandomForestRegressor$new(n_estimators = 500) rf <- RandomForestRegressor$new(n_estimators = 500, random_state = 100) ## ------------------------------------------------ ## Method `RandomForestRegressor$fit` ## ------------------------------------------------ data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] rf <- RandomForestRegressor$new() rf$fit(X_train, y_train) ## ------------------------------------------------ ## Method `RandomForestRegressor$predict` ## ------------------------------------------------ preds <- rf$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction")))))) ## ------------------------------------------------ ## Method `RandomForestRegressor$get_estimator_type` ## ------------------------------------------------ rf$get_estimator_type()
## ------------------------------------------------ ## Method `RandomForestRegressor$new` ## ------------------------------------------------ rf <- RandomForestRegressor$new() rf <- RandomForestRegressor$new(n_estimators = 500) rf <- RandomForestRegressor$new(n_estimators = 500, random_state = 100) ## ------------------------------------------------ ## Method `RandomForestRegressor$fit` ## ------------------------------------------------ data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] rf <- RandomForestRegressor$new() rf$fit(X_train, y_train) ## ------------------------------------------------ ## Method `RandomForestRegressor$predict` ## ------------------------------------------------ preds <- rf$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction")))))) ## ------------------------------------------------ ## Method `RandomForestRegressor$get_estimator_type` ## ------------------------------------------------ rf$get_estimator_type()
The default distance function in LESS.
rbf(data, center, coeff = 0.01)
rbf(data, center, coeff = 0.01)
data |
Data that includes points in shape of (M x d) |
center |
A constant point in shape of (1 x d) |
coeff |
Coefficient value for RBF kernel |
A numeric vector containing the Radial basis function kernel distance between each point in data and center.
data <- matrix(1:12, nrow=3) center <- c(2, 7, 1, 3) distances <- rbf(data, center) print(distances)
data <- matrix(1:12, nrow=3) center <- c(2, 7, 1, 3) distances <- rbf(data, center) print(distances)
A dummy base R6 class that includes fit, predict functions for estimators
R6 Class of SklearnEstimator
less::BaseEstimator
-> SklearnEstimator
fit()
Dummy fit function
SklearnEstimator$fit()
sklearn <- SklearnEstimator$new() sklearn$fit()
predict()
Dummy predict function
SklearnEstimator$predict()
sklearn$predict()
get_type()
Auxiliary function returning the type of the class e.g 'estimator'
SklearnEstimator$get_type()
sklearn$get_type()
get_isFitted()
Auxiliary function returning the isFitted flag
SklearnEstimator$get_isFitted()
sklearn$get_isFitted()
clone()
The objects of this class are cloneable with this method.
SklearnEstimator$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `SklearnEstimator$fit` ## ------------------------------------------------ sklearn <- SklearnEstimator$new() sklearn$fit() ## ------------------------------------------------ ## Method `SklearnEstimator$predict` ## ------------------------------------------------ sklearn$predict() ## ------------------------------------------------ ## Method `SklearnEstimator$get_type` ## ------------------------------------------------ sklearn$get_type() ## ------------------------------------------------ ## Method `SklearnEstimator$get_isFitted` ## ------------------------------------------------ sklearn$get_isFitted()
## ------------------------------------------------ ## Method `SklearnEstimator$fit` ## ------------------------------------------------ sklearn <- SklearnEstimator$new() sklearn$fit() ## ------------------------------------------------ ## Method `SklearnEstimator$predict` ## ------------------------------------------------ sklearn$predict() ## ------------------------------------------------ ## Method `SklearnEstimator$get_type` ## ------------------------------------------------ sklearn$get_type() ## ------------------------------------------------ ## Method `SklearnEstimator$get_isFitted` ## ------------------------------------------------ sklearn$get_isFitted()
Wrapper R6 Class of e1071::svm function that can be used for LESSRegressor and LESSClassifier
R6 Class of SVC
less::BaseEstimator
-> less::SklearnEstimator
-> SVC
new()
Creates a new instance of R6 Class of SVC
SVC$new( scale = TRUE, kernel = "radial", degree = 3, gamma = NULL, coef0 = 0, cost = 1, cache_size = 40, tolerance = 0.001, epsilon = 0.1, shrinking = TRUE, cross = 0, probability = FALSE, fitted = TRUE )
scale
A logical vector indicating the variables to be scaled. If scale is of length 1, the value is recycled as many times as needed. Per default, data are scaled internally (both x and y variables) to zero mean and unit variance. The center and scale values are returned and used for later predictions (default: TRUE)
kernel
The kernel used in training and predicting. Possible values are: "linear", "polynomial", "radial", "sigmoid" (default is "radial")
degree
Parameter needed for kernel of type polynomial (default: 3)
gamma
Parameter needed for all kernels except linear (default: 1/(data dimension))
coef0
Parameter needed for kernels of type polynomial and sigmoid (default: 0)
cost
Cost of constraints violation (default: 1)—it is the ‘C’-constant of the regularization term in the Lagrange formulation (default: 1)
cache_size
Cache memory in MB (default: 40)
tolerance
Tolerance of termination criterion (default: 0.001)
epsilon
Epsilon in the insensitive-loss function (default: 0.1)
shrinking
Option whether to use the shrinking-heuristics (default: TRUE)
cross
If a integer value k>0 is specified, a k-fold cross validation on the training data is performed to assess the quality of the model: the accuracy rate for classification and the Mean Squared Error for regression (default: 0)
probability
Logical indicating whether the model should allow for probability predictions (default: FALSE)
fitted
Logical indicating whether the fitted values should be computed and included in the model or not (default: TRUE)
svc <- SVC$new() svc <- SVC$new(kernel = "polynomial")
fit()
Fit the SVM model from the training set (X, y).
SVC$fit(X, y)
X
2D matrix or dataframe that includes predictors
y
1D vector or (n,1) dimensional matrix/dataframe that includes labels
Fitted R6 Class of SVC
data(iris) split_list <- train_test_split(iris, test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] svc <- SVC$new() svc$fit(X_train, y_train)
predict()
Predict regression value for X0.
SVC$predict(X0)
X0
2D matrix or dataframe that includes predictors
Factor of the predict classes.
svc <- SVC$new() svc$fit(X_train, y_train) preds <- svc$predict(X_test) svc <- SVC$new() preds <- svc$fit(X_train, y_train)$predict(X_test) preds <- SVC$new()$fit(X_train, y_train)$predict(X_test) print(caret::confusionMatrix(data=preds, reference = factor(y_test)))
get_estimator_type()
Auxiliary function returning the estimator type e.g 'regressor', 'classifier'
SVC$get_estimator_type()
svc$get_estimator_type()
clone()
The objects of this class are cloneable with this method.
SVC$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `SVC$new` ## ------------------------------------------------ svc <- SVC$new() svc <- SVC$new(kernel = "polynomial") ## ------------------------------------------------ ## Method `SVC$fit` ## ------------------------------------------------ data(iris) split_list <- train_test_split(iris, test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] svc <- SVC$new() svc$fit(X_train, y_train) ## ------------------------------------------------ ## Method `SVC$predict` ## ------------------------------------------------ svc <- SVC$new() svc$fit(X_train, y_train) preds <- svc$predict(X_test) svc <- SVC$new() preds <- svc$fit(X_train, y_train)$predict(X_test) preds <- SVC$new()$fit(X_train, y_train)$predict(X_test) print(caret::confusionMatrix(data=preds, reference = factor(y_test))) ## ------------------------------------------------ ## Method `SVC$get_estimator_type` ## ------------------------------------------------ svc$get_estimator_type()
## ------------------------------------------------ ## Method `SVC$new` ## ------------------------------------------------ svc <- SVC$new() svc <- SVC$new(kernel = "polynomial") ## ------------------------------------------------ ## Method `SVC$fit` ## ------------------------------------------------ data(iris) split_list <- train_test_split(iris, test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] svc <- SVC$new() svc$fit(X_train, y_train) ## ------------------------------------------------ ## Method `SVC$predict` ## ------------------------------------------------ svc <- SVC$new() svc$fit(X_train, y_train) preds <- svc$predict(X_test) svc <- SVC$new() preds <- svc$fit(X_train, y_train)$predict(X_test) preds <- SVC$new()$fit(X_train, y_train)$predict(X_test) print(caret::confusionMatrix(data=preds, reference = factor(y_test))) ## ------------------------------------------------ ## Method `SVC$get_estimator_type` ## ------------------------------------------------ svc$get_estimator_type()
Wrapper R6 Class of e1071::svm function that can be used for LESSRegressor and LESSClassifier
R6 Class of SVR
less::BaseEstimator
-> less::SklearnEstimator
-> SVR
new()
Creates a new instance of R6 Class of SVR
SVR$new( scale = TRUE, kernel = "radial", degree = 3, gamma = NULL, coef0 = 0, cost = 1, cache_size = 40, tolerance = 0.001, epsilon = 0.1, shrinking = TRUE, cross = 0, probability = FALSE, fitted = TRUE )
scale
A logical vector indicating the variables to be scaled. If scale is of length 1, the value is recycled as many times as needed. Per default, data are scaled internally (both x and y variables) to zero mean and unit variance. The center and scale values are returned and used for later predictions (default: TRUE)
kernel
The kernel used in training and predicting. Possible values are: "linear", "polynomial", "radial", "sigmoid" (default is "radial")
degree
Parameter needed for kernel of type polynomial (default: 3)
gamma
Parameter needed for all kernels except linear (default: 1/(data dimension))
coef0
Parameter needed for kernels of type polynomial and sigmoid (default: 0)
cost
Cost of constraints violation (default: 1)—it is the ‘C’-constant of the regularization term in the Lagrange formulation (default: 1)
cache_size
Cache memory in MB (default: 40)
tolerance
Tolerance of termination criterion (default: 0.001)
epsilon
Epsilon in the insensitive-loss function (default: 0.1)
shrinking
Option whether to use the shrinking-heuristics (default: TRUE)
cross
If a integer value k>0 is specified, a k-fold cross validation on the training data is performed to assess the quality of the model: the accuracy rate for classification and the Mean Squared Error for regression (default: 0)
probability
Logical indicating whether the model should allow for probability predictions (default: FALSE)
fitted
Logical indicating whether the fitted values should be computed and included in the model or not (default: TRUE)
svr <- SVR$new() svr <- SVR$new(kernel = "polynomial")
fit()
Fit the SVM model from the training set (X, y).
SVR$fit(X, y)
X
2D matrix or dataframe that includes predictors
y
1D vector or (n,1) dimensional matrix/dataframe that includes response variables
Fitted R6 Class of SVR
data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] svr <- SVR$new() svr$fit(X_train, y_train)
predict()
Predict regression value for X0.
SVR$predict(X0)
X0
2D matrix or dataframe that includes predictors
The predict values.
svr <- SVR$new() svr$fit(X_train, y_train) preds <- svr$predict(X_test) svr <- SVR$new() preds <- svr$fit(X_train, y_train)$predict(X_test) preds <- SVR$new()$fit(X_train, y_train)$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction"))))))
get_estimator_type()
Auxiliary function returning the estimator type e.g 'regressor', 'classifier'
SVR$get_estimator_type()
svr$get_estimator_type()
clone()
The objects of this class are cloneable with this method.
SVR$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `SVR$new` ## ------------------------------------------------ svr <- SVR$new() svr <- SVR$new(kernel = "polynomial") ## ------------------------------------------------ ## Method `SVR$fit` ## ------------------------------------------------ data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] svr <- SVR$new() svr$fit(X_train, y_train) ## ------------------------------------------------ ## Method `SVR$predict` ## ------------------------------------------------ svr <- SVR$new() svr$fit(X_train, y_train) preds <- svr$predict(X_test) svr <- SVR$new() preds <- svr$fit(X_train, y_train)$predict(X_test) preds <- SVR$new()$fit(X_train, y_train)$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction")))))) ## ------------------------------------------------ ## Method `SVR$get_estimator_type` ## ------------------------------------------------ svr$get_estimator_type()
## ------------------------------------------------ ## Method `SVR$new` ## ------------------------------------------------ svr <- SVR$new() svr <- SVR$new(kernel = "polynomial") ## ------------------------------------------------ ## Method `SVR$fit` ## ------------------------------------------------ data(abalone) split_list <- train_test_split(abalone[1:100,], test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] svr <- SVR$new() svr$fit(X_train, y_train) ## ------------------------------------------------ ## Method `SVR$predict` ## ------------------------------------------------ svr <- SVR$new() svr$fit(X_train, y_train) preds <- svr$predict(X_test) svr <- SVR$new() preds <- svr$fit(X_train, y_train)$predict(X_test) preds <- SVR$new()$fit(X_train, y_train)$predict(X_test) print(head(matrix(c(y_test, preds), ncol = 2, dimnames = (list(NULL, c("True", "Prediction")))))) ## ------------------------------------------------ ## Method `SVR$get_estimator_type` ## ------------------------------------------------ svr$get_estimator_type()
A simple function to generate n_samples from sine curve in the range (-10, 10) with some amplitude. The function returns the dataset (X, y), and plots the function (curve) along with the dataset (circles)
synthetic_sine_curve(n_samples = 200)
synthetic_sine_curve(n_samples = 200)
n_samples |
Number of data points to be generated |
sine_data_list <- synthetic_sine_curve() X_sine <- sine_data_list[[1]] y_sine <- sine_data_list[[2]]
sine_data_list <- synthetic_sine_curve() X_sine <- sine_data_list[[1]] y_sine <- sine_data_list[[2]]
Plots a histogram chart which shows the fitting time obtained from various regressors/classifiers (using their default values) on the given dataset (X, y).
test_timing(type = 1, X, y)
test_timing(type = 1, X, y)
type |
1 to compare regressors, 2 for comparing classifiers |
X |
Predictors |
y |
Response variables |
X <- matrix(sample(100, 20), nrow = 10) y <- sample(100, 10) test_timing(1, X, y)
X <- matrix(sample(100, 20), nrow = 10) y <- sample(100, 10) test_timing(1, X, y)
Split dataframes or matrices into random train and test subsets. Takes the column at the y_index of data as response variable (y) and the rest as the independent variables (X)
train_test_split( data, test_size = 0.3, random_state = NULL, y_index = ncol(data) )
train_test_split( data, test_size = 0.3, random_state = NULL, y_index = ncol(data) )
data |
Dataset that is going to be split |
test_size |
Represents the proportion of the dataset to include in the test split. Should be between 0.0 and 1.0 (defaults to 0.3) |
random_state |
Controls the shuffling applied to the data before applying the split. Pass an int for reproducible output across multiple function calls (defaults to NULL) |
y_index |
Corresponding column index of the response variable y (defaults to last column of data) |
A list
of length 4 with elements:
X_train |
Training input variables |
X_test |
Test input variables |
y_train |
Training response variables |
y_test |
Test response variables |
data(abalone) split_list <- train_test_split(abalone, test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] print(head(X_train)) print(head(X_test)) print(head(y_train)) print(head(y_test))
data(abalone) split_list <- train_test_split(abalone, test_size = 0.3) X_train <- split_list[[1]] X_test <- split_list[[2]] y_train <- split_list[[3]] y_test <- split_list[[4]] print(head(X_train)) print(head(X_test)) print(head(y_train)) print(head(y_test))