| Type: | Package | 
| Title: | Data Preprocessing, Binning for Classification and Regression | 
| Version: | 0.2.1 | 
| Date: | 2018-01-05 | 
| Author: | Chapman Siu | 
| Maintainer: | Chapman Siu <chpmn.siu@gmail.com> | 
| Description: | Various supervised and unsupervised binning tools including using entropy, recursive partition methods and clustering. | 
| LazyData: | TRUE | 
| Imports: | stats, rpart | 
| Suggests: | discretization, Formula, testthat, BAMMtools, earth | 
| RoxygenNote: | 5.0.1 | 
| License: | MIT + file LICENSE | 
| URL: | https://github.com/jules-and-dave/binst | 
| NeedsCompilation: | no | 
| Packaged: | 2018-01-04 23:48:21 UTC; chapm | 
| Repository: | CRAN | 
| Date/Publication: | 2018-01-05 04:08:02 UTC | 
Creates bins given breaks
Description
Creates bins given breaks
Usage
create_bins(x, breaks, method = "cuts")
Arguments
| x | X is a numeric vector which is to be discretized | 
| breaks | Breaks are the breaks for the vector X to be broken at. This excludes endpoints | 
| method | the approach to bin the variable, can either be cuts or hinge. | 
Value
A vector same length as X is returned with the numeric discretization
See Also
Examples
create_bins(1:10, c(3, 5))
A convenience functon for creating breaks with various methods.
Description
A convenience functon for creating breaks with various methods.
Usage
create_breaks(x, y = NULL, method = "kmeans", control = NULL, ...)
Arguments
| x | X is a numeric vector to be discretized | 
| y | Y is the response vector used for calculating metrics for discretization | 
| method | Method is the type of discretization approach used. Possible methods are: "dt", "entropy", "kmeans", "jenks" | 
| control | Control is used for optional parameters for the method. It is a list of optional parameters for the function | 
| ... | instead of passing a list into control, arguments can be parsed as is. | 
Value
A vector containing the breaks
See Also
Examples
kmeans_breaks <- create_breaks(1:10)
create_bins(1:10, kmeans_breaks)
# passing the k means parameter "centers" = 4
kmeans_breaks <- create_breaks(1:10, list(centers=4))
create_bins(1:10, kmeans_breaks)
entropy_breaks <- create_breaks(1:10, rep(c(1,2), each = 5), method="entropy")
create_bins(1:10, entropy_breaks)
dt_breaks <- create_breaks(iris$Sepal.Length, iris$Species, method="dt")
create_bins(iris$Sepal.Length, dt_breaks)
Create breaks using decision trees (recursive partitioning)
Description
Create breaks using decision trees (recursive partitioning)
Usage
create_dtbreaks(x, y, control = NULL)
Arguments
| x | X is a numeric vector to be discretized | 
| y | Y is the response vector used for calculating metrics for discretization | 
| control | Control is used for optional parameters for the method | 
Value
A vector containing the breaks
See Also
Examples
dt_breaks <- create_breaks(iris$Sepal.Length, iris$Species, method="dt")
create_bins(iris$Sepal.Length, dt_breaks)
Create breaks using earth (i.e. MARS)
Description
Create breaks using earth (i.e. MARS)
Usage
create_earthbreaks(x, y, control = NULL)
Arguments
| x | X is a numeric vector to be discretized | 
| y | Y is the response vector used for calculating metrics for discretization | 
| control | Control is used for optional parameters for the method | 
Value
A vector containing the breaks
See Also
Examples
earth_breaks <- create_breaks(x=iris$Sepal.Length, y=iris$Sepal.Width, method="earth")
create_bins(iris$Sepal.Length, earth_breaks)
Create Jenks breaks
Description
Create Jenks breaks
Usage
create_jenksbreaks(x, control = NULL)
Arguments
| x | X is a numeric vector to be discretized | 
| control | Control is used for optional parameters for the method | 
Value
A vector containing the breaks
See Also
Examples
jenks_breaks <- create_breaks(1:10, method="jenks")
create_bins(1:10, jenks_breaks)
Create kmeans breaks.
Description
Create kmeans breaks.
Usage
create_kmeansbreaks(x, control = NULL)
Arguments
| x | X is a numeric vector to be discretized | 
| control | Control is used for optional parameters for the method | 
Value
A vector containing the breaks
See Also
Examples
kmeans_breaks <- create_breaks(1:10)
create_bins(1:10, kmeans_breaks)
Create breaks using mdlp
Description
Create breaks using mdlp
Usage
create_mdlpbreaks(x, y)
Arguments
| x | X is a numeric vector to be discretized | 
| y | Y is the response vector used for calculating metrics for discretization | 
Value
A vector containing the breaks
See Also
Examples
entropy_breaks <- create_breaks(1:10, rep(c(1,2), each = 5), method="entropy")
create_bins(1:10, entropy_breaks)
gets the default parameters for each method.
Description
gets the default parameters for each method.
Usage
get_control(method = "kmeans", control = NULL)
Arguments
| method | Method is the type of discretization approach used | 
| control | Control are the controls for the algorithm | 
Value
List of default parameters