Code Documentation¶
-
class
GeneralRegression.
GenericRegressor
(funcs, regressor=None, ci=0.95, **kwargs)[source]¶ Uses a linear regression algorithm and a transformer to perform nonlinear regression. Using a set of functions \((f_0,\dots,f_n)\), and a point \(x\), lifts the point \(x\) to \((x, f_0(x),\dots,f_n(x))\) and applies a linear regression. The result will be a nonlinear regression based on \(f_0,\dots,f_n.\)
Parameters: - funcs – a function that transforms the data points
- regressor – the linear regression method which should be scikit-learn compatible; default: BayesianRidge
- ci – confidence interval; float between 0 and 1.
- kwargs – argument to be passed to funcs
Hilbert Space based regression¶
-
exception
NpyProximation.
Error
(*args)[source]¶ Generic errors that may occur in the course of a run.
-
class
NpyProximation.
FunctionSpace
(dim=1, measure=None, basis=None)[source]¶ A class tha facilitates a few types of computations over function spaces of type \(L_2(X, \mu)\)
Parameters: - dim – the dimension of ‘X’ (default: 1)
- measure – an object of type Measure representing \(\mu\)
- basis – a finite basis of functions to construct a subset of \(L_2(X, \mu)\)
-
form_basis
()[source]¶ Call this method to generate the orthogonal basis corresponding to the given basis. The result will be stored in a property called
orth_base
which is a list of function that are orthogonal to each other with respect to the measuremeasure
over the given rangedomain
.
-
inner
(f, g)[source]¶ Computes the inner product of the two parameters with respect to the measure measure, i.e., \(\int_Xf\cdot g d\mu\).
Parameters: - f – callable
- g – callable
Returns: the quantity of \(\int_Xf\cdot g d\mu\)
-
project
(f, g)[source]¶ Finds the projection of f on g with respect to the inner product induced by the measure measure.
Parameters: - f – callable
- g – callable
Returns: the quantity of \(\frac{\langle f, g\rangle}{\|g\|_2}g\)
-
series
(f)[source]¶ Given a function f, this method finds and returns the coefficients of the series that approximates f as a linear combination of the elements of the orthogonal basis \(B\). In symbols \(\sum_{b\in B}\langle f, b\rangle b\).
Returns: the list of coefficients \(\langle f, b\rangle\) for \(b\in B\)
-
class
NpyProximation.
HilbertRegressor
(deg=3, base=None, meas=None, f_space=None, c_limit=0.95)[source]¶ Regression using Hilbert Space techniques Scikit-Learn style.
Parameters: - deg – int, default=3 The degree of polynomial regression. Only used if base is None
- base – list, default = None a list of function to form an orthogonal function basis
- meas – NpyProximation.Measure, default = None the measure to form the \(L_2(\mu)\) space. If None a discrete measure will be constructed based on fit inputs
- f_space – NpyProximation.FunctionBasis, default = None the function subspace of \(L_2(\mu)\), if None it will be initiated according to self.meas
- c_limit – for confidence interval
- apprx – It is a callable, this will be constructed on fit method. It use for approximate after fitting/learning.
-
fit
(X, y)[source]¶ Calculates an orthonormal basis according to the given function space basis and the discrete measure from the training points.
Parameters: - X – Training data
- y – Target values
Returns: self
-
class
NpyProximation.
Measure
(density=None, domain=None)[source]¶ Constructs a measure \(\mu\) based on density and domain.
Parameters: - density –
the density over the domain: + if none is given, it assumes uniform distribution
- if a callable h is given, then \(d\mu=h(x)dx\)
- if a dictionary is given, then \(\mu=\sum w_x\delta_x\) a discrete measure. The points \(x\) are the keys of the dictionary (tuples) and the weights \(w_x\) are the values.
- domain – if density is a dictionary, it will be set by its keys. If callable, then domain must be a list of tuples defining the domain’s box. If None is given, it will be set to \([-1, 1]^n\)
- density –
-
class
NpyProximation.
Regression
(points, dim=None)[source]¶ Given a set of points, i.e., a list of tuples of the equal lengths P, this class computes the best approximation of a function that fits the data, in the following sense:
- if no extra parameters is provided, meaning that an object is initiated like
R = Regression(P)
then callingR.fit()
returns the linear regression that fits the data. - if at initiation the parameter deg=n is set, then
R.fit()
returns the polynomial regression of degree n. - if a basis of functions provided by means of an OrthSystem object (
R.SetOrthSys(orth)
) then callingR.fit()
returns the best approximation that can be found using the basic functions of the orth object.
Parameters: - points – a list of points to be fitted or a callable to be approximated
- dim – dimension of the domain
-
fit
()[source]¶ Fits the best curve based on the optional provided orthogonal basis. If no basis is provided, it fits a polynomial of a given degree (at initiation) :return: The fit.
- if no extra parameters is provided, meaning that an object is initiated like
Time Series Tools¶
-
class
ModelSelection.
TimeSeriesCV
(test_ratio=0.2, train_ratio=None, index=0)[source]¶ This is a very naive cross validator for time series. It simply sorts the given index (default 0) and splits the sorted index into a train and a test index set according to the given ratios.
Parameters: - test_ratio – (default .2) float betweem 0. and 1., the portion of test data
- train_ratio – (default None-> .8) float betweem 0. and 1., the portion of train data
- index – (default 0) the index of the column that corresponds to a time parameter in the data
-
get_n_splits
(X=None, y=None, groups=None)[source]¶ Returns the number of splitting iterations in the cross-validator
Parameters: - X – Always ignored, exists for compatibility.
- y – Always ignored, exists for compatibility.
- groups – Always ignored, exists for compatibility.
Returns: Returns the number of splitting iterations in the cross-validator which is 1 for time series.
-
split
(X, y=None, groups=None)[source]¶ Generate indices to split data into training and test set.
Parameters: - X – array-like of shape (n_samples, n_features) Training data, where n_samples is the number of samples and n_features is the number of features.
- y – array-like of shape (n_samples,), default=None The target variable for supervised learning problems.
- groups – array-like of shape (n_samples,), default=None Group labels for the samples used while splitting the dataset into train/test set.
Returns: train The training set indices for that split. test The testing set indices for that split.