Classes
class	MinuitDensityFitFcn1D

class	JohnsonFit

class	MinuitLocalLogisticFcn

class	MinuitLOrPEBgCVFcn1D

class	MinuitNeymanOSDE1DFcn

class	MinuitPLogliOSDE1DFcn

class	MinuitQuantileRegression1DFcn

class	MinuitQuantileRegressionNDFcn

class	MinuitSemiparametricFitFcn1D

class	MinuitUnbinnedFitFcn1D

class	ScalableDensityConstructor1D

Functions
template<typename InputData , typename OutputData >
void	fitCompositeJohnson (const InputData input, unsigned long nInput, unsigned nBins, double xmin, double xmax, double qmin, double qmax, double minlog, const npstat::LocalPolyFilter1D const filters, unsigned nFilters, OutputData smoothedCurve, unsigned lenCurve, bool intitialFitConverged, unsigned filterUsed)

template<class Numeric1 , class Numeric2 >
void	minuitLocalQuantileRegression1D (std::vector< std::pair< Numeric1, Numeric1 > > inputPoints, double symbetaPower, double bandwidthInCDFSpace, unsigned polyDegree, double cdfValue, double xmin, double xmax, Numeric2 *result, unsigned nResultPoints, bool verbose=false)

template<class Point , class Numeric , class BooleanFunctor , typename Num2 , unsigned StackLen2, unsigned StackDim2>
void	minuitUnbinnedLogisticRegression (npstat::LogisticRegressionOnKDTree< Point, Numeric, BooleanFunctor > &reg, const unsigned maxdeg, npstat::ArrayND< Num2, StackLen2, StackDim2 > *result, const npstat::BoxND< Numeric > &resultBox, unsigned reportProgressEvery=0)

template<typename Numeric , unsigned StackLen, unsigned StackDim, typename Num2 , unsigned StackLen2, unsigned StackDim2>
void	minuitLogisticRegressionOnGrid (npstat::LogisticRegressionOnGrid< Numeric, StackLen, StackDim > &reg, const unsigned maxdeg, npstat::ArrayND< Num2, StackLen2, StackDim2 > *result, unsigned reportProgressEvery=0)

template<typename Numeric , typename Num2 , unsigned StackLen2, unsigned StackDim2>
void	minuitQuantileRegression (npstat::QuantileRegressionBase< Numeric > &qrb, unsigned polyDegree, npstat::ArrayND< Num2, StackLen2, StackDim2 > *result, const npstat::BoxND< Numeric > &resultBox, unsigned reportProgressEvery=0, double upFactor=1.0)

template<typename Numeric , typename Num2 , unsigned StackLen2, unsigned StackDim2, typename NumHisto >
void	minuitQuantileRegressionIncrBW (npstat::QuantileRegressionBase< Numeric > &qrb, unsigned polyDegree, npstat::ArrayND< Num2, StackLen2, StackDim2 > *result, const npstat::BoxND< Numeric > &resultBox, const npstat::HistoND< NumHisto > &predictorHisto, double minimalSampleFraction, unsigned reportProgressEvery=0, double upFactor=1.0)

template<typename Numeric >
double	boundaryBandwidth1D (const npstat::HistoND< Numeric > &histo, const double filterDeg, const int m)

double	featureBandwidth1D (const double featureSize, const double filterDeg, const double effectiveNBg, const int m)

template<typename Numeric >
double	minHistoBandwidth1D (const npstat::HistoND< Numeric > &histo, const double featureSize, const double filterDeg, const double nbg, const int m)

template<class Numeric1 , class Numeric2 >
void	weightedLocalQuantileRegression1D (std::vector< npstat::Triple< Numeric1, Numeric1, double > > inputPoints, double symbetaPower, double bandwidthInCDFSpace, unsigned polyDegree, double cdfValue, double xmin, double xmax, Numeric2 *result, unsigned nResultPoints, bool verbose=false)

Detailed Description

Namespace "npsi" (nonparametric statistics interface) is used for classes and functions in the NPStat package which rely on the Minuit function minimization package. See http://www.cern.ch/minuit/

Function Documentation

◆ fitCompositeJohnson()

template<typename InputData , typename OutputData >

void npsi::fitCompositeJohnson	(	const InputData *	input,
		unsigned long	nInput,
		unsigned	nBins,
		double	xmin,
		double	xmax,
		double	qmin,
		double	qmax,
		double	minlog,
		const npstat::LocalPolyFilter1D const	filters,
		unsigned	nFilters,
		OutputData *	smoothedCurve,
		unsigned	lenCurve,
		bool *	intitialFitConverged,
		unsigned *	filterUsed
	)

Density estimation by the transformation method using the following sequence of steps:

Johnson system is fitted to the input sample between quantiles that correspond to parameters "qmin" and "qmax". Typical values of these parameters are 0.05 and 0.95.
The sample is transformed according to the cumulative distribution of the fitted Johnson system.
The transformed sample is smoothed with a bunch of filters with different bandwidth values. The best filter (bandwidh) is then chosen using pseudo-likelihood cross-vaidation.
BinnedCompositeJohnson density is made using the results of these fits. This density is scanned into the "smoothedCurve" array.

Function arguments are as follows:

input, nInput – Array of input data points (typically floats or doubles) and the number of points in this array.

nBins – Number of bins for the histogram which will be used for fitting parameters of the Johnson system.

xmin, xmax – Range (support) of the estimated density.

qmin, qmax, minlog – Parameters passed to the JohnsonFit class.

filters, nFilters – A collection of smoothers to try on the transformed density. All of them will be used and the smoother with the best cross-validation pseudo-likelihood will be chosen to build the final result.

smoothedCurve, lenCurve – The array in which the smoothed values will be stored. The coordinates correspond to the bin centers of a histogram with "lenCurve" bins between "xmin" and "xmax".

intitialFitConverged – Can be used to find out whether the initial Johnson system fit converged successfully. This parameter can also be NULL.

filterUsed – On output, will contain the number of the best filter from "filters" (or can be NULL).

◆ minuitLocalQuantileRegression1D()

template<class Numeric1 , class Numeric2 >

void npsi::minuitLocalQuantileRegression1D	(	std::vector< std::pair< Numeric1, Numeric1 > >	inputPoints,
		double	symbetaPower,
		double	bandwidthInCDFSpace,
		unsigned	polyDegree,
		double	cdfValue,
		double	xmin,
		double	xmax,
		Numeric2 *	result,
		unsigned	nResultPoints,
		bool	verbose = `false`
	)

High-level driver functions for performing local 1-d quantile regression fits using Minuit2 as a minimization engine

The arguments are as follows:

inputPoints – are the points for which the regression should be performed. Predictor is the first member of the pair and response is the second. As a side effect of this function, the input points will be sorted in the increasing order. This is why the vector of input points is non-const.

symbetaPower – the power parameter for "SymmetricBeta1D". 3 and 4 are good values to try.

bandwidthInCDFSpace – Approximate fraction of sample points which will participate in each fit. Due to robustness requirements (obtaining limited bandwidth in coordinate space), the bandwidth in the CDF space must be less than 0.5 (and, of course, positive).

polyDegree – this defines the degree of the polynomial that will be fitted to the quantile curve. It does not make much sense to go beyond 3 here.

cdfValue – which quantile to use in the regression

xmin, xmax – the result will be calculated between xmin and xmax in equidistant steps

result – array where the result will be stored

nResultPoints – number of coordinate points to use to build the result. The interval (xmin, xmax) will be split into "nResultPoints" bins. The coordinates at which the fits are performed are taken from the middle of those bins (as in a histogram). Naturally, array "result" must have at least "nResultPoints" elements.

verbose – this switch can be turned on for debugging purposes

◆ minuitQuantileRegression()

template<typename Numeric , typename Num2 , unsigned StackLen2, unsigned StackDim2>

void npsi::minuitQuantileRegression	(	npstat::QuantileRegressionBase< Numeric > &	qrb,
		unsigned	polyDegree,
		npstat::ArrayND< Num2, StackLen2, StackDim2 > *	result,
		const npstat::BoxND< Numeric > &	resultBox,
		unsigned	reportProgressEvery = `0`,
		double	upFactor = `1.0`
	)

High-level driver function for performing local quantile regression fits using Minuit2 as a minimization engine. The weight function is assumed to be symmetric in each dimension.

Function arguments are as follows:

qrb – Naturally, an instance of the npstat::QuantileRegressionBase template. Carries the information about the dataset, the kernel, the bandwidth, and the quantile to fit for. For more details, look at the LocalQuantileRegression.hh header.

polyDegree – Degree of the local polynomial to fit. Can be 0, 1 (local linear regression), or 2 (local quadratic regression).

result – Grid which will hold the results on exit. It defines the number of points in each dimension and provides the storage space.

resultBox – Coordinates of the grid boundaries. The points for which the regression is performed will be positioned inside this box just like histogram bin centers.

reportProgressEvery – Print out a message about the number of grid points processed to the standard output every "reportProgressEvery" points. The default value of 0 means that such printouts are disabled.

upFactor – A factor for the Minuit UP parameter, to multiply by the value estimated internally. Don't change the default unless you really understand what you are doing.

For this function, it is assumed that the constant bandwidth is set up already, with the weight function which was used to create the orthogonal polynomials.

◆ minuitQuantileRegressionIncrBW()

template<typename Numeric , typename Num2 , unsigned StackLen2, unsigned StackDim2, typename NumHisto >

void npsi::minuitQuantileRegressionIncrBW	(	npstat::QuantileRegressionBase< Numeric > &	qrb,
		unsigned	polyDegree,
		npstat::ArrayND< Num2, StackLen2, StackDim2 > *	result,
		const npstat::BoxND< Numeric > &	resultBox,
		const npstat::HistoND< NumHisto > &	predictorHisto,
		double	minimalSampleFraction,
		unsigned	reportProgressEvery = `0`,
		double	upFactor = `1.0`
	)

High-level driver function for performing local quantile regression fits using Minuit2 as a minimization engine. The weight function is assumed to be symmetric in each dimension.

This function is similar to minuitQuantileRegression. However, it sometimes automatically increases the bandwidth: it makes sure that the regression box has at least the minimal fraction of points inside it, as specified by the "minimalSampleFraction" parameter. The fraction is calculated from the "predictorHisto" histogram whose dimensionality and axis order should coincide with the regression predictors. It is expected that this histogram will contain the predictor variables for the sample actually used in the regression.

"minimalSampleFraction" must be <= 1.0. 0 or negative values will result in the constant bandwidth use, just like in the minuitQuantileRegression function.

◆ minuitUnbinnedLogisticRegression()

template<class Point , class Numeric , class BooleanFunctor , typename Num2 , unsigned StackLen2, unsigned StackDim2>

void npsi::minuitUnbinnedLogisticRegression	(	npstat::LogisticRegressionOnKDTree< Point, Numeric, BooleanFunctor > &	reg,
		const unsigned	maxdeg,
		npstat::ArrayND< Num2, StackLen2, StackDim2 > *	result,
		const npstat::BoxND< Numeric > &	resultBox,
		unsigned	reportProgressEvery = `0`
	)

High-level driver function for performing local logistic regression fits using Minuit2 as a minimization engine. It is assumed that the constant bandwidth is set up already, with the weight function which was used to create the orthogonal polynomials. The weight function is assumed to be symmetric in each dimension.

◆ weightedLocalQuantileRegression1D()