Top |
guint | length | Read / Write / Construct Only |
gboolean | save-x | Read / Write / Construct Only |
NcmStatsVecType | type | Read / Write / Construct Only |
This object calculates some basic statistics (mean, variance and covariance) of a set of random variables.
The mean can be calculated online using the following formula: $$\bar{x}_n = \bar{x}_{n-1} + (x_n - \bar{x}_{n-1})\frac{w_n}{W_n},$$ where $\bar{x}_n$ is the mean calculated using the first $n$ elements, $x_n$ is the $n$-th element, $w_n$ the $n$-th weight and finally $W_n$ is the sum of the first $n$ weights.
Using the expressions above we obtain the variance from as following: $$M_n = M_{n-1} + (x_n - \bar{x}_{n-1})^2w_n\frac{W_{n-1}}{W_n},$$ where the variance of the first $n$ elements is $$V_n = \frac{M_n}{W^\text{bias}_{n}}, \quad W^\text{bias}_{n} \equiv \frac{W_n^2 - \sum^n_iw_i^2}{W_n}.$$ In the formula above we defined the bias corrected weight $W^\text{bias}_{n}$.
Finally, the covariance is computed through the following expression: $$N(x,y)_n = N(x,y)_{n-1} + (x_n - \bar{x}_n)(y_n - \bar{y}_{n-1})w_n,$$ where the covariance of two variables $x$, $y$ is given by $$Cov(x,y)_n = \frac{N(x,y)_n}{W^\text{bias}_{n}}.$$
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
// Creates a new one dimensional NcmStatsVec to calculates mean and variance. NcmStatsVec *svec = ncm_stats_vec_new (1, NCM_STATS_VEC_VAR, FALSE); // Set and update three different values of the only random variable. ncm_stats_vec_set (svec, 0, 1.0); ncm_stats_vec_update (svec); ncm_stats_vec_set (svec, 0, 2.0); ncm_stats_vec_update (svec); ncm_stats_vec_set (svec, 0, 1.5); ncm_stats_vec_update (svec); { gdouble mean = ncm_stats_vec_get_mean (svec, 0); gdouble var = ncm_stats_vec_get_var (svec, 0); ... } |
void (*NcmStatsVecUpdateFunc) (NcmStatsVec *svec
,const gdouble w
,NcmVector *x
);
NcmStatsVec * ncm_stats_vec_new (guint len
,NcmStatsVecType t
,gboolean save_x
);
Creates a new NcmStatsVec.
NcmStatsVec *
ncm_stats_vec_ref (NcmStatsVec *svec
);
Increase the reference of svec
by one.
void
ncm_stats_vec_free (NcmStatsVec *svec
);
Decrease the reference count of svec
by one.
void
ncm_stats_vec_clear (NcmStatsVec **svec
);
Decrease the reference count of svec
by one, and sets the pointer *svec to
NULL.
void ncm_stats_vec_reset (NcmStatsVec *svec
,gboolean rm_saved
);
Reset all data in svec
. If rm_saved
is TRUE and svec
has
saved data, it will be also removed from the object.
void ncm_stats_vec_update_weight (NcmStatsVec *svec
,gdouble w
);
Updates the statistics using svec->x
set in svec
and weight
, then reset
svec->x
to zero.
void ncm_stats_vec_append_weight (NcmStatsVec *svec
,NcmVector *x
,gdouble w
,gboolean dup
);
Appends and updates the statistics using weight w
for the vector x
NcmVector of same
size “length” and with continuous allocation. i.e., NcmVector:stride == 1.
If svec
was created with save_x TRUE, the paramenter dup
determines if the vector
x
will be duplicated or if just a reference for x
will be saved.
void ncm_stats_vec_prepend_weight (NcmStatsVec *svec
,NcmVector *x
,gdouble w
,gboolean dup
);
Prepends and updates the statistics using the vector x
and weight w
.
It assumes that NcmVector is of same size “length” and
with continuous allocation. i.e., NcmVector:stride == 1.
If svec
was created with save_x TRUE, the paramenter dup
determines if the vector
will be duplicated or if just a reference for x
will be saved.
void ncm_stats_vec_append (NcmStatsVec *svec
,NcmVector *x
,gboolean dup
);
Appends and updates the statistics using weight 1.0 for the vector x
NcmVector of same
size “length” and with continuous allocation. i.e., NcmVector:stride == 1.
If svec
was created with save_x TRUE, the paramenter dup
determines if the vector
x
will be duplicated or if just a reference for x
will be saved.
void ncm_stats_vec_prepend (NcmStatsVec *svec
,NcmVector *x
,gboolean dup
);
Prepends and updates the statistics using the vector x
and weight 1.0.
It assumes that NcmVector is of same size “length” and
with continuous allocation. i.e., NcmVector:stride == 1.
If svec
was created with save_x TRUE, the paramenter dup
determines if the vector
will be duplicated or if just a reference for x
will be saved.
void ncm_stats_vec_append_data (NcmStatsVec *svec
,GPtrArray *data
,gboolean dup
);
Appends and updates the statistics using the data contained in data
and weight == 1.0.
It assumes that each element of data
is a NcmVector of same size “length” and
with continuous allocation. i.e., NcmVector:stride == 1.
If svec
was created with save_x TRUE, the paramenter dup
determines if the vectors
from data
will be duplicated or if just a reference for the current vectors in data
will be saved.
svec |
||
data |
a GPtrArray containing NcmVector s to be added. |
[element-type NcmVector] |
dup |
a boolean |
void ncm_stats_vec_prepend_data (NcmStatsVec *svec
,GPtrArray *data
,gboolean dup
);
Prepends and updates the statistics using the data contained in data
and weight == 1.0.
It assumes that each element of data
is a NcmVector of same size “length” and
with continuous allocation. i.e., NcmVector:stride == 1.
If svec
was created with save_x TRUE, the paramenter dup
determines if the vectors
from data
will be duplicated or if just a reference for the current vectors in data
will be saved.
svec |
||
data |
a GPtrArray containing NcmVector s to be added. |
[element-type NcmVector] |
dup |
a boolean |
void ncm_stats_vec_enable_quantile (NcmStatsVec *svec
,gdouble p
);
Enables quantile calculation, it will calculate the $p$ quantile. Warning, it does not support weighted samples, the results will ignores the weights.
void
ncm_stats_vec_disable_quantile (NcmStatsVec *svec
);
Disables quantile calculation.
gdouble ncm_stats_vec_get_quantile (NcmStatsVec *svec
,guint i
);
Returns the current estimate of the quantile initialized
through ncm_stats_vec_enable_quantile()
.
gdouble ncm_stats_vec_get_quantile_spread (NcmStatsVec *svec
,guint i
);
Returns the current estimate of the quantile spread, from the
probability $p$ initialized through ncm_stats_vec_enable_quantile()
,
i.e., it returns the difference between $(p + 1)/2$ quantile
and the $p/2$. For example, if $p = 0.5$ then it returns the
interquartile range.
NcmVector * ncm_stats_vec_get_autocorr (NcmStatsVec *svec
,guint p
);
Calculates the autocorrelation vector, the j-th element represent the selfcorrelation with lag-j.
The returning vector use the internal memory allocation and will
change with subsequent calls to ncm_stats_vec_get_autocorr()
.
NcmVector * ncm_stats_vec_get_subsample_autocorr (NcmStatsVec *svec
,guint p
,guint subsample
);
Calculates the autocorrelation vector, the j-th element represent
the selfcorrelation with lag-j using the subsample
parameter.
The returning vector use the internal memory allocation and will
change with subsequent calls to ncm_stats_vec_get_autocorr()
.
gdouble ncm_stats_vec_get_autocorr_tau (NcmStatsVec *svec
,guint p
,const guint max_lag
);
Calculates the integrated autocorrelation time for the parameter p
using all rows of data.
If max_lag
is 0 or larger than the current number of itens than it use
the current number of itens as max_lag
.
gdouble ncm_stats_vec_get_subsample_autocorr_tau (NcmStatsVec *svec
,guint p
,guint subsample
,const guint max_lag
);
Calculates the integrated autocorrelation time for the parameter p
using the subsample
parameter.
gboolean ncm_stats_vec_fit_ar_model (NcmStatsVec *svec
,guint p
,const guint order
,NcmStatsVecARType ar_crit
,NcmVector **rho
,NcmVector **pacf
,gdouble *ivar
,guint *c_order
);
If order is zero the value of floor $\left[10 log_{10}(s) \right]$, where $s$ is the number of points.
gdouble ncm_stats_vec_ar_ess (NcmStatsVec *svec
,guint p
,NcmStatsVecARType ar_crit
,gdouble *spec0
,guint *c_order
);
Calculates the effective sample size for the parameter p
.
gdouble ncm_stats_vec_estimate_const_break (NcmStatsVec *svec
,guint p
);
Estimate mean $\mu$ and standard deviation $\sigma$ fitting the paramater p
using robust regression. Computes the time $t_0$ where the parameter p
falls
within the $\alpha\sigma$ from $\mu$, where $\alpha$ is implicitly defined by
$$ \int_\alpha^\infty\chi_1(X)\mathrm{d}X = 1/N,$$
and $N$ is the size of the sample.
NcmVector * ncm_stats_vec_max_ess_time (NcmStatsVec *svec
,const guint ntests
,gint *bindex
,guint *wp
,guint *wp_order
,gdouble *wp_ess
);
Calculates the time $t_m$ that maximizes the Effective Sample Size (ESS).
The variable ntests
control the number of divisions where the ESS
will be calculated, if it is zero the default 10 tests will be used.
NcmVector * ncm_stats_vec_heidel_diag (NcmStatsVec *svec
,const guint ntests
,const gdouble pvalue
,gint *bindex
,guint *wp
,guint *wp_order
,gdouble *wp_pvalue
);
Applies the Heidelberger and Welch’s convergence diagnostic
applying ntests
Schruben tests sequentially, if ntests
== 0
it will use the default 10 tests. The variable bindex
will
contains the smallest index where all p-values are smaller than
pvalue
, if pvalue
is zero it used the default value of $0.05$.
If the test is not satisfied by any index bindex
will contain
-1 and the return vector the p-values considering the whole system.
See:
NcmVector * ncm_stats_vec_visual_heidel_diag (NcmStatsVec *svec
,const guint p
,const guint fi
,gdouble *mean
,gdouble *var
);
Computes the empirical cumulative and the mean used to build the Heidelberger and Welch’s convergence diagnostic.
GPtrArray *
ncm_stats_vec_dup_saved_x (NcmStatsVec *svec
);
Creates a copy of the internal saved_x array.
NcmMatrix *
ncm_stats_vec_compute_cov_robust_diag (NcmStatsVec *svec
);
Compute the covariance using the saved data applying a a robust scale estimator for each degree of freedom.
NcmMatrix *
ncm_stats_vec_compute_cov_robust_ogk (NcmStatsVec *svec
);
Compute the covariance matrix employing the Orthogonalized Gnanadesikan-Kettenring (OGK) method. This method utilizes saved data and incorporates a robust scale estimator for each degree of freedom. The OGK method provides a robust and efficient approach to compute covariance, ensuring reliable estimates even in the presence of outliers or skewed distributions.
NcmVector *
ncm_stats_vec_peek_x (NcmStatsVec *svec
);
Returns the vector containing the current value of the random variables.
void ncm_stats_vec_set (NcmStatsVec *svec
,guint i
,gdouble x_i
);
Sets the value of the current i
-th random variable to x_i
.
gdouble ncm_stats_vec_get (NcmStatsVec *svec
,guint i
);
Returns the value of the current i
-th random variable.
void
ncm_stats_vec_update (NcmStatsVec *svec
);
Same as ncm_stats_vec_update_weight()
assuming weigth equal to one.
gdouble ncm_stats_vec_get_mean (NcmStatsVec *svec
,guint i
);
Return the current value of the variable mean, i.e., $\bar{x}_n$.
gdouble ncm_stats_vec_get_var (NcmStatsVec *svec
,guint i
);
Return the current value of the variable variance, i.e., $Var_n$.
gdouble ncm_stats_vec_get_sd (NcmStatsVec *svec
,guint i
);
Return the current value of the variable standard deviation, i.e., $\sigma_n \equiv sqrt (Var_n)$.
gdouble ncm_stats_vec_get_cov (NcmStatsVec *svec
,guint i
,guint j
);
Return the current value of the variance between the i
-th and the j
-th
variables, i.e., $Cov_{ij}$.
gdouble ncm_stats_vec_get_cor (NcmStatsVec *svec
,guint i
,guint j
);
Return the current value of the correlation between the i
-th and the j
-th
variables, i.e., $$Cor_{ij} \equiv \frac{Cov_{ij}}{\sigma_i\sigma_j}.$$
gdouble
ncm_stats_vec_get_weight (NcmStatsVec *svec
);
Return the current value of the weight, for non-weighted means this is simply the number of elements.
void ncm_stats_vec_get_mean_vector (NcmStatsVec *svec
,NcmVector *mean
,guint offset
);
Copy the current value of the means to the vector mean
starting from parameter offset
.
NcmVector *
ncm_stats_vec_peek_mean (NcmStatsVec *svec
);
Gets the local mean vector.
void ncm_stats_vec_get_cov_matrix (NcmStatsVec *svec
,NcmMatrix *m
,guint offset
);
Copy the current value of the correlation between the variables to the
matrix m
starting from paramenter offset
.
NcmMatrix * ncm_stats_vec_peek_cov_matrix (NcmStatsVec *svec
,guint offset
);
Gets the internal covariance matrix starting from paramenter offset
.
This is the internal matrix of svec
and can change with further
additions to svec
. It is not guaranteed to be valid after new additions.
guint
ncm_stats_vec_nrows (NcmStatsVec *svec
);
Gets the number of saved rows, this function fails if the object was not created with save_x == TRUE;
guint
ncm_stats_vec_nitens (NcmStatsVec *svec
);
Gets the number of itens added to the object;
NcmVector * ncm_stats_vec_peek_row (NcmStatsVec *svec
,guint i
);
The i-th data row used in the statistics, this function fails if the object was not created with save_x == TRUE;
gdouble ncm_stats_vec_get_param_at (NcmStatsVec *svec
,guint i
,guint p
);
Gets the p-th parameter in the i-th data row used in the statistics, this function fails if the object was not created with save_x == TRUE;
“length”
property “length” guint
Number of random variables.
Owner: NcmStatsVec
Flags: Read / Write / Construct Only
Allowed values: >= 1
Default value: 1
“save-x”
property “save-x” gboolean
Whenever to save each vector x through each interation.
Owner: NcmStatsVec
Flags: Read / Write / Construct Only
Default value: FALSE
“type”
property“type” NcmStatsVecType
The statistics to be calculated.
Owner: NcmStatsVec
Flags: Read / Write / Construct Only
Default value: NCM_STATS_VEC_MEAN