bayesml.linearregressionmixture package#

_images/linearregressionmixture_example.png

Module contents#

The mixture of linear regression model with the Gauss-Gamma prior distribution and the Dirichlet prior distribution.

The stochastic data generative model is as follows:

  • \(K \in \mathbb{N}\): number of latent classes

  • \(\boldsymbol{z} \in \{ 0, 1 \}^K\): a one-hot vector representing the latent class (latent variable)

  • \(\boldsymbol{\pi} \in [0, 1]^K\): a parameter for latent classes, (\(\sum_{k=1}^K \pi_k=1\))

  • \(D \in \mathbb{N}\): a dimension of data

  • \(y\in\mathbb{R}\): an objective variable

  • \(\boldsymbol{x} \in \mathbb{R}^D\): a data point

  • \(\boldsymbol{\theta}_k\in\mathbb{R}^{D}\): a parameter

  • \(\boldsymbol{\theta} = \{ \boldsymbol{\theta}_k \}_{k=1}^K\)

  • \(\tau_k \in \mathbb{R}_{>0}\) : a parameter

  • \(\boldsymbol{\tau} = \{ \tau_k \}_{k=1}^K\)

\[\begin{split}p(\boldsymbol{z} | \boldsymbol{\pi}) &= \mathrm{Cat}(\boldsymbol{z}|\boldsymbol{\pi}) = \prod_{k=1}^K \pi_k^{z_k},\\ p(y | \boldsymbol{x}, \boldsymbol{\theta}, \boldsymbol{\tau}, \boldsymbol{z}) &= \prod_{k=1}^K \mathcal{N}(y | \boldsymbol{\theta}^\top_k \boldsymbol{x},\tau_k^{-1})^{z_k} \\ &= \prod_{k=1}^K \left( \sqrt{\frac{\tau_k}{2\pi}} \exp \left\{ -\frac{\tau_k}{2}(y - \boldsymbol{\theta}^\top_k\boldsymbol{x})^2 \right\} \right)^{z_k}.\end{split}\]

The prior distribution is as follows:

  • \(\boldsymbol{\mu}_0 \in \mathbb{R}^{D}\): a hyperparameter

  • \(\boldsymbol{\Lambda}_0 \in \mathbb{R}^{D\times D}\): a hyperparameter (a positive definite matrix)

  • \(\alpha_0 \in \mathbb{R}_{> 0}\): a hyperparameter

  • \(\beta_0\in \mathbb{R}_{>0}\): a hyperparameter

  • \(\boldsymbol{\gamma}_0 \in \mathbb{R}_{>0}^K\): a hyper parameter

  • \(\Gamma (\cdot)\): the gamma function

\[\begin{split}p(\boldsymbol{\theta},\boldsymbol{\tau},\boldsymbol{\pi}) &= \left\{ \prod_{k=1}^K \mathcal{N}(\boldsymbol{\theta}_k|\boldsymbol{\mu}_0,(\tau_k \boldsymbol{\Lambda}_0)^{-1})\mathrm{Gam}(\tau_k|\alpha_0, \beta_0) \right\} \mathrm{Dir}(\boldsymbol{\pi}|\boldsymbol{\gamma}_0) \\ &= \Biggl[ \prod_{k=1}^K \frac{|\tau_k \boldsymbol{\Lambda}_0|^{1/2}}{(2\pi)^{d/2}} \exp \left\{ -\frac{\tau_k}{2}(\boldsymbol{\theta}_k -\boldsymbol{\mu}_0)^\top \boldsymbol{\Lambda}_0 (\boldsymbol{\theta}_k - \boldsymbol{\mu}_0) \right\} \\ &\qquad \times \frac{\beta_0^{\alpha_0}}{\Gamma(\alpha_0)}\tau_k^{\alpha_0-1}\exp\{-\beta_0\tau_k\} \Biggl] C(\boldsymbol{\gamma}_0)\prod_{k=1}^K \pi_k^{\gamma_{0,k}-1},\\\end{split}\]

where \(C(\boldsymbol{\gamma}_0)\) are defined as follows:

\[C(\boldsymbol{\gamma}_0) = \frac{\Gamma(\sum_{k=1}^K \gamma_{0,k})}{\Gamma(\gamma_{0,1})\cdots\Gamma(\gamma_{0,K})}.\]

The apporoximate posterior distribution in the \(t\)-th iteration of a variational Bayesian method is as follows:

  • \(\boldsymbol{X} = [\boldsymbol{x}_1, \boldsymbol{x}_2, \dots , \boldsymbol{x}_n]^\top \in \mathbb{R}^{n \times D}\): given explanatory variables

  • \(\boldsymbol{z}^n = (\boldsymbol{z}_1, \boldsymbol{z}_2, \dots , \boldsymbol{z}_n) \in \{ 0, 1 \}^{K \times n}\): latent classes of given data

  • \(\boldsymbol{r}_i^{(t)} = (r_{i,1}^{(t)}, r_{i,2}^{(t)}, \dots , r_{i,K}^{(t)}) \in [0,1]^K\): a parameter for \(i\)-th latent class (\(\sum_{k=1}^K r_{i,k}^{(t)} = 1\))

  • \(\boldsymbol{y} = [y_1, y_2, \dots , y_n]^\top \in \mathbb{R}^n\): given objective variables

  • \(\boldsymbol{\mu}_{n,k}^{(t)} \in \mathbb{R}^{D}\): a hyperparameter

  • \(\boldsymbol{\Lambda}_{n,k}^{(t)} \in \mathbb{R}^{D\times D}\): a hyperparameter (a positive definite matrix)

  • \(\alpha_{n,k}^{(t)} \in \mathbb{R}_{> 0}\): a hyperparameter

  • \(\beta_{n,k}^{(t)} \in \mathbb{R}_{>0}\): a hyperparameter

  • \(\boldsymbol{\gamma}_n^{(t)} \in \mathbb{R}_{>0}^K\): a hyper parameter

  • \(\psi (\cdot)\): the digamma function

\[\begin{split}&q(\boldsymbol{z}^n, \boldsymbol{\theta},\boldsymbol{\tau},\boldsymbol{\pi}) \nonumber \\ &= \left\{ \prod_{i=1}^n \mathrm{Cat}(\boldsymbol{z}_i|\boldsymbol{r}_i^{(t)}) \right\} \left\{ \prod_{k=1}^K \mathcal{N}(\boldsymbol{\theta}_k|\boldsymbol{\mu}_{n,k}^{(t)},(\tau_k \boldsymbol{\Lambda}_{n,k}^{(t)})^{-1})\mathrm{Gam}(\tau_k|\alpha_{n,k}^{(t)}, \beta_{n,k}^{(t)}) \right\} \\ &\qquad \times \mathrm{Dir}(\boldsymbol{\pi}|\boldsymbol{\gamma}_{n,k}^{(t)}) \\ &= \Biggl[ \prod_{i=1}^n \prod_{k=1}^K (r_{i,k}^{(t)})^{z_{i,k}} \Biggr] \Biggl[ \prod_{k=1}^K \frac{|\tau_k \boldsymbol{\Lambda}_{n,k}^{(t)}|^{1/2}}{(2\pi)^{d/2}} \exp \left\{ -\frac{\tau_k}{2}(\boldsymbol{\theta}_k -\boldsymbol{\mu}_{n,k}^{(t)})^\top \boldsymbol{\Lambda}_{n,k}^{(t)} (\boldsymbol{\theta}_k - \boldsymbol{\mu}_{n,k}^{(t)}) \right\} \\ &\qquad \times \frac{(\beta_{n,k}^{(t)})^{\alpha_{n,k}^{(t)}}}{\Gamma(\alpha_{n,k}^{(t)})}\tau_k^{\alpha_{n,k}^{(t)}-1}\exp\{-\beta_{n,k}^{(t)}\tau_k\} \Biggl] C(\boldsymbol{\gamma}_n)\prod_{k=1}^K \pi_k^{\gamma_{n,k}-1},\\\end{split}\]

where the updating rules of the hyperparameters are as follows:

\[\begin{split}N_{k}^{(t)} &= \sum_{i=1}^{n} r_{i,k}^{(t)}, \\ \boldsymbol{R}_k^{(t)} &= \mathrm{diag} (r_{1,k}^{(t)}, r_{2,k}^{(t)}, \dots , r_{n,k}^{(t)}), \\ \boldsymbol{\Lambda}_{n,k}^{(t+1)} &= \boldsymbol{\Lambda}_0 + \boldsymbol{X}^\top \boldsymbol{R}_k^{(t)} \boldsymbol{X}, \\ \boldsymbol{\mu}_{n,k}^{(t+1)} &= \left( \boldsymbol{\Lambda}_{n,k}^{(t+1)} \right)^{-1} \left( \boldsymbol{\Lambda}_0 \boldsymbol{\mu}_0 + \boldsymbol{X}^\top \boldsymbol{R}_k^{(t)} \boldsymbol{y} \right), \\ a_{n,k}^{(t+1)} &= a_0 + \frac{1}{2} N_k^{(t)}, \\ b_{n,k}^{(t+1)} &= b_0 + \frac{1}{2} \left( -(\boldsymbol{\mu}_{n,k}^{(t+1)})^\top \boldsymbol{\Lambda}_{n,k}^{(t+1)} \boldsymbol{\mu}_{n,k}^{(t+1)} + \boldsymbol{y}^\top \boldsymbol{R}_k^{(t)} \boldsymbol{y} + \boldsymbol{\mu}_0^\top \boldsymbol{\Lambda}_0 \boldsymbol{\mu}_0 \right), \\ \gamma_{n,k}^{(t+1)} &= \gamma_0 + N_{k}^{(t)}, \\ \ln \rho_{i,k}^{(t)} &= \psi (\gamma_{n,k}^{(t+1)}) - \psi \left( {\textstyle \sum_{k=1}^K \gamma_{n,k}^{(t+1)}} \right) \nonumber \\ &\qquad - \frac{1}{2} \ln (2 \pi) - \frac{1}{2} \left( \psi (\alpha_{n,k}^{(t+1)}) - \ln \beta_{n,k}^{(t+1)} \right) \nonumber \\ &\qquad -\frac{1}{2} \left( \frac{\alpha_{n,k}^{(t+1)}}{\beta_{n,k}^{(t+1)}} \left(y_i - (\boldsymbol{\mu}_{n,k}^{(t+1)})^\top \boldsymbol{x}_i \right)^2 + \boldsymbol{x}_i^\top \boldsymbol{\Lambda}_{n,k}^{(t+1)} \boldsymbol{x}_i \right), \\ r_{i,k}^{(t+1)} &= \frac{\rho_{i,k}^{(t+1)}}{\sum_{k=1}^K \rho_{i,k}^{(t+1)}}.\end{split}\]

The predictive distribution is as follows:

  • \(\boldsymbol{x}_{n+1}\in \mathbb{R}^D\): a new data point

  • \(y_{n+1}\in \mathbb{R}\): a new objective variable

  • \(m_{\mathrm{p},k}\in \mathbb{R}\): a parameter

  • \(\lambda_{\mathrm{p},k}\in \mathbb{R}_{>0}\): a parameter

  • \(\nu_{\mathrm{p},k}\in \mathbb{R}_{>0}\): a parameter

\[\begin{split}&p(y_{n+1} | \boldsymbol{X}, \boldsymbol{y}, \boldsymbol{x}_{n+1} ) \nonumber \\ &= \frac{1}{\sum_{k+1}^K \gamma_{n,k}^{(t)}} \sum_{k=1}^K \gamma_{n,k}^{(t)} \mathrm{St}\left(y_{n+1} \mid m_{\mathrm{p},k}, \lambda_{\mathrm{p},k}, \nu_{\mathrm{p},k}\right) \\ &= \frac{1}{\sum_{k+1}^K \gamma_{n,k}^{(t)}} \sum_{k=1}^K \gamma_{n,k}^{(t)} \frac{\Gamma (\nu_{\mathrm{p},k} / 2 + 1/2 )}{\Gamma (\nu_{\mathrm{p},k} / 2)} \left( \frac{\lambda_{\mathrm{p},k}}{\pi \nu_{\mathrm{p},k}} \right)^{1/2} \left( 1 + \frac{\lambda_{\mathrm{p},k} (y_{n+1} - m_{\mathrm{p},k})^2}{\nu_{\mathrm{p},k}} \right)^{-\nu_{\mathrm{p},k}/2 - 1/2},\end{split}\]

where the parameters are obtained from the hyperparameters of the posterior distribution as follows.

\[\begin{split}m_{\mathrm{p},k} &= \boldsymbol{x}_{n+1}^{\top} \boldsymbol{\mu}_{n,k}^{(t)}, \\ \lambda_{\mathrm{p},k} &= \frac{\alpha_{n,k}^{(t)}}{\beta_{n,k}^{(t)}}\left(1+\boldsymbol{x}_{n+1}^{\top} \boldsymbol{\Lambda}_{n,k}^{(t)} \boldsymbol{x}_{n+1}\right)^{-1}, \\ \nu_{\mathrm{p},k} &= 2 \alpha_{n,k}^{(t)}.\end{split}\]
class bayesml.linearregressionmixture.GenModel(c_num_classes, c_degree, *, pi_vec=None, theta_vecs=None, taus=None, h_gamma_vec=None, h_mu_vecs=None, h_lambda_mats=None, h_alphas=None, h_betas=None, seed=None)#

Bases: Generative

The stochastic data generative model and the prior distribution

Parameters:
c_num_classesint

A positive integer

c_degreeint

A positive integer

pi_vecnumpy.ndarray, optional

A vector of real numbers in \([0, 1]\), by default [1/c_num_classes, 1/c_num_classes, … , 1/c_num_classes] Sum of its elements must be 1.0.

theta_vecsnumpy.ndarray, optional

Vectors of real numbers, by default zero vectors.

tausfloat or numpy.ndarray, optional

Positive real numbers, by default [1.0, 1.0, … , 1.0] If a single real number is input, it will be broadcasted.

h_gamma_vecfloat or numpy.ndarray, optional

A vector of positive real numbers, by default [1/2, 1/2, … , 1/2] If a single real number is input, it will be broadcasted.

h_mu_vecsnumpy.ndarray, optional

Vectors of real numbers, by default zero vectors

h_lambda_matsnumpy.ndarray, optional

Positive definite symetric matrices, by default the identity matrices. If a single matrix is input, it will be broadcasted.

h_alphasfloat or numpy.ndarray, optional

Positive real numbers, by default [1.0, 1.0, … , 1.0] If a single real number is input, it will be broadcasted.

h_betasfloat or numpy.ndarray, optional

Positive real numbers, by default [1.0, 1.0, … , 1.0] If a single real number is input, it will be broadcasted.

seed{None, int}, optional

A seed to initialize numpy.random.default_rng(), by default None

Methods

gen_params()

Generate the parameter from the prior distribution.

gen_sample([sample_size, x, constant])

Generate a sample from the stochastic data generative model.

get_constants()

Get constants of GenModel.

get_h_params()

Get the hyperparameters of the prior distribution.

get_params()

Get the parameter of the sthocastic data generative model.

load_h_params(filename)

Load the hyperparameters to h_params.

load_params(filename)

Load the parameters saved by save_params.

save_h_params(filename)

Save the hyperparameters using python pickle module.

save_params(filename)

Save the parameters using python pickle module.

save_sample(filename[, sample_size, x, constant])

Save the generated sample as NumPy .npz format.

set_h_params([h_gamma_vec, h_mu_vecs, ...])

Set the hyperparameters of the prior distribution.

set_params([pi_vec, theta_vecs, taus])

Set the parameter of the sthocastic data generative model.

visualize_model([sample_size, constant])

Visualize the stochastic data generative model and generated samples.

get_constants()#

Get constants of GenModel.

Returns:
constantsdict of {str: int, numpy.ndarray}
  • "c_num_classes" : the value of self.c_num_classes

  • "c_degree" : the value of self.c_degree

set_params(pi_vec=None, theta_vecs=None, taus=None)#

Set the parameter of the sthocastic data generative model.

Parameters:
pi_vecnumpy.ndarray, optional

A vector of real numbers in \([0, 1]\), by default [1/c_num_classes, 1/c_num_classes, … , 1/c_num_classes] Sum of its elements must be 1.0.

theta_vecsnumpy.ndarray, optional

Vectors of real numbers, by default zero vectors.

tausfloat or numpy.ndarray, optional

Positive real numbers, by default [1.0, 1.0, … , 1.0] If a single real number is input, it will be broadcasted.

set_h_params(h_gamma_vec=None, h_mu_vecs=None, h_lambda_mats=None, h_alphas=None, h_betas=None)#

Set the hyperparameters of the prior distribution.

Parameters:
h_gamma_vecfloat or numpy.ndarray, optional

A vector of positive real numbers, by default [1/2, 1/2, … , 1/2] If a single real number is input, it will be broadcasted.

h_mu_vecsnumpy.ndarray, optional

Vectors of real numbers, by default zero vectors

h_lambda_matsnumpy.ndarray, optional

Positive definite symetric matrices, by default the identity matrices. If a single matrix is input, it will be broadcasted.

h_alphasfloat or numpy.ndarray, optional

Positive real numbers, by default [1.0, 1.0, … , 1.0] If a single real number is input, it will be broadcasted.

h_betasfloat or numpy.ndarray, optional

Positive real numbers, by default [1.0, 1.0, … , 1.0] If a single real number is input, it will be broadcasted.

get_params()#

Get the parameter of the sthocastic data generative model.

Returns:
params{str: numpy.ndarray}
  • "pi_vec" : The value of self.pi_vec

  • "theta_vecs" : The value of self.theta_vecs

  • "taus" : The value of self.taus

get_h_params()#

Get the hyperparameters of the prior distribution.

Returns:
h_params{str:float, np.ndarray}
  • "h_gamma_vec" : The value of self.h_gamma_vec

  • "h_mu_vecs" : The value of self.h_mu_vecs

  • "h_lambda_mats" : The value of self.h_lambda_mats

  • "h_alphas" : The value of self.h_alphas

  • "h_betas" : The value of self.h_betas

gen_params()#

Generate the parameter from the prior distribution.

The generated vaule is set at self.pi_vec, self.theta_vecs and self.lambda_mats.

gen_sample(sample_size=None, x=None, constant=True)#

Generate a sample from the stochastic data generative model.

If x is given, it will be used for explanatory variables as it is (independent of the other options: sample_size and constant).

If x is not given, it will be generated from i.i.d. standard normal distribution. The size of the generated sample is defined by sample_size. If constant is True, the last element of the generated explanatory variables will be overwritten by 1.0.

Parameters:
sample_sizeint, optional

A positive integer, by default None.

xnumpy ndarray, optional

float array whose shape is (sample_size,c_degree), by default None.

constantbool, optional

A boolean value, by default True.

Returns:
xnumpy ndarray

2-dimensional array whose shape is (sample_size,c_degree) and its elements are real numbers.

znumpy ndarray

2-dimensional array whose shape is (sample_size,c_num_classes) whose rows are one-hot vectors.

ynumpy ndarray

1 dimensional float array whose size is sample_size.

save_sample(filename, sample_size=None, x=None, constant=True)#

Save the generated sample as NumPy .npz format.

If x is given, it will be used for explanatory variables as it is (independent of the other options: sample_size and constant).

If x is not given, it will be generated from i.i.d. standard normal distribution. The size of the generated sample is defined by sample_size. If constant is True, the last element of the generated explanatory variables will be overwritten by 1.0.

The generated sample is saved as a NpzFile with keyword: “x”, “z”, “y”.

Parameters:
filenamestr

The filename to which the sample is saved. .npz will be appended if it isn’t there.

sample_sizeint, optional

A positive integer, by default None.

xnumpy ndarray, optional

float array whose shape is (sample_size,c_degree), by default None.

constantbool, optional

A boolean value, by default True.

visualize_model(sample_size=100, constant=True)#

Visualize the stochastic data generative model and generated samples.

If x is given, it will be used for explanatory variables as it is (independent of the other options: sample_size and constant).

If x is not given, it will be generated from i.i.d. standard normal distribution. The size of the generated sample is defined by sample_size. If constant is True, the last element of the generated explanatory variables will be overwritten by 1.0.

Parameters:
sample_sizeint, optional

A positive integer, by default 100

constantbool, optional

Examples

>>> from bayesml import linearregressionmixture
>>> import numpy as np
>>> model = linearregressionmixture.GenModel(
>>>     c_num_classes=2,
>>>     c_degree=2,
>>>     theta_vecs=np.array([[1,3],
>>>                          [-1,-3]]),
>>> )
>>> model.visualize_model()

pi_vec: [0.5 0.5] theta_vecs: [[ 1. 3.]

[-1. -3.]]

taus: [1. 1.]

_images/linearregressionmixture_example.png
class bayesml.linearregressionmixture.LearnModel(c_num_classes, c_degree, *, h0_gamma_vec=None, h0_mu_vecs=None, h0_lambda_mats=None, h0_alphas=None, h0_betas=None, seed=None)#

Bases: Posterior, PredictiveMixin

The posterior distribution and the predictive distribution.

Parameters:
c_num_classesint

a positive integer

c_degreeint

a positive integer

h0_gamma_vecfloat or numpy.ndarray, optional

A vector of positive real numbers, by default [1/2, 1/2, … , 1/2] If a single real number is input, it will be broadcasted.

h0_mu_vecsnumpy.ndarray, optional

Vectors of real numbers, by default zero vectors

h0_lambda_matsnumpy.ndarray, optional

Positive definite symetric matrices, by default the identity matrices. If a single matrix is input, it will be broadcasted.

h0_alphasfloat or numpy.ndarray, optional

Positive real numbers, by default [1.0, 1.0, … , 1.0] If a single real number is input, it will be broadcasted.

h0_betasfloat or numpy.ndarray, optional

Positive real numbers, by default [1.0, 1.0, … , 1.0] If a single real number is input, it will be broadcasted.

seed{None, int}, optional

A seed to initialize numpy.random.default_rng(), by default None

Attributes:
hn_gamma_vecfloat or numpy.ndarray

A vector of positive real numbers. If a single real number is input, it will be broadcasted.

hn_mu_vecsnumpy.ndarray

Vectors of real numbers.

hn_lambda_matsnumpy.ndarray

Positive definite symetric matrices.

hn_lambda_mats_invnumpy.ndarray

Positive definite symetric matrices.

hn_alphasfloat or numpy.ndarray

Positive real numbers.

hn_betasfloat or numpy.ndarray

Positive real numbers.

r_vecsnumpy.ndarray

vectors of real numbers. The sum of its elenemts is 1.

nsnumpy.ndarray

positive real numbers

vlfloat

real number

p_pi_vecsnumpy.ndarray

A vector of real numbers in \([0, 1]\). Sum of its elements must be 1.0.

p_msnumpy.ndarray

Real numbers

p_lambdasnumpy.ndarray

Positive real numbers

p_nusnumpy.ndarray

Positive real numbers

Methods

calc_pred_dist(x)

Calculate the parameters of the predictive distribution.

estimate_latent_vars(x, y[, loss])

Estimate latent variables corresponding to x under the given criterion.

estimate_latent_vars_and_update(x, y[, ...])

Estimate latent variables and update the posterior sequentially.

estimate_params([loss])

Estimate the parameter of the stochastic data generative model under the given criterion.

fit(x, y[, max_itr, num_init, tolerance, ...])

Fit the model to the data.

get_constants()

Get constants of LearnModel.

get_h0_params()

Get the hyperparameters of the prior distribution.

get_hn_params()

Get the hyperparameters of the posterior distribution.

get_p_params()

Get the parameters of the predictive distribution.

load_h0_params(filename)

Load the hyperparameters to h0_params.

load_hn_params(filename)

Load the hyperparameters to hn_params.

make_prediction([loss])

Predict a new data point under the given criterion.

overwrite_h0_params()

Overwrite the initial values of the hyperparameters of the posterior distribution by the learned values.

pred_and_update(x, y[, loss, max_itr, ...])

Update the hyperparameters of the posterior distribution using traning data.

predict(x)

Predict the data.

reset_hn_params()

Reset the hyperparameters of the posterior distribution to their initial values.

save_h0_params(filename)

Save the hyperparameters using python pickle module.

save_hn_params(filename)

Save the hyperparameters using python pickle module.

set_h0_params([h0_gamma_vec, h0_mu_vecs, ...])

Set the hyperparameters of the prior distribution.

set_hn_params([hn_gamma_vec, hn_mu_vecs, ...])

Set the hyperparameter of the posterior distribution.

update_posterior(x, y[, max_itr, num_init, ...])

Update the hyperparameters of the posterior distribution using traning data.

visualize_posterior()

Visualize the posterior distribution for the parameter.

get_constants()#

Get constants of LearnModel.

Returns:
constantsdict of {str: int, numpy.ndarray}
  • "c_num_classes" : the value of self.c_num_classes

  • "c_degree" : the value of self.c_degree

set_h0_params(h0_gamma_vec=None, h0_mu_vecs=None, h0_lambda_mats=None, h0_alphas=None, h0_betas=None)#

Set the hyperparameters of the prior distribution.

Parameters:
h0_gamma_vecfloat or numpy.ndarray, optional

A vector of positive real numbers, by default None. If a single real number is input, it will be broadcasted.

h0_mu_vecsnumpy.ndarray, optional

Vectors of real numbers, by default None.

h0_lambda_matsnumpy.ndarray, optional

Positive definite symetric matrices, by default the identity matrices. If a single matrix is input, it will be broadcasted.

h0_alphasfloat or numpy.ndarray, optional

Positive real numbers, by default None. If a single real number is input, it will be broadcasted.

h0_betasfloat or numpy.ndarray, optional

Positive real numbers, by default None. If a single real number is input, it will be broadcasted.

get_h0_params()#

Get the hyperparameters of the prior distribution.

Returns:
h0_paramsdict of {str: numpy.ndarray}
  • "h0_gamma_vec" : the value of self.h0_gamma_vec

  • "h0_mu_vecs" : the value of self.h0_mu_vecs

  • "h0_lambda_mats" : the value of self.h0_lambda_mats

  • "h0_alphas" : the value of self.h0_alphas

  • "h0_betas" : the value of self.h0_betas

set_hn_params(hn_gamma_vec=None, hn_mu_vecs=None, hn_lambda_mats=None, hn_alphas=None, hn_betas=None)#

Set the hyperparameter of the posterior distribution.

Parameters:
hn_gamma_vecfloat or numpy.ndarray, optional

A vector of positive real numbers, by default None. If a single real number is input, it will be broadcasted.

hn_mu_vecsnumpy.ndarray, optional

Vectors of real numbers, by default None.

hn_lambda_matsnumpy.ndarray, optional

Positive definite symetric matrices, by default the identity matrices. If a single matrix is input, it will be broadcasted.

hn_alphasfloat or numpy.ndarray, optional

Positive real numbers, by default None. If a single real number is input, it will be broadcasted.

hn_betasfloat or numpy.ndarray, optional

Positive real numbers, by default None. If a single real number is input, it will be broadcasted.

get_hn_params()#

Get the hyperparameters of the posterior distribution.

Returns:
hn_paramsdict of {str: numpy.ndarray}
  • "hn_gamma_vec" : the value of self.hn_gamma_vec

  • "hn_mu_vecs" : the value of self.hn_mu_vecs

  • "hn_lambda_mats" : the value of self.hn_lambda_mats

  • "hn_alphas" : the value of self.hn_alphas

  • "hn_betas" : the value of self.hn_betas

update_posterior(x, y, max_itr=100, num_init=10, tolerance=1e-08, init_type='random_responsibility')#

Update the hyperparameters of the posterior distribution using traning data.

Parameters:
xnumpy ndarray

float array. The size along the last dimension must conincides with the c_degree. If you want to use a constant term, it should be included in x.

ynumpy ndarray

float array.

max_itrint, optional

maximum number of iterations, by default 100

num_initint, optional

number of initializations, by default 10

tolerancefloat, optional

convergence criterion of variational lower bound, by default 1.0E-8

init_typestr, optional
  • 'random_responsibility': randomly assign responsibility to r_vecs

  • 'subsampling': for each latent class, extract a subsample whose size is int(np.sqrt(x.shape[0])). and use it to update q(theta_k,tau_k).

Type of initialization, by default 'random_responsibility'

estimate_params(loss='squared')#

Estimate the parameter of the stochastic data generative model under the given criterion.

Note that the criterion is applied to estimating pi_vec, theta_vecs and taus independently. Therefore, a tuple of the dirichlet distribution, the student’s t-distributions and the wishart distributions will be returned when loss=”KL”

Parameters:
lossstr, optional

Loss function underlying the Bayes risk function, by default “squared”. This function supports “squared”, “0-1”, and “KL”.

Returns:
Estimatesa tuple of {numpy ndarray, float, None, or rv_frozen}
  • pi_vec_hat : the estimate for pi_vec

  • theta_vecs_hat : the estimate for theta_vecs

  • taus_hat : the estimate for taus

The estimated values under the given loss function. If it is not exist, np.nan will be returned. If the loss function is “KL”, the posterior distribution itself will be returned as rv_frozen object of scipy.stats.

visualize_posterior()#

Visualize the posterior distribution for the parameter.

Examples

>>> import numpy as np
>>> from bayesml import linearregressionmixture
>>> gen_model = linearregressionmixture.GenModel(
>>>     c_num_classes=2,
>>>     c_degree=2,
>>>     theta_vecs=np.array([[1,3],[-1,-3]]),
>>>     taus=np.array([0.5,1.0]),
>>>     )
>>> x,z,y = gen_model.gen_sample(100)
>>> learn_model = linearregressionmixture.LearnModel(
>>>     c_num_classes=2,
>>>     c_degree=2,
>>>     )
>>> learn_model.update_posterior(x,y)
>>> learn_model.visualize_posterior()
hn_gamma_vec:
[53.46589867 47.53410133]
E[pi_vec]:
[0.52936533 0.47063467]
hn_mu_vecs:
[[-1.12057057 -3.14175971]
[ 1.15046197  2.72935847]]
hn_lambda_mats:
[[[ 73.28683786  -1.18874056]
[ -1.18874056  53.96589867]]

[[ 39.13313893 -10.37075427] [-10.37075427 48.03410133]]] hn_alphas: [27.48294934 24.51705066] hn_betas: [27.13542998 43.09024752] E[taus]: [1.01280685 0.56896983]

_images/linearregressionmixture_posterior.png
get_p_params()#

Get the parameters of the predictive distribution.

Returns:
p_paramsdict of {str: numpy.ndarray}
  • "p_pi_vecs" : the value of self.p_pi_vecs

  • "p_ms" : the value of self.p_ms

  • "p_lambdas" : the value of self.p_lambdas

  • "p_nus" : the value of self.p_nus

calc_pred_dist(x)#

Calculate the parameters of the predictive distribution.

Parameters:
xnumpy ndarray

float array. The size along the last dimension must conincides with the c_degree. If you want to use a constant term, it should be included in x.

make_prediction(loss='squared')#

Predict a new data point under the given criterion.

Parameters:
lossstr, optional

Loss function underlying the Bayes risk function, by default “squared”. This function supports “squared” and “0-1”.

Returns:
predicted_valuenumpy.ndarray

The predicted value under the given loss function. The size of the predicted values is the same as the sample size of x when you called calc_pred_dist(x).

pred_and_update(x, y, loss='squared', max_itr=100, num_init=10, tolerance=1e-08, init_type='random_responsibility')#

Update the hyperparameters of the posterior distribution using traning data.

h0_params will be overwritten by current hn_params before updating hn_params by x

Parameters:
xnumpy ndarray

float array. The size along the last dimension must conincides with the c_degree. If you want to use a constant term, it should be included in x.

ynumpy ndarray

float array.

lossstr, optional

Loss function underlying the Bayes risk function, by default “squared”. This function supports “squared” and “0-1”.

max_itrint, optional

maximum number of iterations, by default 100

num_initint, optional

number of initializations, by default 10

tolerancefloat, optional

convergence criterion of variational lower bound, by default 1.0E-8

init_typestr, optional
  • 'random_responsibility': randomly assign responsibility to r_vecs

  • 'subsampling': for each latent class, extract a subsample whose size is int(np.sqrt(x.shape[0])). and use it to update q(theta_k,tau_k).

Type of initialization, by default 'random_responsibility'

Returns:
predicted_valuenumpy.ndarray

The predicted value under the given loss function. The size of the predicted values is the same as the sample size of x when you called calc_pred_dist(x).

fit(x, y, max_itr=1000, num_init=10, tolerance=1e-08, init_type='random_responsibility')#

Fit the model to the data.

This function is a wrapper of the following functions:

>>> self.reset_hn_params()
>>> self.update_posterior(x,y,max_itr,tolerance,init_type)
>>> return self
Parameters:
xnumpy ndarray

float array. The size along the last dimension must conincides with the c_degree. If you want to use a constant term, it should be included in x.

ynumpy ndarray

float array.

max_itrint, optional

maximum number of iterations, by default 1000

num_initint, optional

number of initializations, by default 10

tolerancefloat, optional

convergence criterion of variational lower bound, by default 1.0E-8

init_typestr, optional
  • 'random_responsibility': randomly assign responsibility to r_vecs

  • 'subsampling': for each latent class, extract a subsample whose size is int(np.sqrt(x.shape[0])). and use it to update q(theta_k,tau_k).

Type of initialization, by default 'random_responsibility'

Returns:
selfLearnModel

The fitted model.

predict(x)#

Predict the data.

This function is a wrapper of the following functions:

>>> self.calc_pred_dist(x)
>>> return self.make_prediction(loss="squared")
Parameters:
xnumpy ndarray

float array. The size along the last dimension must conincides with the c_degree. If you want to use a constant term, it should be included in x.

Returns:
Predicted_valuesnumpy ndarray

The predicted values under the squared loss function. The size of the predicted values is the same as the sample size of x.

estimate_latent_vars(x, y, loss='0-1')#

Estimate latent variables corresponding to x under the given criterion.

Note that the criterion is independently applied to each data point.

Parameters:
xnumpy ndarray

float array. The size along the last dimension must conincides with the c_degree. If you want to use a constant term, it should be included in x.

ynumpy ndarray

float array.

lossstr, optional

Loss function underlying the Bayes risk function, by default “0-1”. This function supports “squared”, “0-1”, and “KL”.

Returns:
estimatesnumpy.ndarray

The estimated values under the given loss function. If the loss function is “KL”, the posterior distribution will be returned as a numpy.ndarray whose elements consist of occurence probabilities.

estimate_latent_vars_and_update(x, y, loss='0-1', max_itr=100, num_init=10, tolerance=1e-08, init_type='random_responsibility')#

Estimate latent variables and update the posterior sequentially.

h0_params will be overwritten by current hn_params before updating hn_params by x

Parameters:
xnumpy ndarray

float array. The size along the last dimension must conincides with the c_degree. If you want to use a constant term, it should be included in x.

ynumpy ndarray

float array.

lossstr, optional

Loss function underlying the Bayes risk function, by default “0-1”. This function supports “squared” and “0-1”.

max_itrint, optional

maximum number of iterations, by default 100

num_initint, optional

number of initializations, by default 10

tolerancefloat, optional

convergence croterion of variational lower bound, by default 1.0E-8

init_typestr, optional
  • 'subsampling': for each latent class, extract a subsample whose size is int(np.sqrt(x.shape[0])). and use its mean and covariance matrix as an initial values of hn_m_vecs and hn_lambda_mats.

  • 'random_responsibility': randomly assign responsibility to r_vecs

Type of initialization, by default 'subsampling'

Returns:
estimatesnumpy.ndarray

The estimated values under the given loss function. If the loss function is “KL”, the posterior distribution will be returned as a numpy.ndarray whose elements consist of occurence probabilities.