laplace.baselaplace
#
Classes:
-
ParametricLaplace
–Parametric Laplace class.
-
DiagLaplace
–Laplace approximation with diagonal log likelihood Hessian approximation
-
KronLaplace
–Laplace approximation with Kronecker factored log likelihood Hessian approximation
-
LowRankLaplace
–Laplace approximation with low-rank log likelihood Hessian (approximation).
-
FullLaplace
–Laplace approximation with full, i.e., dense, log likelihood Hessian approximation
ParametricLaplace
#
ParametricLaplace(model: Module, likelihood: Likelihood | str, sigma_noise: float | Tensor = 1.0, prior_precision: float | Tensor = 1.0, prior_mean: float | Tensor = 0.0, temperature: float = 1.0, enable_backprop: bool = False, dict_key_x: str = 'input_ids', dict_key_y: str = 'labels', backend: type[CurvatureInterface] | None = None, backend_kwargs: dict[str, Any] | None = None, asdl_fisher_kwargs: dict[str, Any] | None = None)
Bases: BaseLaplace
Parametric Laplace class.
Subclasses need to specify how the Hessian approximation is initialized, how to add up curvature over training data, how to sample from the Laplace approximation, and how to compute the functional variance.
A Laplace approximation is represented by a MAP which is given by the
model
parameter and a posterior precision or covariance specifying
a Gaussian distribution \(\mathcal{N}(\theta_{MAP}, P^{-1})\).
The goal of this class is to compute the posterior precision \(P\)
which sums as
Every subclass implements different approximations to the log likelihood Hessians, for example, a diagonal one. The prior is assumed to be Gaussian and therefore we have a simple form for \(\nabla^2_\theta \log p(\theta) \vert_{\theta_{MAP}} = P_0 \). In particular, we assume a scalar, layer-wise, or diagonal prior precision so that in all cases \(P_0 = \textrm{diag}(p_0)\) and the structure of \(p_0\) can be varied.
Methods:
-
fit
–Fit the local Laplace approximation at the parameters of the model.
-
square_norm
–Compute the square norm under post. Precision with
value-self.mean
as 𝛥: -
log_prob
–Compute the log probability under the (current) Laplace approximation.
-
log_marginal_likelihood
–Compute the Laplace approximation to the log marginal likelihood subject
-
__call__
–Compute the posterior predictive on input data
x
. -
functional_samples
–Sample from the function-space posterior on input data
x
. -
predictive_samples
–Sample from the posterior predictive on input data
x
. I.e., the respective -
functional_variance
–Compute functional variance for the
'glm'
predictive: -
functional_covariance
–Compute functional covariance for the
'glm'
predictive: -
sample
–Sample from the Laplace posterior approximation, i.e.,
Attributes:
-
log_likelihood
(Tensor
) –Compute log likelihood on the training data after
.fit()
has been called. -
prior_precision_diag
(Tensor
) –Obtain the diagonal prior precision \(p_0\) constructed from either
-
scatter
(Tensor
) –Computes the scatter, a term of the log marginal likelihood that
-
log_det_prior_precision
(Tensor
) –Compute log determinant of the prior precision
-
log_det_posterior_precision
(Tensor
) –Compute log determinant of the posterior precision
-
log_det_ratio
(Tensor
) –Compute the log determinant ratio, a part of the log marginal likelihood.
-
posterior_precision
(Tensor
) –Compute or return the posterior precision \(P\).
Source code in laplace/baselaplace.py
log_likelihood
#
Compute log likelihood on the training data after .fit()
has been called.
The log likelihood is computed on-demand based on the loss and, for example,
the observation noise which makes it differentiable in the latter for
iterative updates.
Returns:
-
log_likelihood
(Tensor
) –
prior_precision_diag
#
Obtain the diagonal prior precision \(p_0\) constructed from either a scalar, layer-wise, or diagonal prior precision.
Returns:
-
prior_precision_diag
(Tensor
) –
scatter
#
Computes the scatter, a term of the log marginal likelihood that
corresponds to L-2 regularization:
scatter
= \((\theta_{MAP} - \mu_0)^{T} P_0 (\theta_{MAP} - \mu_0) \).
Returns:
-
scatter
(Tensor
) –
log_det_prior_precision
#
Compute log determinant of the prior precision \(\log \det P_0\)
Returns:
-
log_det
(Tensor
) –
log_det_posterior_precision
#
Compute log determinant of the posterior precision \(\log \det P\) which depends on the subclasses structure used for the Hessian approximation.
Returns:
-
log_det
(Tensor
) –
log_det_ratio
#
Compute the log determinant ratio, a part of the log marginal likelihood.
Returns:
-
log_det_ratio
(Tensor
) –
posterior_precision
#
Compute or return the posterior precision \(P\).
Returns:
-
posterior_prec
(Tensor
) –
_glm_forward_call
#
_glm_forward_call(x: Tensor | MutableMapping, likelihood: Likelihood | str, joint: bool = False, link_approx: LinkApprox | str = PROBIT, n_samples: int = 100, diagonal_output: bool = False) -> Tensor | tuple[Tensor, Tensor]
Compute the posterior predictive on input data x
for "glm" pred type.
Parameters:
-
x
#Tensor or MutableMapping
) –(batch_size, input_shape)
if tensor. If MutableMapping, must contain the said tensor. -
likelihood
#Likelihood or str in {'classification', 'regression', 'reward_modeling'}
) –determines the log likelihood Hessian approximation.
-
link_approx
#('mc', 'probit', 'bridge', 'bridge_norm')
, default:'mc'
) –how to approximate the classification link function for the
'glm'
. Forpred_type='nn'
, only 'mc' is possible. -
joint
#bool
, default:False
) –Whether to output a joint predictive distribution in regression with
pred_type='glm'
. If set toTrue
, the predictive distribution has the same form as GP posterior, i.e. N([f(x1), ...,f(xm)], Cov[f(x1), ..., f(xm)]). IfFalse
, then only outputs the marginal predictive distribution. Only available for regression and GLM predictive. -
n_samples
#int
, default:100
) –number of samples for
link_approx='mc'
. -
diagonal_output
#bool
, default:False
) –whether to use a diagonalized posterior predictive on the outputs. Only works for
pred_type='glm'
andlink_approx='mc'
.
Returns:
-
predictive
(Tensor or tuple[Tensor]
) –For
likelihood='classification'
, a torch.Tensor is returned with a distribution over classes (similar to a Softmax). Forlikelihood='regression'
, a tuple of torch.Tensor is returned with the mean and the predictive variance. Forlikelihood='regression'
andjoint=True
, a tuple of torch.Tensor is returned with the mean and the predictive covariance.
Source code in laplace/baselaplace.py
598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 |
|
_glm_functional_samples
#
_glm_functional_samples(f_mu: Tensor, f_var: Tensor, n_samples: int, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the posterior functional on input data x
using "glm" prediction
type.
Parameters:
-
f_mu
#Tensor or MutableMapping
) –glm predictive mean
(batch_size, output_shape)
-
f_var
#Tensor or MutableMapping
) –glm predictive covariances
(batch_size, output_shape, output_shape)
-
n_samples
#int
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs.
-
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
_glm_predictive_samples
#
_glm_predictive_samples(f_mu: Tensor, f_var: Tensor, n_samples: int, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the posterior predictive on input data x
using "glm" prediction
type. I.e., the inverse-link function correponding to the likelihood is applied
on top of the functional sample.
Parameters:
-
f_mu
#Tensor or MutableMapping
) –glm predictive mean
(batch_size, output_shape)
-
f_var
#Tensor or MutableMapping
) –glm predictive covariances
(batch_size, output_shape, output_shape)
-
n_samples
#int
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs.
-
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
fit
#
fit(train_loader: DataLoader, override: bool = True, progress_bar: bool = False) -> None
Fit the local Laplace approximation at the parameters of the model.
Parameters:
-
train_loader
#DataLoader
) –each iterate is a training batch, either
(X, y)
tensors or a dict-like object containing keys as expressed byself.dict_key_x
andself.dict_key_y
.train_loader.dataset
needs to be set to access \(N\), size of the data set. -
override
#bool
, default:True
) –whether to initialize H, loss, and n_data again; setting to False is useful for online learning settings to accumulate a sequential posterior approximation.
-
progress_bar
#bool
, default:False
) –whether to show a progress bar; updated at every batch-Hessian computation. Useful for very large model and large amount of data, esp. when
subset_of_weights='all'
.
Source code in laplace/baselaplace.py
848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 |
|
square_norm
#
Compute the square norm under post. Precision with value-self.mean
as 𝛥:
Returns:
-
square_form
–
log_prob
#
log_prob(value: Tensor, normalized: bool = True) -> Tensor
Compute the log probability under the (current) Laplace approximation.
Parameters:
-
value
#Tensor
) – -
normalized
#bool
, default:True
) –whether to return log of a properly normalized Gaussian or just the terms that depend on
value
.
Returns:
-
log_prob
(Tensor
) –
Source code in laplace/baselaplace.py
log_marginal_likelihood
#
log_marginal_likelihood(prior_precision: Tensor | None = None, sigma_noise: Tensor | None = None) -> Tensor
Compute the Laplace approximation to the log marginal likelihood subject
to specific Hessian approximations that subclasses implement.
Requires that the Laplace approximation has been fit before.
The resulting torch.Tensor is differentiable in prior_precision
and
sigma_noise
if these have gradients enabled.
By passing prior_precision
or sigma_noise
, the current value is
overwritten. This is useful for iterating on the log marginal likelihood.
Parameters:
-
prior_precision
#Tensor
, default:None
) –prior precision if should be changed from current
prior_precision
value -
sigma_noise
#Tensor
, default:None
) –observation noise standard deviation if should be changed
Returns:
-
log_marglik
(Tensor
) –
Source code in laplace/baselaplace.py
__call__
#
__call__(x: Tensor | MutableMapping[str, Tensor | Any], pred_type: PredType | str = GLM, joint: bool = False, link_approx: LinkApprox | str = PROBIT, n_samples: int = 100, diagonal_output: bool = False, generator: Generator | None = None, fitting: bool = False, **model_kwargs: dict[str, Any]) -> Tensor | tuple[Tensor, Tensor]
Compute the posterior predictive on input data x
.
Parameters:
-
x
#Tensor or MutableMapping
) –(batch_size, input_shape)
if tensor. If MutableMapping, must contain the said tensor. -
pred_type
#('glm', 'nn')
, default:'glm'
) –type of posterior predictive, linearized GLM predictive or neural network sampling predictive. The GLM predictive is consistent with the curvature approximations used here. When Laplace is done only on subset of parameters (i.e. some grad are disabled), only
nn
predictive is supported. -
link_approx
#('mc', 'probit', 'bridge', 'bridge_norm')
, default:'mc'
) –how to approximate the classification link function for the
'glm'
. Forpred_type='nn'
, only 'mc' is possible. -
joint
#bool
, default:False
) –Whether to output a joint predictive distribution in regression with
pred_type='glm'
. If set toTrue
, the predictive distribution has the same form as GP posterior, i.e. N([f(x1), ...,f(xm)], Cov[f(x1), ..., f(xm)]). IfFalse
, then only outputs the marginal predictive distribution. Only available for regression and GLM predictive. -
n_samples
#int
, default:100
) –number of samples for
link_approx='mc'
. -
diagonal_output
#bool
, default:False
) –whether to use a diagonalized posterior predictive on the outputs. Only works for
pred_type='glm'
whenjoint=False
in regression. In the case of last-layer Laplace with a diagonal or Kron Hessian, setting this toTrue
makes computation much(!) faster for large number of outputs. -
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used).
-
fitting
#bool
, default:False
) –whether or not this predictive call is done during fitting. Only useful for reward modeling: the likelihood is set to
"regression"
whenFalse
and"classification"
whenTrue
.
Returns:
-
predictive
(Tensor or tuple[Tensor]
) –For
likelihood='classification'
, a torch.Tensor is returned with a distribution over classes (similar to a Softmax). Forlikelihood='regression'
, a tuple of torch.Tensor is returned with the mean and the predictive variance. Forlikelihood='regression'
andjoint=True
, a tuple of torch.Tensor is returned with the mean and the predictive covariance.
Source code in laplace/baselaplace.py
1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 |
|
functional_samples
#
functional_samples(x: Tensor | MutableMapping[str, Tensor | Any], pred_type: PredType | str = GLM, n_samples: int = 100, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the function-space posterior on input data x
.
Can be used, for example, for Thompson sampling or to compute an arbitrary
expectation.
Parameters:
-
x
#Tensor or MutableMapping
) –input data
(batch_size, input_shape)
-
pred_type
#('glm', 'nn')
, default:'glm'
) –type of posterior predictive, linearized GLM predictive or neural network sampling predictive. The GLM predictive is consistent with the curvature approximations used here.
-
n_samples
#int
, default:100
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs. Only applies when
pred_type='glm'
. -
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
predictive_samples
#
predictive_samples(x: Tensor | MutableMapping[str, Tensor | Any], pred_type: PredType | str = GLM, n_samples: int = 100, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the posterior predictive on input data x
. I.e., the respective
inverse-link function (e.g. softmax) is applied on top of the functional
sample.
Parameters:
-
x
#Tensor or MutableMapping
) –input data
(batch_size, input_shape)
-
pred_type
#('glm', 'nn')
, default:'glm'
) –type of posterior predictive, linearized GLM predictive or neural network sampling predictive. The GLM predictive is consistent with the curvature approximations used here.
-
n_samples
#int
, default:100
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs. Only applies when
pred_type='glm'
. -
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
functional_variance
#
functional_variance(Js: Tensor) -> Tensor
Compute functional variance for the 'glm'
predictive:
f_var[i] = Js[i] @ P.inv() @ Js[i].T
, which is a output x output
predictive covariance matrix.
Mathematically, we have for a single Jacobian
\(\mathcal{J} = \nabla_\theta f(x;\theta)\vert_{\theta_{MAP}}\)
the output covariance matrix
\( \mathcal{J} P^{-1} \mathcal{J}^T \).
Parameters:
-
Js
#Tensor
) –Jacobians of model output wrt parameters
(batch, outputs, parameters)
Returns:
-
f_var
(Tensor
) –output covariance
(batch, outputs, outputs)
Source code in laplace/baselaplace.py
functional_covariance
#
functional_covariance(Js: Tensor) -> Tensor
Compute functional covariance for the 'glm'
predictive:
f_cov = Js @ P.inv() @ Js.T
, which is a batchoutput x batchoutput
predictive covariance matrix.
This emulates the GP posterior covariance N([f(x1), ...,f(xm)], Cov[f(x1), ..., f(xm)]). Useful for joint predictions, such as in batched Bayesian optimization.
Parameters:
-
Js
#Tensor
) –Jacobians of model output wrt parameters
(batch*outputs, parameters)
Returns:
-
f_cov
(Tensor
) –output covariance
(batch*outputs, batch*outputs)
Source code in laplace/baselaplace.py
sample
#
Sample from the Laplace posterior approximation, i.e., \( \theta \sim \mathcal{N}(\theta_{MAP}, P^{-1})\).
Parameters:
-
n_samples
#int
, default:100
) –number of samples
-
generator
#Generator
, default:None
) –random number generator to control the samples
Returns:
-
samples
(Tensor
) –
Source code in laplace/baselaplace.py
DiagLaplace
#
DiagLaplace(model: Module, likelihood: Likelihood | str, sigma_noise: float | Tensor = 1.0, prior_precision: float | Tensor = 1.0, prior_mean: float | Tensor = 0.0, temperature: float = 1.0, enable_backprop: bool = False, dict_key_x: str = 'input_ids', dict_key_y: str = 'labels', backend: type[CurvatureInterface] | None = None, backend_kwargs: dict[str, Any] | None = None, asdl_fisher_kwargs: dict[str, Any] | None = None)
Bases: ParametricLaplace
Laplace approximation with diagonal log likelihood Hessian approximation
and hence posterior precision.
Mathematically, we have \(P \approx \textrm{diag}(P)\).
See BaseLaplace
for the full interface.
Methods:
-
fit
–Fit the local Laplace approximation at the parameters of the model.
-
log_marginal_likelihood
–Compute the Laplace approximation to the log marginal likelihood subject
-
__call__
–Compute the posterior predictive on input data
x
. -
log_prob
–Compute the log probability under the (current) Laplace approximation.
-
functional_samples
–Sample from the function-space posterior on input data
x
. -
predictive_samples
–Sample from the posterior predictive on input data
x
. I.e., the respective
Attributes:
-
log_likelihood
(Tensor
) –Compute log likelihood on the training data after
.fit()
has been called. -
prior_precision_diag
(Tensor
) –Obtain the diagonal prior precision \(p_0\) constructed from either
-
scatter
(Tensor
) –Computes the scatter, a term of the log marginal likelihood that
-
log_det_prior_precision
(Tensor
) –Compute log determinant of the prior precision
-
log_det_ratio
(Tensor
) –Compute the log determinant ratio, a part of the log marginal likelihood.
-
posterior_precision
(Tensor
) –Diagonal posterior precision \(p\).
-
posterior_scale
(Tensor
) –Diagonal posterior scale \(\sqrt{p^{-1}}\).
-
posterior_variance
(Tensor
) –Diagonal posterior variance \(p^{-1}\).
Source code in laplace/baselaplace.py
log_likelihood
#
Compute log likelihood on the training data after .fit()
has been called.
The log likelihood is computed on-demand based on the loss and, for example,
the observation noise which makes it differentiable in the latter for
iterative updates.
Returns:
-
log_likelihood
(Tensor
) –
prior_precision_diag
#
Obtain the diagonal prior precision \(p_0\) constructed from either a scalar, layer-wise, or diagonal prior precision.
Returns:
-
prior_precision_diag
(Tensor
) –
scatter
#
Computes the scatter, a term of the log marginal likelihood that
corresponds to L-2 regularization:
scatter
= \((\theta_{MAP} - \mu_0)^{T} P_0 (\theta_{MAP} - \mu_0) \).
Returns:
-
scatter
(Tensor
) –
log_det_prior_precision
#
Compute log determinant of the prior precision \(\log \det P_0\)
Returns:
-
log_det
(Tensor
) –
log_det_ratio
#
Compute the log determinant ratio, a part of the log marginal likelihood.
Returns:
-
log_det_ratio
(Tensor
) –
posterior_precision
#
Diagonal posterior precision \(p\).
Returns:
-
precision
(tensor
) –(parameters)
posterior_scale
#
Diagonal posterior scale \(\sqrt{p^{-1}}\).
Returns:
-
precision
(tensor
) –(parameters)
posterior_variance
#
Diagonal posterior variance \(p^{-1}\).
Returns:
-
precision
(tensor
) –(parameters)
fit
#
fit(train_loader: DataLoader, override: bool = True, progress_bar: bool = False) -> None
Fit the local Laplace approximation at the parameters of the model.
Parameters:
-
train_loader
#DataLoader
) –each iterate is a training batch, either
(X, y)
tensors or a dict-like object containing keys as expressed byself.dict_key_x
andself.dict_key_y
.train_loader.dataset
needs to be set to access \(N\), size of the data set. -
override
#bool
, default:True
) –whether to initialize H, loss, and n_data again; setting to False is useful for online learning settings to accumulate a sequential posterior approximation.
-
progress_bar
#bool
, default:False
) –whether to show a progress bar; updated at every batch-Hessian computation. Useful for very large model and large amount of data, esp. when
subset_of_weights='all'
.
Source code in laplace/baselaplace.py
848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 |
|
log_marginal_likelihood
#
log_marginal_likelihood(prior_precision: Tensor | None = None, sigma_noise: Tensor | None = None) -> Tensor
Compute the Laplace approximation to the log marginal likelihood subject
to specific Hessian approximations that subclasses implement.
Requires that the Laplace approximation has been fit before.
The resulting torch.Tensor is differentiable in prior_precision
and
sigma_noise
if these have gradients enabled.
By passing prior_precision
or sigma_noise
, the current value is
overwritten. This is useful for iterating on the log marginal likelihood.
Parameters:
-
prior_precision
#Tensor
, default:None
) –prior precision if should be changed from current
prior_precision
value -
sigma_noise
#Tensor
, default:None
) –observation noise standard deviation if should be changed
Returns:
-
log_marglik
(Tensor
) –
Source code in laplace/baselaplace.py
__call__
#
__call__(x: Tensor | MutableMapping[str, Tensor | Any], pred_type: PredType | str = GLM, joint: bool = False, link_approx: LinkApprox | str = PROBIT, n_samples: int = 100, diagonal_output: bool = False, generator: Generator | None = None, fitting: bool = False, **model_kwargs: dict[str, Any]) -> Tensor | tuple[Tensor, Tensor]
Compute the posterior predictive on input data x
.
Parameters:
-
x
#Tensor or MutableMapping
) –(batch_size, input_shape)
if tensor. If MutableMapping, must contain the said tensor. -
pred_type
#('glm', 'nn')
, default:'glm'
) –type of posterior predictive, linearized GLM predictive or neural network sampling predictive. The GLM predictive is consistent with the curvature approximations used here. When Laplace is done only on subset of parameters (i.e. some grad are disabled), only
nn
predictive is supported. -
link_approx
#('mc', 'probit', 'bridge', 'bridge_norm')
, default:'mc'
) –how to approximate the classification link function for the
'glm'
. Forpred_type='nn'
, only 'mc' is possible. -
joint
#bool
, default:False
) –Whether to output a joint predictive distribution in regression with
pred_type='glm'
. If set toTrue
, the predictive distribution has the same form as GP posterior, i.e. N([f(x1), ...,f(xm)], Cov[f(x1), ..., f(xm)]). IfFalse
, then only outputs the marginal predictive distribution. Only available for regression and GLM predictive. -
n_samples
#int
, default:100
) –number of samples for
link_approx='mc'
. -
diagonal_output
#bool
, default:False
) –whether to use a diagonalized posterior predictive on the outputs. Only works for
pred_type='glm'
whenjoint=False
in regression. In the case of last-layer Laplace with a diagonal or Kron Hessian, setting this toTrue
makes computation much(!) faster for large number of outputs. -
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used).
-
fitting
#bool
, default:False
) –whether or not this predictive call is done during fitting. Only useful for reward modeling: the likelihood is set to
"regression"
whenFalse
and"classification"
whenTrue
.
Returns:
-
predictive
(Tensor or tuple[Tensor]
) –For
likelihood='classification'
, a torch.Tensor is returned with a distribution over classes (similar to a Softmax). Forlikelihood='regression'
, a tuple of torch.Tensor is returned with the mean and the predictive variance. Forlikelihood='regression'
andjoint=True
, a tuple of torch.Tensor is returned with the mean and the predictive covariance.
Source code in laplace/baselaplace.py
1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 |
|
_glm_forward_call
#
_glm_forward_call(x: Tensor | MutableMapping, likelihood: Likelihood | str, joint: bool = False, link_approx: LinkApprox | str = PROBIT, n_samples: int = 100, diagonal_output: bool = False) -> Tensor | tuple[Tensor, Tensor]
Compute the posterior predictive on input data x
for "glm" pred type.
Parameters:
-
x
#Tensor or MutableMapping
) –(batch_size, input_shape)
if tensor. If MutableMapping, must contain the said tensor. -
likelihood
#Likelihood or str in {'classification', 'regression', 'reward_modeling'}
) –determines the log likelihood Hessian approximation.
-
link_approx
#('mc', 'probit', 'bridge', 'bridge_norm')
, default:'mc'
) –how to approximate the classification link function for the
'glm'
. Forpred_type='nn'
, only 'mc' is possible. -
joint
#bool
, default:False
) –Whether to output a joint predictive distribution in regression with
pred_type='glm'
. If set toTrue
, the predictive distribution has the same form as GP posterior, i.e. N([f(x1), ...,f(xm)], Cov[f(x1), ..., f(xm)]). IfFalse
, then only outputs the marginal predictive distribution. Only available for regression and GLM predictive. -
n_samples
#int
, default:100
) –number of samples for
link_approx='mc'
. -
diagonal_output
#bool
, default:False
) –whether to use a diagonalized posterior predictive on the outputs. Only works for
pred_type='glm'
andlink_approx='mc'
.
Returns:
-
predictive
(Tensor or tuple[Tensor]
) –For
likelihood='classification'
, a torch.Tensor is returned with a distribution over classes (similar to a Softmax). Forlikelihood='regression'
, a tuple of torch.Tensor is returned with the mean and the predictive variance. Forlikelihood='regression'
andjoint=True
, a tuple of torch.Tensor is returned with the mean and the predictive covariance.
Source code in laplace/baselaplace.py
598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 |
|
_glm_functional_samples
#
_glm_functional_samples(f_mu: Tensor, f_var: Tensor, n_samples: int, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the posterior functional on input data x
using "glm" prediction
type.
Parameters:
-
f_mu
#Tensor or MutableMapping
) –glm predictive mean
(batch_size, output_shape)
-
f_var
#Tensor or MutableMapping
) –glm predictive covariances
(batch_size, output_shape, output_shape)
-
n_samples
#int
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs.
-
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
_glm_predictive_samples
#
_glm_predictive_samples(f_mu: Tensor, f_var: Tensor, n_samples: int, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the posterior predictive on input data x
using "glm" prediction
type. I.e., the inverse-link function correponding to the likelihood is applied
on top of the functional sample.
Parameters:
-
f_mu
#Tensor or MutableMapping
) –glm predictive mean
(batch_size, output_shape)
-
f_var
#Tensor or MutableMapping
) –glm predictive covariances
(batch_size, output_shape, output_shape)
-
n_samples
#int
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs.
-
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
log_prob
#
log_prob(value: Tensor, normalized: bool = True) -> Tensor
Compute the log probability under the (current) Laplace approximation.
Parameters:
-
value
#Tensor
) – -
normalized
#bool
, default:True
) –whether to return log of a properly normalized Gaussian or just the terms that depend on
value
.
Returns:
-
log_prob
(Tensor
) –
Source code in laplace/baselaplace.py
functional_samples
#
functional_samples(x: Tensor | MutableMapping[str, Tensor | Any], pred_type: PredType | str = GLM, n_samples: int = 100, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the function-space posterior on input data x
.
Can be used, for example, for Thompson sampling or to compute an arbitrary
expectation.
Parameters:
-
x
#Tensor or MutableMapping
) –input data
(batch_size, input_shape)
-
pred_type
#('glm', 'nn')
, default:'glm'
) –type of posterior predictive, linearized GLM predictive or neural network sampling predictive. The GLM predictive is consistent with the curvature approximations used here.
-
n_samples
#int
, default:100
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs. Only applies when
pred_type='glm'
. -
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
predictive_samples
#
predictive_samples(x: Tensor | MutableMapping[str, Tensor | Any], pred_type: PredType | str = GLM, n_samples: int = 100, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the posterior predictive on input data x
. I.e., the respective
inverse-link function (e.g. softmax) is applied on top of the functional
sample.
Parameters:
-
x
#Tensor or MutableMapping
) –input data
(batch_size, input_shape)
-
pred_type
#('glm', 'nn')
, default:'glm'
) –type of posterior predictive, linearized GLM predictive or neural network sampling predictive. The GLM predictive is consistent with the curvature approximations used here.
-
n_samples
#int
, default:100
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs. Only applies when
pred_type='glm'
. -
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
KronLaplace
#
KronLaplace(model: Module, likelihood: Likelihood | str, sigma_noise: float | Tensor = 1.0, prior_precision: float | Tensor = 1.0, prior_mean: float | Tensor = 0.0, temperature: float = 1.0, enable_backprop: bool = False, dict_key_x: str = 'input_ids', dict_key_y: str = 'labels', backend: type[CurvatureInterface] | None = None, damping: bool = False, backend_kwargs: dict[str, Any] | None = None, asdl_fisher_kwargs: dict[str, Any] | None = None)
Bases: ParametricLaplace
Laplace approximation with Kronecker factored log likelihood Hessian approximation
and hence posterior precision.
Mathematically, we have for each parameter group, e.g., torch.nn.Module,
that \P\approx Q \otimes H.
See BaseLaplace
for the full interface and see
laplace.utils.matrix.Kron
and laplace.utils.matrix.KronDecomposed
for the structure of
the Kronecker factors. Kron
is used to aggregate factors by summing up and
KronDecomposed
is used to add the prior, a Hessian factor (e.g. temperature),
and computing posterior covariances, marginal likelihood, etc.
Damping can be enabled by setting damping=True
.
Methods:
-
log_marginal_likelihood
–Compute the Laplace approximation to the log marginal likelihood subject
-
__call__
–Compute the posterior predictive on input data
x
. -
log_prob
–Compute the log probability under the (current) Laplace approximation.
-
functional_samples
–Sample from the function-space posterior on input data
x
. -
predictive_samples
–Sample from the posterior predictive on input data
x
. I.e., the respective
Attributes:
-
log_likelihood
(Tensor
) –Compute log likelihood on the training data after
.fit()
has been called. -
prior_precision_diag
(Tensor
) –Obtain the diagonal prior precision \(p_0\) constructed from either
-
scatter
(Tensor
) –Computes the scatter, a term of the log marginal likelihood that
-
log_det_prior_precision
(Tensor
) –Compute log determinant of the prior precision
-
log_det_ratio
(Tensor
) –Compute the log determinant ratio, a part of the log marginal likelihood.
-
posterior_precision
(KronDecomposed
) –Kronecker factored Posterior precision \(P\).
Source code in laplace/baselaplace.py
log_likelihood
#
Compute log likelihood on the training data after .fit()
has been called.
The log likelihood is computed on-demand based on the loss and, for example,
the observation noise which makes it differentiable in the latter for
iterative updates.
Returns:
-
log_likelihood
(Tensor
) –
prior_precision_diag
#
Obtain the diagonal prior precision \(p_0\) constructed from either a scalar, layer-wise, or diagonal prior precision.
Returns:
-
prior_precision_diag
(Tensor
) –
scatter
#
Computes the scatter, a term of the log marginal likelihood that
corresponds to L-2 regularization:
scatter
= \((\theta_{MAP} - \mu_0)^{T} P_0 (\theta_{MAP} - \mu_0) \).
Returns:
-
scatter
(Tensor
) –
log_det_prior_precision
#
Compute log determinant of the prior precision \(\log \det P_0\)
Returns:
-
log_det
(Tensor
) –
log_det_ratio
#
Compute the log determinant ratio, a part of the log marginal likelihood.
Returns:
-
log_det_ratio
(Tensor
) –
posterior_precision
#
posterior_precision: KronDecomposed
Kronecker factored Posterior precision \(P\).
Returns:
-
precision
(`laplace.utils.matrix.KronDecomposed`
) –
log_marginal_likelihood
#
log_marginal_likelihood(prior_precision: Tensor | None = None, sigma_noise: Tensor | None = None) -> Tensor
Compute the Laplace approximation to the log marginal likelihood subject
to specific Hessian approximations that subclasses implement.
Requires that the Laplace approximation has been fit before.
The resulting torch.Tensor is differentiable in prior_precision
and
sigma_noise
if these have gradients enabled.
By passing prior_precision
or sigma_noise
, the current value is
overwritten. This is useful for iterating on the log marginal likelihood.
Parameters:
-
prior_precision
#Tensor
, default:None
) –prior precision if should be changed from current
prior_precision
value -
sigma_noise
#Tensor
, default:None
) –observation noise standard deviation if should be changed
Returns:
-
log_marglik
(Tensor
) –
Source code in laplace/baselaplace.py
__call__
#
__call__(x: Tensor | MutableMapping[str, Tensor | Any], pred_type: PredType | str = GLM, joint: bool = False, link_approx: LinkApprox | str = PROBIT, n_samples: int = 100, diagonal_output: bool = False, generator: Generator | None = None, fitting: bool = False, **model_kwargs: dict[str, Any]) -> Tensor | tuple[Tensor, Tensor]
Compute the posterior predictive on input data x
.
Parameters:
-
x
#Tensor or MutableMapping
) –(batch_size, input_shape)
if tensor. If MutableMapping, must contain the said tensor. -
pred_type
#('glm', 'nn')
, default:'glm'
) –type of posterior predictive, linearized GLM predictive or neural network sampling predictive. The GLM predictive is consistent with the curvature approximations used here. When Laplace is done only on subset of parameters (i.e. some grad are disabled), only
nn
predictive is supported. -
link_approx
#('mc', 'probit', 'bridge', 'bridge_norm')
, default:'mc'
) –how to approximate the classification link function for the
'glm'
. Forpred_type='nn'
, only 'mc' is possible. -
joint
#bool
, default:False
) –Whether to output a joint predictive distribution in regression with
pred_type='glm'
. If set toTrue
, the predictive distribution has the same form as GP posterior, i.e. N([f(x1), ...,f(xm)], Cov[f(x1), ..., f(xm)]). IfFalse
, then only outputs the marginal predictive distribution. Only available for regression and GLM predictive. -
n_samples
#int
, default:100
) –number of samples for
link_approx='mc'
. -
diagonal_output
#bool
, default:False
) –whether to use a diagonalized posterior predictive on the outputs. Only works for
pred_type='glm'
whenjoint=False
in regression. In the case of last-layer Laplace with a diagonal or Kron Hessian, setting this toTrue
makes computation much(!) faster for large number of outputs. -
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used).
-
fitting
#bool
, default:False
) –whether or not this predictive call is done during fitting. Only useful for reward modeling: the likelihood is set to
"regression"
whenFalse
and"classification"
whenTrue
.
Returns:
-
predictive
(Tensor or tuple[Tensor]
) –For
likelihood='classification'
, a torch.Tensor is returned with a distribution over classes (similar to a Softmax). Forlikelihood='regression'
, a tuple of torch.Tensor is returned with the mean and the predictive variance. Forlikelihood='regression'
andjoint=True
, a tuple of torch.Tensor is returned with the mean and the predictive covariance.
Source code in laplace/baselaplace.py
1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 |
|
_glm_forward_call
#
_glm_forward_call(x: Tensor | MutableMapping, likelihood: Likelihood | str, joint: bool = False, link_approx: LinkApprox | str = PROBIT, n_samples: int = 100, diagonal_output: bool = False) -> Tensor | tuple[Tensor, Tensor]
Compute the posterior predictive on input data x
for "glm" pred type.
Parameters:
-
x
#Tensor or MutableMapping
) –(batch_size, input_shape)
if tensor. If MutableMapping, must contain the said tensor. -
likelihood
#Likelihood or str in {'classification', 'regression', 'reward_modeling'}
) –determines the log likelihood Hessian approximation.
-
link_approx
#('mc', 'probit', 'bridge', 'bridge_norm')
, default:'mc'
) –how to approximate the classification link function for the
'glm'
. Forpred_type='nn'
, only 'mc' is possible. -
joint
#bool
, default:False
) –Whether to output a joint predictive distribution in regression with
pred_type='glm'
. If set toTrue
, the predictive distribution has the same form as GP posterior, i.e. N([f(x1), ...,f(xm)], Cov[f(x1), ..., f(xm)]). IfFalse
, then only outputs the marginal predictive distribution. Only available for regression and GLM predictive. -
n_samples
#int
, default:100
) –number of samples for
link_approx='mc'
. -
diagonal_output
#bool
, default:False
) –whether to use a diagonalized posterior predictive on the outputs. Only works for
pred_type='glm'
andlink_approx='mc'
.
Returns:
-
predictive
(Tensor or tuple[Tensor]
) –For
likelihood='classification'
, a torch.Tensor is returned with a distribution over classes (similar to a Softmax). Forlikelihood='regression'
, a tuple of torch.Tensor is returned with the mean and the predictive variance. Forlikelihood='regression'
andjoint=True
, a tuple of torch.Tensor is returned with the mean and the predictive covariance.
Source code in laplace/baselaplace.py
598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 |
|
_glm_functional_samples
#
_glm_functional_samples(f_mu: Tensor, f_var: Tensor, n_samples: int, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the posterior functional on input data x
using "glm" prediction
type.
Parameters:
-
f_mu
#Tensor or MutableMapping
) –glm predictive mean
(batch_size, output_shape)
-
f_var
#Tensor or MutableMapping
) –glm predictive covariances
(batch_size, output_shape, output_shape)
-
n_samples
#int
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs.
-
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
_glm_predictive_samples
#
_glm_predictive_samples(f_mu: Tensor, f_var: Tensor, n_samples: int, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the posterior predictive on input data x
using "glm" prediction
type. I.e., the inverse-link function correponding to the likelihood is applied
on top of the functional sample.
Parameters:
-
f_mu
#Tensor or MutableMapping
) –glm predictive mean
(batch_size, output_shape)
-
f_var
#Tensor or MutableMapping
) –glm predictive covariances
(batch_size, output_shape, output_shape)
-
n_samples
#int
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs.
-
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
log_prob
#
log_prob(value: Tensor, normalized: bool = True) -> Tensor
Compute the log probability under the (current) Laplace approximation.
Parameters:
-
value
#Tensor
) – -
normalized
#bool
, default:True
) –whether to return log of a properly normalized Gaussian or just the terms that depend on
value
.
Returns:
-
log_prob
(Tensor
) –
Source code in laplace/baselaplace.py
functional_samples
#
functional_samples(x: Tensor | MutableMapping[str, Tensor | Any], pred_type: PredType | str = GLM, n_samples: int = 100, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the function-space posterior on input data x
.
Can be used, for example, for Thompson sampling or to compute an arbitrary
expectation.
Parameters:
-
x
#Tensor or MutableMapping
) –input data
(batch_size, input_shape)
-
pred_type
#('glm', 'nn')
, default:'glm'
) –type of posterior predictive, linearized GLM predictive or neural network sampling predictive. The GLM predictive is consistent with the curvature approximations used here.
-
n_samples
#int
, default:100
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs. Only applies when
pred_type='glm'
. -
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
predictive_samples
#
predictive_samples(x: Tensor | MutableMapping[str, Tensor | Any], pred_type: PredType | str = GLM, n_samples: int = 100, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the posterior predictive on input data x
. I.e., the respective
inverse-link function (e.g. softmax) is applied on top of the functional
sample.
Parameters:
-
x
#Tensor or MutableMapping
) –input data
(batch_size, input_shape)
-
pred_type
#('glm', 'nn')
, default:'glm'
) –type of posterior predictive, linearized GLM predictive or neural network sampling predictive. The GLM predictive is consistent with the curvature approximations used here.
-
n_samples
#int
, default:100
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs. Only applies when
pred_type='glm'
. -
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
LowRankLaplace
#
LowRankLaplace(model: Module, likelihood: Likelihood | str, backend: type[CurvatureInterface] = AsdfghjklHessian if find_spec('asdfghjkl') is not None else CurvatureInterface, sigma_noise: float | Tensor = 1, prior_precision: float | Tensor = 1, prior_mean: float | Tensor = 0, temperature: float = 1, enable_backprop: bool = False, dict_key_x: str = 'input_ids', dict_key_y: str = 'labels', backend_kwargs: dict[str, Any] | None = None)
Bases: ParametricLaplace
Laplace approximation with low-rank log likelihood Hessian (approximation).
The low-rank matrix is represented by an eigendecomposition (vecs, values).
Based on the chosen backend
, either a true Hessian or, for example, GGN
approximation could be used.
The posterior precision is computed as
\( P = V diag(l) V^T + P_0.\)
To sample, compute the functional variance, and log determinant, algebraic tricks
are usedto reduce the costs of inversion to the that of a \(K imes K\) matrix
if we have a rank of K.
Note that only AsdfghjklHessian
backend is supported. Install it via:
pip install git+https://git@github.com/wiseodd/asdl@asdfghjkl
See BaseLaplace
for the full interface.
Methods:
-
log_marginal_likelihood
–Compute the Laplace approximation to the log marginal likelihood subject
-
__call__
–Compute the posterior predictive on input data
x
. -
square_norm
–Compute the square norm under post. Precision with
value-self.mean
as 𝛥: -
log_prob
–Compute the log probability under the (current) Laplace approximation.
-
functional_samples
–Sample from the function-space posterior on input data
x
. -
predictive_samples
–Sample from the posterior predictive on input data
x
. I.e., the respective
Attributes:
-
log_likelihood
(Tensor
) –Compute log likelihood on the training data after
.fit()
has been called. -
prior_precision_diag
(Tensor
) –Obtain the diagonal prior precision \(p_0\) constructed from either
-
scatter
(Tensor
) –Computes the scatter, a term of the log marginal likelihood that
-
log_det_prior_precision
(Tensor
) –Compute log determinant of the prior precision
-
log_det_ratio
(Tensor
) –Compute the log determinant ratio, a part of the log marginal likelihood.
-
posterior_precision
(tuple[tuple[Tensor, Tensor], Tensor]
) –Return correctly scaled posterior precision that would be constructed
Source code in laplace/baselaplace.py
log_likelihood
#
Compute log likelihood on the training data after .fit()
has been called.
The log likelihood is computed on-demand based on the loss and, for example,
the observation noise which makes it differentiable in the latter for
iterative updates.
Returns:
-
log_likelihood
(Tensor
) –
prior_precision_diag
#
Obtain the diagonal prior precision \(p_0\) constructed from either a scalar, layer-wise, or diagonal prior precision.
Returns:
-
prior_precision_diag
(Tensor
) –
scatter
#
Computes the scatter, a term of the log marginal likelihood that
corresponds to L-2 regularization:
scatter
= \((\theta_{MAP} - \mu_0)^{T} P_0 (\theta_{MAP} - \mu_0) \).
Returns:
-
scatter
(Tensor
) –
log_det_prior_precision
#
Compute log determinant of the prior precision \(\log \det P_0\)
Returns:
-
log_det
(Tensor
) –
log_det_ratio
#
Compute the log determinant ratio, a part of the log marginal likelihood.
Returns:
-
log_det_ratio
(Tensor
) –
posterior_precision
#
Return correctly scaled posterior precision that would be constructed as H[0] @ diag(H[1]) @ H[0].T + self.prior_precision_diag.
Returns:
-
H
(tuple(eigenvectors, eigenvalues)
) –scaled self.H with temperature and loss factors.
-
prior_precision_diag
(Tensor
) –diagonal prior precision shape
parameters
to be added to H.
log_marginal_likelihood
#
log_marginal_likelihood(prior_precision: Tensor | None = None, sigma_noise: Tensor | None = None) -> Tensor
Compute the Laplace approximation to the log marginal likelihood subject
to specific Hessian approximations that subclasses implement.
Requires that the Laplace approximation has been fit before.
The resulting torch.Tensor is differentiable in prior_precision
and
sigma_noise
if these have gradients enabled.
By passing prior_precision
or sigma_noise
, the current value is
overwritten. This is useful for iterating on the log marginal likelihood.
Parameters:
-
prior_precision
#Tensor
, default:None
) –prior precision if should be changed from current
prior_precision
value -
sigma_noise
#Tensor
, default:None
) –observation noise standard deviation if should be changed
Returns:
-
log_marglik
(Tensor
) –
Source code in laplace/baselaplace.py
__call__
#
__call__(x: Tensor | MutableMapping[str, Tensor | Any], pred_type: PredType | str = GLM, joint: bool = False, link_approx: LinkApprox | str = PROBIT, n_samples: int = 100, diagonal_output: bool = False, generator: Generator | None = None, fitting: bool = False, **model_kwargs: dict[str, Any]) -> Tensor | tuple[Tensor, Tensor]
Compute the posterior predictive on input data x
.
Parameters:
-
x
#Tensor or MutableMapping
) –(batch_size, input_shape)
if tensor. If MutableMapping, must contain the said tensor. -
pred_type
#('glm', 'nn')
, default:'glm'
) –type of posterior predictive, linearized GLM predictive or neural network sampling predictive. The GLM predictive is consistent with the curvature approximations used here. When Laplace is done only on subset of parameters (i.e. some grad are disabled), only
nn
predictive is supported. -
link_approx
#('mc', 'probit', 'bridge', 'bridge_norm')
, default:'mc'
) –how to approximate the classification link function for the
'glm'
. Forpred_type='nn'
, only 'mc' is possible. -
joint
#bool
, default:False
) –Whether to output a joint predictive distribution in regression with
pred_type='glm'
. If set toTrue
, the predictive distribution has the same form as GP posterior, i.e. N([f(x1), ...,f(xm)], Cov[f(x1), ..., f(xm)]). IfFalse
, then only outputs the marginal predictive distribution. Only available for regression and GLM predictive. -
n_samples
#int
, default:100
) –number of samples for
link_approx='mc'
. -
diagonal_output
#bool
, default:False
) –whether to use a diagonalized posterior predictive on the outputs. Only works for
pred_type='glm'
whenjoint=False
in regression. In the case of last-layer Laplace with a diagonal or Kron Hessian, setting this toTrue
makes computation much(!) faster for large number of outputs. -
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used).
-
fitting
#bool
, default:False
) –whether or not this predictive call is done during fitting. Only useful for reward modeling: the likelihood is set to
"regression"
whenFalse
and"classification"
whenTrue
.
Returns:
-
predictive
(Tensor or tuple[Tensor]
) –For
likelihood='classification'
, a torch.Tensor is returned with a distribution over classes (similar to a Softmax). Forlikelihood='regression'
, a tuple of torch.Tensor is returned with the mean and the predictive variance. Forlikelihood='regression'
andjoint=True
, a tuple of torch.Tensor is returned with the mean and the predictive covariance.
Source code in laplace/baselaplace.py
1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 |
|
_glm_forward_call
#
_glm_forward_call(x: Tensor | MutableMapping, likelihood: Likelihood | str, joint: bool = False, link_approx: LinkApprox | str = PROBIT, n_samples: int = 100, diagonal_output: bool = False) -> Tensor | tuple[Tensor, Tensor]
Compute the posterior predictive on input data x
for "glm" pred type.
Parameters:
-
x
#Tensor or MutableMapping
) –(batch_size, input_shape)
if tensor. If MutableMapping, must contain the said tensor. -
likelihood
#Likelihood or str in {'classification', 'regression', 'reward_modeling'}
) –determines the log likelihood Hessian approximation.
-
link_approx
#('mc', 'probit', 'bridge', 'bridge_norm')
, default:'mc'
) –how to approximate the classification link function for the
'glm'
. Forpred_type='nn'
, only 'mc' is possible. -
joint
#bool
, default:False
) –Whether to output a joint predictive distribution in regression with
pred_type='glm'
. If set toTrue
, the predictive distribution has the same form as GP posterior, i.e. N([f(x1), ...,f(xm)], Cov[f(x1), ..., f(xm)]). IfFalse
, then only outputs the marginal predictive distribution. Only available for regression and GLM predictive. -
n_samples
#int
, default:100
) –number of samples for
link_approx='mc'
. -
diagonal_output
#bool
, default:False
) –whether to use a diagonalized posterior predictive on the outputs. Only works for
pred_type='glm'
andlink_approx='mc'
.
Returns:
-
predictive
(Tensor or tuple[Tensor]
) –For
likelihood='classification'
, a torch.Tensor is returned with a distribution over classes (similar to a Softmax). Forlikelihood='regression'
, a tuple of torch.Tensor is returned with the mean and the predictive variance. Forlikelihood='regression'
andjoint=True
, a tuple of torch.Tensor is returned with the mean and the predictive covariance.
Source code in laplace/baselaplace.py
598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 |
|
_glm_functional_samples
#
_glm_functional_samples(f_mu: Tensor, f_var: Tensor, n_samples: int, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the posterior functional on input data x
using "glm" prediction
type.
Parameters:
-
f_mu
#Tensor or MutableMapping
) –glm predictive mean
(batch_size, output_shape)
-
f_var
#Tensor or MutableMapping
) –glm predictive covariances
(batch_size, output_shape, output_shape)
-
n_samples
#int
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs.
-
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
_glm_predictive_samples
#
_glm_predictive_samples(f_mu: Tensor, f_var: Tensor, n_samples: int, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the posterior predictive on input data x
using "glm" prediction
type. I.e., the inverse-link function correponding to the likelihood is applied
on top of the functional sample.
Parameters:
-
f_mu
#Tensor or MutableMapping
) –glm predictive mean
(batch_size, output_shape)
-
f_var
#Tensor or MutableMapping
) –glm predictive covariances
(batch_size, output_shape, output_shape)
-
n_samples
#int
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs.
-
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
square_norm
#
Compute the square norm under post. Precision with value-self.mean
as 𝛥:
Returns:
-
square_form
–
log_prob
#
log_prob(value: Tensor, normalized: bool = True) -> Tensor
Compute the log probability under the (current) Laplace approximation.
Parameters:
-
value
#Tensor
) – -
normalized
#bool
, default:True
) –whether to return log of a properly normalized Gaussian or just the terms that depend on
value
.
Returns:
-
log_prob
(Tensor
) –
Source code in laplace/baselaplace.py
functional_samples
#
functional_samples(x: Tensor | MutableMapping[str, Tensor | Any], pred_type: PredType | str = GLM, n_samples: int = 100, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the function-space posterior on input data x
.
Can be used, for example, for Thompson sampling or to compute an arbitrary
expectation.
Parameters:
-
x
#Tensor or MutableMapping
) –input data
(batch_size, input_shape)
-
pred_type
#('glm', 'nn')
, default:'glm'
) –type of posterior predictive, linearized GLM predictive or neural network sampling predictive. The GLM predictive is consistent with the curvature approximations used here.
-
n_samples
#int
, default:100
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs. Only applies when
pred_type='glm'
. -
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
predictive_samples
#
predictive_samples(x: Tensor | MutableMapping[str, Tensor | Any], pred_type: PredType | str = GLM, n_samples: int = 100, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the posterior predictive on input data x
. I.e., the respective
inverse-link function (e.g. softmax) is applied on top of the functional
sample.
Parameters:
-
x
#Tensor or MutableMapping
) –input data
(batch_size, input_shape)
-
pred_type
#('glm', 'nn')
, default:'glm'
) –type of posterior predictive, linearized GLM predictive or neural network sampling predictive. The GLM predictive is consistent with the curvature approximations used here.
-
n_samples
#int
, default:100
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs. Only applies when
pred_type='glm'
. -
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
FullLaplace
#
FullLaplace(model: Module, likelihood: Likelihood | str, sigma_noise: float | Tensor = 1.0, prior_precision: float | Tensor = 1.0, prior_mean: float | Tensor = 0.0, temperature: float = 1.0, enable_backprop: bool = False, dict_key_x: str = 'input_ids', dict_key_y: str = 'labels', backend: type[CurvatureInterface] | None = None, backend_kwargs: dict[str, Any] | None = None)
Bases: ParametricLaplace
Laplace approximation with full, i.e., dense, log likelihood Hessian approximation
and hence posterior precision. Based on the chosen backend
parameter, the full
approximation can be, for example, a generalized Gauss-Newton matrix.
Mathematically, we have \(P \in \mathbb{R}^{P \times P}\).
See BaseLaplace
for the full interface.
Methods:
-
log_marginal_likelihood
–Compute the Laplace approximation to the log marginal likelihood subject
-
__call__
–Compute the posterior predictive on input data
x
. -
log_prob
–Compute the log probability under the (current) Laplace approximation.
-
functional_samples
–Sample from the function-space posterior on input data
x
. -
predictive_samples
–Sample from the posterior predictive on input data
x
. I.e., the respective
Attributes:
-
log_likelihood
(Tensor
) –Compute log likelihood on the training data after
.fit()
has been called. -
prior_precision_diag
(Tensor
) –Obtain the diagonal prior precision \(p_0\) constructed from either
-
scatter
(Tensor
) –Computes the scatter, a term of the log marginal likelihood that
-
log_det_prior_precision
(Tensor
) –Compute log determinant of the prior precision
-
log_det_ratio
(Tensor
) –Compute the log determinant ratio, a part of the log marginal likelihood.
-
posterior_scale
(Tensor
) –Posterior scale (square root of the covariance), i.e.,
-
posterior_covariance
(Tensor
) –Posterior covariance, i.e., \(P^{-1}\).
-
posterior_precision
(Tensor
) –Posterior precision \(P\).
Source code in laplace/baselaplace.py
log_likelihood
#
Compute log likelihood on the training data after .fit()
has been called.
The log likelihood is computed on-demand based on the loss and, for example,
the observation noise which makes it differentiable in the latter for
iterative updates.
Returns:
-
log_likelihood
(Tensor
) –
prior_precision_diag
#
Obtain the diagonal prior precision \(p_0\) constructed from either a scalar, layer-wise, or diagonal prior precision.
Returns:
-
prior_precision_diag
(Tensor
) –
scatter
#
Computes the scatter, a term of the log marginal likelihood that
corresponds to L-2 regularization:
scatter
= \((\theta_{MAP} - \mu_0)^{T} P_0 (\theta_{MAP} - \mu_0) \).
Returns:
-
scatter
(Tensor
) –
log_det_prior_precision
#
Compute log determinant of the prior precision \(\log \det P_0\)
Returns:
-
log_det
(Tensor
) –
log_det_ratio
#
Compute the log determinant ratio, a part of the log marginal likelihood.
Returns:
-
log_det_ratio
(Tensor
) –
posterior_scale
#
Posterior scale (square root of the covariance), i.e., \(P^{-\frac{1}{2}}\).
Returns:
-
scale
(tensor
) –(parameters, parameters)
posterior_covariance
#
Posterior covariance, i.e., \(P^{-1}\).
Returns:
-
covariance
(tensor
) –(parameters, parameters)
posterior_precision
#
Posterior precision \(P\).
Returns:
-
precision
(tensor
) –(parameters, parameters)
log_marginal_likelihood
#
log_marginal_likelihood(prior_precision: Tensor | None = None, sigma_noise: Tensor | None = None) -> Tensor
Compute the Laplace approximation to the log marginal likelihood subject
to specific Hessian approximations that subclasses implement.
Requires that the Laplace approximation has been fit before.
The resulting torch.Tensor is differentiable in prior_precision
and
sigma_noise
if these have gradients enabled.
By passing prior_precision
or sigma_noise
, the current value is
overwritten. This is useful for iterating on the log marginal likelihood.
Parameters:
-
prior_precision
#Tensor
, default:None
) –prior precision if should be changed from current
prior_precision
value -
sigma_noise
#Tensor
, default:None
) –observation noise standard deviation if should be changed
Returns:
-
log_marglik
(Tensor
) –
Source code in laplace/baselaplace.py
__call__
#
__call__(x: Tensor | MutableMapping[str, Tensor | Any], pred_type: PredType | str = GLM, joint: bool = False, link_approx: LinkApprox | str = PROBIT, n_samples: int = 100, diagonal_output: bool = False, generator: Generator | None = None, fitting: bool = False, **model_kwargs: dict[str, Any]) -> Tensor | tuple[Tensor, Tensor]
Compute the posterior predictive on input data x
.
Parameters:
-
x
#Tensor or MutableMapping
) –(batch_size, input_shape)
if tensor. If MutableMapping, must contain the said tensor. -
pred_type
#('glm', 'nn')
, default:'glm'
) –type of posterior predictive, linearized GLM predictive or neural network sampling predictive. The GLM predictive is consistent with the curvature approximations used here. When Laplace is done only on subset of parameters (i.e. some grad are disabled), only
nn
predictive is supported. -
link_approx
#('mc', 'probit', 'bridge', 'bridge_norm')
, default:'mc'
) –how to approximate the classification link function for the
'glm'
. Forpred_type='nn'
, only 'mc' is possible. -
joint
#bool
, default:False
) –Whether to output a joint predictive distribution in regression with
pred_type='glm'
. If set toTrue
, the predictive distribution has the same form as GP posterior, i.e. N([f(x1), ...,f(xm)], Cov[f(x1), ..., f(xm)]). IfFalse
, then only outputs the marginal predictive distribution. Only available for regression and GLM predictive. -
n_samples
#int
, default:100
) –number of samples for
link_approx='mc'
. -
diagonal_output
#bool
, default:False
) –whether to use a diagonalized posterior predictive on the outputs. Only works for
pred_type='glm'
whenjoint=False
in regression. In the case of last-layer Laplace with a diagonal or Kron Hessian, setting this toTrue
makes computation much(!) faster for large number of outputs. -
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used).
-
fitting
#bool
, default:False
) –whether or not this predictive call is done during fitting. Only useful for reward modeling: the likelihood is set to
"regression"
whenFalse
and"classification"
whenTrue
.
Returns:
-
predictive
(Tensor or tuple[Tensor]
) –For
likelihood='classification'
, a torch.Tensor is returned with a distribution over classes (similar to a Softmax). Forlikelihood='regression'
, a tuple of torch.Tensor is returned with the mean and the predictive variance. Forlikelihood='regression'
andjoint=True
, a tuple of torch.Tensor is returned with the mean and the predictive covariance.
Source code in laplace/baselaplace.py
1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 |
|
_glm_forward_call
#
_glm_forward_call(x: Tensor | MutableMapping, likelihood: Likelihood | str, joint: bool = False, link_approx: LinkApprox | str = PROBIT, n_samples: int = 100, diagonal_output: bool = False) -> Tensor | tuple[Tensor, Tensor]
Compute the posterior predictive on input data x
for "glm" pred type.
Parameters:
-
x
#Tensor or MutableMapping
) –(batch_size, input_shape)
if tensor. If MutableMapping, must contain the said tensor. -
likelihood
#Likelihood or str in {'classification', 'regression', 'reward_modeling'}
) –determines the log likelihood Hessian approximation.
-
link_approx
#('mc', 'probit', 'bridge', 'bridge_norm')
, default:'mc'
) –how to approximate the classification link function for the
'glm'
. Forpred_type='nn'
, only 'mc' is possible. -
joint
#bool
, default:False
) –Whether to output a joint predictive distribution in regression with
pred_type='glm'
. If set toTrue
, the predictive distribution has the same form as GP posterior, i.e. N([f(x1), ...,f(xm)], Cov[f(x1), ..., f(xm)]). IfFalse
, then only outputs the marginal predictive distribution. Only available for regression and GLM predictive. -
n_samples
#int
, default:100
) –number of samples for
link_approx='mc'
. -
diagonal_output
#bool
, default:False
) –whether to use a diagonalized posterior predictive on the outputs. Only works for
pred_type='glm'
andlink_approx='mc'
.
Returns:
-
predictive
(Tensor or tuple[Tensor]
) –For
likelihood='classification'
, a torch.Tensor is returned with a distribution over classes (similar to a Softmax). Forlikelihood='regression'
, a tuple of torch.Tensor is returned with the mean and the predictive variance. Forlikelihood='regression'
andjoint=True
, a tuple of torch.Tensor is returned with the mean and the predictive covariance.
Source code in laplace/baselaplace.py
598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 |
|
_glm_functional_samples
#
_glm_functional_samples(f_mu: Tensor, f_var: Tensor, n_samples: int, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the posterior functional on input data x
using "glm" prediction
type.
Parameters:
-
f_mu
#Tensor or MutableMapping
) –glm predictive mean
(batch_size, output_shape)
-
f_var
#Tensor or MutableMapping
) –glm predictive covariances
(batch_size, output_shape, output_shape)
-
n_samples
#int
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs.
-
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
_glm_predictive_samples
#
_glm_predictive_samples(f_mu: Tensor, f_var: Tensor, n_samples: int, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the posterior predictive on input data x
using "glm" prediction
type. I.e., the inverse-link function correponding to the likelihood is applied
on top of the functional sample.
Parameters:
-
f_mu
#Tensor or MutableMapping
) –glm predictive mean
(batch_size, output_shape)
-
f_var
#Tensor or MutableMapping
) –glm predictive covariances
(batch_size, output_shape, output_shape)
-
n_samples
#int
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs.
-
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
log_prob
#
log_prob(value: Tensor, normalized: bool = True) -> Tensor
Compute the log probability under the (current) Laplace approximation.
Parameters:
-
value
#Tensor
) – -
normalized
#bool
, default:True
) –whether to return log of a properly normalized Gaussian or just the terms that depend on
value
.
Returns:
-
log_prob
(Tensor
) –
Source code in laplace/baselaplace.py
functional_samples
#
functional_samples(x: Tensor | MutableMapping[str, Tensor | Any], pred_type: PredType | str = GLM, n_samples: int = 100, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the function-space posterior on input data x
.
Can be used, for example, for Thompson sampling or to compute an arbitrary
expectation.
Parameters:
-
x
#Tensor or MutableMapping
) –input data
(batch_size, input_shape)
-
pred_type
#('glm', 'nn')
, default:'glm'
) –type of posterior predictive, linearized GLM predictive or neural network sampling predictive. The GLM predictive is consistent with the curvature approximations used here.
-
n_samples
#int
, default:100
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs. Only applies when
pred_type='glm'
. -
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)
Source code in laplace/baselaplace.py
predictive_samples
#
predictive_samples(x: Tensor | MutableMapping[str, Tensor | Any], pred_type: PredType | str = GLM, n_samples: int = 100, diagonal_output: bool = False, generator: Generator | None = None) -> Tensor
Sample from the posterior predictive on input data x
. I.e., the respective
inverse-link function (e.g. softmax) is applied on top of the functional
sample.
Parameters:
-
x
#Tensor or MutableMapping
) –input data
(batch_size, input_shape)
-
pred_type
#('glm', 'nn')
, default:'glm'
) –type of posterior predictive, linearized GLM predictive or neural network sampling predictive. The GLM predictive is consistent with the curvature approximations used here.
-
n_samples
#int
, default:100
) –number of samples
-
diagonal_output
#bool
, default:False
) –whether to use a diagonalized glm posterior predictive on the outputs. Only applies when
pred_type='glm'
. -
generator
#Generator
, default:None
) –random number generator to control the samples (if sampling used)
Returns:
-
samples
(Tensor
) –samples
(n_samples, batch_size, output_shape)