SVB module¶

Stochastic Bayesian inference of a nonlinear model

Infers:

Posterior mean values of model parameters
A posterior covariance matrix (which may be diagonal or a full positive-definite matrix)

The general order for tensor dimensions is:

Voxel indexing (V=number of voxels / W=number of parameter vertices)
Parameter indexing (P=number of parameters)
Sample indexing (S=number of samples)
Data point indexing (B=batch size, i.e. number of time points being trained on, in some cases T=total number of time points in full data)

This ordering is chosen to allow the use of TensorFlow batch matrix operations. However it is inconvenient for the model which would like to be able to index input by parameter. For this reason we transpose when calling the model’s evaluate function to put the P dimension first.

The parameter vertices, W, are the set of points on which parameters are defined and will be output. They may be voxel centres, or surface element vertices. The data voxels, V, on the other hand are the points on which the data to be fitted to is defined. Typically this will be volumetric voxels as that is what most imaging experiments output as raw data.

In many cases, W will be the same as V since we are inferring volumetric parameter maps from volumetric data. However we might alternatively want to infer surface based parameter maps but keep the comparison to the measured volumetric data. In this case V and W will be different. The key point at which this difference is handled is the model evaluation which takes parameters defined on W and outputs a prediction defined on V.

V and W are currently identical but may not be in the future. For example we may want to estimate parameters on a surface (W=number of surface vertices) using data defined on a volume (V=number of voxels).

Ideas for per voxel/vertex convergence:

Maintain vertex_mask as member. Initially all ones

Mask vertices when generating samples and evaluating model. The latent cost will be over unmasked vertices only.

PROBLEM: need reconstruction cost defined over full voxel set hence need to project model evaluation onto all voxels. So masked vertices still need to keep their previous model evaluation output

Define criteria for masking vertices after each epoch

PROBLEM: spatial interactions make per-voxel convergence difficult. Maybe only do full set convergence in this case (like Fabber)

class svb.svb.SvbFit(data_model, fwd_model, **kwargs)[source]¶

Stochastic Bayesian model fitting

Variables:	model – Model instance to be fitted to some data prior – svb.prior.Prior instance defining the prior parameter distribution post – svb.posterior.Posterior instance defining the posterior parameter distribution params – Sequence of Parameter instances of parameters to infer. This includes the model parameters and the noise parameter(s)

evaluate(*tensors)[source]¶

Evaluate tensor values

Parameters:	tensors – Sequence of tensors or names of tensors
Returns:	If single tensor requested, it’s value as Numpy array. Otherwise tuple of Numpy arrays

fit_batch()[source]¶

Train model based on mini-batch of input data.

Returns:	Tuple of total cost of mini-batch, latent cost and reconstruction cost

set_state(state)[source]¶

Set the state of the optimization

Parameters:	state – State as returned by the `state()` method

state()[source]¶

Get the current state of the optimization.

This can be used to restart from a previous state if a numerical error occurs

train(tpts, data, batch_size=None, sequential_batches=False, epochs=100, fit_only_epochs=0, display_step=1, learning_rate=0.1, lr_decay_rate=1.0, sample_size=None, ss_increase_factor=1.0, revert_post_trials=50, revert_post_final=True, **kwargs)[source]¶

Train the graph to infer the posterior distribution given timeseries data

Parameters:	tpts – Time series values. Should have shape [T] or [V, T] depending on whether timeseries is constant or varies voxelwise data – Full timeseries data, shape [V, T]

Optional arguments:

Parameters:

batch_size – Batch size to use when training model. Need not be a factor of T, however if not batches will not all be the same size. If not specified, data size is used (i.e. no mini-batch optimization)
sequential_batches – If True, form batches from consecutive time points rather than strides
epochs – Number of training epochs
fit_only_epochs – If specified, this number of epochs will be restricted to fitting only and ignore prior information. In practice this means only the reconstruction loss is considered not the latent cost
display_step – How many steps to execute for each display line
learning_rate – Initial learning rate
lr_decay_rate – When adjusting the learning rate, the factor to reduce it by
sample_size – Number of samples to use when estimating expectations over the posterior
ss_increase_factor – Factor to increase the sample size by over the epochs
revert_post_trials – How many epoch to continue for without an improvement in the mean cost before reverting the posterior to the previous best parameters
revert_post_final – If True, revert to the state giving the best cost achieved after the final epoch