SVB module

Stochastic Bayesian inference of a nonlinear model

Infers:
  • Posterior mean values of model parameters
  • A posterior covariance matrix (which may be diagonal or a full positive-definite matrix)
The general order for tensor dimensions is:
  • Voxel indexing (V=number of voxels / W=number of parameter vertices)
  • Parameter indexing (P=number of parameters)
  • Sample indexing (S=number of samples)
  • Data point indexing (B=batch size, i.e. number of time points being trained on, in some cases T=total number of time points in full data)

This ordering is chosen to allow the use of TensorFlow batch matrix operations. However it is inconvenient for the model which would like to be able to index input by parameter. For this reason we transpose when calling the model’s evaluate function to put the P dimension first.

The parameter vertices, W, are the set of points on which parameters are defined and will be output. They may be voxel centres, or surface element vertices. The data voxels, V, on the other hand are the points on which the data to be fitted to is defined. Typically this will be volumetric voxels as that is what most imaging experiments output as raw data.

In many cases, W will be the same as V since we are inferring volumetric parameter maps from volumetric data. However we might alternatively want to infer surface based parameter maps but keep the comparison to the measured volumetric data. In this case V and W will be different. The key point at which this difference is handled is the model evaluation which takes parameters defined on W and outputs a prediction defined on V.

V and W are currently identical but may not be in the future. For example we may want to estimate parameters on a surface (W=number of surface vertices) using data defined on a volume (V=number of voxels).

Ideas for per voxel/vertex convergence:

  • Maintain vertex_mask as member. Initially all ones
  • Mask vertices when generating samples and evaluating model. The latent cost will be over unmasked vertices only.
  • PROBLEM: need reconstruction cost defined over full voxel set hence need to project model evaluation onto all voxels. So masked vertices still need to keep their previous model evaluation output
  • Define criteria for masking vertices after each epoch
  • PROBLEM: spatial interactions make per-voxel convergence difficult. Maybe only do full set convergence in this case (like Fabber)
class svb.svb.SvbFit(data_model, fwd_model, **kwargs)[source]

Stochastic Bayesian model fitting

Variables:
  • model – Model instance to be fitted to some data
  • prior – svb.prior.Prior instance defining the prior parameter distribution
  • post – svb.posterior.Posterior instance defining the posterior parameter distribution
  • params – Sequence of Parameter instances of parameters to infer. This includes the model parameters and the noise parameter(s)
evaluate(*tensors)[source]

Evaluate tensor values

Parameters:tensors – Sequence of tensors or names of tensors
Returns:If single tensor requested, it’s value as Numpy array. Otherwise tuple of Numpy arrays
fit_batch()[source]

Train model based on mini-batch of input data.

Returns:Tuple of total cost of mini-batch, latent cost and reconstruction cost
set_state(state)[source]

Set the state of the optimization

Parameters:state – State as returned by the state() method
state()[source]

Get the current state of the optimization.

This can be used to restart from a previous state if a numerical error occurs

train(tpts, data, batch_size=None, sequential_batches=False, epochs=100, fit_only_epochs=0, display_step=1, learning_rate=0.1, lr_decay_rate=1.0, sample_size=None, ss_increase_factor=1.0, revert_post_trials=50, revert_post_final=True, **kwargs)[source]

Train the graph to infer the posterior distribution given timeseries data

Parameters:
  • tpts – Time series values. Should have shape [T] or [V, T] depending on whether timeseries is constant or varies voxelwise
  • data – Full timeseries data, shape [V, T]

Optional arguments:

Parameters:
  • batch_size – Batch size to use when training model. Need not be a factor of T, however if not batches will not all be the same size. If not specified, data size is used (i.e. no mini-batch optimization)
  • sequential_batches – If True, form batches from consecutive time points rather than strides
  • epochs – Number of training epochs
  • fit_only_epochs – If specified, this number of epochs will be restricted to fitting only and ignore prior information. In practice this means only the reconstruction loss is considered not the latent cost
  • display_step – How many steps to execute for each display line
  • learning_rate – Initial learning rate
  • lr_decay_rate – When adjusting the learning rate, the factor to reduce it by
  • sample_size – Number of samples to use when estimating expectations over the posterior
  • ss_increase_factor – Factor to increase the sample size by over the epochs
  • revert_post_trials – How many epoch to continue for without an improvement in the mean cost before reverting the posterior to the previous best parameters
  • revert_post_final – If True, revert to the state giving the best cost achieved after the final epoch