vegasflow package
Module contents
Monte Carlo integration with Tensorflow
Submodules
vegasflow.vflow module
This module contains the VegasFlow class and all its auxiliary functions
The main interfaces of this class are the class VegasFlow and the vegas_wrapper
- vegasflow.vflow.importance_sampling_digest(xn, divisions)
Importance sampling algorithm: receives a random array (number of dimensions, number of dim) containing information about from which bins in the grid (n_dims, BINS_MAX+1) the random points have to be sampled
This algorithm is shared between the simplest form of Vegas (VegasFlow: only importance sampling) and Vegas+ (VegasFlowPlus: importance and stratified sampling) and so it has been lifted to its own function
- xn: float tensor (n_dim, n_events)
which bins to sample from
- divisions: float tensor (n_dims, BINS_MAX+1)
grid of divisions for the importance sampling algorithm
- Returns
ind_i (integer tensor (n_events, n_dim)) – index in the divisions grid from which the points should be sampled
x (float tensor (n_events, n_dim)) – random values sampled in the divisions grid
xdelta (float tensor (n_events,)) – weight of the random points
- vegasflow.vflow.refine_grid_per_dimension(t_res_sq, subdivisions)
Modifies the boundaries for the vegas grid for a given dimension
- Parameters
t_res_sq (tensor) – array of results squared per bin
subdivision (tensor) – current boundaries for the grid
- Returns
`new_divisions` – array with the new boundaries of the grid
- Return type
tensor
- class vegasflow.vflow.VegasFlow(n_dim, n_events, train=True, main_dimension=0, **kwargs)
Bases:
MonteCarloFlow
Implementation of the important sampling algorithm from Vegas.
- Parameters
- make_differentiable()
Freeze the grid if the function is to be called within a graph
- freeze_grid()
Stops the grid from refining any more
- unfreeze_grid()
Enable the refining of the grid
- save_grid(file_name)
Save the divisions array in a json file
- Parameters
file_name (str) –
checkpoint (Filename in which to save the) –
- load_grid(file_name=None, numpy_grid=None)
Load the divisions array from a json file or from a numpy_array
- Parameters
file_name (str) –
stored (Filename in which the grid json is) –
numpy_grid (np.array) –
with (Numpy array to substitute divisions) –
- refine_grid(arr_res2)
Receives an array with the values of the integral squared per bin per dimension (arr_res2.shape = (n_dim, self.grid_bins)) and reshapes the divisions attribute accordingly
- Parameters
arr_res2 (result the integrand sq per dimension and grid bin) –
Function not compiled
- vegasflow.vflow.vegas_wrapper(integrand, n_dim, n_iter, total_n_events, **kwargs)
Convenience wrapper
- Parameters
integrand (tf.function) –
n_dim (number of dimensions) –
n_iter (number of iterations) –
n_events (number of events per iteration) –
- Returns
`final_result` (integral value)
`sigma` (monte carlo error)
- vegasflow.vflow.vegas_sampler(*args, **kwargs)
Convenience wrapper for sampling random numbers
- Parameters
integrand (tf.function) –
n_dim (number of dimensions) –
n_events (number of events per iteration) –
training_steps (number of training_iterations) –
- Returns
`sampler`
- Return type
a reference to the generate_random_array method of the integrator class
vegasflow.plain module
Plain implementation of the plainest possible MonteCarlo
- class vegasflow.plain.PlainFlow(n_dim, n_events, events_limit=1000000, list_devices=['GPU'], verbose=True, xmin=None, xmax=None, **kwargs)
Bases:
MonteCarloFlow
Simple Monte Carlo integrator.
- vegasflow.plain.plain_wrapper(*args, **kwargs)
Wrapper around PlainFlow
- vegasflow.plain.plain_sampler(*args, **kwargs)
Wrapper sampler around PlainFlow
vegasflow.monte_carlo module
Abstract class for Monte Carlo integrators implements a distribution of events across multiple devices and tensorflow graph technology
- Usage:
In order to implement a new MonteCarloFlow integrator it is necessary to implement (at least) two methods:
- _run_event: integrand
This function defines what to do in order to run one event of the Monte Carlo. It is used only for compilation, as the actual integration is done by the run_event method. In order to use the full capabilities of this library, _run_event can take a number of events as its input so it can run more than one event at the same time. All results from _run_event will be accumulated before being passed to _run_iteration.
- _run_iteration:
This function defines what to do in a full iteration of the MonteCarlo (i.e., what to do in order to run for n_events)
- Device distribution:
The default behaviour is defined in the configflow.py file.
This class will go through the devices given in the list_devices argument and consider them all active and enabled. Then the integration will be broken in batches of events_limit which will be given to the first idle device found. This means if device A is two times faster than device B, it will be expected to get two times as many events. Equally so, if events_limit is greater than n_events, all events will be given to device A as it is the first one found idle.
- vegasflow.monte_carlo.print_iteration(it, res, error, extra='', threshold=0.1)
Checks the size of the result to select between scientific notation and floating point notation
- class vegasflow.monte_carlo.MonteCarloFlow(n_dim, n_events, events_limit=1000000, list_devices=['GPU'], verbose=True, xmin=None, xmax=None, **kwargs)
Bases:
ABC
Parent class of all Monte Carlo integrators using tensorflow
- Parameters
n_dim (number of dimensions of the integrand) –
n_events (number of events per iteration) –
events_limit (maximum number of events per step) – if events_limit is below n_events each iteration of the MC will be broken down into several steps. Do it in order to limit memory. Note: for a better performance, when n_events is greater than the event limit, n_events should be exactly divisible by events_limit
list_devices (list of device type to use (use None to do the tensorflow default)) –
- property n_events
Number of events to run in a single iteration
- property events_per_run
Number of events to run in a single step. Use this variable to control how much the memory will be loaded
- property history
Returns a list with a tuple of results per iteration This tuple contains:
result: result of each iteration
error: error of the corresponding iteration
histograms: list of histograms for the corresponding iteration
- property xjac
The default jacobian is 1 / total number of events
- generate_random_array(n_events, *args)
External interface for the generation of random points as a 2D array of (n_events, n_dim). It calls the internal version of
_generate_random_array
- Parameters
n_events (number of events to generate) –
- Returns
`rnds` (array of (n_events, n_dim) random points)
`p(x)` (p(x) associated to the random points)
- set_seed(seed)
Sets the random seed
- get_device()
Looks in the list of devices until it finds a device available, once found makes the device unavailable and returns it
- release_device(device)
Makes device available again
- device_run(ncalls, sent_pc=100.0, **kwargs)
Wrapper function to select a specific device when running the event If the devices were not set, tensorflow default will be used
- Parameters
ncalls (number of calls to pass to the integrand) –
- Returns
`result`
- Return type
raw result from the integrator
- set_distribute(queue_object)
Uses dask to distribute the vegasflow run onto a cluster Takes as input a queue_object defining the jobs to be sent
- Parameters
queue_object (dask_jobqueue object) –
- make_differentiable()
Modifies the attributes of the integration so that it can be compiled inside Tensorflow functions (and, therefore, gradients calculated) Returns a reference to run_event, a method that upon calling it with no arguments will produce results and uncertainties for an integration iteration of ncalls number of events
- run_event(tensorize_events=False, **kwargs)
Runs the Monte Carlo event. This corresponds to a number of calls decided by the events_per_run variable. The variable acc is exposed in order to pass the tensor output back to the integrator in case it needs to accumulate.
The main driver of this function is the event attribute which corresponds to the tensorflor compilation of the _run_event method together with the integrand.
- Return type
The accumulated result of running all steps
- trace(n_events=50)
Trace part of the integration (only integrand and random number generator). Note that this is not able to trace post-integration steps as the base class is blind to them
- compile(integrand, compilable=True, signature=None, trace=False, check=True)
Receives an integrand, prepares it for integration and tries to compile unless told otherwise.
The input integrand must receive, as an input, an array of random numbers. There are also one optional arguments that will be passed to the function:
weight: weight of each event,
so that the most general signature for the integrand is:
integrand(array_random, weight = None),
the minimal working signature fo the integrand will be
integrand(array_random).
In other words, the integrand must take at least one argument and the integrator will always pass the array of random numbers. For legacy compatibility, the keyword argument n_dim will be accepted but it will fixed to be equal to self.n_dim
This function will try to understand the signature of the function and compile it accordingly, this means:
<1> array_random: DTYPE of shape [None, n_dim] <2> weight: DTYPE of shape [None]
if the function posses any kewyword arguments not included in this list, it will be compiled with a generic tf.function call. This will work most of the time but could trigger retracing on shape-shifting calculations.
If the signature is not to be used, it can be set to false
- Parameters
integrand (the function to integrate) –
compilable ((default True) if False, the integration) – is not passed through tf.function
signature ((default: True)) – whether to autodiscover the signature of the integrand
check ((default: True)) – check whether the integrand produces expected results and whether it is vectorial note, with check=False vectorial output will not work
- run_integration(n_iter, log_time=True, histograms=None)
Runs the integrator for the chosen number of iterations.
histograms must be a tuple of tf.Variables. At the end of all iterations the histograms per iteration will be output. The variable histograms instead will contain the weighted accumulation of all histograms
- Parameters
- Returns
`final_result` (float) – integral value
`sigma` (float) – monte carlo error
Note: it is possible not to pass any histogram variable and still fill some histogram variable at integration time, but then it is the responsibility of the integrand to empty the histograms each iteration and accumulate them.
- vegasflow.monte_carlo.wrapper(integrator_class, integrand, n_dim, n_iter, total_n_events, compilable=True)
Convenience wrapper for integration
- Parameters
integrator_class (MonteCarloFlow inherited class) –
integrand (tf.function) –
n_dim (number of dimensions) –
n_iter (number of iterations) –
n_events (number of events per iteration) –
- Returns
`final_result` (integral value)
`sigma` (monte carlo error)
- vegasflow.monte_carlo.sampler(integrator_class, integrand, n_dim, total_n_events, training_steps=5, compilable=True, return_class=False)
Convenience wrapper for sampling random numbers
- Parameters
integrator_class (MonteCarloFlow inherited class) –
integrand (tf.function) –
n_dim (number of dimensions) –
n_events (number of events per iteration) –
training_steps (number of training_iterations) –
return_class (whether to return the full instance of the class or only the random method) –
- Returns
`sampler`
- Return type
a reference to the generate_random_array method of the integrator class
vegasflow.utils module
This module contains tensorflow_compiled utilities
- vegasflow.utils.consume_array_into_indices(input_arr, indices, result_size)
Accumulate the input tensor input_arr into an output tensor of size result_size. The accumulation occurs according to the array of indices.
For instance, input_array = [a,b,c,d] and vector column indices = [[0,1,0,0]].T (with result_size = 2) will result in a final_result: (a+c+d, b)
- Parameters
input_arr – Array of results to be consumed
indices – Indices of the bins in which to accumulate the input array
result_size – size of the output array
- Returns
Array of size result_size
- Return type
final_result
- vegasflow.utils.py_consume_array_into_indices(input_arr, indices, result_size)
Python interface wrapper for
consume_array_into_indices
. It casts the possible python-object input into the correct tensorflow types.
- vegasflow.utils.generate_condition_function(n_mask, condition='and')
- Generates a function that takes a number of masks
and returns a combination of all n_masks for the given condition.
It is possible to pass a list of allowed conditions, in that case the length of the list should be n_masks - 1 and will be apply sequentially.
Note that for 2 masks you can directly use & and |
>>> from vegasflow.utils import generate_condition_function >>> import tensorflow as tf >>> f_cond = generate_condition_function(2, condition='or') >>> t_1 = tf.constant([True, False, True]) >>> t_2 = tf.constant([False, False, True]) >>> full_mask, indices = f_cond(t_1, t_2) >>> print(f"{full_mask=}
- {indices=}”)
full_mask=<tf.Tensor: shape=(3,), dtype=bool, numpy=array([ True, False, True])> indices=<tf.Tensor: shape=(2, 1), dtype=int32, numpy= array([[0],
[2]], dtype=int32)>
>>> f_cond = generate_condition_function(3, condition=['or', 'and']) >>> t_1 = tf.constant([True, False, True]) >>> t_2 = tf.constant([False, False, True]) >>> t_3 = tf.constant([True, False, False]) >>> full_mask, indices = f_cond(t_1, t_2, t_3) >>> print(f"{full_mask=}
- {indices=}”)
full_mask=<tf.Tensor: shape=(3,), dtype=bool, numpy=array([ True, False, False])> indices=<tf.Tensor: shape=(1, 1), dtype=int32, numpy=array([[0]], dtype=int32)>
- n_mask: int
Number of masks the function should accept
- condition: str (default=’and’)
Condition to apply to all masks. Accepted values are: and, or
- condition_to_idx: function
function(*masks) -> full_mask, true indices