bnelearn.util.metrics module

This module implements metrics that may be interesting.

bnelearn.util.metrics.ex_interim_util_loss(env: AuctionEnvironment, player_position: int, agent_observations: Tensor, grid_size: int, opponent_batch_size: Optional[int] = None, grid_best_response: bool = False)[source]

Estimates a bidder’s utility loss in the current state of the environment, i.e. the potential benefit of deviating from the current strategy, evaluated at each point of the agent_valuations. Therefore, we calculate

\[\max_{v_i \in V_i} \max_{b_i^* \in A_i} E_{v_{-i}|v_i} [u(v_i, b_i^*, b_{-i}(v_{-i})) - u(v_i, b_i, b_{-i}(v_{-i}))]\]

We’re conditioning on the agent’s observation at player_position. That means, types and observations of other players as well as its own type have to be conditioned. As it’s conditioned on the observation, the agent’s action stays the same.

Args:

env: bnelearn.Environment. player_position: int, position of the player in the environment. grid_size: int, stating the number of alternative actions sampled via

env.agents[player_position].get_valuation_grid(grid_size, True).

opponent_batch_size: int, specifying the sample size for opponents. grid_best_response: bool, whether or not the BRs live on the grid or

possibly come from the actual actions (in case no better response was found on grid).

Returns:
utility_loss (torch.Tensor, shape: [batch_size]): the computed

approximate utility loss for for each input observation.

best_response (torch.Tensor, shape: [batch_size, action_size]):

the best response found for each input observation (This is either a grid point, or the actual action according to the player’s strategy.)

Remarks:

Relies on availability of draw_conditional_profiles and generate_valuation_grid in the env’s ValuationObservationSampler.

bnelearn.util.metrics.ex_interim_utility(env: AuctionEnvironment, player_position: int, agent_observations: Tensor, agent_actions: Tensor, opponent_batch_size: int) Tensor[source]

Wrapper for ex_interim_utility that makes some computations sequentially if device is OOM.

bnelearn.util.metrics.ex_post_util_loss(mechanism: Mechanism, bidder_valuations: Tensor, bid_profile: Tensor, bidder: Bidder, grid: Tensor, half_precision=False)[source]

Estimates a bidder’s ex post util_loss in the current bid_profile vs a potential grid, i.e. the potential benefit of having deviated from the current strategy, as:

\[\texttt{util_loss} = max(0, BR(v_i, b_-i) - u_i(b_i, b_-i))\]
Args:

mechanism player_valuations: the valuations of the player that is to be evaluated bid_profile: (batch_size x n_player x n_items) bidder: a Bidder (used to retrieve valuations and utilities) grid: Option 1: 1d tensor with length grid_size todo for n_items > 1,

all grid_size**n_items combination will be used. Should be replaced by e.g. torch.meshgrid. Option 2: tensor with shape (grid_size, n_items)

player_position (optional): specific position in which the player will be evaluated

(defaults to player_position of bidder)

half_precision: (optional, bool) Whether to use half precision tensors. default: false

Returns:

util_loss (batch_size)

Useful: To get the memory used by a tensor (in MB): (tensor.element_size() * tensor.nelement())/(1024*1024)

bnelearn.util.metrics.get_best_responses_among_alternatives(env: AuctionEnvironment, player_position: int, agent_observations: Tensor, action_alternatives: Tensor, opponent_batch_size: int) Tuple[Tensor, IntTensor][source]

Wrapper for _get_best_responses_among_alternatives that makes some computations sequentially if device is OOM.

bnelearn.util.metrics.norm_actions(b1: Tensor, b2: Tensor, p: float = 2) float[source]

Calculates the approximate “mean” Lp-norm between two action vectors.

\[\sum_{i=1}^n(1/n \cdot |b_1 - b_2|^p)^{1/p}\]

If p = Infty, this evaluates to the supremum.

bnelearn.util.metrics.norm_strategies(strategy1: Strategy, strategy2: Strategy, valuations: Tensor, p: float = 2) float[source]

Calculates the approximate “mean” \(L_p\)-norm between two strategies approximated via Monte-Carlo integration on a sample of valuations that have been drawn according to the prior.

The function \(L_p\) norm is given by

\[\left( \int_V |s_1(v) - s_2(v)|^p dv \right)^{1/p}.\]

With Monte-Carlo integration this is approximated by

\[\left( |V|/n \cdot \sum_i^n(|s1(v) - s2(v)|^p) \right)^{1/p}\]

where \(|V|\) is the volume of the set \(V\). Here, we ignore the volume. This gives us the RMSE for \(L_2\), supremum for \(L\)-infty, etc.

bnelearn.util.metrics.norm_strategy_and_actions(strategy, actions, valuations: Tensor, p: float = 2, componentwise=False) Tensor[source]

Calculates the norm as above, but given one action vector and one strategy. The valuations must match the given actions.

This helper function is useful when recalculating an action vector is prohibitive and it should be reused.

Args:

strategy: Strategy actions: torch.Tensor valuations: torch.Tensor p: float=2 componentwise: bool=False, only returns smallest norm of all output

dimensions if true

Returns:

norm: (scalar Tensor)