Source code for prefsampling.ordinal.urn

"""
Urn processes are random processes based on the idea that rankings are drawn from an urn. The
initial composition of the urn, the rules for drawing elements from it, and the evolution of the
elements of the urn ar the characteristic of each specific urn process.
"""

from __future__ import annotations

import math
from collections import Counter

import numpy as np

from prefsampling.combinatorics import all_profiles, generalised_ascending_factorial
from prefsampling.inputvalidators import validate_num_voters_candidates

from prefsampling.core.urn import urn_scheme


[docs] @validate_num_voters_candidates def urn( num_voters: int, num_candidates: int, alpha: float, seed: int = None ) -> list[list[int]]: """ Generates votes following the Pólya-Eggenberger urn culture. The process is as follows. The urn is initially empty and votes are generated one after the other, in turns. When generating a vote, the following happens. With a probability of 1/(urn_size + 1), the vote is selected uniformly at random (following an impartial culture). With probability `1/urn_size` a vote from the urn is selected uniformly at random. In both cases, the vote is put back in the urn together with `alpha * m!` copies of the vote (where `m` is the number of candidates). Note that for a given number of voters, votes are not sampled independently. Parameters ---------- num_voters: int Number of voters num_candidates: int Number of candidates alpha: float The dispersion coefficient (`alpha * m!` copies of a vote are put back in the urn after a draw). Must be non-negative. seed: int, default: :code:`None` The seed for the random number generator. Returns ------- np.ndarray The votes Examples -------- .. testcode:: from prefsampling.ordinal import urn # Sample from an urn model with 2 voters and 3 candidates, alpha parameter is 0.5. urn(2, 3, 0.5) # For reproducibility, you can set the seed. urn(2, 3, 4, seed=1002) # Passing a negative alpha will fail try: urn(2, 3, -0.5) except ValueError: pass Validation ---------- The probability distribution governing an urn model is well documented. Specifically, given :math:`n` agents and :math:`m` candidates, the probability of observing a profile in which a given ranking :math:`j` appears :math:`c_j` times is equal to: .. math:: \\frac{n!}{\\text{asc\\_fact}(m!, n, \\alpha \\times m!)} \\times \\prod_{j = 1}^{m!} \\frac{\\text{asc\\_fact}(1, c_j, \\alpha \\times m!)}{c_j!} where :math:`\\text{asc\\_fact}` is the generalised ascending factorial, defined as: .. math:: \\text{asc\\_fact}(x, \\ell, \\sigma) = x \\times (x + \\sigma) \\times \\cdots \\times (x + (\\ell - 1) \\times \\sigma). Since the probability only depends on the number of times each ranking appears in the profile, the space of outcome consists of all anonymous profiles, i.e., all representations of any profile as a multiset (in which the order of the voters do not matter). We test that the observed frequencies of anonymous profile is in line with the theoretical probability distribution. .. image:: ../validation_plots/ordinal/urn_0_0.png :width: 800 :alt: Observed versus theoretical frequencies for an urn model with alpha=0 .. image:: ../validation_plots/ordinal/urn_0_5.png :width: 800 :alt: Observed versus theoretical frequencies for an urn model with alpha=0.5 .. image:: ../validation_plots/ordinal/urn_1_0.png :width: 800 :alt: Observed versus theoretical frequencies for an urn model with alpha=1 When :math:`\\alpha = \\frac{1}{m!}`, we are supposed to obtain a uniform distribution over all anonymous profiles. .. image:: ../validation_plots/ordinal/urn_0_0416666666666666.png :width: 800 :alt: Observed versus theoretical frequencies for an urn model with alpha=1/m! References ---------- `Über die statistik verketteter vorgänge <https://onlinelibrary.wiley.com/doi/abs/10.1002/zamm.19230030407>`_, *Florian Eggenberger and György Pólya*, ZAMM-Journal of Applied Mathematics and Mechanics/Zeitschrift für Angewandte Mathematik und Mechanik, 3(4):279–289, 1923. `Paradox of Voting under an Urn Model: The Effect of Homogeneity <https://www.jstor.org/stable/30024551>`_, *Sven Berg*, Public Choice, Vol. 47, No. 2, 1985. """ rng = np.random.default_rng(seed) votes = urn_scheme( num_voters, alpha, lambda x: list(x.permutation(num_candidates)), rng ) return votes
def theoretical_distribution(num_voters, num_candidates, alpha, profiles=None) -> dict: if profiles is None: profiles = all_profiles(num_voters, num_candidates) factorial_num_candidates = math.factorial(num_candidates) distribution = {} for profile in profiles: counts = Counter(profile) probability = math.factorial(num_voters) / generalised_ascending_factorial( factorial_num_candidates, num_voters, alpha * factorial_num_candidates, ) for c in counts.values(): probability *= generalised_ascending_factorial( 1, c, alpha * factorial_num_candidates ) / math.factorial(c) distribution[profile] = probability normaliser = sum(distribution.values()) for r in distribution: distribution[r] /= normaliser return distribution