Dirichlet Distribution Models#
The Dirichlet distribution model is a model parameterised by a vector of candidate quality. A quality score is associated to each candidate. When sampling a ranking, the quality scores are used to sample a number of points for each candidate (using a Dirichlet distribution). The ranking corresponds then to the candidates ordered by number of points.
- didi(num_voters: int, num_candidates: int, alphas: list[float], seed: int = None) list[list[int]][source]#
- Generates ordinal votes from the DiDi (Dirichlet Distribution) model. - This model is parameterised by a vector alphas intuitively indicating a quality for each candidate. Moreover, the higher the sum of the alphas, the more correlated the votes are (the more concentrated the Dirichlet distribution is). To sample a vote, we sample a set of points—one per candidate—from a Dirichlet distribution parameterised by alphas. The vote then corresponds to the candidates ordered by decreasing order of points. - A collection of num_voters vote is generated independently and identically following the process described above. - This model is very similar in spirit to the - plackett_luce()model.- Parameters:
- num_voters (int) – Number of Voters. 
- num_candidates (int) – Number of Candidates. 
- alphas (list[float]) – List of model params, one value per candidate. 
- seed (int, default: - None) – Seed for numpy random number generator.
 
- Returns:
- Ordinal votes. 
- Return type:
- list[list[int]] 
 - Examples - from prefsampling.ordinal import didi # Sample from a DiDi model with 2 voters and 3 candidates, the qualities of # candidates are 0.5, 0.2, and 0.1. didi(2, 3, (0.5, 0.2, 0.1)) # For reproducibility, you can set the seed. didi(2, 3, (5, 2, 0.1), seed=1002) # Don't forget to provide a quality for all candidates try: didi(2, 3, (0.5, 0.2)) except ValueError: pass # And all quality scores need to be strictly positive try: didi(2, 3, (0.5, 0.2, -0.4)) except ValueError: pass try: didi(2, 3, (0.5, 0.2, 0)) except ValueError: pass - Validation - The probability distribution guiding the DiDi model is not known in general. Since it depends on the order of the values in a Dirichlet sample, the general computation is involved. Still, we can check some special cases. - First, when all qualities are the same, we are supposed to obtain a uniform distribution over all rankings. ![Observed versus theoretical frequencies for a DiDi model with alpha=[0.1, 0.1, 0.1, 0.1, 0.1]](../../_images/didi__0_1_0_1_0_1_0_1_0_1_.png)  - Second, in the special case of 2 candidates, we can easily compute an expression for the probability distribution of the model. Assume we have two candidates with quality \alpha_0 and \alpha_1. Then, the probability of observing the ranking 0 \succ 1 is that of the probability to sample two values x_0, x_1 from a Dirichlet distribution with parameters \alpha_0 and \alpha_1 such that x_0 > x_1. We have thus: - \mathbb{P}(x_0 > x_1) = \mathbb{P}(x_0 > 0.5) = \int_{0.5}^1 x_0^{\alpha_0 - 1} \times (1 - x_0)^{\alpha_1 - 1} dx_0. - We can compute an approximate value for of this integral using scipy. ![Observed versus theoretical frequencies for a DiDi model with alpha=[0.1, 0.1]](../../_images/didi__1_0_0_3_.png)  ![Observed versus theoretical frequencies for a DiDi model with alpha=[1, 0.3]](../../_images/didi__0_1_0_1_.png)  - In the general case, we obtain the following frequencies. ![Observed versus theoretical frequencies for a DiDi model with alpha=[0.2, 0.5, 0.3, 0.7, 0.2]](../../_images/didi__0_2_0_5_0_3_0_7_0_2_.png)  ![Observed versus theoretical frequencies for a DiDi model with alpha=[1, 0.3, 0.3, 0.3, 0.3]](../../_images/didi__1_0_0_3_0_3_0_3_0_3_.png)  - References - The DiDi model has not been references in any publications. Stanisław Szufa introduced out of curiosity. - See the wikipedia page of the Dirichlet distribution for more details.