Coursera - Computer Science: Programming With A Purpose

Week 3: Arrays - Discrete Distribution

Jul 1, 2020

Write a program DiscreteDistribution.java that takes an integer command-line argument m, followed by a sequence of positive integer command-line arguments a₁, a₂, …, a_n, and prints m random indices (separated by whitespace), choosing each index i with probability proportional to a_i.

~/Desktop/arrays> java DiscreteDistribution 25 1 1 1 1 1 1
5 2 4 4 5 5 4 3 4 3 1 5 2 4 2 6 1 3 6 2 3 2 4 1 4

~/Desktop/arrays> java DiscreteDistribution 25 10 10 10 10 10 50
3 6 6 1 6 6 2 4 6 6 3 6 6 6 6 4 5 6 2 2 6 6 2 6 2

~/Desktop/arrays> java DiscreteDistribution 25 80 20
1 2 1 2 1 1 2 1 1 1 1 1 1 1 1 2 2 2 1 1 1 1 1 1 1 

~/Desktop/arrays> java DiscreteDistribution 100 301 176 125 97 79 67 58 51 46
6 2 4 3 2 3 3 1 7 1 1 3 4 7 1 4 2 2 1 1 3 1 8 6 2 
1 3 6 1 8 5 1 3 6 1 1 2 3 8 7 4 6 4 3 1 5 3 3 7 3 
1 3 1 7 7 2 2 3 6 5 4 1 1 1 7 2 3 5 2 2 1 4 1 2 1 
2 1 2 2 3 2 8 4 3 2 1 8 3 5 3 3 8 1 2 3 3 1 2 3 1

To generate a random index i with probability proportional to a_i:

Define the cumulative sums S_i = a₁ + a₂ + … + a_i and S₀ = 0.
Pick a random integer r uniformly between 0 and S_n − 1.
Find the unique index i between 1 and n such that S_i-1 ≤ r < S_i.

Geometrically, this subdivides the interval [0,S_n) into n subintervals [S_i−1,S_i), with the length of subinterval i proportional to a_i. For example, if the discrete distribution is defined by

(a₁, a₂, a₃, a₄, a₅, a₆) = (10, 10, 10, 10, 10, 50),

then the cumulative sums are

(S₁, S₂, S₃, S₄, S₅, S₆) = (10, 20, 30, 40, 50, 100)

and the following diagram illustrates the 6 subintervals:

In probability theory, this is known as sampling from a discrete distribution.

Note: the above description is copied from Coursera and converted to markdown for convenience

Solution:

public class DiscreteDistribution {

    public static void main(String[] args) {
        final int m = Integer.parseInt(args[0]);

        // read sequence of positive integer command-line arguments
        final int n = args.length - 1;
        final int[] a = new int[n];
        for (int i = 0; i < a.length; i++) {
            a[i] = Integer.parseInt(args[i + 1]);
        }

        // define the cumulative sums
        final int[] sum = new int[a.length + 1];
        sum[0] = 0;
        for (int i = 0; i < a.length; i++) {
            sum[i + 1] = sum[i] + a[i];
        }

        int count = 0;
        for (int i = 0; i < m; i++) {
            // pick a random integer r uniformly between 0 and Sn - 1
            // double r = Math.random() * (sum[n] - 1);
            final double r = Math.random() * (sum[n]);
            // find the unique index i between 1 and n such that Si-1<=r<Si
            for (int j = 0; j < sum.length; j++) {
                if (r < sum[j]) {
                    System.out.print(j + " ");
                    count++;
                    if (count == 25) {
                        System.out.println();
                        count = 0;
                    }
                    break;
                }
            }
        }
    }
}

Link To: Java Source Code