The Zero-Truncated Poisson distribution is a sample variant of the Poisson distribution that has no zero value. A simple example of this is a the distribution of items a customer has in their shopping cart before approaching a register where it is common to presume that the customer will not approach the cash register without any items to purchase. Efficient sampling algorithms exist and are common for the Poisson distribution, however, few sampling algorithms exist for the Zero-Truncated Poisson distribution. We derive such an algorithm here using the inverse transform method.

We begin by introducing the sampling problem and the inverse transform method. Then, we review the Poisson distribution and how we derive a sampling algorithm using the inverse transform method. Finally, we introduce the Zero-Truncated Poisson distribution and how we derive a sampling algorithm using the inverse transform method. Throughout this post, we use Julia as the programming language to explore these algorithms and verify properties empirically.

Sampling Problem

The problem of sampling is straightforward given an elementary background in statistics. Given a probability distribution $f(x)$ and a sample size $n$, return a list of $n$ values $x$ that are drawn according to the distribution. We call the list of values a sample.

For example, suppose we had a Bernoulli distribution where the probability of success is $\frac{1}{4}$. We represent a successful case as 1 and an unsuccessful case as 0.

As a random variable, we denote this as $X \sim \operatorname{Bernoulli}(\frac{1}{4})$.

If we sampled 100 values from this distribution, we should expect approximately 25 of 1 and 75 of 0. In Julia, we may simulate this using the Distributions package.

using Distributions

# Define the Bernoulli distribution with parameters
bernoulli = Bernoulli(0.25)

# Draw samples from the distribution
samples = rand(bernoulli, 100)

# Count the number of success and failures
num_successes = count(x -> x == 1, samples)
num_failures = count(x -> x == 0, samples)

There are several methods for deriving a sampling algorithm for a specified distribution. For this post, we will use the inverse transform method.

Inverse Transform Method

The inverse transform method is a general technique for designing sampling algorithms when the cumulative density function $F(x)$ is known and the inverse $F^{-1}(x)$ exists.

To sample a random number from the distribution,

  1. Sample a random number $u$ from a Uniform distribution between $[0, 1]$
  2. Compute the random sample as $F^{-1}(u)$

For more details about this algorithm and its proof of correctness, see Wikipedia.

Poisson Distribution

We briefly review the Poisson distribution. This distribution describes the number of events that occur within a fixed time interval $(0, t]$. We denote the density as

$$ f(k; \lambda) = \frac{\lambda^k e^{-\lambda}}{k!} $$

and the cumulative density as

$$ F(k; \lambda) = e^{-\lambda} \sum_{i=0}^k \frac{\lambda^i}{i!} $$

Some simple properties of the distribution include the expected value and the variance with respect to the parameter $\lambda$.

$$ E[X] = Var(X) = \lambda $$

Sampling from the Poisson Distribution

Although the cumulative density is known, the inverse will require computation to compute. To be precise, we formulate the inverse problem as follows. Given some number $u \in [0, 1]$, we want to find the smallest integer $k \geq 0$ such that

$$ \begin{array}{l} u &\leq F(k; \lambda) \\ &\leq e^{-\lambda} \sum_{i=0}^k \frac{\lambda^i}{i!} \\ &\leq \sum_{i=0}^k e^{-\lambda} \frac{\lambda^i}{i!} \end{array} $$
Although it is not good style to put constants inside of the summation, we show this expression to develop some intuition for how the algorithm should proceed.

Here is an algorithm to compute such $k$ in Julia.

function sample_poisson(lambda::Float64)
  k = 0
  t = exp(-lambda)
  s = t
  u = rand()
  while s < u
    k += 1
    t *= lambda / k
    s += t

This algorithm may be proven by induction. We will not prove the correctness of this algorithm; however, we will develop some intuition. First, this algorithm is a simple, iterative algorithm that decomposes into three components: before iteration, during iteration, and after iteration. Before iteration, we assign the variables values that represent the base case of the algorithm i.e. when k = 0, t represents $\frac{\lambda}{i}$, s represents the partial sum up to k, and u is the target number for s to bound above. During iteration, we update the variables reflect the next iteration in the summation. After iteration, we return the value k that meets the exit criterion of the while loop.

Verification by Experiment

To empirically verify that this sampling algorithm is working as intended, we create a simple simulation that generates a random sample of 10,000 random numbers and verify that the expectation and variance have values that are close to $\lambda$.

sample = [sample_poisson(7.0) for _ in 1:10000]
mean(sample), var(sample)
# (6.9621,7.208384428442861)

Observe that the mean and variance are close to $\lambda$ for a sample generated by this algorithm

Zero-Truncated Poisson Distribution

The Zero-Truncated Poisson distribution can be derived from a Poisson distribution conditioned on the fact that $k$ cannot be zero. Subsequently, the probability density function is a function of $k > 0$ such that

$$ g(k; \lambda) = \frac{f(k; \lambda)}{1 - f(0; \lambda)} $$

Since $f(0; \lambda) = e^{-\lambda}$, we simplify this function to

$$ g(k; \lambda) = \frac{f(k; \lambda)}{1 - e^{-\lambda}} $$

and the we may derive the cumulative density using some calculus

$$ \begin{array}{ll} G(k; \lambda) &= \int_k \frac{f(k; \lambda)}{1 - e^{-\lambda}} dk \\ &= \frac{1}{1 - e^{-\lambda}} \int_k f(k; \lambda) \\ &= \frac{1}{1 - e^{-\lambda}} F(k; \lambda) \\ &= \frac{1}{1 - e^{-\lambda}} F(k; \lambda) \\ &= \frac{e^{-\lambda}}{1 - e^{-\lambda}} \sum_{i=1}^k \frac{\lambda^i}{i!} \\ &= \sum_{i=1}^k \frac{e^{-\lambda}}{1 - e^{-\lambda}} \frac{\lambda^i}{i!} \end{array} $$

The final cumulative density function is structurally similar to the cumulative density function for the Poisson distribution. However, it important to observe two differences. First, the constant before the summation is different. Second, the starting index for the summation is 1 for this distribution in contrast to 0 for the Poisson distribution. To summarize, we represent our cumulative density function as

$$ G(k; \lambda) = \sum_{i=1}^k \frac{e^{-\lambda}}{1 - e^{-\lambda}} \frac{\lambda^i}{i!} $$

Sampling from the Zero-Truncated Poisson Distribution

Sampling from this distribution turns out to be similar to the sampling from the regular Poisson distribution when the cumulative density function is formulated this way. To derive the algorithm, we only need to make changes to the component before iteration because the component during iteration and after iteration is exactly the same.

function sample_ztp(lambda::Float64)
  k = 1
  t = exp(-lambda) / (1 - exp(-lambda)) * lambda
  s = t
  u = rand()
  while s < u
    k += 1
    t *= lambda / k
    s += t

The sketch for the proof of this algorithm is similar to the sketch of the proof of correctness for the Poisson distribution sampling algorithm.

Verification by Experiment

Similar to the Poisson distribution sampling algorithm, we verify the correctness of our implementation by a simple sampling experiment by sampling 10,000 random numbers and verifying three properties. The first two properties we will verify are that the mean and variance are close to the $\lambda$ parameter. The final property that we will verify is that no zeros exist in the sample; this property distinguishes the Zero-Truncated Poisson distribution from the standard Poisson distribution.

sample = [sample_ztp(7.0) for _ in 1:10000]
mean(sample), var(sample)
# (6.9798,6.9812900890088905)
count(x -> x == 0, sample)
# 0

Observe that the mean and variance are close to $\lambda$ for a sample generated by this algorithm. Additionally, observe that no zeros exist in the sample.


We have shown how to derive an efficient algorithm for the Zero-Truncated Poisson distribution using previous ideas and algorithms from the standard Poisson distribution.

Additionally, a few key algorithm design concepts are illustrated here. First, we solve for the upperbound of a summation that satisfies an inequality. Second, we analyze the algorithm using loop invariants. These two concepts are both high-level ideas on approach algorithmic problems involving iteration.

For more information about loop invariants, see Chapter 2 of Introduction to Algorithms by Cormen, Leiserson, Rivest, and Stein.

It is also important to note that the running time of this algorithm is expected to be linear with respect to $\lambda$. Precisely, the running time is $\Theta(\lambda)$. An efficient sampling algorithm has been devised for sampling from the Poisson distribution for large values of $\lambda$ as shown by Ahrens and Dieters in Computer methods for sampling from gamma, beta, poisson and bionomial distributions.