Counting semiprimes

The prime-counting function \pi(n) returns the number of prime numbers less than or equal to n. The prime number theorem gives the asymptotic behaviour \pi(n)\sim n/\log n.

We shall call a number semiprime if it is the product of two primes. Let \omega(n) denote the semiprime-counting function. What is the asymptotic behaviour of \omega(n)? I claim that

\omega(n) \sim \frac{n}{\log n} \log\left( \frac{\log n}{\log 2} \right)

Edit (17 September 2016) Based on experiments in Mathematica, the above seems to be a lower bound, while an upper bound is given by

\omega(n) \sim \frac{n \log \log n}{\log n}

I obtain these by the following heuristic, probabilistic argument. The probability that a random integer chosen between 1 and n is \pi(n)/n \sim 1/\log n; loosely, this means that the probability of a randomly chosen large number n is approximately 1/\log n. Similarly, \omega(n) / n is the probability that a randomly-chosen m < n is semiprime.

An exact formula for \omega(n) is

\omega(n) = \sum_{x=2}^{n/2} \sum_{y=2}^{n/x} 1(xy \text{ is prime}) = \frac{1}{2} \sum_{x=2}^{n/2} \sum_{y=2}^{n/x} 1(x \text{ is prime}) 1(y \text{ is prime}).

The sum over y is trivially \pi(n/x) \sim (n/x)/(\log n - \log x). Hence

\omega(n) \sim \sum_{x=2}^{n/2} 1(x \text{ is prime}) \frac{n}{x (\log n - \log x)}.

We split the sum at x = \sqrt n into two parts, and assume that the first part is dominated by the second. In the second part, x is large and we invoke 1(x \text{ is prime}) \sim 1/\log x and approximate the sum as an integral, to get

\omega(n) \sim \int_{\sqrt n}^{n/2} \frac{n}{x \log x (\log n - \log x)} dx.

Evaluating the integral and keeping only the highest-order terms, we get the desired formula.

The semiprime-counting function can be implemented in Mathematica by

SemiprimeQ[n_] := SemiprimeQ[n] = Module[{f},
    f = FactorInteger[n];
     ((Length[f] == 2) && (Total[Last /@ f] == 2))
        || ((Length[f] == 1) && f[[1]][[2]] == 2) )

w[n_] := w[n] = Length[Select[Range[n], SemiprimeQ]]

(where we memoise the results to speed things up). The following code produces a graph showing the validity of these approximation:

W1[n_] := n/Log[n] Log[Log[n]];
W2[n_] := n/Log[n] Log[Log[n]/Log[2]];
nn = 150000; nj = 500;
 ListPlot[Table[{n, w[n]/W1[n]}, {n, 1, nn, nj}],
   PlotStyle -> Red],
 ListPlot[Table[{n, w[n]/W2[n]}, {n, 1, nn, nj}],
   PlotStyle -> Green]

Leave a Reply

Your email address will not be published. Required fields are marked *