## Counting semiprimes

The prime-counting function $\pi(n)$ returns the number of prime numbers less than or equal to $n$. The prime number theorem gives the asymptotic behaviour $\pi(n)\sim n/\log n$.

We shall call a number semiprime if it is the product of two primes. Let $\omega(n)$ denote the semiprime-counting function. What is the asymptotic behaviour of $\omega(n)$? I claim that

$\omega(n) \sim \frac{n}{\log n} \log\left( \frac{\log n}{\log 2} \right)$

Edit (17 September 2016) Based on experiments in Mathematica, the above seems to be a lower bound, while an upper bound is given by

$\omega(n) \sim \frac{n \log \log n}{\log n}$

I obtain these by the following heuristic, probabilistic argument. The probability that a random integer chosen between 1 and $n$ is $\pi(n)/n \sim 1/\log n$; loosely, this means that the probability of a randomly chosen large number $n$ is approximately $1/\log n$. Similarly, $\omega(n) / n$ is the probability that a randomly-chosen $m < n$ is semiprime.

An exact formula for $\omega(n)$ is

$\omega(n) = \sum_{x=2}^{n/2} \sum_{y=2}^{n/x} 1(xy \text{ is prime}) = \frac{1}{2} \sum_{x=2}^{n/2} \sum_{y=2}^{n/x} 1(x \text{ is prime}) 1(y \text{ is prime}).$

The sum over $y$ is trivially $\pi(n/x) \sim (n/x)/(\log n - \log x)$. Hence

$\omega(n) \sim \sum_{x=2}^{n/2} 1(x \text{ is prime}) \frac{n}{x (\log n - \log x)}.$

We split the sum at $x = \sqrt n$ into two parts, and assume that the first part is dominated by the second. In the second part, $x$ is large and we invoke $1(x \text{ is prime}) \sim 1/\log x$ and approximate the sum as an integral, to get

$\omega(n) \sim \int_{\sqrt n}^{n/2} \frac{n}{x \log x (\log n - \log x)} dx.$

Evaluating the integral and keeping only the highest-order terms, we get the desired formula.

The semiprime-counting function can be implemented in Mathematica by

SemiprimeQ[n_] := SemiprimeQ[n] = Module[{f},
f = FactorInteger[n];
(
((Length[f] == 2) && (Total[Last /@ f] == 2))
|| ((Length[f] == 1) && f[[1]][[2]] == 2) )
];

w[n_] := w[n] = Length[Select[Range[n], SemiprimeQ]]


(where we memoise the results to speed things up). The following code produces a graph showing the validity of these approximation:

W1[n_] := n/Log[n] Log[Log[n]];
W2[n_] := n/Log[n] Log[Log[n]/Log[2]];
nn = 150000; nj = 500;
Show[
ListPlot[Table[{n, w[n]/W1[n]}, {n, 1, nn, nj}],
PlotStyle -> Red],
ListPlot[Table[{n, w[n]/W2[n]}, {n, 1, nn, nj}],
PlotStyle -> Green]
]


I wrote this simple parallel task-scheduling system in C++/MPI, following discussions with Juha Jäyykä. It seems to work most of the time, but fails if jobs finish so quickly or so close to each other that a node does not have the change to update nexttask_ind.

I can’t work out the exact failure criterion, and would welcome any other advice with the code.

/* A simple task scheduler in C++ and MPI.
* Usage:
*   $mpic++ taskscheduler.cpp -o taskscheduler *$ mpirun -np 4 ./taskscheduler Ntasks
*/
// and
#include<iostream>
#include<mpi.h>
#include<unistd.h>
#include<assert.h>
#include<time.h>
#include<stdlib.h>

int main (int argc, char** argv) {
MPI_Init (&argc, &argv);      /* starts MPI */
MPI_Request mpireq;
MPI_Status* mpistatus;

assert(argc > 1);

/* get number of nodes */
int nnodes;
MPI_Comm_size (MPI_COMM_WORLD, &nnodes);
/* get my node rank */
int rank;
MPI_Comm_rank (MPI_COMM_WORLD, &rank);
/* get my node hostname */
char hostname[MPI_MAX_PROCESSOR_NAME];
int name_len;
MPI_Get_processor_name(hostname, &name_len);

/* This is the task that we are currently working on. */
/* This buffer stores the next task that needs to be done. */

/* We start off by giving task i to node i. */

srandom(rank);

printf("Node %d signing on.\n", rank);
while (true)
{
/* Do our task, if there is any more work to do. */
break;

printf("Node rank %d on %s is now doing task %d.\n",
/* Whatever we're doing takes a while... */
int usec = 0;
while(true)
{
if (random() % 100000 == 0)
break;

usec++;
usleep(1);

/* Every now and then, check for updates to nexttask_ind from other
* nodes. */
if (usec % 100 == 0)
for (int node = 0; node < nnodes; node++)
if (node != rank)
{
int flag = 0;
MPI_Iprobe(node, MPI_ANY_TAG, MPI_COMM_WORLD, &flag, mpistatus);
if (flag)
{
MPI_Irecv( &nexttask_ind, 1, MPI_INT, node, MPI_ANY_TAG, MPI_COMM_WORLD, &mpireq );
}
}
}

/* Advertise the fact that we have started work on this task, by