### cyclic distribution

Let $v$ be a 1D vector, $\dim v = N$. $P$ is the number of processors and $r$ denotes the rank of a processor, $0 \le r < P$. Suppose the global index of the vector is $I, (I = 0, 1, \cdots N-1)$ and the local index is $i, (i = 0, 1, \cdots n_r-1)$. Then in the 1D cyclic data distribution, \[I = i \cdot P + r \;; \quad r = I \bmod P, \; i = \lfloor I/P \rfloor.

\] The size of the local array on processor $r$ is $n_r = \left\lfloor \frac{N-1-r}{P} \right\rfloor$ + 1. The cyclic distribution is a permutation. We rarely use the global id after permutation, it still may be useful to know: $ I' = (I\bmod P) * \lfloor N/P \rfloor + \lfloor I/P \rfloor + \min(I\bmod P, N\bmod P)$.

#### example:

$v = (0, 1,2,3,4,5,6,7,8)$, $N = 9, P = 2$. The global ids are in bold font.

r \ i | 0 1 2 3 4 | nr

- - - - - - - - - - - - - - - -

0 |

**0 2 4 6 8**| 5
1 |

I = \left( \left\lfloor i / b \right\rfloor P + r \right) \cdot b + ( i \bmod b) \;; \quad

r = \left\lfloor (I \bmod bP ) / b \right\rfloor, \;

i = \left\lfloor I / b P \right\rfloor \cdot b + (I \bmod b). \] The size of the local array on processor $r$ is $n_r = \left\lfloor \frac{N-1-r\cdot b}{b P}\right\rfloor b + b + \delta \left(r, \left\lfloor (N-1) / b \right\rfloor \bmod P \right) \cdot \left( (N-1) \bmod b + 1 - b \right) $.

**1 3 5 7**| 4### block cyclic distribution

The block cyclic distribution is another scheme that can reconcile the need of load balance and the need of block distribution of data. In addition to the quantities we have introduced before, we need to introduce the block number $b, (b\gt 0)$. In the 1D block cyclic distribution scheme, \[I = \left( \left\lfloor i / b \right\rfloor P + r \right) \cdot b + ( i \bmod b) \;; \quad

r = \left\lfloor (I \bmod bP ) / b \right\rfloor, \;

i = \left\lfloor I / b P \right\rfloor \cdot b + (I \bmod b). \] The size of the local array on processor $r$ is $n_r = \left\lfloor \frac{N-1-r\cdot b}{b P}\right\rfloor b + b + \delta \left(r, \left\lfloor (N-1) / b \right\rfloor \bmod P \right) \cdot \left( (N-1) \bmod b + 1 - b \right) $.

#### example:

#### $v = (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)$, $N = 10, P = 2, b = 3$. The global ids are in bold font.

r \ i | 0 1 2 3 4 5 | nr

- - - - - - - - - - - - - - - -

0 |

**0 1 2****6 7****8**| 6
1 |

**3 4 5****9**| 4