The mmgroup guide for developers

Introduction

This guide for developers explains some selected topics of the mmgroup project which are relevant for developers who want to make changes or bugfixes. It also explains some mathematical aspects of the implementation which are not covered by [Sey20].

The guide is far from complete and we hope that we can provide more information in future versions of it.

Directory structure

The directory is organized as follows:

  • [Root directory]

    Contains setup.py and configuration files

    • docs
      • sources

        Source files for documentation with Sphinx

    • src
      • mmgroup The directory containing the package, and also extensions and shared libraries

        • dev Contains stuff needed by developers, including scripts and source file for generating C code automatically.

        • generate_c The code generator (for generating C code automatically)

        • structures Some basic structures used by the modules in mmgroup

        • tests Test scripts and some algebraic structures used for testing with pytest

Directory src/mmgroup/dev has the following subdirectories:

  • c_files Contains automatically generated C code

  • clifford12 Scripts for generating C code related to the subgroup \(2^{1+24}.\mbox{Co}_1\) of the monster and also to a Clifford group resembling that subgroup.

  • generators Scripts for generating tables related to the monomial operation of the element \(\xi\) of \(\mathbb{M}\), and also for generating C code related to certain subgroups of \(\mathbb{M}\).

  • hadamard Scripts for generating C code related to Hadamard matrices

  • mat24 Scripts for generating C code related to the Mathieu group \(\mbox{Mat}_{24}\)

  • mm_basics Scripts for generating C code related to the representation \(\rho_p\) of \(\mathbb{M}\)

  • mm_op Scripts for generating C code related to the operation of \(\mathbb{M}\) on \(\rho_p\) for specific values \(p\)

  • mm_reduce Scripts for generating C code related to the word shortening algorithm and the computation of the order in \(\mathbb{M}\).

  • pxd_files Contains files with extension .pyx, .pxd, .pxi used by Cython, which are automatically generated or copied from a different source location

Directory src/mmgroup/tests has the following subdirectories:

  • groups Slow python implementations of certain groups, including a preimage of \(\mathbb{M}\), for testing.

  • spaces Slow python implementations of certain vector spaces, including the space \(\rho_p\), for testing.

  • test_xxx Test scripts (to be executed with pytest) for testing module xxx.

Some mathematical aspects of the implementation

Implementing Automorphisms of the Parker loop

A standard automorphism \(\pi\) of the Parker loop \(\mathcal{P}\) is implemented as an array of 12 integers of type uint32_t. The lower 13 bits of the \(i\)-th entry of that array contain the image of the \(i\)-th basis element \((b_i,0)\) of the Parker loop, where \(b_i\) is the \(i\)-th basis vector of the Golay code \(\mathcal{C}\). These images describe the automorphism \(\pi\) uniquely. Each element of \(\mathcal{P}\) has a unique representation as a tuple \(d = (d, \lambda)\), \(d \in \mathcal{C}\), \(\lambda \in \mathbb{F}_2\).

A key task is to compute the image \((d, \lambda) \cdot \pi\) of an element \((d, \lambda)\) of the Parker loop under the automorphism \(\pi\). By Lemma 4.1 in [Sey20] there is a quadratic form \(q_\pi\) with associated bilinear form \(\theta_\pi\) on the Golay code satisfying

\[\begin{split}\theta_\pi \, = \, \theta^\pi + \theta \;, \quad \mbox{where} \quad \theta^\pi(d,e) = \theta(d^\pi,e^\pi) \;, \\ (d, \lambda)^\pi = (d^{\pi}, \lambda + q_\pi(d)) \; ,\end{split}\]

for any \((d, \lambda) \in \mathcal{P}\). So the image \((d, \lambda) \cdot \pi\) can be computed if \(q_\pi(d)\) can be computed. A functional value of \(q_\pi(d)\) can easily be computed from the associated bilinear from \(\theta_\pi`\) if the values \(q_\pi(b_i)\) are known for all basis vectors \(b_i\). The values \(q_\pi(b_i)\) can be computed from the sign bits of the images if the basis vectors.

So it suffices to compute the bilinear form \(\theta_\pi\) as a \(12 \times 12\) bit matrix. The first term \(\theta\) of \(\theta_\pi\) does not depend on \(\pi\) and can be stored as a constant. The second term \(\theta^\pi\) of that matrix is computed as follows:

Let \(c_i\) = \((b_i)^\pi\), and let \(C\) be the matrix with entries \(c_{i,j}\) such that \(c_i\) = \(\sum_j c_{i,j} b_j\). So \(C\) is just the matrix of the images of the basis vectors, ignoring the signs of the Parker loop.

Row \(i\), column \(j\) of matrix \(\theta^\pi\) is equal to \(\theta(c_i,c_j)\). Since \(\theta(c_i,c_j)\) is linear in its second argument, we have

\[\begin{split}\theta^\pi(c_i, c_j) \, = \, \sum_k c_{j,k} \theta(c_i,b_k) \; ,\\ \theta^\pi(c_i) \, = \, \theta(c_i) \cdot C^\top \; .\end{split}\]

Here \(\theta^\pi(c_i)\) is just the \(i\)-th row of the bit matrix \(\theta^\pi\). We also store the values \(\theta(d)\) for all \(d \in \mathcal{C}\) in a table. Thus the bilinear form \(\theta_\pi\) can essentially be computed as a product of two \(12 \times 12\) bit matrices.

We store the lower triangular submatrix of the bit matrix \(\theta_\pi\) in bits 13, ..., 24 of the array representing the automorphism \(\pi\) of the Parker loop.

Implementing generators of the Monster group

The operation \(y_{f} \cdot x_{e} \cdot x_\epsilon\)

The operation \(g = y_{f} \cdot x_{e} \cdot x_\epsilon\) on \(\rho_p\) is coded in function mm_op<p>_xy in the automatically generated C file mm_op<p>_xy.c for the modulus p. The C function performs that operation for arbitrary \(e, f \in \mathcal{P}\) and \(\epsilon \in \mathcal{C}^*\) in a single pass. That operation is monomial and the formula for it can easily be deduced from the information in [Sey20].

For \(v \in \rho_p\) the C program calculates the components in \(V_A, V_B, V_C \ldots\) of the result \(v \cdot g\) in their natural order. For implementing such a program it is useful to have a formula for the operation of \(g^{-1} = x_\epsilon \cdot x_{\bar{e}} \cdot y_{\bar{f}}\). In the sequel we state the operation of \(g^{-1}\) on \(\rho_p\).

In this subsection we use the operator ‘\(\oplus\)’ for the programmer’s multiplication in the Parker loop \(\mathcal{P}\), which we define by:

\[(\tilde{d_1}, \lambda_1 ) \oplus (\tilde{d_2}, \lambda_2) = (\tilde{d_2} + \tilde{d_2} , \, \lambda_1 + \lambda_2) \; , \qquad \tilde{d_1}, \tilde{d_2} \in \mathcal{C}, \, \lambda_1,\lambda_2 \in \mathbb{F}_2 \; .\]

Then ‘\(\oplus\)’ is a simple XOR operation on a computer; and the standard product \(d_1 d_2\) in the Parker loop is given by \(d_1 d_2 = (-1)^{\theta(d_1, d_2)} d_1 \oplus d_2\) for \(d_1, d_2 \in \mathcal{P}\). Note that ‘\(\oplus\)’ depends on the selected cocycle \(\theta\). We may store the cocycles \(\theta(d)\) in a table, with one 12-bit entry representing the element \(\theta(d Z(\mathcal{P}))\) of the Golay cocode for each of the 2048 cosets \(d Z(\mathcal{P})\) of \(Z(\mathcal{P}) = \{\pm 1, \pm \Omega\}\) in \(\mathcal{P}\). So multiplication in \(\mathcal{P}\) is easy.

Put \(d^{[0]} = d^+, d^{[1]} = d^-\), and \(X^+_{d,\delta} = X^{\vphantom{+}}_{\Omega d,\delta}\) if \(x^{\vphantom{+}}_{\Omega d,\delta}\) is short, \(X^+_{d,\delta} = X^{\vphantom{+}}_{d,\delta}\) otherwise. Then

\[\begin{split}X_{d,i}^+ & \stackrel{g^{-1}}{\longrightarrow} (-1)^{s_{{X}\vphantom{X^x}}} \cdot X_{d\oplus f,i}^+ \, , \quad \mbox{with} \; \\ {s_X} & = P(f) + P(ef) + (|\epsilon| + 1) P(d) + P(def) + \left< e, i\right> + \big< d, i^{|\epsilon|} \epsilon A(e,f) \theta(f) \big> \; ; \\ (d^{[\tau]} \otimes_1 i) & \stackrel{g^{-1}}{\longrightarrow} (-1)^{s_{{Y\!Z}\vphantom{X^x}}} \cdot d^{[\sigma]}_{d \oplus e \oplus f^{\sigma + 1}} \otimes_1 i \, , \quad \mbox{with} \; \; \sigma = \tau + |\epsilon| \pmod{2} \; , \\ {s_{Y\!Z}} & = (\sigma+1)\theta(f,e) + \sigma P(f) + P(de) + P(def) + \left<f, i\right> + \left< d, \epsilon \theta(e) \theta(f)^{\sigma+1} \right> \; ; \\ X^+_{d \cdot \delta} & \stackrel{g^{-1}}{\longrightarrow} (-1)^{s_{{T}\vphantom{X^x}}} \cdot X^+_{d \cdot \delta \delta'} \, , \quad \mbox{for} \; |d| = 8, \; \; \delta \mbox{ even} \, , \quad \mbox{with} \; \delta' = A(d,f) \, , \; \; \mbox{and} \\ s_T &= P(e) + P(de) + \left< d, \epsilon \right> + \left<ef,\delta\right> + |\delta||\epsilon|/2 \; ; \\ X_{\Omega^m \cdot ij} & \stackrel{g^{-1}}{\longrightarrow} (-1)^{ m |\epsilon| + \left< ef, ij \right> } \cdot X_{\Omega^n \cdot ij} \; , \quad \mbox{with} \; \; n = m + \left<f, ij \right> \, ; \\ (ij)_1 & \stackrel{g^{-1}}{\longrightarrow} (-1)^{\left< f, ij \right>} (ij)_1 \; .\end{split}\]

Conjugation of \(\tilde{x}_d x_\delta\) with \(y_e\)

Sometimes we have to conjugate an element \(\tilde{x}_d x_\delta\), where \(\tilde{x}_d = x_d x_{\theta(d)}\), with \(y_e\). We have

\[\begin{split}y_e^{-1} \tilde{x}_d x_\delta y_e & = x_{-1}^\alpha x_\Omega^\beta \tilde{x}_d \tilde{x}_e^{|\delta|} x_\delta x_\epsilon \, , \\ \alpha & = \theta(d,e) + \langle e,\delta\rangle^{1 + |\delta|} + \mbox{sign}(e) \, ,\\ \beta &= \theta(e,d) + \langle e,\delta\rangle + {P(e)}^{|\delta|} \, , \\ \epsilon &= A(d,e) + {\theta(e)}^{|\delta|} \, .\end{split}\]

Computations in the Leech lattice modulo 2

In this section we describe the operation of the Conway group \(\mbox{Co}_1\) on \(\Lambda / 2 \Lambda\), where \(\Lambda\) is the Leech lattice. The description of these orbits is taken from [Sey22], Appendix A.

This operation is important for computing in the subgroup \(G_{x0}\) of structure \(2^{1+24}.\mbox{Co}_1\) of the monster, as described in [Con85] and [Sey20]. Let \(Q_{x0}\) be the normal subgroup of \(G_{x0}\) of structure \(2^{1+24}\). The group \(\mbox{Co}_1 = G_{x0}/Q_{x0}\) is the automorphism group of \(\Lambda / 2 \Lambda\). We assume that \(\mbox{Co}_1\) and also \(G_{x0}\) operate on \(\Lambda / 2 \Lambda\) by right multiplication. Let \(\Omega\) in \(\Lambda / 2 \Lambda\) be the type-4 vector corresponding to the the standard coordinate frame in \(\Lambda\). The stabilizer of \(\Omega\) is a subgroup \(N_{x0}\) of \(G_{x0}\) of structure \(2^{1+24}.2^{11}.M_{24}\). Thus the set of type-4 vectors in \(\Lambda / 2 \Lambda\) corresponds the set of right cosets \(2^{11}.M_{24} \backslash \mbox{Co}_1\). We have \(N_{x0} \backslash G_{x0} \cong 2^{11}.M_{24} \backslash \mbox{Co}_1\). So identifying the right coset \(N_{x0} h\) for a \(h \in G_{x0}\) reduces to the computation of the frame \(\Omega h\).

In our construction of the monster the element \(h\) may be given as an arbitrary word in the generators of \(G_{x0}\) (modulo \(Q_{x0}\)). So it is important to find a short word \(g\) in the the generators of \(G_{x0}\) with \(g \in N_{x0} h\). This can be achieved by applying a sequences \(g' = g'_1,\ldots,g'_k\) of automorphisms of \(\Lambda / 2 \Lambda\) to \(\Omega h\) such that \(\Omega h g' = \Omega\), and each \(g'_i\) corresponds to a generator of \(G_{x0}\) (modulo \(Q_{x0}\)).

We assume that the reader is familiar with the Conway group \(\mbox{Co}_1\) and the geometry of the Leech lattice as described in [Iva99], section 4.1 - 4.7.

Orbits of the group \(N_{x0}\) in the Leech lattice mod 2

In this subsection we describe the orbits of \(N_{x0}\) on \(\Lambda / 2 \Lambda\). The type of a vector in \(v \in \Lambda\) is the halved scalar product \(\frac{1}{2} \langle v, v \rangle\). The type of a vector in \(\Lambda / 2 \Lambda\) is the type of its shortest representative in \(\Lambda\). Each vector in \(\Lambda / 2 \Lambda\) has type 0, 2, 3 or 4; and the group \(\mbox{Co}_1\) is transitive on the vectors of any of these types.

The orbits of the groups \(N_{x0}\) on the vectors of type 2,3, and 4 on \(\Lambda\) have been described in [Iva99], Lemma 4.4.1. \(N_{x0}\) acts monomially on the the lattice \(\sqrt{8} \Lambda\), which has integers coordinates in the standard Euclidean basis. Thus an orbit of \(N_{x0}\) on \(\Lambda / 2 \Lambda\) can be described by the shapes of the shortest vectors in the corresponding orbit of \(\sqrt{8} \Lambda\). Here the shape of a vector is the multiset of the absolute values of the coordinates of the vector. E.g. a vector of shape \((3^5 1^{19})\) has 5 coordinates with absolute value 3 and 19 coordinates with absolute value 1.

A vector of type 2 or 3 in \(\Lambda / 2 \Lambda\) has two opposite representatives of the same type in \(\Lambda\); so its shape is uniquely defined. A vector of type 4 in \(\Lambda / 2 \Lambda\) has \(2 \cdot 24\) representatives of type 4 in \(\Lambda\) which are orthogonal except when equal or opposite. It is well known that a type-4 vector in \(\Lambda / 2 \Lambda\) corresponds to a coordinate frame in the Leech lattice in standard coordinates, see e.g. [CS99], [Iva99].

The table at Lemma 4.4.1. in [Iva99] assigns a name and a shape to each orbit of \(N_{x0}\) on the vectors of type 2, 3, and 4 in \(\Lambda\). The table at Lemma 4.6.1. in [Iva99] assigns a name and one or more shapes to each orbit of \(N_{x0}\) on the vectors of type 4 in \(\Lambda / 2 \Lambda\). We reproduce this information for the orbits of \(N_{x0}\) on \(\Lambda / 2 \Lambda\) in the following table. Here we also assign a subtype (which is a 2-digit number) to each orbit. The first digit of the subtype specifies the type of the orbit and the second digit is used to distinguish between orbits of the same type.

Let \(\mathcal{C}\) be the Golay code and \(\mathcal{C}^*\) is the Golay cocode. For \(d \in \mathcal{C}, \delta \in \mathcal{C}^*\) let \(\lambda_d, \lambda_\delta \in \Lambda / 2 \Lambda\) be defined as in [Sey20], Theorem 6.1. Conway’s definition of \(\lambda_d, \lambda_\delta\) in [Con85] differs slightly from that definition. Each \(v \in \Lambda / 2 \Lambda\) has a unique decomposition \(v = \lambda_d + \lambda_\delta, d \in \mathcal{C}, \delta \in \mathcal{C}^*\). The subtype of a vector \(\lambda_d + \lambda_\delta \in \Lambda / 2\Lambda\) can be computed from \(d\) and \(\delta\), as indicated in the following table:

\[\begin{split}\begin{array}{|c|c|c|c|c|c|c|} \hline \mbox{Subtype} & \mbox{Name} & \mbox{Shape} & |d| & |\delta| & \langle d , \delta\rangle & \mbox{Remark} \\ \hline 00 & & (0^{24}) & 0 & 0 & 0 \\ \hline 20 & \Lambda_2^4 & (4^2 0^{22}) & 0, 24 & 2 & 0 & \\ \hline 21 & \Lambda_2^3 & (3 \, 1^{23}) & \mbox{any} & 1 & |d| / 4 & \\ \hline 22 & \Lambda_2^2 & (2^8 0^{16}) & 8, 16 & \mbox{even} & 0 & 1. \\ \hline 31 & \Lambda_3^5 & (5 \, 1^{23}) & \mbox{any} & 1 & |d| / 4 + 1 & \\ \hline 33 & \Lambda_3^3 & (3^3 1^{21}) & \mbox{any} & 3 & |d| / 4 & \\ \hline 34 & \Lambda_3^4 & (4 \, 2^{8} 0^{15}) & 8, 16 & \mbox{even} & 1 & \\ \hline 36 & \Lambda_3^2 & (2^{12} 0^{12}) & 12 & \mbox{even} & 0 & \\ \hline 40 & \bar{\Lambda}_4^{4a} & (4^4 0^{20}) & 0, 24 & 4 & 0 & \\ \hline 42 & \bar{\Lambda}_4^{6} & (6 \, 2^7 0^{16}), (2^{16} 0^8) & 8, 16 & \mbox{even} & 0 & 2.\\ \hline 43 & \bar{\Lambda}_4^{5} & (5 \, 3^2 1^{21}), (3^{5} 1^{19}) & \mbox{any} & 3 & |d| / 4 + 1 & \\ \hline 44 & \bar{\Lambda}_4^{4b} & (4^2 2^8 0^{14}), (2^{16} 0^8) & 8, 16 & \mbox{even} & 0 & 3.\\ \hline 46 & \bar{\Lambda}_4^{4c} & (4 \, 2^{12} 0^{11}) & 12 & \mbox{even} & 1 & \\ \hline 48 & \bar{\Lambda}_4^{8} & (8 \, 0^{23}) & 24 & 0 & 0 & \\ \hline \end{array}\end{split}\]

Remarks

  1. \(|\delta|/2 = 1 + |d|/8 \pmod{2}\), \(\delta \subset d \Omega^{1 + |d|/8}\) for a suitable representative \(\delta\) of the cocode element.

  2. \(|\delta|/2 = |d|/8 \pmod{2}\), \(\delta \subset d \Omega^{1 + |d|/8}\) for a suitable representative \(\delta\) of the cocode element.

  3. None of the conditions stated in Remarks 1 and 2 hold.

Here column Subtype lists the two-digit number that we use for describing the orbit. Columns Name and Shape list the names and the shapes of the orbits as given in [Iva99], Lemma 4.1.1 and 4.6.1. Columns \(|d|\) and \(|\delta|\) list conditions on the weight of a Golay code word \(d\) and of (a shortest representative of) the Golay cocode element \(\delta\), respectively. Column \(\langle d, \delta \rangle\) lists conditions on the scalar product of \(|d|\) and \(|\delta|\). All this information can easily be derived from [Iva99] and [Con85] (or [Sey20]).

The table provides enough information for effectively computing the subtype of an element \(\lambda_d + \lambda_\delta\) from \(d\) and \(\delta\). Function gen_leech2_type in file gen_leech.c computes that subtype. It returns e.g. the subtype 46 as the hexadecimal number 0x46.

The following table may be helpful for memorizing the second digit of a subtype of a vector in \(\Lambda / 2 \Lambda\).

\[\begin{split}\begin{array}{|c|l|} \hline \mbox{Subtype} & \mbox{Description} \\ \hline x0 & \mbox{Contains a vector $\lambda_d+\lambda_\delta$ with $d=0$; $\delta$ even} \\ \hline x1 & \mbox{Vectors $\lambda_d+\lambda_\delta$ with $|\delta|=1$} \\ \hline x2 & \mbox{Vectors $\lambda_d+\lambda_\delta$ with $|d| = 8, 16$; and $\delta \subset d \Omega^{1 + |d|/8}$, $|\delta|$ even}\\ \hline x3 & \mbox{Vector $\lambda_d+\lambda_\delta$ with $|\delta|=3$} \\ \hline x4 & \mbox{Vectors $\lambda_d+\lambda_\delta$ with $|d| = 8, 16$; not of subtype $x2$, $|\delta|$ even} \\ \hline x6 & \mbox{Vectors $\lambda_d+\lambda_\delta$ with $|d| = 12$; $|\delta|$ even} \\ \hline x8 & \mbox{The vector $\lambda_d+\lambda_\delta$ with $|d| = 24$; $\delta = 0$} \\ \hline \end{array}\end{split}\]

Operation of the group \(G_{x0}\) on the Leech lattice

The generators \(x_d, x_\delta, y_d, x_\pi, \xi\), with \(d \in \mathcal{C}, \delta \in \mathcal{C}^*, \pi \in \mbox{Aut}_{St} \mathcal{P}\) operate on the subgroup \(G_{x0}\) of the monster and also on \(\Lambda / 2 \Lambda\), as described in [Sey20]. Here \(x_d, x_\delta\) operate trivially on \(\Lambda / 2 \Lambda\). \(y_d\) and \(x_\pi\) act as permutations and sign changes on the coordinates of \(\Lambda\), respectively; they do not change the subtype.

From the Leech graph in [Iva99], section 4.7, we see how the generators \(\xi\) and \(\xi^2\) may change the subtype of a type-4 vector in \(\Lambda / 2 \Lambda\). The following figure shows a subgraph of that graph. It shows the transitions of subtypes of type-4 vectors (due to transformations by \(\xi\) or \(\xi^2\)) that we actually use.

Figure made with TikZ

Transitions of subtypes of type-4 vectors in the Leech lattice mod 2

We will provide some more information about the operation of \(\xi^k\) on \(\Lambda\).

We say that a vector \(w \in \mathbb{Z}^n\) is of shape \((m^\alpha 0^{n-\alpha} \bmod 2m)\) if \(w\) has \(\alpha\) coordinates equal to \(m\) \(\pmod{2m}\) and \(n-\alpha\) coordinates equal to \(0\) \(\pmod{2m}\). For a vector \(v = (v_0,\ldots, v_{23}) \in \mathbb{R}^{24}\) define \((v_{4i}, v_{4i+1}, v_{4i+2}, v_{4i+3}) \in \mathbb{R}^4\) to be the \(i\)-th column of \(v\). This definition is related to the Miracle Octad Generator (MOG) used for the description of the Golay code and the Leech lattice, see [CS99], chapter 11. The following lemma has been shown in [Sey22].

Lemma

Let \(v \in \sqrt{8} \Lambda\), and let \(v^{(k)} = v \cdot {\xi^k}\). Let \(w\) be a column of \(v\), and let \(w^{(k)}\) be the corresponding column of \(v^{(k)}\). Then \(|w^{(k)}| = |w|\). If \(w\) has shape \((m^4 \bmod{2m})\) then there is a unique \(k \in \{1,2\}\) such that \(w^{(k)}\) has shape \((0^4 \bmod{2m})\). If \(w\) has shape \((m^2 0^2 \bmod{2m})\) then \(w^{(k)}\) has shape \((m^2 0^2 \bmod{2m})\) for all \(k \in \mathbb{Z}\).

Computing a transversal of \(G_{x0} \backslash N_{x0}\)

The cosets in \(G_{x0} \backslash N_{x0}\) correspond to the type-4 vectors in the Leech lattice modulo 2. Thus for computing a transversal of \(G_{x0} \backslash N_{x0}\) we have to compute an element of \(G_{x0}\) that maps the standard frame \(\Omega\) in \(\Lambda/ 2 \Lambda\) to a vector \(v\) for a given vector \(v\) of type 4 in \(\Lambda/ 2 \Lambda\).

In the next few subsections we show how to map a vector of subtype 4x to a vector of subtype 48 by a sequence of generators \(x_\pi, \xi^k\) of length at most 6. The proofs are a bit involved, but the C function gen_leech2_reduce_type4_vector in file gen_leech.c constructs that sequence very fast. Then the inverse of the element computed by function gen_leech2_reduce_type4_vector is the requested element of the transversal.

If we select a permutation in the Mathieu group \(M_{24}\) then we actually have to select an element of the group \({{\rm Aut}_{{\rm St}} \mathcal{P}}\) of standard automorphisms of the Parker loop described in Automorphisms of the Parker loop. Whenever we have to select a permutation mapping a certain set to another set, we want to keep the inverse of the selected element of \({{\rm Aut}_{{\rm St}} \mathcal{P}}\) as simple as possible. More specifically, for that inverse we select the standard representative in \({{\rm Aut}_{{\rm St}} \mathcal{P}}\) of the least feasible permutation in lexicographic order. We use the function mat24_perm_from_map in file mat24_functions.c for finding a suitable permutation in \(M_{24}\).

Incorporating the baby monster into the computation of a transversal

The capability of computing in a subgroup \(H^+\) of structure \(2.B\) of the monster \(\mathbb{M}\) (where \(B\) is the baby monster) is crucial for the success of our project. Although not strictly necessary, we want to select elements of \(H^+\) in our transversal of \(G_{x0} \backslash N_{x0}\) whenever possible. In this subsection we explain how to do so. This is a bit technical and may be skipped at first reading.

Let \(\beta = {\small \frac{1}{\sqrt{8}}}(e_2 - e_3) \in \Lambda\), where \(e_i\) is the \(i\)-th unit vector in \(\mathbb{R}^{24}\). Then \(\beta\) is of type 2 in the Leech lattice \(\Lambda\), and the centralizer \(H^+\) of \(x_\beta\) in \(\mathbb{M}\) has structure \(2.B\), and the centralizer \(H\) of \(x_\beta\) in \(G_{x0}\) has structure \(2^{1+22}.\mbox{Co}_2\). Our goal is find a representative of a coset in \(N_{x0} \backslash G_{x0}\) in \(H\) (centralizing the element \(x_\beta\) of \(\mathbb{M}\)) whenever possible.

The coset in \(N_{x0} \backslash G_{x0}\) corresponding to a vector \(v \in \Lambda / 2 \Lambda\) of type 4 has a nonempty intersection with \(H\) if and only if \(v + \beta\) is of type 2 and orthogonal to \(\beta\) in the real Leech lattice \(\Lambda\), see [Sey22] for details.

If this is the case then we compute an element of \(H \subset G_{x0}\) that maps \(v + \beta\) to the type-2 vector \(\Omega + \beta\) and fixes \(\beta\).

Note that the generator \(\xi\) fixes \(\beta\). A permutation \(\pi \in M_{24}\) fixes \(\beta\) if and only if it fixes the set \(\{2,3\}\) of points.

From subtype 46 to subtype 44

Let \(\lambda_r = \lambda_d + \lambda_\delta\) be of subtype 46. Then \(d\) is a dodecad. Assume that \(d\) contains one column of the MOG. Then up to signs and permutation of the columns of the MOG, and up to permutation of the entries of a column, there is a vector \(v \in \sqrt{8}\Lambda\) with \(v/\sqrt{8} = \lambda_r \pmod{2\Lambda}\) that has MOG coordinates

\[\begin{split}\begin{array}{|c|c|c|c|c|c|} \hline 2 & 0 & 2 & 0 & 0 & 0 \\ \hline 2 & 0 & 2 & 0 & 0 & 0 \\ \hline 2 & 0 & 4 & 2 & 2 & 2 \\ \hline 2 & 0 & 0 & 2 & 2 & 2 \\ \hline \end{array} \quad .\end{split}\]

By the Lemma given above there is a \(k \in \{1,2\}\) such that column 0 of the the vector \(w = v \cdot {\xi^k}\) has shape \((4 \, 0^3)\). The other columns of \(w\) have the same shape as the corresponding columns of \(v\). Thus \(w\) has shape \((4 \, 2^8 0^{15})\). So using the table given above we see that \(w\) has type 44.

Note that both, the dodecad \(d\) and its complement, contain exactly one column of the MOG. The union of these two columns is a grey even octad \(o\).

With a little extra effort we can show \(k = 2 - \langle o, \delta \rangle\), where \(\langle ., . \rangle: \mathcal{C} \times \mathcal{C}^* \rightarrow \{0,1\}\) is the scalar product.

If dodecad \(d\) does not contain a column of the MOG then we select a permutation that maps the first four entries of dodecad \(d\) to the points \(0,1,2,3\), i.e. to the first column of the MOG. Then we proceed as above.

From subtype 43 to subtype 42 or 44

Let \(\lambda_r = \lambda_d + \lambda_\delta\) be of subtype 43. Then \(|\delta| = 3\). The three points of \(\delta\) can lie in one, two, or three different columns of the MOG.

If all entries of \(\delta\) lie in the same column of the MOG then up to signs and permutation of the columns of the MOG, and up to permutation of the entries of a column, there is a vector \(v \in \sqrt{8}\Lambda\) with \(v/\sqrt{8} = \lambda_r \pmod{2\Lambda}\) that has MOG coordinates

\[\begin{split}\begin{array}{|c|c|c|c|c|c|} \hline 5 & 1 & 1 & 1 & 1 & 1 \\ \hline 3 & 1 & 1 & 1 & 1 & 1 \\ \hline 3 & 1 & 1 & 1 & 1 & 1 \\ \hline 1 & 1 & 1 & 1 & 1 & 1 \\ \hline \end{array} \quad .\end{split}\]

By the Lemma given above there is a \(k \in \{1,2\}\) such that one column of \(w = v \cdot {\xi^k}\) (and hence all columns) have even entries. Column 0 of \(w\) has squared norm \(44 = 6^2 + 2^2 + 2^2 + 0^2\). That decomposition of \(44\) into a sum of four even squares is unique, so column 0 has shape \((6 \, 2^2 0)\). The other columns have shape \((2 \, 0^3)\). So \(w\) is of shape \((6 \, 2^7 0^{15})\) and hence of subtype 42.

With a little extra effort we can show \(k = 2 - \langle d, \omega \rangle\), where \(\omega\) is the standard tetrad represented by \(\{0,1,2,3\}\), and \(\langle ., . \rangle\) as above.

If the entries of \(\delta\) lie in two different columns of the MOG then up to signs and permutations as above, the corresponding vector in \(v \in \sqrt{8}\Lambda\) has MOG coordinates

\[\begin{split}\begin{array}{|c|c|c|c|c|c|} \hline 3 & 5 & 1 & 1 & 1 & 1 \\ \hline 3 & 1 & 1 & 1 & 1 & 1 \\ \hline 1 & 1 & 1 & 1 & 1 & 1 \\ \hline 1 & 1 & 1 & 1 & 1 & 1 \\ \hline \end{array} \quad .\end{split}\]

By a similar argument as in the first case there is a \(k \in \{1,2\}\) such that the first two columns of \(w = v \cdot {\xi^k}\) have shape \((4 \, 2\, 0^2)\) and \((4 \, 2^3)\), and that the other columns of \(w\) have shape \((2 \, 0^3)\). Thus \(w\) is of subtype 44.

If the entries of \(\delta\) lie in three different columns of the MOG then we apply a permutation in \(M_{24}\) that maps \(\delta\) to \(\{1,2,3\}\), and proceed as in the first case.

From subtype 44 to subtype 40

Let \(\lambda_r = \lambda_d + \lambda_\delta\) be of subtype 44. Then \(d\) is an octad or a complement of an octad. We call that octad \(o\). If the cocode word \(\delta\) is a duad, let \(c_0, c_1\) be the two points in that duad. Otherwise, let \(c\) by any tetrad intersecting the octad \(o\) in two points, so that \(c\) is equal to the sextet \(\delta\) modulo the cocode \(\mathcal{C}^*\). Let \(c_0, c_1\) be the two points in \(c\) which are not in the octad \(o\).

Assume first that \(o\) is a grey even octad, i.e. a union of two columns of the MOG, and that the points \(c_0, c_1\) are in the same column of the MOG.

Then up to signs and permutation of the columns of the MOG, and up to permutation of the entries of a column, there is a vector \(v \in \sqrt{8}\Lambda\) with \(v/\sqrt{8} = \lambda_r \pmod{2\Lambda}\) that has MOG coordinates

\[\begin{split}\begin{array}{|c|c|c|c|c|c|} \hline 2 & 2 & 4 & 0 & 0 & 0 \\ \hline 2 & 2 & 4 & 0 & 0 & 0 \\ \hline 2 & 2 & 0 & 0 & 0 & 0 \\ \hline 2 & 2 & 0 & 0 & 0 & 0 \\ \hline \end{array}\end{split}\]

Similar to the previous cases, we can show that there is a unique \(k \in \{1,2\}\) such that \(w = v \cdot {\xi^k}\) has shape \((4^4 0^{20})\). So \(w\) is of subtype 40.

One can show \(k = 2 - \sigma\), where \(\sigma = |c \cap w | \pmod{2}\), \(\sigma \in \{0,1\}\) , and \(w\) is any of the two columns of the MOG contained in the octad \(o\).

If the octad \(o\) and the two points \(c_0, c_1\) do not satisfy the condition given above then we first construct a permutation that maps \(c_0, c_1\) to \(2, 3\), and four points of \(o\) to \(4, 5, 6, 7\) as follows.

We compute the syndrome of the set \(c_0, c_1, o_0, o_1, o_2\). That syndrome intersects with \(o\) in exactly one point \(o_j\). The we construct a permutation in \(M_{24}\) that maps the hexad \((c_0, c_1, o_0, o_1, o_2, o_j)\) to the hexad \((2, 3, 4, 5, 6, 7)\). Such a permutation exists, since both hexads are subsets of an octad.

From subtype 42 to subtype 40

Let \(\lambda_r = \lambda_d + \lambda_\delta\) be of subtype 42. Then \(d\) is an octad or a complement of an octad. We call that octad \(o\) as in the case of type 42. If that octad is grey, we map vector \(\lambda_r\) to a vector of type 40 in the same way as in that case.

Otherwise we first map the lowest four points of octad \(o\) to the set \(0, 1, 2, 3\), using a permutation in \(M_{24}\).

From subtype 40 to subtype 48

Let \(\lambda_r\) be of subtype 40. Then \(\lambda_r = \alpha \lambda_\Omega + \lambda_\delta\), \(\alpha = 0, 1\), for some \(\delta \in \mathcal{C}^*\), \(|\delta|=4\). If \(\delta\) is equal to the standard tetrad \(\omega\) represented by \(\{0,1,2,3\}\) then there is a unique power of \(\xi\) that maps \(\lambda_r\) to \(\lambda_\Omega\), which is of type 48.

Otherwise let \(c\) be the tetrad containing the point \(0\) that corresponds to \(\delta\). The we first apply a permutation in \(M_{24}\) that maps \(c\) to \(\{0,1,2,3\}\).

From subtype 21 to subtype 22, fixing \(\beta\)

Let \(\lambda_r = \lambda_d + \lambda_\delta\) be of subtype 21, and orthogonal to \(\beta\) in the real Leech lattice. Then \(|\delta| = 1\).

We proceed as in the case of subtype 43. Then there is a \(k \in \{1,2\}\) such that one column of \(w = v \cdot {\xi^k}\) (and hence all columns) have even entries. It is easy to see that \(w\) is of subtype 22.

From subtype 22 to subtype 20, fixing \(\beta\)

Let \(\lambda_r = \lambda_d + \lambda_\delta\) be of subtype 22, and orthogonal to \(\beta\) in the real Leech lattice. Then \(d\) is an octad or a complement of an octad. We call that octad \(o\) as in the cases of subtype 42 or 44. For subtype 22 the cocode word \(\delta\) can be considered as a subset of octad \(o\).

The set \(\{2,3\}\) either contained in or disjoint to octad \(o\).

If octad \(o\) is grey and even then we may directly apply a power of \(\xi\) that transforms \(\lambda_r\) to a vector of subtype 22, similar to the operation in the case of subtype 42. Otherwise we proceed as follows.

If the set \(\{2,3\}\) is disjoint from \(o\) we proceed exactly as in the case of subtype 44, replacing \(c_0\) and \(c_1\) by \(2\) and \(3\). This operation fixes the points \(2\) and \(3\).

Otherwise we first fix the points \(2\) and \(3\) and map the lowest two points of \(o \setminus \{2,3\}\) to \(0\) and \(1\), using a permutation in \(M_{24}\).

From subtype 20 to the vector \(\beta + \Omega\), fixing \(\beta\)

Let \(\lambda_r\) be of subtype 20, and orthogonal to \(\beta\) in the real Leech lattice. Then \(\lambda_r = \alpha \lambda_\Omega + \lambda_\delta\), \(\alpha = 0, 1\), for some \(\delta \in \mathcal{C}^*\), \(|\delta|=2\). In case \(\delta = \{2,3\}\) we must have \(\alpha = 1\), i.e. \(\lambda_r = \beta + \Omega\), and we are done. Otherwise both points in \(\delta\) are different from \(2\) and \(3\).

In case \(\delta = \{0, 1\}\) the transformation \(\xi^k\) with \(k = 2- \alpha\) maps \(\lambda_r\) to \(\beta\).

Otherwise we first apply a permutation in \(M_{24}\) that maps \(\delta\) to \(\{0, 1\}\), and fixes \(2\) and \(3\).

Orbits of the group \(N_{x0}\) in the group \(Q_{x0}\)

Function gen_leech2_reduce_n in file gen_leech_reduce_n.c computes a standard representative of the orbit of a nonzero vector in the extraspecial subgroup \(Q_{x0}\) of \(G_{x0}\) under the action of the group \(N_{x0}\). Here \(N_{x0}\) acts on the group \(Q_{x0}\) of structure \(2_+^{1+24}\) by conjugation.

The quotient of \(Q_{x0}\) by its center \(\{x_{\pm 1}\}\) is isomophic to \(\Lambda / 2 \Lambda\), i.e. to the Leech lattice mod 2. So we may define the subtype of any \(v \in Q_{x0}\) as the subtype of the element \(v x_{\pm 1}\) of \(\Lambda / 2 \Lambda\).

With one exception the orbit of a nonzero vector \(v \in Q_{x0}\) under \(N_{x0}\) depends on the subtype of \(v\) only. There are exactly two elements \(x_{\pm 1}\) of \(Q_{x0}\) of subtype 00, which are of course in different \(N_{x0}\) orbits.

For each nonzero orbit of \(N_{x0}\) we choose a representative \(x_d x_\delta\), \(d \in \mathcal{P}, \delta \in \mathcal{C}^*\) (depending on the subtype of the orbit) according to the following table:

\[\begin{split}\begin{array}{|c|c|c|} \hline \mbox{Subtype} & x_d & \delta \\ \hline 00 & x_{-1} & \{\} \\ 20 & x_1 & \{2,3\} \\ 21 & x_1 & \{0\} \\ 22 & x_o & \{\} \\ 31 & x_\Omega & \{0\} \\ 33 & x_\Omega & \{1,2,3\} \\ 34 & x_o & \{0,8\} \\ 36 & x_D & \{\} \\ 40 & x_1 & \{0,1,2,3\} \\ 42 & x_\Omega x_o & \{\} \\ 43 & x_1 & \{1,2,3\} \\ 44 & x_o & \{8, 9\} \\ 46 & x_D & \{0, 12\} \\ 48 & x_\Omega & \{\} \\ \hline \end{array}\end{split}\]

\(x_o\) is the (positive) element of the Parker loop corresponding to the stadard octad \(\{0,1,2,3,4,5,6,7\}\).

\(x_D\) is the (positive) element of the Parker loop corresponding to the stadard dodecad \(\{0,4,8, 13,14,15, 17,18,19, 21,22,23\}\).

This table can be computed by function gen_leech2_reduce_n_rep in file gen_leechc_reduce_n.c.

Checking equality of monster elements and membership in \(2^{1+24}.\mbox{Co}_1\)

Checking equality of monster elements

For checking equality of two elements of the monster we may use two nonzero vectors \(w_{71}\) and \(w_{94}\) in the 198883-dimensional representation of the monster that are fixed by an element of order 71 and negated by an element of order 94, respectively. In [LPWW98] it is shown that only the neutral element of the monster fixes both vectors, \(w_{71}\) and \(w_{94}\). So we may check if an element is the identity in the monster. Here the corresponding calculations can be done modulo any odd prime; and we may even use different primes for the operation on \(w_{71}\) and on \(w_{94}\).

The python script mmgroup.structures.find_order_vectors.py generates suitable vectors \(w_{71}\) and \(w_{94}\) at random and stores data for the fast recomputation of these vectors in file order_vector_data.py. Since we actually work in the representation \(\rho\) of dimension 1 + 196883, we have to make sure that the generated vectors are not fixed by the monster.

More precisely, we generate \(w_{71}\) as a vector in the representation \(\rho_3\) of the monster in characteristic 3, and \(w_{94}\) as a vector in the representation \(\rho_5\) of the monster in characteristic 5. We combine these two vectors to a vector in the representation \(\rho_{15}\) (modulo 15) of the monster via Chinese remaindering. Note that a computation in \(\rho_{15}\) is faster than the combination of the corresponding computations in \(\rho_{3}\). We write \(w\) for the vector combined from \(w_{71}\) and \(w_{94}\).

Testing membership in \(2^{1+24}.\mbox{Co}_1\)

For testing membership in \(2^{1+24}.\mbox{Co}_1\) we impose additional conditions on the vector \(w_{71}\) in the representation \(\rho_3\) defined in the last section.

For an odd number \(p\) let \(\rho_{A,p}\) be the subspace of \(\rho_p\) spanned by the basis vectors of \(\rho_p\) with tag A. Then \(\rho_{A,p}\) is fixed by the subgroup \(G_{x0}\) of structure \(2^{1+24}.\mbox{Co}_1\) of the monster. A vector in \(\rho_{A,p}\) has a natural interpretation as a symmetric bilinear form on the Leech lattice (modulo \(p\)), see [Con85]. Let \(Q_{p}\) be the symmetric bilinear form on the Leech lattice (modulo \(p\)) given by the scalar product; then \(Q_{p}\) is given by the the \(24 \times 24\) unit matrix, up to a scalar factor.

Let \(w\) be a vector in \(\rho_{15}\) as in the last section. For any \(w\) in \(\rho_{p}\), let \(w_{(A,p)}\) be the bilinear from on the Leech lattice (modulo \(p\)) corresponding to the projection of \(w\) onto \(\rho_{A,p}\). We shall search for a vector \(w\) satisfying the condition stated an the last section. In addition we require that for some integer \(d\) (modulo 3) the bilinear form \(w_{(A,3)} - d \cdot Q_3\) has a kernel of dimension one, and that this kernel contains a type-4 vector \(x_4\) of the Leech lattice modulo 3.The chance of finding such a vector \(w\) is discussed in [Sey22], Appendix B. In average, finding a suitable vector \(w\) take less than a minute on the author’s compute, when running with 12 processes in parallel.

Let \(\Omega\) be the standard frame of the Leech lattice (modulo 3). Then \(\Omega\) is of type 4 and consists of the vectors \((0,\ldots,0,\pm 1,0,\ldots,0)\). Once we have found a suitable vector \(w\), we apply a transformation in \(G_{x0}\) to \(w\) that maps the frame in the Leech lattice given by \(x_4\) to \(\Omega\). Therefore we work in the Leech lattice modulo 2, as described in section Computations in the Leech lattice modulo 2. Conversion from a type-4 vector in the Leech lattice mod 3 to a type-4 vector in the Leech lattice mod 2 is easy; the C function gen_leech3to2_type4 in file gen_leech3.c does that job. In the sequel we also write \(w\) for the transformed vector. So we may assume that the kernel of \(w_{(A,3)} - d \cdot Q_3\) is one dimensional and spanned by a vector in the standard frame \(\Omega\) of the Leech lattice modulo 3.

For doing linear algebra in \(\mathbb{F}_3\) we store a vector of 16 integers modulo 3 in a 64-bit integer; and we use arithmetic, Boolean and shift operations.

Assume that \(g\) is in \(G_{x0}\). Then the kernel of \(w_{A,3} \cdot g - d \cdot Q_3\) is spanned by a vector in the frame \(\Omega \cdot g\). So we may also compute \(\Omega \cdot g\) in the Leech lattice modulo 2. Using the methods in section Computations in the Leech lattice modulo 2 we can compute a \(h_1\) in \(G_{x0}\) that maps \(\Omega \cdot g\) to \(\Omega\). So \(gh_1\) fixes \(\Omega\). Hence \(gh_1\) is in the subgroup \(N_{x0}\) of structure \(2^{1+24+11}.M_{24}\) of \(G_{x0}\).

The factor group \(M_{24}\) of \(N_{x0}\) acts on the \(24 \times 24\) matrix representing the bilinear form \(w_{(A,p)}\) by permuting the rows and the columns, up to changes of signs that we will ignore at the moment. So we store the multiset of the absolute values of the entries of each row of the matrix \(w_{(A,p)}\). Then \(M_{24}\) permutes these 24 multisets by its natural action. So we may recover the element of \(M_{24}\) performing that action and compute a preimage \(h_2\) of the inverse of that element in \(N_{x0}\). Thus \(gh_1 h_2\) is in the subgroup \(2^{1+24+11}\) of \(N_{x0}\). Note that, originally, the vector \(w\) has been defined modulo 15, so that almost certainly we have enough information in the matrix \(w_{(A,15)}\) to compute the permutation in \(M_{24}\). In the implementation we compute a hash function on the 24 multisets obtained from the rows of \(w_{(A,15)}\), and we generate a new vector \(w\) if these 24 hash values are not mutually different.

Now that task of reducing an element of \(2^{1+24+11}\) is easy. We first eliminate the factor \(2^{11}\) be checking signs in \(w_{(A,15)}\), or signs of suitable entries of vector \(w\) with tag A, which is the same thing. Then we eliminate the factor \(2^{24}\) be checking signs of suitable entries of \(w\) with tags B, C, T, X, and, finally, the factor \(2^{1}\) by checking a sign of an entry with tag Y or Z.

So we obtain a decomposition \(g h_1 h_2 h_3 = 1\), with \(h_i\) in a transversal of \(G_i / G_{i+1}\) for \(G_1 = G_{x0}\), \(G_2 = N_{x0}\), \(G_3 = 2^{1+24+11}\), \(G_4 = \{1\}\). If \(g \notin G_{x0}\) then this procedure fails in one of the steps mentioned above. We abort as early as possible in case of failure, in order to accelerate functions searching for an element of \(G_{x0}\) that have a low probability of success.

Module mmgroup.dev.mm_reduce.find_order_vector.py contains a function for finding a suitable vector \(w\). Once a vector \(w\) has been found, it creates a module mmgroup.dev.mm_reduce.order_vector_data.py containing information for recomputing the same vector \(w\) more quickly. The code generation process creates a C file mm_order_vector.c containing the vector \(w\) in its source code together with some data required for using \(w\) and for checking its correctness. We believe that this method simplifies the integration of our project into computer algebra systems like GAP or Magma, since these systems may now simply call C functions for computing in the monster group, without bothering how to generate a suitable vector \(w\).

Subgroups of the Mathieu group \(M_{24}\)

An amalgam of many (mostly 2-local) large subgroups of the Monster \(\mathbb{M}\) has been constructed in [Iva09]. For each of these subgroups (and, hopefully, also for their intersections) we want an algorithm that constructs an (almost) uniform random element of that group. This is a complicated task, which might not even be possible for all cases. As a first step towards this goal we will describe the intersections of some subgroups of the Mathieu group \(M_{24}\). These subgroups of \(M_{24}\) are involved in the subgroups of \(\mathbb{M}\) in the amalgam. We will also give some information how to construct ramdom elements of these subgroups of \(M_{24}\) and of their intersections.

TODO: This section is yet a fragment and will be completed in a future version.

Some subgroups of the the Mathieu group \(M_{24}\)

We define some subgroups of the Mathieu group \(M_{24}\) as the stabilizers of certain subsets (or sets of subsets) of the set \(\tilde{\Omega} = \{0,\ldots,23\}\), as shown in the following table:

\[\begin{split}\begin{array}{|c|c|c|c|} \hline \mbox{Name} & \mbox{Mnemonic} & \mbox{Stabilizes the set} & \mbox{Structure} \\ \hline M_2 & \mbox{2-set} & \{2,3\} & M_{22}:2 \\ M_o & \mbox{octad} & \{0,\ldots,7\} & 2^4:A_8 \\ M_t & \mbox{trio} & {\{ \{8i,\ldots,8i+7\} \mid i < 3 \}} & 2^6:(S_3 \times L_3(2)) \\ M_s & \mbox{sextet} & {\{ \{4i,\ldots,4i+3\} \mid i < 6 \}} & 2^6:3.S_6 \\ M_l & \mbox{(line)} & \{ \{2i, 2i+1\} \mid 4 \leq i < 12 \} & 2^{1+6}:L_3(2) \\ M_3 & \mbox{3-set} & \{1, 2,3\} & L_3(4):S_3 \\ \hline \end{array}\end{split}\]

All these groups, except for \(M_l\), are maximal subgrous of \(M_{24}\) and discussed in [CS99], Ch. 11. The structure of \(M_l\) (as a subgoup of \(M_o\)) is described in [Iva09] in the text after Lemma 4.1.3. The central element of \(M_l\) of order 2 is obtained by exchanging all entries \(2i\) with \(2i+1\) for \(i \geq 4\). We have the following inclusions:

\[\begin{split}\begin{array}{lll} M_l \subset M_o \, , \; & M_l \cap M_2 \subset M_t \, , \; & M_t \cap M_2 \subset M_o \cap M_l \, , \\ M_3 \cap M_l \subset M_s \, , \; & M_3 \cap M_t \subset M_o \cap M_s \, , \; & M_3 \cap M_t \cap M_l \subset M_2 \; . \\ \end{array}\end{split}\]

Membership of an element of \(M_{24}\) in any of these subgroups can easily be checked computationally. Thus an inclusion of two intersections of these subgroups can be disproved computationally by finding an element of \(M_{24}\) contadicting that inclusion. This computation is done in function test_mat24_rand in module mmgroup.tests.test_mat24 during the standard test.

In the remainder of this subsection we will prove the inclusions given above. The first inclusion is obvious. Next we justify the following two inclusions.

By [CS99], Ch. 11, the group \(M_o\) acts on the octad \(o = \{0,\ldots,7\}\) as the alternating group \(A_8\), and on the complement \(\bar{o}\) of \(o\) as the affine group on \(\mathbb{F}_2^4\). Here the affine structure of \(\bar{o}\) is given by the last 4 digits of the binary representations of its entries. Fixing e.g. the element 8 of \(\bar{o}\), the set \(\bar{o}\) acquires the structure of the linear space \(\mathbb{F}_2^4\); and we obtain the well-known isomorphism \(L_4(2) \cong A_8\). Subspaces of dimension 1, 2, and 3 of the linear space \(\mathbb{F}_2^4\) will be called lines, planes, and hyperplanes.

We have following correspondences between \(o\) and its complement \(\bar{o}\)

\[\begin{split}\begin{array}{|l|l|l|} \hline \mbox{Linear space } \bar{o} = \mathbb{F}_2^4 & \mbox{Octad } o & \mbox{Remarks} \\ \hline \mbox{line} & \mbox{Steiner system } S(3,4,8) & \mbox{defines an affine structure on } o \\ \hline \mbox{plane} & \mbox{sextet refining } o & \mbox{tetrads of the sextet in } \bar{o} \mbox{ are} \\ & & \mbox{planes parallel to that plane} \\ \hline \mbox{hyperplane} & \mbox{Steiner system } S(3,4,8) & \mbox{defines an affine structure on } o \mbox{} \\ \hline \mbox{symplectic form} & \mbox{set of 2 elements} & \\ \hline \mbox{line incident} & \mbox{partition into 4} & \\ \mbox{with hyperplane} & \mbox{sets of 2 elements} \\ \hline \end{array}\end{split}\]

These correspondences (except for the last) are stated in [CCN+85]. For the last correspondence we refer to [CS99], Ch. 10.1.4. The tetrads of a Steiner system corresponding to a line or a hyperplane are the tetrads in the sextets corresponding to the planes incident with that line or hyperplane; here we take tetrads that are subsets of octad \(o\) only. Methods for computing with these objects are presented in [CS99], Ch. 11, and implemented in file mat24_functions.c.

Let \(f\) be the symplectic form on \(\mathbb{F}_2^4\) corresponding to a pair \(y\) of elements of \(o\). For two different nonzero elements \(x_1, x_2\) of \(\mathbb{F}_2^4\) let \(s\) be the sextet corresponding to the plane generated by \(x_1\) and \(x_2\), and let \(\tau\) be a tetrad in \(s\) which is a subset of \(o\) . Then the scalar product \(x_1\) and \(x_2\) with respect to \(f\) is the size of the set \(y \cap \tau\) (modulo 2).

Let \(l' = \{8,9\}, s'= \{8,9,10,11\}\) and \(t' = \{8,\ldots,15\}\). Then the line \(l'\), the plane \(s'\), and the hyperplane \(t'\) are mutually incident. A staightforward calculation using the facts stated above shows that \(t'\) is the orthogonal complement of \(l'\) with respect to the symplectic form \(f'\) corresponding to the subset \(\{2,3\}\) of \(o\). Obviously, \(M_t \cap M_2 subset M_o\). So this proves \(M_l \cap M_2 \subset M_t\) and \(M_t \cap M_2 \subset M_o \cap M_l\).

Next we will show the last three inclusions listed above. The group \(M_3\) fixes the set \(y= \{1,2,3\}\). We have \(M_3 \cap M_l \subset M_o\), so \(M_3 \cap M_l\) fixes a Steiner system in \(o\), an it turns out the the set \(\{0,1,2,3\} \supset y\) is in that Steiner system. This proves \(M_3 \cap M_l \subset M_s\). We obviously have \(M_3 \cap M_t \subset M_o\), and hence \(M_3 \cap M_t\) fixes another Steiner system in \(o\). Here the set \(\{0,1,2,3\}\) is also in this Steiner system, implying \(M_3 \cap M_t \subset M_o \cap M_s\).

The group \(M_t \cap M_l\) fixes a line incident with a hyperplane in \(\mathbb{F}_2^4\). Hence it fixes a partition of \(o\) into four unordered pairs, and it turns out that the set \(u\) of these pairs is \(\{ \{2i, 2i+1\} \mid 0 \leq i < 4 \}\). Thus \(M_3 \cap M_t \cap M_l\) fixes \(u\) and \(y\), and hence also the set \(\{2,3\}\). This proves \(M_3 \cap M_t \cap M_l \subset M_2\).

Some 2-local subgroups of the Monster

The subgroups of \(M_{24}\) constructed above are involved in some large subgrups of the Monster as described in the following table.

\[\begin{split}\begin{array}{|c|c|c|c|} \hline \mbox{Name} & \mbox{Involved in subgroup of} \, \mathbb{M} \\ \hline M_2 & H^+ = 2.\mbox{B} \\ M_o & G_{10} = 2^{10+16}.O_{10}^+(2) \\ M_t & G_5^{(t)} = 2^{5+10+20}.(S_3 \times L_5(2)) \\ M_s & G_3 = 2^{3+6+12+18}.(L_3(2) \times L_3(2)) \\ % M_l & G_5^{(l)} \\ M_3 & 2^2.\!\hphantom{|}^2\!E_6(2):S_3 \\ \hline \end{array}\end{split}\]

TODO: More details will be given in a future version.

Installation from a source distribution

The current version of the mmgroup package is a source distribution that has been tested on a bit Windows, Linux and macOS with a 64-bit x86 CPU. It runs with python 3.8 or higher. The sources of the project can be downloaded from

https://github.com/Martin-Seysen/mmgroup .

We use the python package cibuildwheel to build several wheels on these operating systems. The GitHub repository of the project contains actions to build the corrsponding python wheels. The program code for the GitHub actions is stored in subdirectory .github/workflows. More details about the build process are given in section The build process.

The mmmgroup package contains a number of extensions written in C which have to be built before use. This will be discussed in section Code generation.

Building the mmgroup package with cibuildwheel

The easiest way to build the mmgroup package is to clone the sources from the github repository with git, and to build a python wheel with cibuildwheel. Therefore, the cibuildwheel tool must support your target platform. In the sequel we assume that python and the git revision control system are installed on your platform. We describe the build process on a unix-like platform.

Depending on your platform, you might have to install additional tools. For details we refer to the cibuildwheel docmentation https://cibuildwheel.readthedocs.io/en/stable/

Any previously installed version of the mmgroup package must be uninstalled before building a new version of that package!

The current build system supports native compilation only. Cross compilation is not supported.

E.g. for building the mmgroup package for python 3.12 on Linux with a 64-bit x86 CPU, open a shell and type:

pip3 install cibuildwheel
git clone https://github.com/Martin-Seysen/mmgroup
cd mmgroup
git pull origin master
git checkout .
python3 -m cibuildwheel --output-dir wheelhouse --only cp312-manylinux_x86_64

The last command builds the python wheel. For building the wheel on a different platform, the argument ‘cp312-manylinux_x86_64’ in the last command should be replaced by a build identifier that describes the requested python version and target platform. A list of valid build identifiers is given in Section ‘Options/build selection’ of the cibuildwheel documentation, see

https://cibuildwheel.readthedocs.io/en/stable/options/#build-skip

After building the python wheel, switch to subdirectory wheelhouse, and check the file name of the wheel that has been built:

cd wheelhouse
ls

Then you may install the wheel with pip, e.g.:

pip3 install mmgroup-0.0.13-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

Here the file name after the pip3 install command must be changed to the file name of the wheel just generated.

You should test your installation of the mmgroup package before using it:

pip3 install pytest
python3 -m pytest --pyargs mmgroup -m "not slow"

Dependencies

Before you can use this source distribution or build its extensions you should install the following python packages:

External Python packages required

Package

Purpose

cython

Development: integrating C programs into the mmgroup package

numpy

Runtime: Most arrays used by the mmgroup package are numpy arrays

wheel

Distribution: package for generating a Python wheel

pytest

Testing: basic package used for testing

regex

Development: parsing balanced curly braces

setuptools

Development: basic package used for setup and building extensions

cibuildwheel

Development: build wheel in continuous integration process

auditwheel

Development (Linux and macOS only): patching shared libraries

sphinx

Documentation: basic package used for documentation

sphinx-rtd-theme

Documentation: ‘theme’ to be used by sphinx

sphinxcontrib-bibtex

Documentation: bibliography in BibTeX style

sphinxcontrib-tikz

Documentation: link between doxygen and sphinx

breathe

Documentation: link between TikZ and sphinx

Packages used for the purpose of documentation are required only if you want to rebuild the documentation. If you want to rebuild the documentation you should also install the following programs:

External programs required

Program

Purpose

Location

miktex

Documentation

https://miktex.org/

Perl

Documentation

https://www.perl.org/get.html

doxygen

Documentation of C files

https://www.doxygen.nl/download.html

Ghostcript

Documentation: using TikZ in Sphinx

https://ghostscript.com/releases/gsdnld.html

Installing the package

To build the required package on your local computer, go to the root directory of the distribution. This is the directory containing the files setup.py and README.rst. From there run the following commands in a shell:

python -m pip install -r requirements.txt
python setup.py bdist_wheel

In Linux or macOS you’ll have to change the python command to python3.

For testing the installation, run the following command:

python -m pytest ./src/mmgroup/ -Wignore -v -s -m "not slow"

Distributing a wheel is the standard way to distribute a python package, see e.g.

https://packaging.python.org/guides/distributing-packages-using-setuptools/#wheels

Remarks

If you have installed any version of the mmgroup package (e.g. with the pip tool) then you must uninstall that package before you can build a new version of the mmgroup package from the source files.

Installing a C compiler for cython in Windows

The bad news for Windows developers is that there is no pre-installed C compiler on a standard Windows system. However, the cython package requires the C compiler MSVC.

The user has to install a C compiler so that it cooperates with cython. That installation process is out of the scope of this document.

For installing MSVC, one might start looking at https://wiki.python.org/moin/WindowsCompilers

The author has installed the MSVC compiler with the Microsoft Build Tools for Visual Studio from:

https://visualstudio.microsoft.com/thank-you-downloading-visual-studio/?sku=BuildTools&rel=16 ,

following the instructions in

https://www.scivision.dev/python-windows-visual-c-14-required/ .

Before typing python setup.py bdist_wheel in a Windows command line the author had to type:

"C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvars64.bat"

Here the path my be different on the user’s Windows system.

The build process

Warning! At present the build process is under construction.

We plan to switch to the Meson build system. The reason for this is that invoking setup.py from the command line is deprecated. Meson requires a much more declarative style for describing the build operations. Thus the description here is pretty much outdated!

The standard build process for python packages is based on the setuptools package. We use setuptools to build the mmgroup package. We have added the following functionality to the standard build process:

  • The user may call arbitrary functions or processes, e.g. for the automatic generation of C code.

  • The user may add a shared library that may be used by several extensions.

Module build_ext_steps in the root directory contains the python classes required for that extension. The following table lists some important modules in the root directory.

Files in the root directory used by the build process

File name

Purpose

build_ext_steps.py

Extension for the setuptools package

build_shared.py

Tool for building a shared library

cleanup.py

Clean up intermediate files

config.py

A configuration file of the project

generate_code.py

Tool for generating C code

MANIFEST.in

List of files to be added to the source distribution

pyproject.toml

Main configuration file for the project

README.rst

Main documentation file for GitHub

setup.py

Main file for the build process with setuptools

The tool build_shared.py

The tool build_shared.py is a simple stand-alone tool for generating a shared library with a standard C compiler. It supports Windows, Linux and macOS. This job could also be done with a tool like CMake or Meson; but for the time being it has been easier to extract this functionality from an older tool with a similar functionality.

Here are the options of the tool build_shared.py:

-h, --help            show this help message and exit
--name MAME           Set NAME of library to be generated. NAME should
                      have no extension and no prefix such as 'lib'.
--sources [SOURCES ...]
                      Add list SOURCES of to list of source files. Each
                      source file should be a .c file with extension '.c'.
--source-dir DIR      Set root directory DIR for all source files.
--include-path [PATHS ...]
                      Set list of PATHS for finding .h files to be used.
--library-path [PATHS ...]
                      Set list of PATHS for finding libraries to be used.
--define [VARS ...]   Define variables given by the list VARS for the
                      compiler. Each variable must be a string VAR or
                      VAR=VALUE.
--undef [VARS ...]    Undefine variables given by the list VARS for the
                      compiler.
--libraries [LIBS ...]
                      Search libraries with names in the list given by
                      LIBS. This corresponds to the gcc option '-l'.
--library-dir DIR     Set directory DIR for storing object files and
                      static libraries.
--shared-dir DIR      Set directory DIR for storing shared libraries.
--static [STATIC]     Create static instead of shared library. Optional
                      argument STATIC may be 0 (=shared) or 1 (=static).
--n N                 Use N parallel processes for compiling.
--compiler C          Specify name of default compiler. C must be 'unix',
                      'msvc', or 'mingw32'.
--cflags FLAGS        Add extra arguments FLAGS for the compiler. E.g.
                      '--cflags=-c,foo=bar' adds arguments '-c' and
                      'foo=bar'.
--display MODE        display (full) name of generated library and exit.
                      MODE = 'load' displays the full file name of the
                      library. MODE = 'build' displays the name of the
                      library to be linked to a program using that
                      library.

Creating a new version

The ultimate goal of building a new version is to upload a new python version to the pypi server.

At present we will upload a source distribution and python 3 wheels for 64-bit Windows, Linux and macOS, say, for the latest two or three python versions.

Before creating a new version, (at least) the following test should be executed in a shell in Windows 64, and also in some Linux version:

pytest src/mmgroup/ -v -s -m "very_slow"

Version numbering

We assume that we want to create Version 0.0.8 at date 2022-07-12, with the short version description ‘Performance improved’.

You should update the version number in file setup.py by writing e.g:

VERSION = '0.0.8' # 2022-07-12. Performance improved

into that file. You should also comment out older version descriptions in that file. In section Version history of file docs/source/api.rst you should add the following line to the version history:

| Version 0.0.8, 2022-07-12. Performance improved

Then you should upload the new version with these changes to the master branch in the github repository:

https://github.com/Martin-Seysen/mmgroup

After uploading, you should create a new release in the github repository. Therefore, click Create a new release in main page of the github repository. Here you should write the tag v0.0.8 into the field Choose a tag. The Target of the release should be master, referring to the master branch in git. You should enter the title mmgroup v0.0.8 into the field Release title. We recommend to enter (at least) the version description (which is ‘Performance improved’ in our case) into the field Describe this release. Finally, you should click the button Publish release in the github window.

Generating the wheels

This subsection describes how to create wheels manually. This process has now been automated to some extent by using GitHub actions that trigger the python tool cibuildwheel. So the reader may skip this section.

Here you must generate a wheel for each python version, and also for each operating system that you want to support. Here we assume that Anaconda is used for creating wheels for Windows 64 for various python versions. An Anaconda environment e.g for python 3.9 is created by typing the following command

conda create --name python39 python = 3.9

Then we may switch to that python version by typing:

conda activate python39

Environments for other python versions are created similarly. One has to install all required python packages for each version. For uploading a version to pypi we also have to install twine with

pip install twine

In each version to be supported we have to type:

python setup.py build_ext bdist_wheel
python setup.py sdist

Here the first line creates the wheel for the selected python version. Before doing so in Windows, you must install a C compiler for cython as described in section Installation from a source distribution.

The second line creates a source distribution; this must be done only once. The wheels and source distributions are stored in subdirectory dist. The wheel for mmgroup version 0.0.8 for python 3.9 has a name similar to mmgroup-0.0.9-cp37-cp37m-win_amd64.whl; and the source distribution has a name like mmgroup-0.0.8.zip.

Uploading the version to pypi

You may upload the new version with the following commend:

twine upload twine upload dist/*

This uploads all files from subdirectory dist to pypi. So you’d better cleanup that directory before uploading.

Warning

Uploading with twine is irreversible. If your uploaded version is buggy, you will have to create a new version!

Description of module build_ext_steps.py

Module build_ext_steps provides a customized version of the ‘build_ext’ command for setup.py.

It should be placed in the same directory as the module setup.py.

Distributing python packages

The standard toolkit for distributing python packages is the setuptools package. Here the user types:

python setup.py build_ext

at the console for building the extensions to the python package, which are typically written in a language like C or C++ for the sake of speed. We may use e.g. the Cython package to write python wrappers for the functions written in C or C++. The setuptools package supports the integration of Cython extensions.

The setup.py script describes a list of extensions, where each extension is an instance of class Extension which is provided by setuptools. Then it calls the setup function which builds all these extensions:

from setuptools import setup
from setuptools.extension import Extension   

ext_modules = [
    Extension(
        ...  # description of first extension
    ),
    Extension(
        ...  # description of second extension
    ),
    ...
]

setup(
    ..., # Standard arguments for the setup function
    ext_modules = ext_modules,   # List of extensions to be built
)

We assume that the reader is familiar with the standard python setup process. For background, we refer to

https://setuptools.readthedocs.io/en/latest/

Added functionality for building python packages

This build_ext_steps module supports a new paradigm for building a python extension:

  • A python program make_stage1.py creates a C program stage1.c.

  • We create a python extension stage1.so (or stage1.pyd in Windows) that makes the functionality of stage1.c available in python.

  • A python program make_stage2.py creates a C program stage2.c. Here make_stage2.py may import stage1.so (or stage1.pyd).

  • We create a python extension that makes the functionality of stage2.c available in python.

  • etc.

This paradigm is not supported by the setuptools package.

Using class BuildExtCmd for the new paradigm

For using the new building paradigm we have to replace the standard class build_ext by the class build_ext_steps.BuildExtCmd.

from setuptools import setup
from build_ext_steps import Extension   
from build_ext_steps import BuildExtCmd

ext_modules = [
    # description of extension as above
]

setup(
    ..., # Standard arguments for the setup function
    ext_modules = ext_modules,   # List of extensions to be built
    cmdclass={
       'build_ext': BuildExtCmd, # replace class for build_ext
    },
)

This change has a few consequences:

  • It is guaranteed that the extension are build in the given order

  • Extensions are always build in place (option build_ext --inplace) (The current version does not support building the extension in a special build directory.)

  • The building of all extensions is now forced (option build_ext --f), regardless of any time stamps.

Apart from these changes, an extension is created in the same way as with setuptools.

For a documentation of the Extension class in the setuptools package, see

https://docs.python.org/3/distutils/apiref.html?highlight=extension#distutils.core.Extension

Inserting user-defined functions into the build process

Module build_ext_steps provides a class CustomBuildStep for adding user-defined functions to the build process.

In the list ext_modules of extensions, instances of class CustomBuildStep may be mixed with instances of class Extension, Class CustomBuildStep models an arbitrary sequence of functions to be executed.

The constructor for that class takes a string ‘name’ describing the action of these functions, followed by an arbitrary number of lists, where each list describes a function to be executed.

Here the first entry should be a string. Then a subprocess with that name is called. Subsequent entries in the list are arguments passed to the subprocess.

If such an argument contains the string ‘${build_lib}’ then that string is replaced by the name of the root directory of the path where the extionsion is to be built. If the ‘build_ext’ command of setup.py is invoked with the ‘–inplace’ option then the string ‘${build_lib}’ is replaced by ‘null’.

Such a subprocess may be e.g. a step that generates C code to be used for building a subsequent python extension.

Its recommended to use the string sys.executable (provided by the sys package) instead of the string 'python' for starting a python subprocess.

The following functionality is deprecated:

If the first entry of a list as descibed above is a python function then that function is called. Subsequent entries in the list are passed as arguments to the function.

Using a shared library in an extension

The user may have to perform some os-specific steps for making the library available for python extension. Details are out of the scope of this documentation.

Using an extension in a subsequent build step

Once a python extension has been built, it can also be used in a subsequent step of the build process, e.g. for calculating large arrays of constants for C programs. Details are out of the scope of this documentation..

Code generation

Warning! At present the code generation process is under construction.

We plan to switch to the Meson build system. Therefore a much more declarative style is required for describing the code generation. Also, a strict separation between input and output directories is required. Thus the description here is pretty much outdated!

The main code generation tool generate_code.py

An invocation of the main code generation script generate_code.py generates a set of .c files and also a single .h file from a set of source files. Here each .c file is generated from a single source file which has extension .ske. Prototypes for the functions in the .c files may also be generated form these source files; they are copied into the generated .h file. The user may also specify a source file with an extension .h. Such a .h file usually contains global declarations; and it is also copied into the generated .h file.

The user may also specifiy a set of python modules to be used for code generation with the --tables option. Any such module must have a class Tables containing information how to enter tables and code snippets into the generated C code. The structure of such a code-generating module is described in the next section.

We use the Cython language to integrate the generated .c file into our python project. In a Cython project, a .pxd file should be generated from a header file. Here the .pxd file contains essentially the same information as the header file. The --pxd option generates a .pxd file from the generated header file automatically. Here the user may specify a source file (with extension .pxd), which will also be copied into the generated .pxd file.

The user may also create a .pxi file from the .pxd file with the --pxi option . Such a .pxi file will contains wrappers for the C functions declared in the .pxi file. These wrappers can be used directly from python by including the .pxi file into a .pyx file that defines a Cython extension. Note that this automatic wrappping mechanism works for rather simple prototypes only.

With the --pyx option, the user may simply copy a .pyx file from the source path to a destination directory.

The Meson build system requires strict separation between the input and the output directory. The user may specify a path where to look for the input files (with extensions .ske, .h, .pxd, and .pyx). The user may specify a directory where to store the generated .c and .h files; and also a directory where to store the generated .pxd, .pyi, and .pyx files.

In the following table we list the options available in the generate_code.py tool.

-h, --help            show this help message and exit
--sources [SOURCE ...]
                      List of SOURCE files to be generated. For each
                      SOURCE with extension '.c' a '.c' file is
                      generated from a file with the same name and
                      extension '.ske'. A SOURCE with extension '.h' is
                      copied into the common header file. Each SOURCE
                      is seached in the path set by parameter '--
                      source-path'. Output is written to the directory
                      set by parameter '--out-dir'.
--out-header HEADER   Set name of output header file to HEADER. By
                      default we take the first file with extension .h
                      in the list given in the argument '--sources'.
--pxd PXD             Set input '.pxd' file PXD for generating '.pxd'
                      file with same name from that input file and from
                      the generated header.
--pxi                 Generate '.pxi' file from '.pxd' file if this
                      option is set.
--pyx PYX             Copy input '.pyx' file PYX from source path to
                      output directory.
--tables [TABLES ...]
                      Add list TABLES of table classes to the tables to
                      be used for code generation.
--set VAR=VALUE [VAR=VALUE ...]
                      Set variable VAR to value VALUE. When generating
                      code with subsequent '--sources' options then the
                      table classes will set VAR=VALUE.
--subst [PATTERN SUBSTITUTION ...]
                      Map the name of a '.c' or '.h' file to be
                      generated to the name of a file used as a source
                      for the generation process. Example: "--subst
                      mm(?P<p>[0-9]+)_op mm_op" maps e.g.'mm3_op' to
                      'mm_op'. We substitute PATTERN in the name of a
                      generated file by SUBSTITUTION for obtaining the
                      name of that source. The part "(?P<p>[0-9]+)"
                      creates a variable 'p' that takes the decimal
                      string following the intial letters 'mm' in the
                      file name. Then variable 'p' will be passed to
                      the table classes used for code generation.
                      PATTERN must be given in python regular
                      expression syntax.
--source-path [PATHS ...]
                      Set list of PATHS for finding source files to be
                      used for generation process.
--py-path [PATHS ...]
                      Set list of PATHS for finding python scripts
--out-dir DIR         Store output files with extensions .c and .h in
                      directory DIR.
--out-pxd-dir DIR     Store output files with extensions .pxd, .pxi,
                      and .pyx in directory DIR.
--dll DLL_NAME        Generate code for exporting C functions to a DLL
                      or to a shared library with name DLL_NAME.
                      Parameter DLL_NAME must be the same for all
                      generated C files to be placed into the same DLL.
                      This parameter is not used for any other
                      purposes.
--nogil               Optional, declare functions from .pxi file as
                      'nogil' when set.
--mockup              Use tables and directives for Sphinx mockup if
                      present
-v, --verbose         Verbose operation

Python modules used for code generation

In the main code generation tool generate_code.py the user may specify a list of python modules to be used for code generation. Any such module must have a class Tables that contains two dictionaries tables and directives to be used for code generation. Both of these dictionaries map identifiers to objects implementing tables or directives to be used in the code generation process. Here the identifiers should be valid python identifiers beginning with an alphabetic character. If several code generation modules are specified then the corresponding dictionaries in the Tables classes of these modules are merged into a single dictionary.

Dictionary tables maps identifiers to values. If we have e.g. tables[P] = 3 then the expression %{P} in a source file with extension .ske evaluates to the string 3 in the generated .c file. More elaborated use cases for the tables dictionary are given in the following sections. A value in the tables dictionary may also be a list of integers; then there ia a way to to initialize an array of integers in the generated .c files with these integer values.

Dictionary directives maps identifiers to directives. Invoking such a directive in a source file causes the code generator to put a code snippet into the generated .c file. Here a typical use case is the multiplication of a part of a vector of the representation of the Monster group with a fixed matrix.

A class Tables as described above should accept arbitrary keyword arguments in the constructor. (A standard way to do this is to provide a parameter **kwds in the constructor.) The generate_code.py tool may pass such keyword arguments to the constructor of a class Tables either with the --set option or with the --subst option. Using the --set option is straightforward. Using --subst is a bit more involved. The following paragraph provides an example for using the --subst option.

The source file mm_op_pi.ske supports certain oparations of the Monster group on vectors modulo several small odd numbers p. The module mm_op_pi.py provides a class Tables that supports the generation of code snippets to be used in these operations. From the source file mm_op_word.ske we want to generate files mm3_op_pi.c, mm7_op_pi.c, … etc. for the operation modulo 3, 7, …, etc. In the corresponding code generation process there is an option

--subst mm(?P<p>[0-9]+)_op mm_op

that primarily describes a subtitution of strings that maps both, mm3_op_pi, and mm7_op_pi to mm_op_pi. So this substitution maps the name of a generated .c file to the name of a .ske file (without extension) from which that .c file is to be generated. The first argument mm(?P<p>[0-9]+)_op of the --subst option is the pattern to be substituted given in the standard python regular expression syntax. Note that the part (?P<p>[0-9]+) of that argument also creates a dictionary {'p' : '3'} mapping identifier ‘p’ to the string ‘3’, when we substitute the name mm3_op_pi using this pattern. When such a dictionary is created, the code generation tool passes the keyword argument p = '3' to the constructor of every class Tables used for the generation of file mm3_op_pi.c. This way the the Tables classes can be instructed to generate code snippets for operation modulo 3.

Generation of code snippets and tables

In this section we describe the most important functions and classes used for the automatic generation of C code.

The C code generator generates code from a source file automatically.

Class TableGenerator can be used to generate a .c and a .h file from source file. Here such a source file usually has the extension .ske.

Basic idea: We copy a source file to the .c file. The source contains directives to enter precomputed tables, code snippets or constants into the .c file, and prototypes into the .h file.

We copy the content of the source file directly to the .c file. In the source file we also parse certain directives and we replace them by automatically generated code snippets in the .c file and in the .h file.

The arguments given to the constructor of class TableGenerator specify the meaning of the directives.

Overview

In the source file we parse directives of the shape:

// %%<keyword>  <arg1>, <arg2>, ...

Each directive executes certain function associated to the <keyword>. There are built-in and user-defined directives. Built-in directives are listed in section Built in directives. User-defined directives are passed to the constructor in a dictionary directives with entries:

<keyword> : <directive_function>. 

If a directive is found in the source file we call:

<directive_function>(*<all evaluated arguments>), 

and the string returned by that function is merged into the generated .c file. The arguments of a directive must be given in python syntax and they are evaluated to python objects using some safe version of the python eval() function. Basically, python identifiers in an argument are evaluated as follows:

In the constructor, parameter tables is a dictionary of local variables, which will be passed to the python eval() function for evaluating an argument. See section Arguments of directives for a more detailed description of this process.

You may also consider class UserDirective for constructing specific directive functions.

Arguments of directives

The comma-separated list of arguments of a built-in or user-defined directive or function must be a valid python expression. This expression is evaluated by a safe version of the python eval() function, with the local variables given by the dictionary names. Here names is an updated version of the dictionary tables of local variables passed in the constructor of this class as described below. This means that each identifier found in the list of arguments is looked up in that dictionary, and that the identifier is evaluated to the corresponding value in that dictionary.

The dynamic dictionary names is initialized with dictionary tables. It may be updated with the names of the tables generated by the built-in directive TABLE, as indicated in the description of the built-in directive USE_TABLE.

There are a few more predefined entries in dictionary names:

name

value

TABLES

The original dictionary tables passed in the constructor

NAMES

updated version of dictionary tables as described above

Evaluating a python expression with the standard function eval() is known to be unsafe. For evaluating arguments we use a modified version of the eval() function. That modified function does not allow assignment, and identifiers starting with an underscore _ are illegal. See function EvalNodeVisitor in module mmgroup.generate_c.generate_functions. for details.

There is a limited number of built-in python functions (specified in directory safe_locals in the same module) that are legal in python expressions for arguments:

abs, int, len, max, min, str, pow, range, zip 

An example with user-defined directives is given in class testcode.TestGen. We recommend to follow the conventions in that class for passing user-defined directives to an instance of class TableGenerator.

Built-in directives

  • COMMENT, deprecated!

    Indicates that a comment follows. This should be used only for comments relevant for the user, not for internal details. No action in this class, but other functions may use this directive for generating user documentation.

  • ELSE [IF]? <expression>

    else clause for IF directive, see IF

  • END <directive>

    Closes a block that has been started by one of the directives FOR or IF. A FOR block must be closed with END FOR, an IF block must be closed with END IF, etc.

  • ERROR <message>

    TODO: yet to be implemented and documented!!!!

  • EXPORT (arguments see below)

    Then the following line with an appended semicolon is copied into the generated .h file. This is useful for creating prototypes. The EXPORT directive may precede a line with a USE_TABLE directive. Then the table name captured by the next USE_TABLE statement is exported to the generated .h file. The EXPORT directive may precede a line with a DEFINE function. Then the #define statement generated by the next DEFINE function is written to the generated .h file.

    An optional argument p means that the exported line will also be included in a .pxd file, see method generate_pxd().

    An optional argument px or xp means that the same as argument p. Then in addition we also write a comment # PYX <wrap pxi> into the .pxd file generated by method generate_pxd(). If function mmgroup.generate_c.pxd_to_pyx reads such a comment in a .pxd file, it will write a simple wrapper for the exported function into a .pyx file.

  • EXPORT_KWD <keyword>

    The directive:

    // %% EXPORT_KWD  FOO_API
    

    places the keyword FOO_API before any function or variable which is exported with the EXPORT or EXPORT_TABLE directive. This is useful on a MS Windows system for creating a dll. Then the header should contain a definition similar to:

    #define FOO_API __declspec(dllexport)
    

    This declares these functions or variables to be exported by the dll. For background, see e.g. https://gcc.gnu.org/wiki/Visibility.

  • EXPORT_TABLE (no arguments)

    Parses the next input line for the C-name of a table, and exports this name into the generate .h header file. This is equivalent to two subsequent directives:

    EXPORT
    USE_TABLE
    

    There are, however, subtle differences, see directive EXPORT_KWD.

  • FOR <variable> in <parameter_list>

    Generate a sequence of blocks of statements, one block for each parameter in the parameter_list. A sequence of arbitrary lines follows, terminated by an END FOR directive. The parameter list is evaluated to a python object which should be iterable.

    Nesting of FOR and similar directives is possible.

    Examples:

    // %%FOR x in range(1,3)
    i%{x} += 2*%{x};
    // %%END FOR
    

    evaluates to:

    i1 += 2*1;
    i2 += 2*2;
    
  • GEN <ch>

    Directs output to .c or to .h file or to both, depending on whether the letter c or h is present in the first argument.

  • IF <expression>

    Conditional code generation depending on the boolean value of <expression>. Syntax is:

    // %%IF <expression>
    <statements>
    // %%ELSE IF <expression>
    <statements>
    // %%ELSE 
    <statements>
    // %%END IF
    

    with one or more optional ELSE IF clauses and at most one final ELSE clause. Nested IF and FOR blocks are possible.

  • INCLUDE_HEADERS

    This directive is legal in a header file with extension .h only. A typical header file may look like this:

    #ifndef  _THIS_HAEDER_HAS_ALREADY_BEEN_PROCESSED_
    #define  _THIS_HAEDER_HAS_ALREADY_BEEN_PROCESSED_
    
    // Some prototypes for C functions will be inserted here
    
    #endif
    

    The code generator may insert automatically generated prototypes for C functions into that header. You can tell the code generator where to insert these prototypes as follows:

    #ifndef  _THIS_HAEDER_HAS_ALREADY_BEEN_PROCESSED_
    #define  _THIS_HAEDER_HAS_ALREADY_BEEN_PROCESSED_
    
    // %%INCLUDE_HEADERS
    
    #endif
    
  • JOIN <infix>, <suffix>

    A FOR directive must follow a JOIN directive. The expressions generated by the following FOR directive are joined with the <infix> (which is usually an operator), and the <suffix> is appended to the whole generated expression. E.g.:

    a += 
    // %%JOIN*  " +", ";"
    // %%FOR* i in range(1,4)
      %{int:2*i} * table[%{i}]
    // %%END FOR
    

    evaluates to:

    a += 
      2 * table[1] + 
      4 * table[2] + 
      6 * table[3];
    
  • PY_DOCSTR <string>, <format>

    This directive is no longer supported

  • PYX <string>

    This directive is no longer supported

    into the .pxd file, when generating a .pxd file. This has no effect on the generated .c or .h file. It may be used for automatically generating a .pyx file from a .pxd file.

  • TABLE <table>, <format>

    Enters all entries of the python array <table> into the C code as an array of constants with an optional <format>. Only the data of the array are written. Feasible <format> strings are given in function format_number().

  • USE_TABLE (no arguments)

    Parses the next input line for the C-name of a table and makes that C name available for user-defined directives.

    If a table is coded with the TABLE directive and preceded by a USE_TABLE or EXPORT_TABLE directive then the name of that table will be added to the dictionary names. The key with the python name of that table will have the C name of that table as its value.

    Consider the following example:

    1:    // %%USE_TABLE
    2:    int  C_table[]  = {
    3:       // %%TABLE python_table, uint32
    4:    } 
    

    Here 'python_table' must be a key in the dictionary tables passed to the constructor of class TableGenerator. The value for that key should be a list of 32-bit integers. Then the directive in line 3 creates the list of integer constants given by the list tables[python_table].

    The USE_TABLE directive in line 1 has the following effect. In the sequel the entry names['python_table'] has the value 'C_table', which is the C name of the table that has been generated. So we may use that entry in subsequent directives or via string formatting with curly braces.

    The entries of table python_table are still available in the form python_table[i] as before. python_table[:] evaluates to the whole table as before.

  • WITH <variable> = <value>

    Temporarily assign <value> to <variable>. A sequence of arbitrary lines follows, terminated by an END WITH directive. Both, <variable> and <value>, may be tuples of equal length; a nested tuple <variable> is illegal. The <variable>, or the variables contained in the tuple <variable>, are temporarily added to the dictionary names as keys, to that they can be used via string formatting.

    Nesting of WITH and other directives is possible.

    Example:

    // %WITH square, cube = int((x+1)**2), int((x+1)**3)
    i += %{cube} + %{square} + %{x};
    // %%END WITH
    

    Assuming x has value 9, this evaluates to:

    i += 1000 + 100 + 9;
    

String formatting

A line l in the source file may contain a string s with balanced curly braces and preceded by a ‘%’ character. Then the standard .format() method is applied to the string s[1:] obtained from s by dropping the initial ‘%’ character. The dictionary names is given to the formatting method as given as the list of keyword arguments.

E.g. in case names['x'] = y, the expressions %{x.z} and %{x[z]} refer to y.z and y[z]. Of course, the objects y.z or y[z] must exist and evaluate to strings which are meaningful in the generated C file.

For formatting, the dictionary names is an updated version of the dictionary tables passed in the constructor, as described in section Arguments of directives.

We consider this to be a reasonable compromise between the standard C syntax and the very powerful pythonic .format method for strings.

Using functions via string formatting

User-defined functions can be invoked inside C code via the the string formatting operator %{function_name:arg1,arg2,...}. Then the function with name function_name is called with the given arguments, and the result of the function is substituted in the C code.

Such a user-defined function may be created with class UserFormat. The simplest way to create such a user-defined function from a function f is to create an entry:

"function_name" : UserFormat(f)

in the dictionary tables passed to the constructor. Then %{function_name:arg1,arg2,...} evaluates to str(f(arg1,arg2,...)). Here arguments arg1, arg2 must be given in python syntax and they are processed in the same way as the arguments of a directive described in section Arguments of directives. Then you may also write function_name(arg1,..) inside an argument of a directive or a user-defined function; this expression evaluates to f(arg1,...).

Class UserFormat also contains options to control the conversion of the result of function f to a string.

There are a few built-in string formatting functions, as given by dictionary built_in_formats in module generate_functions.

There are a few predefined functions for string formatting:

  • %{int:x} returns x as a decimal integer

  • %{hex:x} returns x as a hexadecimal integer

  • %{len:l} returns length of list l as a decimal int

  • %{str:x} returns str(x)

  • %{join:s,l} returns s.join(l) for string s and list l

See dictionary built_in_formats in module mmgroup.generate_c.generate_functions for details.

So, assuming that keys 'a' and 'b' have values 1 and 10 in the dictionary tables passed to the code generator, we may write e.g.:

y%{a} += %{int:3*b+1};

in the source file. This evaluates to:

y1 += 31;

Quiet form of a directive

For directives such as IF, FOR, … there are also quiet variants IF*, FOR*, …, which perform the same operations, but without writing any any comments about the directive into the output file. E.g. in:

// %%FOR i in range(50)
printf("%d\n", i);
// %%IF* i % 7 == 0 or i % 10 == 7
printf("seven\n");
// %%END IF
// %%END FOR

you probably do not want any messy comments about the IF directive in your generated code.

Historical remark

In versions up to 0.4 string formatting operator was {xxx} instead of %{xxx}. This had the unpleasent effect that valid C expressions have a different meaning in the input and and the output of the code generation process. This situation became unbearable after intoducing doxygen for documentation of C code. Then expressions like \f$ x_{1,1} \f$ could not be coded in a reasonable way the old version.

Classes and functions provided by the code generator

class mmgroup.generate_c.TableGenerator(tables={}, directives={}, verbose=False)

Automatically generate a .c file and a .h file from a source

Basic idea: We copy a source file to the .c file. The source contains directives to enter precomputed tables, code snippets or constants into the .c file, and prototypes into the .h file.

We copy the content of the source file directly to the .c file. In the source file we also parse certain directives and we replace them by automatically generated code snippets in the .c file and in the .h file.

The arguments given to the constructor of this class specify the meaning of the directives.

For a more detailed description of these directives see module make_c_tables_doc.py

Parameters:
  • tables

    Here tables is a dictionary of with entries of shape:

    ``name`` : ``table`` .
    

    name is a string that is a valid python identifier. table is the python object to be referred by that name whenever an argument is evaluated as a python expression.

  • directives

    Here directives is a dictionary of shape:

    ``function_name`` : ``function``  .
    

    The function function is executed when a directive of shape:

    // %%function_name
    

    occurs as a directive in the source file. The arguments following the directive are evaluated and passed to the function function. That function should return a string. This string is merged into the output .c file.

    If directives is mmgroups.generate_c.NoDirectives then also built-in directives are not executed.

  • verbose – optional, may be set True for verbose console output

External modules importing this class should use methods set_tables, generate, and table_size only.

generate(source, c_stream, h_stream=None, gen='c')

Generate a .c and a .h file form a source.

Here source must be an iterator that yields the lines of the source file. This may be a readble object of class _io.TextIOWrapper as returned by the built-in function open().

Parameters c_stream and h_stream must be either None or an instance of a class with a write method, e.g. a writable object of class _io.TextIOWrapper. Then the output to the c file and to the h file is written to c_stream and to h_stream.

class mmgroup.generate_c.UserDirective(f, eval_mode='', prefix=False)

Convert a function for use in a code generator directive

Each code generator directive corresponds to a function. Evaluating that function with the arguments given to the directive results in a string that makes up the code generated by the directive.

The list of arguments of a code generator directive is necessarily a string. Usually, this string is converted to a python expression which is evaluated using the dictionary ‘tables’ given to the code generator. Here dictionary tables contains the local variables used by the evaluation process.

The constructor of this class returns an object which is a callable function. Calling that function has the same effect as calling the function f, which is the first argument passed to the constructor.

Additional information for dealing with the arguments of the function may be given the constructor. This information is used by the code generator if an instance of this class is registered at the code generator as a directive. For registering a directive in that way you may simply enter an instance of this class as a value in the dictionary directives passed to the constructor of class TableGenerator.

Using this feature you may automatically convert arguments of the function e.g. to strings, integers or python objects. You may also prepend a local variable from dictionary tables to the list of user-specified arguments of the directive. The latter feature gives a directive the flavour of a member function of a class, similar to prepending self to the list of user-specified arguments.

Parameters:
  • f – A callable function which is executed when the code generator parses a user-defined directive. The arguments of such a directive are passed to function f as arguments.

  • eval_mode

    This is a string specifying the syntax of the arguments. The i-th character of the string eval_mode controls the evaluation of the i-th argument of function f as follows:

    • 'p' Evaluate the argument to a python object (default).

    • 'i' Evaluate the argument to an integer.

    • 's' Evaluate the argument to a string, ignoring dictionary

    It eval_mode is '.' then a single parameter containing just the given string is passed to function f as a single argument without any parsing with python.

  • prefix

    Describes to an optional local variable of the code generator prepended to the list of arguments of the directive. It may be one of the following:

    • False: Nothing is prepended

    • True: The dictionary of all local variables is prepended

    • any string: The local variable with that name is prepended

class mmgroup.generate_c.UserFormat(f, eval_mode='', prefix=False, fmt=None, arg_fmt=None)

Convert a function for string formatting in the code generator

The code generator evaluates expressions in curly braces %{...} in the source code with the python .format() function for stings.

It is desirable that %{int:2*3} evaluates to a string representing the integer constant int(2*3) for some built-in and user-defined functions such as function int.

Therefore we need a python object with name int that behaves like the function int() when being called. That object should also have a __format__ method that returns str(int(<expression>)) if a string representing a valid python expression is passed to the __format__ method.

UserFormat(int) creates such an object. Thus:

tables = {"int": UserFormat(int)}
"{int:2*3}".format(**tables))

evaluates to str(int(2*3)). Of course, function int may be replaced by any user-defined function.

Things get more interesting if we prepare a dictionary for the code generator:

from mmgroup.generate_c import TableGenerator, UserFormat
# Prepare tables for a code generator
tables = {"int": UserFormat(int),  "a": 3, "b":5}
# Create a code generator 'codegen'
codegen = TableGenerator(tables)
# Make a sample source line of C code
sample_source = "  x = %{int: a * b + int('1')};\n"
# generate file "test.c" containing the evaluated sample line
codegen.generate(sample_source, "test.c")

Here the string {int: a * b + int('1')} in the sample source line is formatted with the standard python .format method. The .format method obtains the entries of the dictionary tables as keyword arguments.

So %{int: a * b + int('1')} evaluates to str(3*5+1), and the C file "test.c" will contain the line:

x = 16;

So UserFormat(f) creates an object the behaves like f when it is called; and it returns str(f(arg1,arg2,...)) when it is invoked in "{f:arg1,arg2,...}".format(f = f).

Here the argument list arg1, arg2,... is processed in according to the same rules as the argument list of a user-defined directive in class UserDirective.

Parameters:
  • f – A callable function which is executed when the code generator parses a user-defined formatting string in curly braces preceded by a ‘%’ character, such as %{...}. The arguments given in the formatting string are passed to function f as arguments.

  • eval_mode – This is a string specifying the syntax of the arguments in the same way as in class UserFunction.

  • prefix – Describes to an optional local variable of the code generator prepended to the list of arguments of the function in the same way as in class UserFunction.

  • fmt

    By default, function str() is applied to the result of function f in order to convert that result to a string. This behaviour can be modified by setting parameter fmt.

    if fmt is a callable function then str(f(*args)) is replaced by fmt(f(*args)).

  • arg_fmt – If this parameter is set then the corresponding .format method returns arg_fmt(*args) instead of str(f(*args)). This overwrites the parameter fmt.

mmgroup.generate_c.c_snippet(source, *args, **kwds)

Return a C code snippet as a string from a source string

Here source is a string that is interpreted in the same way as the text in a source file in method generate() of class TableGenerator. The function applies the code generator to the string source and returns the generated C code as a string.

All subsequent keyword arguments are treated as a dictionary and they are passed to the code generator in class TableGenerator in the same way as parameter tables in the constructor of that class.

One can also pass positional arguments to this function. In the source string they an be accessed as %{0}, %{1}, etc.

A line starting with // %% is interpreted as a directive as in class TableGenerator.

The keyword directives is reserved for passing a dictionary of directives as in class TableGenerator.

In order to achieve a similar effect as generating C code with:

tg = TableGenerator(tables, directives)
tg.generate(source_file, c_file)

you may code e.g.:

c_string = c_snippet(source, directives=directives, **tables) 

If this function cannot evaluate an expression of shape %{xxx} then the expression it is not changed; so a subsequent code generation step may evaluate that expression. An unevaluated argument in a directive leads to an error.

mmgroup.generate_c.format_item(n, format=None)

Format a python object n to a C string for use in a C table

The default format is to apply function str() to the object.

The current version supports integer formats only. Parameter format must be one of the following strings:

'int8', 'int16', 'int32', 'int64', 'uint8', 'uint16', 'uint32', 'uint64'

mmgroup.generate_c.make_doc()
mmgroup.generate_c.make_table(table, format=None)

Convert a python table to a C sequence separated by commas

Parameter table may be anything iterable. Function

format_item(x, format)

is applied to each item x in the table.

The output may contain several lines.

mmgroup.generate_c.prepend_blanks(string, length)

Prepend length blank characters to each line of a string

Here parameter string is a string that may consist of several lines.

mmgroup.generate_c.pxd_to_pxi(pxd_file, module=None, translate=None, nogil=False)

Extract Cython wrappers from prototypes in a .pxd file

A .pxd file contains prototypes of external C functions and it may be included into a .pyx file in order to create a Cython module that uses these C functions.

This function returns a string containing python wrappers of the C functions in the pxd file. This string may be used as a part of an automatically generated .pxi file that can be included directly into a .pyx file. Here the C functions in the .pxd file to be included must be sufficiently simple as indicated below.

The returned string starts with two lines:

cimport cython 
cimport <module>

Here <module> is given by the parameter module, By default, module is constructed from the name pxd_file of the . pxd file.

The .pxd file is parsed for lines of shape:

<return_type> <function>(<type1> <arg1>,  <type2> <arg2> , ...)

Every such line is converted to a string that codes function, which is a Cython wrapper of that function of shape:

@cython.boundscheck(False)  # Deactivate bounds checking
@cython.wraparound(False)   # Deactivate negative indexing.
def <translated_function>(<arg1>, <arg2> , ...):
    cdef <type1> <arg1> = <arg1>_v_
    cdef <type2> <arg2> = <arg2>_v
    cdef <return_type>  _ret
    ret_ = <module>.<function>(<arg1>_v_,  <arg2>_v_ , ...)
    return ret_

<translated_function> is the string computed as translate(<function>), if the argument translate is given. Otherwise is <translated_function> is equal to <function>.

<return_type> and <type1>, <type2>, ... must be valid types known to Cython and C. The types <type1>, <type2> used for arguments may also be pointers. Then pointers are converted to memory views e.g:

uint32_t <function>(uint8_t *a)

is converted to:

@cython.boundscheck(False)  # Deactivate bounds checking
@cython.wraparound(False)   # Deactivate negative indexing.
def <translated_function1>(a):
    cdef uint8_t[::1] a_v_ = a
    cdef uint32_t ret_
    ret_ = <function>(&a_v_[0])
    return ret_

The <return_type> may be void, but not be a pointer. Other types are not allowed.

We convert a function only if a line containing the string “# PYX” (possibly with leading blanks) precedes the declaration of a function. In the code generator you may use the EXPORT directive with options px to enter such a line into the source file.

If nogil is True, a C function is called with as follows:

with nogil:
    ret_ = <function>(&a_v_[0])

Then the C function <function> must be e.g. declared as follows:

cdef extern from "<file>.h" nogil:
    int <function> (int a, int *b)

This feature releases the GIL (Global Interpreter Lock) when calling the function.

mmgroup.generate_c.generate_pxd(pxd_out, h_in, pxd_in=None, h_name=None, nogil=False)

Create a .pxd file from a C header file.

Here parameter h_in is the name of a C header file that has usually been generated with method generate of class TableGenerator. In such a header file, some exported function are preceded by a comment of shape

// %%EXPORT <parameter>

There <parameter> is a string of letters describing the way how a function is exported. The function is entered into the output .pxd file if <parameter> contains the letter 'p'.

Method generate_pxd creates a .pxd file with name pxd_file that contains information about these exported function.

A .pxd file is used in Cython. Its content is:

cdef extern from <h_file_name>:
    <exported function 1>
    <exported function 2>
    ...

Here <h_file_name> is given by parameter h_name.

Parameters:
  • pxd_out – Name of the output .pxd file.

  • h_in – Name of the header file from which external functions are copied into the .pxd file as a cdef extern from statement.

  • pxd_in – An optional name of a input .pxd file to be copied in the output .pxd file in front of the cdef statement. If that parmeter is a string containing a newline character \n then that string is copied directly into the output .pxd file.

  • h_name – Name of .h file to to be placed into the statement cdef extern from <h_file_name>.

  • nogil – Optional, exported functions are declared as nogil when set.

Support for Sphinx mockup feature when generating documentation

We use the Sphinx package on the Read the Docs server for generating documentation. On that server we do not want to compile .c files. However, we want the .c files to be present for extracting documentation from them, using the Doxygen tool. Here it suffices if the prototypes and comments relevant for documentation are present in the .c files.

This means that we have to perform the code generation process also on the documentation server. Note that some code generation procedures in the Tables classes in the python modules used for code generation rely on compiled Cython extensions. But Cython extensions will not be compiled on the documentation server. To avoid using compiled code, in each python module to be used for code generation we may add a MockupTables class to the Tables class already present. Such a MockupTables class may generate the parts of a .c file relevant for documentation only, without using any compiled Cython extensions.

We may set the --mockup option in the code generation tool generate_code.py. Then any Tables class in a python module to be used for code generation is replaced by the MockupTables class in that module, if present.

How the code generator is used

Warning!

At present the process of generating C code is under construction.

We plan to switch to the meson build system. Therefore a much more declarative style is required for describing an buld operations. Thus the description here is pertty much outdated!

For generating C code, an instance of class TableGenerator in module mmgroup.generate_c must be generated. That instance takes two arguments tables and directives as described in section Code generation.

There are many tables and directives needed for different purposes. By convention we provide a variety of classes, where each class provides two attributes tables and directives providing dictionaries of tables and directives, respectively. These dictionaries are suitable for creating instances of classes TableGenerator. Such classes are called table-providing classes. Attributes tables and directives may also be implemented as properties.

Python scripts with names codegen_xxx.py in the root directory of the source distribution create instances of class TableGenerator that will generate C programs and also the corresponding header files. These python scripts are run as subprocesses of the build_ext command in the setup.py script.

Each instance of class TableGenerator may take the union of the directories obtained from methods tables() and directives() of several table-providing classes. A simple example for a table-providing class is the class Lsbit24Function in module mmgroup.dev.mat24.mat24_aux.

Description of some typical bugs

In this section we decribe some typical bugs that may occur during the develpment process and that may be extremely hard to find.

1. Cleaning up a corrupted local git repository

After a long development session the local git repository might be corrupted for what reason ever. Here the best cure is to delete everything in the repository (exept for subdirectory .git) and to checkout everythinh with git. If you do not want to do so, you may cleanaup intermediate files as follows.

First open a command shell, switch to the root directory of your local repository, and make sure that your C compiler works in that shell. Then enter the following statements:

python cleanup.py -a
git checkout .
python setup.py  build_ext --inplace
python -m pytest ./src/mmgroup/ -v -s -m "not slow"

Then for creating the documentation you should enter something like the following (details depending on the operating system):

cd docs
sphinx-build -M html source build  -E -a
sphinx-build -M latexpdf source build  -E -a
cd ..

2. Documentation is created correctly on the local host but not on readthedocs

Whenever we update the master branch in git on the gitbub server, this automatically starts a process on the readthedocs server that updates the documentation. In that case the developer should check the documentation on the readthedocs server at

https://mmgroup.readthedocs.io/en/latest/index.html

If this documentation is incorrect then the developer should check the logfiles of the last generation of the documentation on readthedocs. For generating documentation, the readthedocs server starts a sequence of processes, and for each process the output is logged.

There is one type of bug that is difficult to find. If this bug occurs, then in the output of one of the processes that invokes python -m sphinx ... there may be an error message similar to the following message:

WARNING: autodoc: failed to import module ‘mm_group’ from module ‘mmgroup’; the following exception was raised:

No module named ‘mmgroup.mat24’

This message has to do with the mockup of python extensions when running the Sphinx documentation tool on the readthedocs server.

We do not use any C compiler on the readthedocs server, so that python extensions written in Cython are not available. When running Sphinx then all python objects to be documented are imported, so that their docstrings can be processed. Therefore all Cython extensions must be mocked up in Sphinx. This means that we tell Sphinx that these extensions are not available. For details see

https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#confval-autodoc_mock_imports

The Cython extension mat24 is not mocked up in Sphinx, but it is replaced by class mmgroup.dev.mat24.mat24_ref.Mat24 instead.

So instad of simply importing the extension mat24 we have to do the following:

>>> try:
>>>    from mmgroup import mat24
>>> except (ImportError, ModuleNotFoundError):
>>>    from mmgroup.dev.mat24.mat24_ref import Mat24
>>>    mat24 = Mat24

More sophisticated examples of importing objects from module mat24 are given e.g. in modules mmgroup.mm_group and mmgroup.structures.autpl.

One way to debug such a problem on the local host (and not on the readthedocs server) is as follows. Use pip uninstall mmgroup (or similar) to uninstall the mmgroup package Then execute the following commands in your shell:

python cleanup.py -a
git checkout .
set readthedocs=True
python setup.py  build_ext --inplace
cd docs
sphinx-build -M html source build  -E -a
sphinx-build -M latexpdf source build  -E -a
cd ..
# Here you should inspect the output of the sphinx-build
# commands for debugging
unset readthedocs  # in Windows use 'set readthedocs=' instead
# Reverse the effect of  'set readthedocs=True'

If you are lucky then the errors occuring on the readthedocs server will now also occur on your local host.

Warning

After typing that sequence into your shell, your environment is probably corrupted; and you should proceed as in subsection Cleaning up a corrupted local git repository.