EUROCRYPT 2024
Cryptanalysis of rank-2 module-LIP in totally real number fields
README
#
This is the code used in the article
"Cryptanalysis of rank-2 module-LIP
in totally real number fields"
(https://eprint.iacr.org/2024/441.pdf)
#
This repository contains two folders, corresponding to two different
codes, both used in the article "Cryptanalysis of rank-2 module-LIP in
totally real number fields".
The folder `experiments_for_assumption' contains the code used to verify
experimentally the Assumption 1 from the article (see Appendix B of
the article).
The folder `attack' contains the code of the attack against rank-2
module-LIP in totally real number fields. The implementation is limited
to the free module OK^2, and maximal totally real fields of cyclotomics
of conductor divisible by 4 (see Appendix C in the article).
Detailed READMEs for each of the two parts of the code are available
in each folder. For convenience, we also copy-paste the two detailed
readmes below.
All the code from this repository is provided under the the GNU Affero
Public License (see the 'LICENSE.md' file and the files 'COPYRIGHT.txt'
in each folder).
#
#
Repository `attack'
#
#
This is a proof-of-concept code for solving module-LIP for rank 2
modules over totally real field. More precisely, the code is restricted
to the free modules OK^2, over the maximal totally real subfield K of
a cyclotomic fields L with conductor m divisible by 4.  In particular,
no pseudo-objects are considered.
The structure of the attack follows the description of the submitted
article. Given a integer binary quadratic form of determinant 1 over the
maximal totally real subfield K of a cyclotomic field L=K(i), it computes
all the OK-congruence matrices from Q to the identity matrix. This is
done by reducing to a relative norm computation for prime, principal
OK-ideals Q, from which all the sums-of-two-squares Q = x^2+y^2 over OK
are deduced. The relative norm computation is handled through a version
of Gentry-Szydlo's algorithm. Recall that, given a Z-basis of an ideal
I in OL and the relative norm gg^* of a potential generator g for I,
it computes g or fail if I was not generated by g.
Our implementation does not aim for efficiency in larger dimension.
Softwares needed to run our code
We require Sagemath to run the code. It has been tested on sage's version
9.5 and more recent (but it may still work with older versions of sage).
Optionally, if you have a recent version of pari/gp installed on your
computer (version 2.16.x of after) and if your sage installation uses
an older version of pari/gp (2.15.x of before, this can be checked
by running 'pari.version()' in sage), you will be able to use your
more recent installation of pari/gp by changing some line in the file
'attack_Hawk_totally_real.sage'. More informations about how to do this
are provided in the later file. The point of using a more recent version
of pari/gp is to gain 20/30% on the running time in large dimensions.
How to run the attack
To run the attack on a randomly generated instance, simply run
Sagemath in a terminal in the current repository, and load the file
'attack_Hawk_totally_real.sage' by typing
sage: load("attack_Hawk_totally_real.sage")
By default, this will run the attack on a random instance in a number
field of conductor 64, and should take less than 10 seconds. If you want
to run the attack in number fields of larger dimension, you can change
the value of m inside the file 'attack_Hawk_totally_real.sage'. Note that
we can only guarantee the behaviour of our implementation for maximal
totally real subfield of cyclotomic fields with conductor m=4k.
By "random instances", we mean that the coefficients of the polynomials
in the first column of (the secret basis) B are drawn uniformly at random
in a small interval (-bo, bo). This interval can be increased by the
user: change '3' in FakeHawkKeyGen parameters to a larger value. Note
that even if 'bo=3' the second column is very unlikely to be small,
and that in turn Q is very unlikely to have small-ish entries.
The current code generates its own random instances. If you want to
supply your own generated instances, you can: 1) comment line 160: 'B =
FakeHawkKeyGen(K, B_OK, MC_, 3,Hawk=True)' 2) B should be a 2 x 2 matrix,
with entries that can be interpreted as elements of OK by Sage, and its
determinant should be 1. Replace the commented line by any instructions
that can provide this data in this format.
To generate B with not-too-large entries, we use a specific basis of
OK, the integer ring of K. We describe how to derive this basis in the
Appendices of the full version of the paper. If you want to work with a
different basis, you can: 1) comment line 152: 'B_OK, MC_ = NiceOKBasis(K,
L)' 2) Let d = euler_phi(m). Then B_OK is a d x d matrix whose rows are
the coordinates of the corresponding element of K, seen as a polynomial
(e.g., the power basis gives B_OK=Id), and MC_ is its inverse. Replace
the commented line by any instructions that can provide this data in
this format.
Memory usage
Running the attack may require some significant amount of
memory in large dimension. By default, the code allocates 5GB
of memory for the computation.  You can modify this value in the
file 'attack_Hawk_totally_real.sage', by modifying the variable
'pari_size_max'.  (If your computation gets killed, or if you get an
error with message "pari stack overflows", you may want to re-run the
computation with a larger amount of allocated memory).
The bound of 5GB (allocated by the 'pari_size_max' quantity) is an upper
bound on the memory that will be used by pari. By default, pari will use
less memory and increase its stack only when it is needed. When it does
so, if in verbose mode, it prints a warning of the form
   "Warning: not enough memory, new PARI stack XXX"
This is normal, and it will be printed (in verbose mode) whenever pari
needs to increase its stack.
Temporary files
Depending on your version of sage (and whether you use your own
installation of pari/gp), the code may create temporary files, stored
in a repository `GentrySzydlo_temporary_files'. You may safely erase
this repository once the computation is over.
Output repository
If you want to manually check that the attack was indeed successful, you
can have a look at the files 'secret_basis.sage' and 'result_attack.sage'
in the repository "output". The file 'secret_basis.sage' contains
the (random) secret basis that was used to generate the module-LIP
instance. The file 'result_attack.sage' contains the four possible
bases that were computed by the attack. One can check manually (at least
when the dimension is small) that the file 'result_attack.sage' indeed
contains the secret basis from 'secret_basis.sage'.  You can also safely
erase this repository if you want to.
Description of the various files
The archive contains 12 files:
- this README.txt 
- LICENSE.txt contains the description of the licence under which we 
 distribute this code (GNU AFFERO GENERAL PUBLIC LICENSE, version 3).
*** Functions related to Gentry-Szydlo's algorithm: These can be found
in the eponymous directory
- LLL_gram_pari.gp: this short pari script reads the matrix G produced 
 by the sage function implicit_reduction_pari, reduce it using pari's
 LLL algorithm, and writes the output in a file, to be read back by
 the function implicit_reduction_pari.
- implicit_reduction.sage: the function in this file computes the 
 implicit reduction part of the Gentry-Szydlo algorithm. Namely,
 given as input a principal ideal I = vOL (where OL is the ring of
 integers of a CM field L) and the relative norm x = v\bar(v) of the
 generator, the algorithm outputs an implicitely reduced basis (w1,
 ..., wd) of I (d = degree(L)), i.e., (w1, ..., wd) is a basis of I,
 and wi = v*ai with ai in OL such that (a1, ..., ad) is LLL-reduced
 (note: the ai are a priori unknown, since v is not supposed to be
 known). This file may use LLL_gram_pari.gp.
- nf_root.sage: This file contains a function to compute a r-th root 
 of an element in a number field. Depending on the version of pari
 available, use the fastest possible function (we gain a factor 2 to
 3 by using the more recent function of pari). This is the last step
 of Gentry-Szydlo.
- estimate_size_prime.sage: functions to estimate the size of the 
 primes P1, P2 and Q needed when running the Gentry-Szydlo
 algorithm. These are involved for a Fermat-like use, and are derived
 either conservatively or heuristically. If the selection is not done
 conservatively, the algorithm may fail sometimes.
- polynomial_chain.sage: the first step in the algorithm is to compute 
 the power of an ideal, I^e, for some possibly large e. This cannot be
 done naively, and this deferred to this file for clarity. At a high
 level, the exponentiation is done via a square-and-multiply approach,
 but as we are handling Z-bases, LLL reduction is performed at each
 step to maintain a reasonable size of the description. Besides,
 we use the Gram version of LLL (acting on the Gram matrices) as we
 perform a division by a totally real element at each step --- see
 also the description of [EFK CRYPTO 2020] --- to further control the
 sizes. Dependencies: implicit_reduction.sage
- Gentry_Szydlo.sage: the general description of the 
 algorithm. Dependencies: polynomial_chain.sage, nf_root.sage,
 estimate_size_prime.sage.
- test_Gentry_Szydlo.sage: a file to test in a standalone way the 
 behaviour of the implemented Gentry-Szydlo's algorithm. Dependencies:
 Gentry_Szydlo.sage.
*** Functions related to two-square decompositions These are a level
above Gentry_Szydlo.
- two_square.sage: a code file in Sagemath containing the necessary 
 functions for the TwoSquares algorithm. Dependencies: Gentry_Szydlo.sage- The main functions are: 
 --- TwoSquareFromRelativeNorm: computes the
 two-squares decompositions from a set of solutions of an equation q=
 xx^*. Additional functions for conversion are needed, because of a
 weird precision bug in pari when the conductors grows. This is done
 by a parity check of the relative trace.
--- RelativeNormEquation: computes the solutions of an equation
    q=xx^*. Currently, it works for the maximal totally real subfield
    of cyclotomic fields of conductor m=4k. It can be modified to
    handle more general cyclotomic fields, at the cost of not using
    Gentry-Szydlo's and class group computation instead. For this reason,
    in this case it should only be used over small conductors. A flag
    "prime" is added to the signature, defaulted to False. If prime=True,
    it avoids refactoring the absolute norm of the input ideals and
    directly factor a degree 2 polynomial over OK, which is the standard
    behaviour during the attack later.  It calls CandidateIdeals and if
    prime=False, it also calls PrimesAbove.
--- CandidateIdeals: In the general case, this is the bottleneck of
    the algorithm. Given the list of primes above qOL, compute all
    possible products of prime dividing qOK that could lead to a
    solution for the norm equation, then find the principal ones using
    Gentry-Szydlo (or class group computation is m!=4k).  A flag "prime"
    is added to the signature, defaulted to False. If prime=True, only
    one Gentry-Szydlo is necessary as in this case, the prime ideal is
    either inert or split in two conjugates.
--- PrimesAbove: is dedicated to factor qOK and to sort primes
    depending of them being split or inert. Mostly used for the general
    relative norm problem.
--- GenerateInstance: to test the code. Can generate relative norms
    equations as well as sums of squares. Two flags can be used: "with
    sol" creates a relative norm equation instance, that is usually
    easy to solve (when m<80); and "two squares" which is much longer
    for large m, because of the norm factorization.
*** Functions related to the attack itself
- congruence.sage: this contains the two last functions to run the 
 attack. FindPrimeNorm outputs a prime in OK of the form x^2+y^2 =
 (u,v)Q(u,v)^t for some small known u,v. This is then processed into
 CongruenceSolver, who uses two such vectors and primes to compute all
 the congruence matrices from Q to I_2. Currently limited to free module
 of determinant 1. Dependencies: two_square.sage
- attack_Hawk_totally_real.sage: generates test instances and run the 
 whole attack on them. Notably, it emulates Hawk's keygen for the
 maximal totally real subfield of a cyclotomic, without trying to
 be efficient nor guaranteeing a small output. This is mostly for
 proof-of-concept. Dependencies: congruence.sage
--- AdditionalStructures: the totally real subfield has a basis which
 has better geometry than the power basis. It is used to get smaller
 instances (see next function).
--- FakeHawkKeyGen: mimics Cohen's "idempotent algorithm" (sic), which
 is also known as Naive-NTRUSolve. This algorithm serves as completing
 a vector of OK^2 (or other free modules) into a basis via a HNF
 computation. It can be simplified using algebraic norm instead, but
 it will not reduce the intermediate outputs. The last step is the
 usual size-reduction, which does not reduce so much for totally real
 subfields. Using the better basis gives mildly better results.
#
#
Repository
`experiments_for_assumption'
#
#
This folder contains the code used in the article "Cryptanalysis
of rank-2 module-LIP in totally real number fields"
(https://eprint.iacr.org/2024/441.pdf) to verify experimentally
Assumption 1.  Description of the experiments and the results is provided
in Appendix B in the article.  The instructions below will help you run
the code and reproduce the experiments.
Running the experiments
To run these experiments, you need Sagemath version 9 or more (i.e.,
using python 3).
To reproduce the experiments and figures described in the article,
open a terminal and run
sage run_tests.sage
Warning: this will use 3 cores of your computer, and require up to 6
Go of RAM at its maximum. This will run for approximately 2 days if you
want to run the experiments for all the fields tested in the article. If
you want to use less processors or perform less tests, you can do so by
adding some options, as explained below.
Randomness: the random seed has been fixed to allow reproducibility
of the results. This can also be removed by using some option (see below).
Options
When running the sage script run_tests.sage, one can use the following
options:
-np (or --nbproc) N
where N is an integer > 0. This will run the tests with N parallel
processors (the default value if no option is passed to the script is
N = 3).
-rs (or --randomseed) random
if this option is given to the script, then the random seed will not be
fixed (the default value is "-rs fixed", in which case the random seed
is fixed to 42)
-i (or --input) filename.csv
where filename.csv is a csv file containing the list of tests to be run
by the scripts. The format of the csv file should be: - a first line
written: m,random,verbose - then each line should be a triple of the
form: m,random,verbose     where m is an integer and random and verbose
are either True or False
    (e.g., a line can be: 10,True,False) each line will generate a
    test in a cyclotomic field of conductor m, using random modules
    if random = True, or only the module OK^2 if random = False. The
    program running the test will print some data if verbose = True
    (otherwise no information will be printed).
An example of csv file that can be given as an argument is available in
`input_for_article.csv' (this contains all tests run in the article).
The default value of this option is None, in which case the script will
generate the list of tests as in the article.
Examples: you can run in a terminal
sage run_tests.sage -np 2 -rs random -i input_for_article.csv
This will run the tests provided in file input_for_article.csv (i.e., all
tests) using 2 parallel processors, and without fixing the random seed.
Results
The script will produce files with name output_modules_m.csv' oroutput_OK2_m.csv' (for various m's) in the folder results', and plots
in the folderresults/figures'.
- Each file in the folder "results" corresponds to a cyclotomic number
 field, whose conductor is m. The files with nameoutput_modules_n.csv' contain the data for randomly generated free modules, and the files with nameoutput_OK2_m.csv' contain the data for the rank-2 module O_K^2.
The files output_modulesm.csv' oroutput_OK2_m.csv' are in csv
format. Each row of those files (except the first one) corresponds to
a module M, and the data of the row is of the form
  m,d,nb_tests, rho_K, everything_went_fine, P_empirical', P_expected', s'
where: - m is the conductor of the cyclotomic field L (recall that K
is the totally real subfield of a cyclotomic field) - d is the degree
of K - nb_tests is the number of random vectors (z1, z2) generated in
each module - rho_K is the residue of the zeta function of K at 1 -
everything_went_fine is a boolean, which is True if no warnings occured
during the execution of the test - P_empirical' and P_expected' are
the quantities P_empirical and P_expected from the article multiplied by
nb_tests.  - s' = s/norm(I)^{1/d} where s is the standard deviation of the
Gaussian distribution used to sample (z1, z2) and I is the Gram ideal of M
- The figures poly_full.png' andratios_full.png' are the same as the
 one displayed in the article (Fig. 1 and 2). The folderfigures' also contains a filepoly_modules.png', which shows the results of Figure 2
 only in the case where we generated 10 (or 20) random modules per number
 field. The color indicates modules in a same number field: the same
 color means that the two modules are over the same number field (since
 there may be different number fields with the same degree). The file
 poly_OK.png' shows the same thing, but in the case where we considered only the module O_K^2. Finally, the filesratios_modules.png' and
 `ratios_OK.png' are similar, but correspond to Figure 1 in the article.
Organization of the code
All the functions used for the experiments and for plotting the
results are in the file functions.sage'. The filerun_tests.sage'
is the main script, which generates the data (in the folder "results")
for all the number fields, then load this data and plot it (in the
folder "results/figures"). The file run_tests.sage' loads the filefunctions.sage'.
There is also an example file input_for_article.csv', in csv format,
which contains the list of all tests performed in the article. This file
is not needed to run the scriptrun_tests.sage', but can serve as an
example to someone willing to generate their own tests.
Getting the data without running the experiments
The folder `data_for_the_article' contains the data that should be
obtained when running the experiments (as described above), and which
we used in the article. (It is not needed to run the experiments.)
