International Association for Cryptologic Research

International Association
for Cryptologic Research

EUROCRYPT 2024

Cryptanalysis of rank-2 module-LIP in totally real number fields


README

#

This is the code used in the article

"Cryptanalysis of rank-2 module-LIP

in totally real number fields"

(https://eprint.iacr.org/2024/441.pdf)

#

This repository contains two folders, corresponding to two different
codes, both used in the article "Cryptanalysis of rank-2 module-LIP in
totally real number fields".

The folder `experiments_for_assumption' contains the code used to verify
experimentally the Assumption 1 from the article (see Appendix B of
the article).

The folder `attack' contains the code of the attack against rank-2
module-LIP in totally real number fields. The implementation is limited
to the free module OK^2, and maximal totally real fields of cyclotomics
of conductor divisible by 4 (see Appendix C in the article).

Detailed READMEs for each of the two parts of the code are available
in each folder. For convenience, we also copy-paste the two detailed
readmes below.

All the code from this repository is provided under the the GNU Affero
Public License (see the 'LICENSE.md' file and the files 'COPYRIGHT.txt'
in each folder).

#

#

Repository `attack'

#

#

This is a proof-of-concept code for solving module-LIP for rank 2
modules over totally real field. More precisely, the code is restricted
to the free modules OK^2, over the maximal totally real subfield K of
a cyclotomic fields L with conductor m divisible by 4. In particular,
no pseudo-objects are considered.

The structure of the attack follows the description of the submitted
article. Given a integer binary quadratic form of determinant 1 over the
maximal totally real subfield K of a cyclotomic field L=K(i), it computes
all the OK-congruence matrices from Q to the identity matrix. This is
done by reducing to a relative norm computation for prime, principal
OK-ideals Q, from which all the sums-of-two-squares Q = x^2+y^2 over OK
are deduced. The relative norm computation is handled through a version
of Gentry-Szydlo's algorithm. Recall that, given a Z-basis of an ideal
I in OL and the relative norm gg^* of a potential generator g for I,
it computes g or fail if I was not generated by g.

Our implementation does not aim for efficiency in larger dimension.

Softwares needed to run our code

We require Sagemath to run the code. It has been tested on sage's version
9.5 and more recent (but it may still work with older versions of sage).

Optionally, if you have a recent version of pari/gp installed on your
computer (version 2.16.x of after) and if your sage installation uses
an older version of pari/gp (2.15.x of before, this can be checked
by running 'pari.version()' in sage), you will be able to use your
more recent installation of pari/gp by changing some line in the file
'attack_Hawk_totally_real.sage'. More informations about how to do this
are provided in the later file. The point of using a more recent version
of pari/gp is to gain 20/30% on the running time in large dimensions.

How to run the attack

To run the attack on a randomly generated instance, simply run
Sagemath in a terminal in the current repository, and load the file
'attack_Hawk_totally_real.sage' by typing

sage: load("attack_Hawk_totally_real.sage")

By default, this will run the attack on a random instance in a number
field of conductor 64, and should take less than 10 seconds. If you want
to run the attack in number fields of larger dimension, you can change
the value of m inside the file 'attack_Hawk_totally_real.sage'. Note that
we can only guarantee the behaviour of our implementation for maximal
totally real subfield of cyclotomic fields with conductor m=4k.

By "random instances", we mean that the coefficients of the polynomials
in the first column of (the secret basis) B are drawn uniformly at random
in a small interval (-bo, bo). This interval can be increased by the
user: change '3' in FakeHawkKeyGen parameters to a larger value. Note
that even if 'bo=3' the second column is very unlikely to be small,
and that in turn Q is very unlikely to have small-ish entries.

The current code generates its own random instances. If you want to
supply your own generated instances, you can: 1) comment line 160: 'B =
FakeHawkKeyGen(K, B_OK, MC_, 3,Hawk=True)' 2) B should be a 2 x 2 matrix,
with entries that can be interpreted as elements of OK by Sage, and its
determinant should be 1. Replace the commented line by any instructions
that can provide this data in this format.

To generate B with not-too-large entries, we use a specific basis of
OK, the integer ring of K. We describe how to derive this basis in the
Appendices of the full version of the paper. If you want to work with a
different basis, you can: 1) comment line 152: 'B_OK, MC_ = NiceOKBasis(K,
L)' 2) Let d = euler_phi(m). Then B_OK is a d x d matrix whose rows are
the coordinates of the corresponding element of K, seen as a polynomial
(e.g., the power basis gives B_OK=Id), and MC_ is its inverse. Replace
the commented line by any instructions that can provide this data in
this format.

Memory usage

Running the attack may require some significant amount of
memory in large dimension. By default, the code allocates 5GB
of memory for the computation. You can modify this value in the
file 'attack_Hawk_totally_real.sage', by modifying the variable
'pari_size_max'. (If your computation gets killed, or if you get an
error with message "pari stack overflows", you may want to re-run the
computation with a larger amount of allocated memory).

The bound of 5GB (allocated by the 'pari_size_max' quantity) is an upper
bound on the memory that will be used by pari. By default, pari will use
less memory and increase its stack only when it is needed. When it does
so, if in verbose mode, it prints a warning of the form
"Warning: not enough memory, new PARI stack XXX"
This is normal, and it will be printed (in verbose mode) whenever pari
needs to increase its stack.

Temporary files

Depending on your version of sage (and whether you use your own
installation of pari/gp), the code may create temporary files, stored
in a repository `GentrySzydlo_temporary_files'. You may safely erase
this repository once the computation is over.

Output repository

If you want to manually check that the attack was indeed successful, you
can have a look at the files 'secret_basis.sage' and 'result_attack.sage'
in the repository "output". The file 'secret_basis.sage' contains
the (random) secret basis that was used to generate the module-LIP
instance. The file 'result_attack.sage' contains the four possible
bases that were computed by the attack. One can check manually (at least
when the dimension is small) that the file 'result_attack.sage' indeed
contains the secret basis from 'secret_basis.sage'. You can also safely
erase this repository if you want to.

Description of the various files

The archive contains 12 files:

*** Functions related to Gentry-Szydlo's algorithm: These can be found
in the eponymous directory

*** Functions related to two-square decompositions These are a level
above Gentry_Szydlo.

--- RelativeNormEquation: computes the solutions of an equation
q=xx^*. Currently, it works for the maximal totally real subfield
of cyclotomic fields of conductor m=4k. It can be modified to
handle more general cyclotomic fields, at the cost of not using
Gentry-Szydlo's and class group computation instead. For this reason,
in this case it should only be used over small conductors. A flag
"prime" is added to the signature, defaulted to False. If prime=True,
it avoids refactoring the absolute norm of the input ideals and
directly factor a degree 2 polynomial over OK, which is the standard
behaviour during the attack later. It calls CandidateIdeals and if
prime=False, it also calls PrimesAbove.

--- CandidateIdeals: In the general case, this is the bottleneck of
the algorithm. Given the list of primes above qOL, compute all
possible products of prime dividing qOK that could lead to a
solution for the norm equation, then find the principal ones using
Gentry-Szydlo (or class group computation is m!=4k). A flag "prime"
is added to the signature, defaulted to False. If prime=True, only
one Gentry-Szydlo is necessary as in this case, the prime ideal is
either inert or split in two conjugates.

--- PrimesAbove: is dedicated to factor qOK and to sort primes
depending of them being split or inert. Mostly used for the general
relative norm problem.

--- GenerateInstance: to test the code. Can generate relative norms
equations as well as sums of squares. Two flags can be used: "with
sol" creates a relative norm equation instance, that is usually
easy to solve (when m<80); and "two squares" which is much longer
for large m, because of the norm factorization.

*** Functions related to the attack itself

--- AdditionalStructures: the totally real subfield has a basis which
has better geometry than the power basis. It is used to get smaller
instances (see next function).

--- FakeHawkKeyGen: mimics Cohen's "idempotent algorithm" (sic), which
is also known as Naive-NTRUSolve. This algorithm serves as completing
a vector of OK^2 (or other free modules) into a basis via a HNF
computation. It can be simplified using algebraic norm instead, but
it will not reduce the intermediate outputs. The last step is the
usual size-reduction, which does not reduce so much for totally real
subfields. Using the better basis gives mildly better results.

#

#

Repository

`experiments_for_assumption'

#

#

This folder contains the code used in the article "Cryptanalysis
of rank-2 module-LIP in totally real number fields"
(https://eprint.iacr.org/2024/441.pdf) to verify experimentally
Assumption 1. Description of the experiments and the results is provided
in Appendix B in the article. The instructions below will help you run
the code and reproduce the experiments.

Running the experiments

To run these experiments, you need Sagemath version 9 or more (i.e.,
using python 3).

To reproduce the experiments and figures described in the article,
open a terminal and run

sage run_tests.sage

Warning: this will use 3 cores of your computer, and require up to 6
Go of RAM at its maximum. This will run for approximately 2 days if you
want to run the experiments for all the fields tested in the article. If
you want to use less processors or perform less tests, you can do so by
adding some options, as explained below.

Randomness: the random seed has been fixed to allow reproducibility
of the results. This can also be removed by using some option (see below).

Options

When running the sage script run_tests.sage, one can use the following
options:

-np (or --nbproc) N

where N is an integer > 0. This will run the tests with N parallel
processors (the default value if no option is passed to the script is
N = 3).

-rs (or --randomseed) random

if this option is given to the script, then the random seed will not be
fixed (the default value is "-rs fixed", in which case the random seed
is fixed to 42)

-i (or --input) filename.csv

where filename.csv is a csv file containing the list of tests to be run
by the scripts. The format of the csv file should be: - a first line
written: m,random,verbose - then each line should be a triple of the
form: m,random,verbose where m is an integer and random and verbose
are either True or False
(e.g., a line can be: 10,True,False) each line will generate a
test in a cyclotomic field of conductor m, using random modules
if random = True, or only the module OK^2 if random = False. The
program running the test will print some data if verbose = True
(otherwise no information will be printed).
An example of csv file that can be given as an argument is available in
`input_for_article.csv' (this contains all tests run in the article).
The default value of this option is None, in which case the script will
generate the list of tests as in the article.

Examples: you can run in a terminal

sage run_tests.sage -np 2 -rs random -i input_for_article.csv

This will run the tests provided in file input_for_article.csv (i.e., all
tests) using 2 parallel processors, and without fixing the random seed.

Results

The script will produce files with name output_modules_m.csv' oroutput_OK2_m.csv' (for various m's) in the folder results', and plots in the folderresults/figures'.

  1. Each file in the folder "results" corresponds to a cyclotomic number
    field, whose conductor is m. The files with name output_modules_n.csv' contain the data for randomly generated free modules, and the files with nameoutput_OK2_m.csv' contain the data for the rank-2 module O_K^2.

The files output_modulesm.csv' oroutput_OK2_m.csv' are in csv
format. Each row of those files (except the first one) corresponds to
a module M, and the data of the row is of the form
m,d,nb_tests, rho_K, everything_went_fine, P_empirical', P_expected', s'
where: - m is the conductor of the cyclotomic field L (recall that K
is the totally real subfield of a cyclotomic field) - d is the degree
of K - nb_tests is the number of random vectors (z1, z2) generated in
each module - rho_K is the residue of the zeta function of K at 1 -
everything_went_fine is a boolean, which is True if no warnings occured
during the execution of the test - P_empirical' and P_expected' are
the quantities P_empirical and P_expected from the article multiplied by
nb_tests. - s' = s/norm(I)^{1/d} where s is the standard deviation of the
Gaussian distribution used to sample (z1, z2) and I is the Gram ideal of M

  1. The figures poly_full.png' andratios_full.png' are the same as the
    one displayed in the article (Fig. 1 and 2). The folder figures' also contains a filepoly_modules.png', which shows the results of Figure 2
    only in the case where we generated 10 (or 20) random modules per number
    field. The color indicates modules in a same number field: the same
    color means that the two modules are over the same number field (since
    there may be different number fields with the same degree). The file
    poly_OK.png' shows the same thing, but in the case where we considered only the module O_K^2. Finally, the filesratios_modules.png' and
    `ratios_OK.png' are similar, but correspond to Figure 1 in the article.

Organization of the code

All the functions used for the experiments and for plotting the
results are in the file functions.sage'. The filerun_tests.sage'
is the main script, which generates the data (in the folder "results")
for all the number fields, then load this data and plot it (in the
folder "results/figures"). The file run_tests.sage' loads the filefunctions.sage'.

There is also an example file input_for_article.csv', in csv format, which contains the list of all tests performed in the article. This file is not needed to run the scriptrun_tests.sage', but can serve as an
example to someone willing to generate their own tests.

Getting the data without running the experiments

The folder `data_for_the_article' contains the data that should be
obtained when running the experiments (as described above), and which
we used in the article. (It is not needed to run the experiments.)