EUROCRYPT 2024
Cryptanalysis of rank-2 module-LIP in totally real number fields
README
#
This is the code used in the article
"Cryptanalysis of rank-2 module-LIP
in totally real number fields"
(https://eprint.iacr.org/2024/441.pdf)
#
This repository contains two folders, corresponding to two different
codes, both used in the article "Cryptanalysis of rank-2 module-LIP in
totally real number fields".
The folder `experiments_for_assumption' contains the code used to verify
experimentally the Assumption 1 from the article (see Appendix B of
the article).
The folder `attack' contains the code of the attack against rank-2
module-LIP in totally real number fields. The implementation is limited
to the free module OK^2, and maximal totally real fields of cyclotomics
of conductor divisible by 4 (see Appendix C in the article).
Detailed READMEs for each of the two parts of the code are available
in each folder. For convenience, we also copy-paste the two detailed
readmes below.
All the code from this repository is provided under the the GNU Affero
Public License (see the 'LICENSE.md' file and the files 'COPYRIGHT.txt'
in each folder).
#
#
Repository `attack'
#
#
This is a proof-of-concept code for solving module-LIP for rank 2
modules over totally real field. More precisely, the code is restricted
to the free modules OK^2, over the maximal totally real subfield K of
a cyclotomic fields L with conductor m divisible by 4. In particular,
no pseudo-objects are considered.
The structure of the attack follows the description of the submitted
article. Given a integer binary quadratic form of determinant 1 over the
maximal totally real subfield K of a cyclotomic field L=K(i), it computes
all the OK-congruence matrices from Q to the identity matrix. This is
done by reducing to a relative norm computation for prime, principal
OK-ideals Q, from which all the sums-of-two-squares Q = x^2+y^2 over OK
are deduced. The relative norm computation is handled through a version
of Gentry-Szydlo's algorithm. Recall that, given a Z-basis of an ideal
I in OL and the relative norm gg^* of a potential generator g for I,
it computes g or fail if I was not generated by g.
Our implementation does not aim for efficiency in larger dimension.
Softwares needed to run our code
We require Sagemath to run the code. It has been tested on sage's version
9.5 and more recent (but it may still work with older versions of sage).
Optionally, if you have a recent version of pari/gp installed on your
computer (version 2.16.x of after) and if your sage installation uses
an older version of pari/gp (2.15.x of before, this can be checked
by running 'pari.version()' in sage), you will be able to use your
more recent installation of pari/gp by changing some line in the file
'attack_Hawk_totally_real.sage'. More informations about how to do this
are provided in the later file. The point of using a more recent version
of pari/gp is to gain 20/30% on the running time in large dimensions.
How to run the attack
To run the attack on a randomly generated instance, simply run
Sagemath in a terminal in the current repository, and load the file
'attack_Hawk_totally_real.sage' by typing
sage: load("attack_Hawk_totally_real.sage")
By default, this will run the attack on a random instance in a number
field of conductor 64, and should take less than 10 seconds. If you want
to run the attack in number fields of larger dimension, you can change
the value of m inside the file 'attack_Hawk_totally_real.sage'. Note that
we can only guarantee the behaviour of our implementation for maximal
totally real subfield of cyclotomic fields with conductor m=4k.
By "random instances", we mean that the coefficients of the polynomials
in the first column of (the secret basis) B are drawn uniformly at random
in a small interval (-bo, bo). This interval can be increased by the
user: change '3' in FakeHawkKeyGen parameters to a larger value. Note
that even if 'bo=3' the second column is very unlikely to be small,
and that in turn Q is very unlikely to have small-ish entries.
The current code generates its own random instances. If you want to
supply your own generated instances, you can: 1) comment line 160: 'B =
FakeHawkKeyGen(K, B_OK, MC_, 3,Hawk=True)' 2) B should be a 2 x 2 matrix,
with entries that can be interpreted as elements of OK by Sage, and its
determinant should be 1. Replace the commented line by any instructions
that can provide this data in this format.
To generate B with not-too-large entries, we use a specific basis of
OK, the integer ring of K. We describe how to derive this basis in the
Appendices of the full version of the paper. If you want to work with a
different basis, you can: 1) comment line 152: 'B_OK, MC_ = NiceOKBasis(K,
L)' 2) Let d = euler_phi(m). Then B_OK is a d x d matrix whose rows are
the coordinates of the corresponding element of K, seen as a polynomial
(e.g., the power basis gives B_OK=Id), and MC_ is its inverse. Replace
the commented line by any instructions that can provide this data in
this format.
Memory usage
Running the attack may require some significant amount of
memory in large dimension. By default, the code allocates 5GB
of memory for the computation. You can modify this value in the
file 'attack_Hawk_totally_real.sage', by modifying the variable
'pari_size_max'. (If your computation gets killed, or if you get an
error with message "pari stack overflows", you may want to re-run the
computation with a larger amount of allocated memory).
The bound of 5GB (allocated by the 'pari_size_max' quantity) is an upper
bound on the memory that will be used by pari. By default, pari will use
less memory and increase its stack only when it is needed. When it does
so, if in verbose mode, it prints a warning of the form
"Warning: not enough memory, new PARI stack XXX"
This is normal, and it will be printed (in verbose mode) whenever pari
needs to increase its stack.
Temporary files
Depending on your version of sage (and whether you use your own
installation of pari/gp), the code may create temporary files, stored
in a repository `GentrySzydlo_temporary_files'. You may safely erase
this repository once the computation is over.
Output repository
If you want to manually check that the attack was indeed successful, you
can have a look at the files 'secret_basis.sage' and 'result_attack.sage'
in the repository "output". The file 'secret_basis.sage' contains
the (random) secret basis that was used to generate the module-LIP
instance. The file 'result_attack.sage' contains the four possible
bases that were computed by the attack. One can check manually (at least
when the dimension is small) that the file 'result_attack.sage' indeed
contains the secret basis from 'secret_basis.sage'. You can also safely
erase this repository if you want to.
Description of the various files
The archive contains 12 files:
this README.txt
LICENSE.txt contains the description of the licence under which we
distribute this code (GNU AFFERO GENERAL PUBLIC LICENSE, version 3).
*** Functions related to Gentry-Szydlo's algorithm: These can be found
in the eponymous directory
LLL_gram_pari.gp: this short pari script reads the matrix G produced
by the sage function implicit_reduction_pari, reduce it using pari's
LLL algorithm, and writes the output in a file, to be read back by
the function implicit_reduction_pari.implicit_reduction.sage: the function in this file computes the
implicit reduction part of the Gentry-Szydlo algorithm. Namely,
given as input a principal ideal I = vOL (where OL is the ring of
integers of a CM field L) and the relative norm x = v\bar(v) of the
generator, the algorithm outputs an implicitely reduced basis (w1,
..., wd) of I (d = degree(L)), i.e., (w1, ..., wd) is a basis of I,
and wi = v*ai with ai in OL such that (a1, ..., ad) is LLL-reduced
(note: the ai are a priori unknown, since v is not supposed to be
known). This file may use LLL_gram_pari.gp.nf_root.sage: This file contains a function to compute a r-th root
of an element in a number field. Depending on the version of pari
available, use the fastest possible function (we gain a factor 2 to
3 by using the more recent function of pari). This is the last step
of Gentry-Szydlo.estimate_size_prime.sage: functions to estimate the size of the
primes P1, P2 and Q needed when running the Gentry-Szydlo
algorithm. These are involved for a Fermat-like use, and are derived
either conservatively or heuristically. If the selection is not done
conservatively, the algorithm may fail sometimes.polynomial_chain.sage: the first step in the algorithm is to compute
the power of an ideal, I^e, for some possibly large e. This cannot be
done naively, and this deferred to this file for clarity. At a high
level, the exponentiation is done via a square-and-multiply approach,
but as we are handling Z-bases, LLL reduction is performed at each
step to maintain a reasonable size of the description. Besides,
we use the Gram version of LLL (acting on the Gram matrices) as we
perform a division by a totally real element at each step --- see
also the description of [EFK CRYPTO 2020] --- to further control the
sizes. Dependencies: implicit_reduction.sageGentry_Szydlo.sage: the general description of the
algorithm. Dependencies: polynomial_chain.sage, nf_root.sage,
estimate_size_prime.sage.test_Gentry_Szydlo.sage: a file to test in a standalone way the
behaviour of the implemented Gentry-Szydlo's algorithm. Dependencies:
Gentry_Szydlo.sage.
*** Functions related to two-square decompositions These are a level
above Gentry_Szydlo.
two_square.sage: a code file in Sagemath containing the necessary
functions for the TwoSquares algorithm. Dependencies: Gentry_Szydlo.sageThe main functions are:
--- TwoSquareFromRelativeNorm: computes the
two-squares decompositions from a set of solutions of an equation q=
xx^*. Additional functions for conversion are needed, because of a
weird precision bug in pari when the conductors grows. This is done
by a parity check of the relative trace.
--- RelativeNormEquation: computes the solutions of an equation
q=xx^*. Currently, it works for the maximal totally real subfield
of cyclotomic fields of conductor m=4k. It can be modified to
handle more general cyclotomic fields, at the cost of not using
Gentry-Szydlo's and class group computation instead. For this reason,
in this case it should only be used over small conductors. A flag
"prime" is added to the signature, defaulted to False. If prime=True,
it avoids refactoring the absolute norm of the input ideals and
directly factor a degree 2 polynomial over OK, which is the standard
behaviour during the attack later. It calls CandidateIdeals and if
prime=False, it also calls PrimesAbove.
--- CandidateIdeals: In the general case, this is the bottleneck of
the algorithm. Given the list of primes above qOL, compute all
possible products of prime dividing qOK that could lead to a
solution for the norm equation, then find the principal ones using
Gentry-Szydlo (or class group computation is m!=4k). A flag "prime"
is added to the signature, defaulted to False. If prime=True, only
one Gentry-Szydlo is necessary as in this case, the prime ideal is
either inert or split in two conjugates.
--- PrimesAbove: is dedicated to factor qOK and to sort primes
depending of them being split or inert. Mostly used for the general
relative norm problem.
--- GenerateInstance: to test the code. Can generate relative norms
equations as well as sums of squares. Two flags can be used: "with
sol" creates a relative norm equation instance, that is usually
easy to solve (when m<80); and "two squares" which is much longer
for large m, because of the norm factorization.
*** Functions related to the attack itself
congruence.sage: this contains the two last functions to run the
attack. FindPrimeNorm outputs a prime in OK of the form x^2+y^2 =
(u,v)Q(u,v)^t for some small known u,v. This is then processed into
CongruenceSolver, who uses two such vectors and primes to compute all
the congruence matrices from Q to I_2. Currently limited to free module
of determinant 1. Dependencies: two_square.sageattack_Hawk_totally_real.sage: generates test instances and run the
whole attack on them. Notably, it emulates Hawk's keygen for the
maximal totally real subfield of a cyclotomic, without trying to
be efficient nor guaranteeing a small output. This is mostly for
proof-of-concept. Dependencies: congruence.sage
--- AdditionalStructures: the totally real subfield has a basis which
has better geometry than the power basis. It is used to get smaller
instances (see next function).
--- FakeHawkKeyGen: mimics Cohen's "idempotent algorithm" (sic), which
is also known as Naive-NTRUSolve. This algorithm serves as completing
a vector of OK^2 (or other free modules) into a basis via a HNF
computation. It can be simplified using algebraic norm instead, but
it will not reduce the intermediate outputs. The last step is the
usual size-reduction, which does not reduce so much for totally real
subfields. Using the better basis gives mildly better results.
#
#
Repository
`experiments_for_assumption'
#
#
This folder contains the code used in the article "Cryptanalysis
of rank-2 module-LIP in totally real number fields"
(https://eprint.iacr.org/2024/441.pdf) to verify experimentally
Assumption 1. Description of the experiments and the results is provided
in Appendix B in the article. The instructions below will help you run
the code and reproduce the experiments.
Running the experiments
To run these experiments, you need Sagemath version 9 or more (i.e.,
using python 3).
To reproduce the experiments and figures described in the article,
open a terminal and run
sage run_tests.sage
Warning: this will use 3 cores of your computer, and require up to 6
Go of RAM at its maximum. This will run for approximately 2 days if you
want to run the experiments for all the fields tested in the article. If
you want to use less processors or perform less tests, you can do so by
adding some options, as explained below.
Randomness: the random seed has been fixed to allow reproducibility
of the results. This can also be removed by using some option (see below).
Options
When running the sage script run_tests.sage, one can use the following
options:
-np (or --nbproc) N
where N is an integer > 0. This will run the tests with N parallel
processors (the default value if no option is passed to the script is
N = 3).
-rs (or --randomseed) random
if this option is given to the script, then the random seed will not be
fixed (the default value is "-rs fixed", in which case the random seed
is fixed to 42)
-i (or --input) filename.csv
where filename.csv is a csv file containing the list of tests to be run
by the scripts. The format of the csv file should be: - a first line
written: m,random,verbose - then each line should be a triple of the
form: m,random,verbose where m is an integer and random and verbose
are either True or False
(e.g., a line can be: 10,True,False) each line will generate a
test in a cyclotomic field of conductor m, using random modules
if random = True, or only the module OK^2 if random = False. The
program running the test will print some data if verbose = True
(otherwise no information will be printed).
An example of csv file that can be given as an argument is available in
`input_for_article.csv' (this contains all tests run in the article).
The default value of this option is None, in which case the script will
generate the list of tests as in the article.
Examples: you can run in a terminal
sage run_tests.sage -np 2 -rs random -i input_for_article.csv
This will run the tests provided in file input_for_article.csv (i.e., all
tests) using 2 parallel processors, and without fixing the random seed.
Results
The script will produce files with name output_modules_m.csv' or
output_OK2_m.csv' (for various m's) in the folder results', and plots
in the folder
results/figures'.
- Each file in the folder "results" corresponds to a cyclotomic number
field, whose conductor is m. The files with nameoutput_modules_n.csv' contain the data for randomly generated free modules, and the files with name
output_OK2_m.csv' contain the data for the rank-2 module O_K^2.
The files output_modulesm.csv' or
output_OK2_m.csv' are in csv
format. Each row of those files (except the first one) corresponds to
a module M, and the data of the row is of the form
m,d,nb_tests, rho_K, everything_went_fine, P_empirical', P_expected', s'
where: - m is the conductor of the cyclotomic field L (recall that K
is the totally real subfield of a cyclotomic field) - d is the degree
of K - nb_tests is the number of random vectors (z1, z2) generated in
each module - rho_K is the residue of the zeta function of K at 1 -
everything_went_fine is a boolean, which is True if no warnings occured
during the execution of the test - P_empirical' and P_expected' are
the quantities P_empirical and P_expected from the article multiplied by
nb_tests. - s' = s/norm(I)^{1/d} where s is the standard deviation of the
Gaussian distribution used to sample (z1, z2) and I is the Gram ideal of M
- The figures
poly_full.png' and
ratios_full.png' are the same as the
one displayed in the article (Fig. 1 and 2). The folderfigures' also contains a file
poly_modules.png', which shows the results of Figure 2
only in the case where we generated 10 (or 20) random modules per number
field. The color indicates modules in a same number field: the same
color means that the two modules are over the same number field (since
there may be different number fields with the same degree). The file
poly_OK.png' shows the same thing, but in the case where we considered only the module O_K^2. Finally, the files
ratios_modules.png' and
`ratios_OK.png' are similar, but correspond to Figure 1 in the article.
Organization of the code
All the functions used for the experiments and for plotting the
results are in the file functions.sage'. The file
run_tests.sage'
is the main script, which generates the data (in the folder "results")
for all the number fields, then load this data and plot it (in the
folder "results/figures"). The file run_tests.sage' loads the file
functions.sage'.
There is also an example file input_for_article.csv', in csv format,
which contains the list of all tests performed in the article. This file
is not needed to run the script
run_tests.sage', but can serve as an
example to someone willing to generate their own tests.
Getting the data without running the experiments
The folder `data_for_the_article' contains the data that should be
obtained when running the experiments (as described above), and which
we used in the article. (It is not needed to run the experiments.)