International Association for Cryptologic Research

International Association
for Cryptologic Research

Transactions on Cryptographic Hardware and Embedded Systems, Volume 2024

Low-Latency Masked Gadgets Robust against Physical Defaults with Application to Ascon


README

Artifact for "Low-Latency Masked Gadgets Robust against Physical Defaults with Application to Ascon"

Getting started

We used the following software with the corresponding versions:

Source files

The source files for the verilog implementation are all located in the src folder, organized as follows:

Testbenches of the implementations are included in the respective source directories.

Behavioural simulation

We use iverilog to simulate the designs, in the beh_sim directory. All the
designs can be simulated by running the make all command in that directory,
and results (vcd files) are generated under beh_simu/work/.F

Synthesis

We use Cadence Genus with freepdk45.
The synthesis script for all designs are located under synthesis/. In each
directory the synth_d.sh script drives the synthesis flow.

Since we propose results with and without randomness in the paper, a parameter is fed to the verilog topmodule (set_prng) as to activate or deactivate the randomness generation. By default the parameter is set to '0' and can be changed either in the instantation in the testbench or in the synth.tcl to obtain synthesis results with the randomness generation.

To replicate the paper results, please check the corresponding folders :

Structural simulation

Before trying to simulate a design, one should synthesize it as described in the previous section. Note that if you try to simulate a group of designs (i.e. all gadgets, primitives, mode, etc.) you should have synthesized all of them.

Structural simulation is run in the same way as the behavioral simulation, from the struct_sim directory.

TVLA

This section discusses the reproduction of the results of Appendix B, for which the source code lies in fpga/.

The Sakura-G board embeds a control FPGA and a target FPGA whose power consumption is measured.
Our design sets up the control FPGA to provide masked data and randomness to
the target FPGA, which is programmed with our iterative AND design.

The target FPGA will drive the GPIO 2 high after receiving all the required data for an execution. Its FSM then switches to a constant delay state, and then perform the 'iterative AND'.
Afterwards, it will move to another delay state, and finally the GPIO 2 will be set to low.
The target FPGA also communicates with the control FPGA that reads the result of the execution and starts a new execution.

The Xilinx ISE project for the target FPGA is located under SRC_sakura. One can change the selected target gadget by defining either the HPC3 or HPC4 macro in ucl_cg_sakura_dut_sr.v.
The ISE project for the control FPGA is located in CTRL_sakura.

Hardware setup

On the hardware side, we measured the current over the IDD jumper. We measured
through a CT1 probe without amplifier with a Picoscope 5000D, at
500MS/s for a 10 bits resolution. The trigger was taken from the GPIO 2 (traces
last the duration of the trigger signal being high).

We connect the trigger to the Picoscope on the External port, the CT1 probe over JP2 is connected to a 50 ohm resistance matching the impedance. It is fed to the Channel A.

Others settings are left untouched with respect to the factory settings.

Running the scripts

One can interact with the FPGA with the python (python 3.10 tested) scripts in the API_sakura folder (dependencies are given in requirements.txt).

The python script_ttest.py simultaneously controls the Sakura-G board, the picoscope, and performs T-test computation on the traces. By default, the script performs the traces acquisition, runs the T-test on-the-fly, and produces the figures as a pdf file, as well as saving the t-test results and the last few traces as a .npz file.

The number of captured traces is configured by the variable total_data (default: 10M). We automatically discard the first 1000 traces to hide initialization effects.

We observed an acquisition throughput around 20k traces/s. Therefore a 10 million acquisition (as per the paper results) takes roughly 10 minutes.

Other configuration:

Gadget verification with SILVER

SILVER
(version 57fd89b713f627a8b6855e02d10abe02474777e5)
can be used to verify the verilog implementation of the HPC4 gadget (PINI for
first- and second-order gadget) by running ./check.sh in the silver/
directory.
This requires the environment variable SILVER_ROOT to be set to the root
SILVER directory (typically, path ending in /SILVER and containing a bin/
directory).

E.g.: SILVER_ROOT=~/SILVER ./check.sh

Gadget simulation with PROLEAD

PROLEAD
(version c2ba46df94c8b0f7db3c1be42c90bd8ebf24e7e1)
can be used to simulate a toy Ascon (only 1 S-box), by running ./run.sh in
the prolead/ directory, with the environment variable PROEAD_ROOT set to
the root PROLEAD directory (path ending in PROLEAD/).