Transactions on Cryptographic Hardware and Embedded Systems, Volume 2022
Side Channel Attack On Stream Ciphers: A Three-Step Approach To State/Key Recovery
README
Trivium-3Step-SCA
This repository contains the accompanying side channel data and source codes for the paper, Side Channel Attacks On Stream Ciphers: A Three-Step Approach To State/Key Recovery (published in CHES). Our attack consists of three steps: ML, MILP and SMT experiments. The code for each is provided as a separate module.
Description of the Attack
Our SCA attack targets the recovery of internal state bits at any particular round of the Trivium cipher using the information of erroneous Hamming weight information and keystream bits (optional) in three steps (ML + MILP + SMT). ML model predicts Hamming weight (with some noise) from the given traces, MILP model modify the obtained Hamming weight sequence to reduce the error tolerance (in our case, it reduces the error tolerance to 3 with a very high success rate) and lastly, SMT solves a set of equations (or constraints) formed from the erroneous Hamming weight sequence and keystream bits to recover the internal state bits. For the attack, first we have to tune in the tolerance of the ML and SMT models. For that, first execute ML code and note down the accuracy for varying tolerance. Thereafter run the SMT model on a simulated cipher with varying tolerance and note down the tolerance (say tl) beyond which the SMT time is infeasible even with more information (HW or keystream bits). If the ML accuracy at tolerance tl is not 100%, then use the MILP model (in between ML and SMT) to achieve 100% accuracy at tolerance, tl, with a very high probability (in our case tl = 3). In our manuscript, we have analysed the performance of each part of the framework separately, i.e., using generic traces of a 32-bit microcontroller for the ML model (script in ML/
folder) and using simulated cipher information for MILP (script at MILP+SMT/Trivium_MILP_PredictedHW_Correction_CHES-submission.py
) and SMT (script at MILP+SMT/Trivium_SMT-HW-code-CHES-submission.py
). We tested each part (ML, MILP and SMT) for varying parameters and results of the same are stated in the paper. We have also tested our SMT model for Hamming distance model (script at MILP+SMT/Trivium_SMT-HD-code-CHES-submission.py
) with varying parameters. Lastly, we computed the success probability of MILP for varying SNR by adding some Gaussian noise to the existing traces with varying standard deviations.
Setup
Refer to the readme file for ML (ML/README.md
) and SMT (MILP+SMT/README.md
).
Repository Structure
ML/
: This folder contains the script for ML experiments. See readme file inside the folder for explanation.MILP+SMT/
: This folder contains the script for MILP and SMT model. See readme file inside the folder for explanation.
How to reproduce the results stated in the paper
Table 3: Refer to the
ML/README.md
file to produce results of Table 3.Table 4: This table is produced from the script
MILP+SMT/Trivium_MILP_PredictedHW_Correction_CHES-submission.py
. CollectTesting Accuracy
for all tolerance from Table 3 and feed it to the arraydistribution
under the SectionMILP
of theMILP+SMT/config.ini
file. Update thetrials
variable with the number of different experiments we want to perform andN_round
with the number of rounds for which information has to be passed. For example: In Table 3, the testing accuracies for MLP-II parameters with varying tolerances (0-7) are[0.39266, 0.86615, 0.98234, 0.99784, 0.99965, 0.99992, 0.99999, 1.0]
. Feed it to the arraydistribution
in theMILP+SMT/config.ini
file underMILP
section. Settrials = 1000
,N_round= 110
andtolerance = 3
to produce the result corresponding to the first row of Table 4.Table 5: In this table, we have analysed the performance of our SMT model for Hamming weight model in the pseudorandom phase. This table is produced from the script
MILP+SMT/Trivium_SMT-HW-code-CHES-submission.py
. Update the parameters in the SectionSMT-HW
ofMILP+SMT/config.ini
file as follows:mc_len
- size of the microcontroller used (8/16/32);tolerance
- error tolerance for which we have to analyse SMT model;N_round
- number of rounds for which information has to be passed;trials
- Number of different and independent trials that we need to perform;init=0
as in this table we are targeting pseudo-random phase;z_act
- 0 if we do not use keystream bit information while forming SMT instances and 1 if we are using keystream bit information for SMT instances (z_act
= 0 for Table 5(a) and 1 for Table 5(b));G_pos
- positions of the internal state that are needed to be guessed (any numbers in the range[0,state_size-1]
). For example, setmc_len = 8
,tolerance = 1
,N_round = 110
,trials= 20
,init= 0
,z_act = 0
andG_pos = []
to produce the result in first row of Table 5(a).Table 6: In this table, we have analysed the performance of our SMT model for Hamming weight model in the initialisation phase. The results in this table are produced from the script
MILP+SMT/Trivium_SMT-HW-code-CHES-submission.py
. Update the parameters in the SectionSMT-HW
of theMILP+SMT/config.ini
file as follows:mc_len
- size of the microcontroller used (8/16/32);tolerance
- error tolerance for which we have to analyse SMT model;N_round
- number of rounds for which information has to be passed (denoted by#Rounds
in 3rd column of Table 6);trials
- Number of different and independent trials that we need to perform;init=1
as in this table we are targeting initialisation phase;z_act = 0
as in the initialisation phase we do not have access to the keystream bits;G_pos
- positions of the internal state that needed to be guessed (any numbers in the range[0,state_size-1]
). For example, setmc_len = 8
,tolerance = 4
,N_round = 150
,trials= 20
,init= 1
,z_act = 0
andG_pos = []
for first row of Table 6.Table 7: In this table, we have analysed the performance of our SMT model for Hamming distance model. The results in the table are produced from the script
MILP+SMT/Trivium_SMT-HD-code-CHES-submission.py
. Update the parameters in the SectionSMT-HD
of theMILP+SMT/config.ini
file as follows:tolerance
- error tolerance for which we have to analyse SMT model;N_round
- number of rounds for which information has to be passed;trials
- Number of different and independent trials that we need to perform;init=1
as in this table we are targeting initialisation phase;z_act=0
as in the initialisation phase, keystream bits are not available;G_pos
- positions of the internal state that needed to be guessed (any numbers in the range[0,state_size-1]
). For example, settolerance = 1
,N_round = 200
,trials = 5
,init =1
,z_act = 0
andG_pos = [65,66,67,68,69]
(state positionss65,s66,...,s69
) for 5th entries of Table 7.Table 8: In this table, we have analysed the success probability of the MILP model with respect to the different SNRs. New SNR is obtained by adding Gaussian noise to the existing traces with respect to different standard deviations (instructions for adding Gaussian noise to the existing traces can be found at
ML/README.md
). Thereafter the obtained ML accuracy with respect to different tolerance is fed to the arraydistribution
in the SectionMILP
of theMILP+SMT/config.ini
file and run the scriptMILP+SMT/Trivium_MILP_PredictedHW_Correction_CHES-submission.py
to compute the success probability.
Note that in the above description of tables, N_round
variable in script refers to the column named # Rounds
in the tables of the manuscript. For all entries of Table 4, trials = 1000
; For all entries of Table 5(a) z_act = 0
and init = 0
; For Table 5(b), z_act = 1
and init = 0
; For Tables 6 and 7, init = 1
and z_act = 0
; For Table 8, trials = 1000
;
License
This work is licensed under a
Creative Commons Attribution 4.0 International License.
ML to Predict Classes from Traces
The ML/
folder contains the code to run the Machine Learning step of the key recovery attack on TRIVIUM. The ML models and data are for the Hamming Weight (HW) leakage model.
Setup
- (Optional but recommended) Create a virtual environment with conda:
conda create -n trivium_sca python=3.8
- (Optional but recommended) Activate the virtual environment:
conda activate trivium_sca
- Install the required packages using pip:
pip install -r requirements.txt
Alternatively, use the conda_install.sh
script to create a conda
environment and install the required packages.
Repository Structure
run_traces.py
: Script to train MLP model on 32-bit microcontroller traces.run_traces_optuna.py
: Script to identify optimal hyperparameters using Optuna.run_traces_noisy.py
: Script to train MLP on 32-bit microcontroller traces with additive white Gaussian noise to simulate lower SNR.run_traces_pretrained.py
: Script to load pretrained model and run prediction on 32-bit microcontroller traces.datagen.py
: Utility to process raw data for training.get_snr.py
: Utility to compute SNR.config.ini
: Config file to set hyper parameters and other variables for model training.data/
: Contains the 32-bit microcontroller traces raw data.results/manual/
: Contains the log files, pretrained models, and config files for the results in the paper achieved using manual experimentation.results/optuna/
: Contains the log files, pretrained models, and config files for the results in the paper achieved using Optuna.results/noisy/
: Contains the log files, pretrained models, and config files for the results in the paper for simulating lower SNR.
Usage
- Unzip all zip files in
data/
i.ecd data && unzip \*.zip
.
Train ML Model
- Update
config.ini
with required hyper parameters and variables. - Run
python run_traces.py
,python run_traces_optuna.py
, orpython run_traces_noisy.py
for manual training, to identify optimal hyperparameters using Optuna under given constraints, or for manual training with additive white Gaussian noise.
The trained models and log files can be found in traces_models
/traces_model_opt
/traces_models_noisy
and trace_logs.log
/trace_logs_opt.log
/trace_logs_noisy.log
for the manual training, Optuna, and low SNR (noisy) experiments respectively.
Load Pretrained Model
- Run
python run_traces_pretrained.py
by passing the pretrained model as an argument i.e.python run_traces_pretrained.py -f results/manual/traces_models/traces_0.3933_ReLU.pth
$ python run_traces_pretrained.py -h
usage: run_traces_pretrained.py [-h] -f FILE
optional arguments:
-h, --help show this help message and exit
-f FILE, --file FILE Path of saved model (.pth)
Note
In case of any errors while installing pytorch
, you may have to change the version of pytorch
or install the CPU only variant. Please refer to the documentation for a detailed description on the various installation options.
The ML+MILP/
folder contains the scripts of SMT and MILP model on TRIVIUM Cipher. For SMT model, we have considered Hamming weight and Hamming distance model.
Software Used
Setup
- Create a virtual environment with conda:
conda create -n trivium_sca python=3.8
- Activate the new environment:
conda activate trivium_sca
- Install the required python packages using pip:
pip install -r requirements.txt
- Add Gurobi channels as follows:
conda config --add channels https://conda.anaconda.org/gurobi
- Install Gurobi using conda:
conda install gurobi
- For Gurobi software, obtain the Academic license from the website
- Activate license using the command:
grbgetkey <license-number>
File Structure
Trivium_MILP_PredictedHW_Correction_CHES-submission.py
: Script for MILP model on TRIVIUM cipher for correction of ML predicted Hamming weight sequence. In this experiment, we simulate a Trivium cipher with a random key/IV, generate a sequence of original HW sequence and inject noise as per the distribution of ML output. After that, we form MILP instances to reduce the error tolerance of erroneous HW sequence. It relates to Section 4.3.2 of the manuscript and reproduces the results of Tables 4 and 8. The input parameters are defined in the SectionMILP
inside theconfig.ini
file.Trivium_SMT-HW-code-CHES-submission.py
: Script to recover internal state bits from erroneous Hamming weight and keystream bits (optional) information. First, we simulated a Trivium cipher with a random key/IV; then we injected noise randomly to the original Hamming weight(HW) sequence within the given tolerance class. After that, we formed SMT instances with the help of the erroneous HW sequence and solved it to recover the internal state bits. It relates to Section 5.1 of the manuscript and reproduces the results of Tables 5 and 6. The input parameters are passed through theconfig.ini
file (SectionSMT-HW
).Trivium_SMT-HD-code-CHES-submission.py
: Script to recover internal state bits from erroneous Hamming distance and keystream bits (optional) information. It works by simulating a Trivium cipher with a random key/IV and then we inject noise randomly within the given tolerance class to the original Hamming distance (HD) sequence. After that, we formed SMT instances with the help of the erroneous HD sequence and solved it to recover the internal state bits. It relates to Section 5.2 of the manuscript and reproduces the results of Table 7. The input parameters are defined under the sectionSMT-HD
inside theconfig.ini
file.requirements.txt
: It contains the python packages required for compilation of the above three scripts.config.ini
: It contains the input parameters for all of the above three python scripts under different sections.
Note that for some parameters, the solution time for the above scripts might go beyond 24-48 hours. Additionally, for a fixed parameter set, the solution time could be higher from the reported mean time as the standard deviation is quite high, especially for Trivium_SMT-HD-code-CHES-submission.py
. For the HW model, the solution time becomes higher after tolerance 3 in the pseudo-random phase, whereas for the HD model, solution time goes higher after tolerance 3 in the initialisation phase. Refer to the tables in the manuscript to get an estimate of the solution time.
Usage
Update config.ini
with the the input parameters in the corresponding section (MILP
for MILP model; SMT-HW
for SMT model on Hamming weight; SMT-HD
for SMT model on Hamming distance) and compile using python3 filename.py
.