International Association for Cryptologic Research

International Association
for Cryptologic Research

Transactions on Cryptographic Hardware and Embedded Systems, Volume 2021

Polynomial Multiplication in NTRU Prime:

Comparison of Optimization Strategies on Cortex-M4


Polynomial Multiplication in NTRU Prime

Comparison of Optimization Strategies on Cortex-M4

This repository contains ARM Cortex-M4 implementation of three polynomial multiplication for NTRU Prime, an alternate candidate for the NIST post-quantum standardization project.

To perform polynomial multiplication in Z_{4591}/(X^{761}-X-1), the implementation can use one of the following ring operations:

Those polynomial multiplication implementations can be chosen at compile time with preprocessing macros GOODS, MIXED1, and MIXED, respectively.


The implementation targets the STM32F4 Discovery board, and it uses the following tools:

For convenience, libopencm3 is compiled, and relevant header files and the shared object placed in the lib directory.

To compile the codes, simply run the make command in the main directory. This command will generate all binaries for testing the software as well as benchmarking different implementations.

Benchmarking binaries:

Stack usage measurements:

Testing binaries:

To load the binaries to the board and read the output, one can simply run the make runXXX, where XXX can be Speed, Stack, Test, or All for running all three of them. After running any .bin file the output is also recorded in .bin.log files in obj directory. A shell script called provided to generate table 5 and table 6 in our paper in the command-line interface.

Implementation details

To use parts of the codes we provide a brief description of the files here