Today the privacy of user data is a critical concern. Using FHE encryption (such as TFHE Chillotti20) offers a solution for privacy-preserving computation in a cloud environment by allowing computation directly over encrypted data. However, use of FHE causes a O(1M)x slowdown in execution. Thus acceleration of FHE computation is required, which is the goal of this project. It focuses on FHEW/TFHE schemes and aims to develop a processor for their faster execution as well as improve software-only execution.

Our first result deals with software TFHE implementation of ciphertext-ciphertext multiplication. It is required, for instance, when AI/ML coumputations are outsourced to the cloud and both a user and the service provider use encryption, i.e. both the user input data and proveder's network weights are encrypted. Support for such an operation is either lacking or are extremely slow. We developed an approach to improve the performance of this multiplication by applying carry-save addition. Its theoretical speedup is proportional to the bit width of the plaintext integer operands. It also speeds up multi-operand summation.

This approach introduces easily exploitable parallelism at the level above TFHE gates. A speedup of 15x was obtained for 16-bit multiplication on a 64-core processor, when compared to previous results. This leads to a much faster dot product and convolution computations, which combine multiplications and a multi-operand sum. A 45x speedup is achieved for a 16-bit, 32-element dot product and a 30x speedup for a convolution with a 32x32 filter size.

The mltiplication also becomes more than twice as fast on a GPU when our approach is utilized.