# PapersCutA shortcut to recent security papers

### Arxiv

#### Accuracy, Interpretability, and Differential Privacy via Explainable Boosting

Authors: Harsha Nori, Rich Caruana, Zhiqi Bu, Judy Hanwen Shen, Janardhan Kulkarni

Abstract: We show that adding differential privacy to Explainable Boosting Machines (EBMs), a recent method for training interpretable ML models, yields state-of-the-art accuracy while protecting privacy. Our experiments on multiple classification and regression datasets show that DP-EBM models suffer surprisingly little accuracy loss even with strong differential privacy guarantees. In addition to high accuracy, two other benefits of applying DP to EBMs are: a) trained models provide exact global and local interpretability, which is often important in settings where differential privacy is needed; and b) the models can be edited after training without loss of privacy to correct errors which DP noise may have introduced.

Comment: To be published in ICML 2021. 12 pages, 6 figures

Date: 17 Jun 2021

#### On PQC Migration and Crypto-Agility

Authors: Alexander Wiesmaier, Nouri Alnahawi, Tobias Grasmeyer, Julian Geißler, Alexander Zeier, Pia Bauspieß, Andreas Heinemann

Abstract: Besides the development of PQC algorithms, the actual migration of IT systems to such new schemes has to be considered, best by utilizing or establishing crypto-agility. Much work in this respect is currently conducted all over the world, making it hard to keep track of the many individual challenges and respective solutions that have been identified. In consequence, it is difficult to judge for both individual application scenarios and on a global scale, whether all (known) challenges have been addressed respectively or what their current state is. We provide a literature survey and a snapshot of the discovered challenges and solutions categorized in different areas. We use this as starting point for a community project to keep track of the ongoing efforts and the state of the art in this field. Thereby we offer a single entry-point into the subject reflecting the current state in a timely manner.

Comment: 12 pages, 2 tables

Date: 17 Jun 2021

#### Interval Privacy: A Framework for Data Collection

Authors: Jie Ding, Bangjun Ding

Abstract: The emerging public awareness and government regulations of data privacy motivate new paradigms of collecting and analyzing data transparent and acceptable to data owners. We present a new concept of privacy and corresponding data formats, mechanisms, and tradeoffs for privatizing data during data collection. The privacy, named Interval Privacy, enforces the raw data conditional distribution on the privatized data to be the same as its unconditional distribution over a nontrivial support set. Correspondingly, the proposed privacy mechanism will record each data value as a random interval containing it. The proposed interval privacy mechanisms can be easily deployed through most existing survey-based data collection paradigms, e.g., by asking a respondent whether its data value is within a randomly generated range. Another unique feature of interval mechanisms is that they obfuscate the truth but not distort it. The way of using narrowed range to convey information is complementary to the popular paradigm of perturbing data. Also, the interval mechanisms can generate progressively refined information at the discretion of individual respondents. We study different theoretical aspects of the proposed privacy. In the context of supervised learning, we also offer a method such that existing supervised learning algorithms designed for point-valued data could be directly applied to learning from interval-valued data.

Date: 17 Jun 2021

#### Secure Multi-Function Computation with Private Remote Sources

Authors: Onur Günlü, Matthieu Bloch, Rafael F. Schaefer

Abstract: We consider a distributed function computation problem in which parties observing noisy versions of a remote source facilitate the computation of a function of their observations at a fusion center through public communication. The distributed function computation is subject to constraints, including not only reliability and storage but also privacy and secrecy. Specifically, 1) the remote source should remain private from an eavesdropper and the fusion center, measured in terms of the information leaked about the remote source; 2) the function computed should remain secret from the eavesdropper, measured in terms of the information leaked about the arguments of the function, to ensure secrecy regardless of the exact function used. We derive the exact rate regions for lossless and lossy single-function computation and illustrate the lossy single-function computation rate region for an information bottleneck example, in which the optimal auxiliary random variables are characterized for binary-input symmetric-output channels. We extend the approach to lossless and lossy asynchronous multiple-function computations with joint secrecy and privacy constraints, in which case inner and outer bounds for the rate regions differing only in the Markov chain conditions imposed are characterized.

Comment: Shorter version to appear in the IEEE International Symposium on Information Theory 2021

Date: 17 Jun 2021

#### Modeling Realistic Adversarial Attacks against Network Intrusion Detection Systems

Authors: Giovanni Apruzzese, Mauro Andreolini, Luca Ferretti, Mirco Marchetti, Michele Colajanni

Abstract: The incremental diffusion of machine learning algorithms in supporting cybersecurity is creating novel defensive opportunities but also new types of risks. Multiple researches have shown that machine learning methods are vulnerable to adversarial attacks that create tiny perturbations aimed at decreasing the effectiveness of detecting threats. We observe that existing literature assumes threat models that are inappropriate for realistic cybersecurity scenarios because they consider opponents with complete knowledge about the cyber detector or that can freely interact with the target systems. By focusing on Network Intrusion Detection Systems based on machine learning, we identify and model the real capabilities and circumstances required by attackers to carry out feasible and successful adversarial attacks. We then apply our model to several adversarial attacks proposed in literature and highlight the limits and merits that can result in actual adversarial attacks. The contributions of this paper can help hardening defensive systems by letting cyber defenders address the most critical and real issues, and can benefit researchers by allowing them to devise novel forms of adversarial attacks based on realistic threat models.

Date: 17 Jun 2021

#### Differentially Private Hamiltonian Monte Carlo

Authors: Ossi Räisä, Antti Koskela, Antti Honkela

Abstract: Markov chain Monte Carlo (MCMC) algorithms have long been the main workhorses of Bayesian inference. Among them, Hamiltonian Monte Carlo (HMC) has recently become very popular due to its efficiency resulting from effective use of the gradients of the target distribution. In privacy-preserving machine learning, differential privacy (DP) has become the gold standard in ensuring that the privacy of data subjects is not violated. Existing DP MCMC algorithms either use random-walk proposals, or do not use the Metropolis--Hastings (MH) acceptance test to ensure convergence without decreasing their step size to zero. We present a DP variant of HMC using the MH acceptance test that builds on a recently proposed DP MCMC algorithm called the penalty algorithm, and adds noise to the gradient evaluations of HMC. We prove that the resulting algorithm converges to the correct distribution, and is ergodic. We compare DP-HMC with the existing penalty, DP-SGLD and DP-SGNHT algorithms, and find that DP-HMC has better or equal performance than the penalty algorithm, and performs more consistently than DP-SGLD or DP-SGNHT.

Comment: 18 pages, 3 figures

Date: 17 Jun 2021

#### Large Scale Private Learning via Low-rank Reparametrization

Authors: Da Yu, Huishuai Zhang, Wei Chen, Jian Yin, Tie-Yan Liu

Abstract: We propose a reparametrization scheme to address the challenges of applying differentially private SGD on large neural networks, which are 1) the huge memory cost of storing individual gradients, 2) the added noise suffering notorious dimensional dependence. Specifically, we reparametrize each weight matrix with two \emph{gradient-carrier} matrices of small dimension and a \emph{residual weight} matrix. We argue that such reparametrization keeps the forward/backward process unchanged while enabling us to compute the projected gradient without computing the gradient itself. To learn with differential privacy, we design \emph{reparametrized gradient perturbation (RGP)} that perturbs the gradients on gradient-carrier matrices and reconstructs an update for the original weight from the noisy gradients. Importantly, we use historical updates to find the gradient-carrier matrices, whose optimality is rigorously justified under linear regression and empirically verified with deep learning tasks. RGP significantly reduces the memory cost and improves the utility. For example, we are the first able to apply differential privacy on the BERT model and achieve an average accuracy of $83.9\%$ on four downstream tasks with $\epsilon=8$, which is within $5\%$ loss compared to the non-private baseline but enjoys much lower privacy leakage risk.

Comment: Published as a conference paper in International Conference on Machine Learning (ICML 2021). Source code available at https://github.com/MSRA-COLT-Group/Differentially-Private-Deep-Learning

Date: 17 Jun 2021

#### Blockchain Oracle Design Patterns

Authors: Amirmohammad Pasdar, Zhongli Dong, Young Choon Lee

Abstract: Blockchain is a form of distributed ledger technology (DLT) where data is shared among users connected over the internet. Transactions are data state changes on the blockchain that are permanently recorded in a secure and transparent way without the need of a third party. Besides, the introduction of smart contracts to the blockchain has added programmability to the blockchain and revolutionized the software ecosystem leading toward decentralized applications (DApps) attracting businesses and organizations to employ this technology. Although promising, blockchains and smart contracts have no access to the external systems (i.e., off-chain) where real-world data and events resides; consequently, the usability of smart contracts in terms of performance and programmability would be limited to the on-chain data. Hence, \emph{blockchain oracles} are introduced to mitigate the issue and are defined as trusted third-party services that send and verify the external information (i.e., feedback) and submit it to smart contracts for triggering state changes in the blockchain. In this paper, we will study and analyze blockchain oracles with regard to how they provide feedback to the blockchain and smart contracts. We classify the blockchain oracle techniques into two major groups such as voting-based strategies and reputation-based ones. The former mainly relies on participants' stakes for outcome finalization while the latter considers reputation in conjunction with authenticity proof mechanisms for data correctness and integrity. We then provide a structured description of patterns in detail for each classification and discuss research directions in the end.

Date: 17 Jun 2021

#### Invisible for both Camera and LiDAR: Security of Multi-Sensor Fusion based Perception in Autonomous Driving Under Physical-World Attacks

Authors: Yulong Cao*, Ningfei Wang*, Chaowei Xiao*, Dawei Yang*, Jin Fang, Ruigang Yang, Qi Alfred Chen, Mingyan Liu, Bo Li

Comment: Accepted by IEEE S&P 2021

Date: 17 Jun 2021

#### Localized Uncertainty Attacks

Authors: Ousmane Amadou Dia, Theofanis Karaletsos, Caner Hazirbas, Cristian Canton Ferrer, Ilknur Kaynar Kabul, Erik Meijer

Abstract: The susceptibility of deep learning models to adversarial perturbations has stirred renewed attention in adversarial examples resulting in a number of attacks. However, most of these attacks fail to encompass a large spectrum of adversarial perturbations that are imperceptible to humans. In this paper, we present localized uncertainty attacks, a novel class of threat models against deterministic and stochastic classifiers. Under this threat model, we create adversarial examples by perturbing only regions in the inputs where a classifier is uncertain. To find such regions, we utilize the predictive uncertainty of the classifier when the classifier is stochastic or, we learn a surrogate model to amortize the uncertainty when it is deterministic. Unlike $\ell_p$ ball or functional attacks which perturb inputs indiscriminately, our targeted changes can be less perceptible. When considered under our threat model, these attacks still produce strong adversarial examples; with the examples retaining a greater degree of similarity with the inputs.

Comment: CVPR 2021 Workshop on Adversarial Machine Learning in Computer Vision

Date: 17 Jun 2021