PapersCutA shortcut to recent security papers

Arxiv

DPCOVID: Privacy-Preserving Federated Covid-19 Detection

Authors: Trang-Thi Ho, Yennun-Huang

Abstract: Coronavirus (COVID-19) has shown an unprecedented global crisis by the detrimental effect on the global economy and health. The number of COVID-19 cases has been rapidly increasing, and there is no sign of stopping. It leads to a severe shortage of test kits and accurate detection models. A recent study demonstrated that the chest X-ray radiography outperformed laboratory testing in COVID-19 detection. Therefore, using chest X-ray radiography analysis can help to screen suspected COVID-19 cases at an early stage. Moreover, the patient data is sensitive, and it must be protected to avoid revealing through model updates and reconstruction from the malicious attacker. In this paper, we present a privacy-preserving Federated Learning system for COVID-19 detection based on chest X-ray images. First, a Federated Learning system is constructed from chest X-ray images. The main idea is to build a decentralized model across multiple hospitals without sharing data among hospitals. Second, we first show that the accuracy of Federated Learning for COVID-19 identification reduces significantly for Non-IID data. We then propose a strategy to improve model's accuracy on Non-IID COVID-19 data by increasing the total number of clients, parallelism (client fraction), and computation per client. Finally, we apply a Differential Privacy Stochastic Gradient Descent (DP-SGD) to enhance the preserving of patient data privacy for our Federated Learning model. A strategy is also proposed to keep the robustness of Federated Learning to ensure the security and accuracy of the model.

Comment: 7 pages, 8 Figures, 4 Tables

Date: 26 Oct 2021

Measuring the Effectiveness of Digital Hygiene using Historical DNS Data

Authors: Oliver Farnan, Gregory Walton, Joss Wright

Abstract: This paper describes an ongoing experiment evaluating the efficacy of a digital safety intervention in six high-risk, low capacity Civil Society Organisations (CSOs) in Central Asia. The evaluation takes the form of statistical analysis of DNS traffic in each organisation, obtained via security tools installed by researchers. The hypothesis is that the digital safety intervention strengthens the overall digital security posture of the CSOs, as measured by number of malware attacks intercepted by a cloud-based DNS firewall installed on the CSOs networks. The research collects DNS traffic from CSOs that are participating in the digital safety intervention, and compares a treatment group consisting of four CSOs against DNS traffic from a second group of two CSOs in which the intervention has not yet taken place. This project is ongoing, with data collection underway at a number of Central Asian CSOs. In this paper we outline the experimental design of the project, and look at the early data coming out of the DNS firewall. This is done to support the ultimate question of whether DNS data such as this can be used to accurately assess the efficacy of digital hygiene efforts.

Date: 26 Oct 2021

Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes

Authors: Sanghyun Hong, Michael-Andrei Panaitescu-Liess, Yiğitcan Kaya, Tudor Dumitraş

Abstract: Quantization is a popular technique that $transforms$ the parameter representation of a neural network from floating-point numbers into lower-precision ones ($e.g.$, 8-bit integers). It reduces the memory footprint and the computational cost at inference, facilitating the deployment of resource-hungry models. However, the parameter perturbations caused by this transformation result in $behavioral$ $disparities$ between the model before and after quantization. For example, a quantized model can misclassify some test-time samples that are otherwise classified correctly. It is not known whether such differences lead to a new security vulnerability. We hypothesize that an adversary may control this disparity to introduce specific behaviors that activate upon quantization. To study this hypothesis, we weaponize quantization-aware training and propose a new training framework to implement adversarial quantization outcomes. Following this framework, we present three attacks we carry out with quantization: (i) an indiscriminate attack for significant accuracy loss; (ii) a targeted attack against specific samples; and (iii) a backdoor attack for controlling the model with an input trigger. We further show that a single compromised model defeats multiple quantization schemes, including robust quantization techniques. Moreover, in a federated learning scenario, we demonstrate that a set of malicious participants who conspire can inject our quantization-activated backdoor. Lastly, we discuss potential counter-measures and show that only re-training consistently removes the attack artifacts. Our code is available at https://github.com/Secure-AI-Systems-Group/Qu-ANTI-zation

Comment: Accepted to NeurIPS 2021 [Poster]

Date: 26 Oct 2021

SEDML: Securely and Efficiently Harnessing Distributed Knowledge in Machine Learning

Authors: Yansong Gao, Qun Li, Yifeng Zheng, Guohong Wang, Jiannan Wei, Mang Su

Abstract: Training high-performing deep learning models require a rich amount of data which is usually distributed among multiple data sources in practice. Simply centralizing these multi-sourced data for training would raise critical security and privacy concerns, and might be prohibited given the increasingly strict data regulations. To resolve the tension between privacy and data utilization in distributed learning, a machine learning framework called private aggregation of teacher ensembles(PATE) has been recently proposed. PATE harnesses the knowledge (label predictions for an unlabeled dataset) from distributed teacher models to train a student model, obviating access to distributed datasets. Despite being enticing, PATE does not offer protection for the individual label predictions from teacher models, which still entails privacy risks. In this paper, we propose SEDML, a new protocol which allows to securely and efficiently harness the distributed knowledge in machine learning. SEDML builds on lightweight cryptography and provides strong protection for the individual label predictions, as well as differential privacy guarantees on the aggregation results. Extensive evaluations show that while providing privacy protection, SEDML preserves the accuracy as in the plaintext baseline. Meanwhile, SEDML's performance in computing and communication is 43 times and 1.23 times higher than the latest technology, respectively.

Date: 26 Oct 2021

Wavelet: Code-based postquantum signatures with fast verification on microcontrollers

Authors: Gustavo Banegas, Thomas Debris-Alazard, Milena Nedeljković, Benjamin Smith

Abstract: This work presents the first full implementation of Wave, a postquantum code-based signature scheme. We define Wavelet, a concrete Wave scheme at the 128-bit classical security level (or NIST postquantum security Level 1) equipped with a fast verification algorithm targeting embedded devices. Wavelet offers 930-byte signatures, with a public key of 3161 kB. We include implementation details using AVX instructions, and on ARM Cortex-M4, including a solution to deal with Wavelet's large public keys, which do not fit in the SRAM of a typical embedded device. Our verification algorithm is $\approx 4.65 \times$ faster then the original, and verifies in 1 087 538 cycles using AVX instructions, or 13 172 ticks in an ARM Cortex-M4.

Date: 26 Oct 2021

Precise URL Phishing Detection Using Neural Networks

Authors: Aman Rangapur, Dr Ajith Jubilson

Abstract: With the development of the Internet, ways of obtaining important data such as passwords and logins or sensitive personal data have increased. One of the ways to extract such information is page impersonation, also called phishing. Such websites do not provide service but collect sensitive details from the user. Here, we present you with ways to detect such malicious URLs with state of art accuracy with neural networks. Different from previous works, where web content, URL or traffic statistics are examined, we analyse only the URL text, making it faster and which detects zero-day attacks. The network is optimised and can be used even on small devices such as Ras-Pi without a change in performance.

Comment: 10 pages, 9 figures

Date: 26 Oct 2021

Semantic Host-free Trojan Attack

Authors: Haripriya Harikumar, Kien Do, Santu Rana, Sunil Gupta, Svetha Venkatesh

Abstract: In this paper, we propose a novel host-free Trojan attack with triggers that are fixed in the semantic space but not necessarily in the pixel space. In contrast to existing Trojan attacks which use clean input images as hosts to carry small, meaningless trigger patterns, our attack considers triggers as full-sized images belonging to a semantically meaningful object class. Since in our attack, the backdoored classifier is encouraged to memorize the abstract semantics of the trigger images than any specific fixed pattern, it can be later triggered by semantically similar but different looking images. This makes our attack more practical to be applied in the real-world and harder to defend against. Extensive experimental results demonstrate that with only a small number of Trojan patterns for training, our attack can generalize well to new patterns of the same Trojan class and can bypass state-of-the-art defense methods.

Date: 26 Oct 2021

Task-Aware Meta Learning-based Siamese Neural Network for Classifying Obfuscated Malware

Authors: Jinting Zhu, Julian Jang-Jaccard, Amardeep Singh, Paul A. Watters, Seyit Camtepe

Abstract: Malware authors apply different obfuscation techniques on the generic feature of malware (i.e., unique malware signature) to create new variants to avoid detection. Existing Siamese Neural Network (SNN) based malware detection methods fail to correctly classify different malware families when similar generic features are shared across multiple malware variants resulting in high false-positive rates. To address this issue, we propose a novel Task-Aware Meta Learning-based Siamese Neural Network resilient against obfuscated malware while able to detect malware trained with one or a few training samples. Using entropy features of each malware signature alongside image features as task inputs, our task-aware meta leaner generates the parameters for the feature layers to more accurately adjust the feature embedding for different malware families. In addition, our model utilizes meta-learning with the extracted features of a pre-trained network (e.g., VGG-16) to avoid the bias typically associated with a model trained with a limited number of training samples. Our proposed approach is highly effective in recognizing unique malware signatures, thus correctly classifying malware samples that belong to the same malware family even in the presence of obfuscation technique applied to malware. Our experimental results, validated with N-way on N-shot learning, show that our model is highly effective in classification accuracy exceeding the rate>91% compared to other similar methods.

Date: 26 Oct 2021

VLSI Implementation of Cryptographic Algorithms & Techniques: A Literature Review

Abstract: Through the years, the flow of Data and its transmission have increased tremendously and so has the security issues to it. Cryptography in recent years with the advancement of VLSI has led to its implementation of Encryption and Decryption techniques, where the process of translating and converting plaintext into cypher text and vice versa is made possible. In this paper, the review of various aspects of VLSI's implementation of encryption and decryption are covered. To systemize the material, the information about topics such as Private Key Encryption, Index Technique, Blowfish Algorithm, DNA cryptography, and many more are reviewed. Ultimately, with this review, the basic understanding of different VLSI techniques of Encryption and Decryption can be studied and implemented.

Date: 26 Oct 2021

Exploring eFPGA-based Redaction for IP Protection

Authors: Jitendra Bhandari, Abdul Khader Thalakkattu Moosa, Benjamin Tan, Christian Pilato, Ganesh Gore, Xifan Tang, Scott Temple, Pierre-Emmanuel Gaillardon, Ramesh Karri

Abstract: Recently, eFPGA-based redaction has been proposed as a promising solution for hiding parts of a digital design from untrusted entities, where legitimate end-users can restore functionality by loading the withheld bitstream after fabrication. However, when deciding which parts of a design to redact, there are a number of practical issues that designers need to consider, including area and timing overheads, as well as security factors. Adapting an open-source FPGA fabric generation flow, we perform a case study to explore the trade-offs when redacting different modules of open-source intellectual property blocks (IPs) and explore how different parts of an eFPGA contribute to the security. We provide new insights into the feasibility and challenges of using eFPGA-based redaction as a security solution.