# PapersCutA shortcut to recent security papers

### Arxiv

#### Towards Simplifying PKI Implementation: Client-Server based Validation of Public Key Certificates

Authors: Diana Berbecaru, Antonio Lioy

Abstract: With real-time certificate validation checking, a public-key-using system that needs to validate a certificate executes a transaction with a specialized validation party. At the end of the transaction the validation party returns an indication about the validity status of the certificate. This paper analysis the public key (PbK) certificate validation service from a practical point of view by describing the implementation of a system that makes use of the Data Validation and Certification Server (DVCS) protocols to provide certificate validation service to the Relying Parties (RPs). However the system is not restricted to use only the specified protocol and allows the integration of other validation protocols or mechanisms. Our implementation efforts emphasize the possibility to pursue a specific RP tradeoff between timeliness, security and computational resource usage via dynamic selection of several configurable options.

Date: 15 Oct 2019

#### Cascading Machine Learning to Attack Bitcoin Anonymity

Authors: Francesco Zola, Maria Eguimendia, Jan Lukas Bruse, Raul Orduna Urrutia

Abstract: Bitcoin is a decentralized, pseudonymous cryptocurrency that is one of the most used digital assets to date. Its unregulated nature and inherent anonymity of users have led to a dramatic increase in its use for illicit activities. This calls for the development of novel methods capable of characterizing different entities in the Bitcoin network. In this paper, a method to attack Bitcoin anonymity is presented, leveraging a novel cascading machine learning approach that requires only a few features directly extracted from Bitcoin blockchain data. Cascading, used to enrich entities information with data from previous classifications, led to considerably improved multi-class classification performance with excellent values of Precision close to 1.0 for each considered class. Final models were implemented and compared using different machine learning models and showed significantly higher accuracy compared to their baseline implementation. Our approach can contribute to the development of effective tools for Bitcoin entity characterization, which may assist in uncovering illegal activities.

Comment: 15 pages,7 figures, 4 tables, presented in 2019 IEEE International Conference on Blockchain (Blockchain)

Date: 15 Oct 2019

#### Automated Ransomware Behavior Analysis: Pattern Extraction and Early Detection

Authors: Qian Chen, Sheikh Rabiul Islam, Henry Haswell, Robert A. Bridges

Abstract: Security operation centers (SOCs) typically use a variety of tools to collect large volumes of host logs for detection and forensic of intrusions. Our experience, supported by recent user studies on SOC operators, indicates that operators spend ample time (e.g., hundreds of man-hours) on investigations into logs seeking adversarial actions. Similarly, reconfiguration of tools to adapt detectors for future similar attacks is commonplace upon gaining novel insights (e.g., through internal investigation or shared indicators). This paper presents an automated malware pattern-extraction and early detection tool, testing three machine learning approaches: TF-IDF (term frequency-inverse document frequency), Fisher's LDA (linear discriminant analysis) and ET (extra trees/extremely randomized trees) that can (1) analyze freshly discovered malware samples in sandboxes and generate dynamic analysis reports (host logs); (2) automatically extract the sequence of events induced by malware given a large volume of ambient (un-attacked) host logs, and the relatively few logs from hosts that are infected with potentially polymorphic malware; (3) rank the most discriminating features (unique patterns) of malware and from the learned behavior detect malicious activity; and (4) allows operators to visualize the discriminating features and their correlations to facilitate malware forensic efforts. To validate the accuracy and efficiency of our tool, we design three experiments and test seven ransomware attacks (i.e., WannaCry, DBGer, Cerber, Defray, GandCrab, Locky, and nRansom). The experimental results show that TF-IDF is the best of the three methods to identify discriminating features, and ET is the most time-efficient and robust approach.

Comment: The 2nd International Conference on Science of Cyber Security - SciSec 2019; Springer's Lecture Notes in Computer Science (LNCS) series

Date: 15 Oct 2019

#### It is high time we let go of the Mersenne Twister

Authors: Sebastiano Vigna

Abstract: When the Mersenne Twister made his first appearance in 1997 it was a powerful example of how linear maps on $\mathbf F_2$ could be used to generate pseudorandom numbers. In particular, the easiness with which generators with long periods could be defined gave the Mersenne Twister a large following, in spite of the fact that such long periods are not a measure of quality, and they require a large amount of memory. Even at the time of its publication, several defects of the Mersenne Twister were predictable, but they were somewhat obscured by other interesting properties. Today the Mersenne Twister is the default generator in C compilers, the Python language, the Maple mathematical computation system, and in many other environments. Nonetheless, knowledge accumulated in the last $20$ years suggests that the Mersenne Twister has, in fact, severe defects, and should never be used as a general-purpose pseudorandom number generator. Many of these results are folklore, or are scattered through very specialized literature. This paper surveys these results for the non-specialist, providing new, simple, understandable examples, and it is intended as a guide for the final user, or for language implementors, so that they can take an informed decision about whether to use the Mersenne Twister or not.

Date: 14 Oct 2019

#### Bridging Information Security and Environmental Criminology Research to Better Mitigate Cybercrime

Authors: Colin C. Ife, Toby Davies, Steven J. Murdoch, Gianluca Stringhini

Abstract: Cybercrime is a complex phenomenon that spans both technical and human aspects. As such, two disjoint areas have been studying the problem from separate angles: the information security community and the environmental criminology one. Despite the large body of work produced by these communities in the past years, the two research efforts have largely remained disjoint, with researchers on one side not benefitting from the advancements proposed by the other. In this paper, we argue that it would be beneficial for the information security community to look at the theories and systematic frameworks developed in environmental criminology to develop better mitigations against cybercrime. To this end, we provide an overview of the research from environmental criminology and how it has been applied to cybercrime. We then survey some of the research proposed in the information security domain, drawing explicit parallels between the proposed mitigations and environmental criminology theories, and presenting some examples of new mitigations against cybercrime. Finally, we discuss the concept of cyberplaces and propose a framework in order to define them. We discuss this as a potential research direction, taking into account both fields of research, in the hope of broadening interdisciplinary efforts in cybercrime research

Date: 14 Oct 2019

#### Using Lexical Features for Malicious URL Detection -- A Machine Learning Approach

Authors: Apoorva Joshi, Levi Lloyd, Paul Westin, Srini Seethapathy

Abstract: Malicious websites are responsible for a majority of the cyber-attacks and scams today. Malicious URLs are delivered to unsuspecting users via email, text messages, pop-ups or advertisements. Clicking on or crawling such URLs can result in compromised email accounts, launching of phishing campaigns, download of malware, spyware and ransomware, as well as severe monetary losses. A machine learning based ensemble classification approach is proposed to detect malicious URLs in emails, which can be extended to other methods of delivery of malicious URLs. The approach uses static lexical features extracted from the URL string, with the assumption that these features are notably different for malicious and benign URLs. The use of such static features is safer and faster since it does not involve crawling the URLs or blacklist lookups which tend to introduce a significant amount of latency in producing verdicts. The goal of the classification was to achieve high sensitivity i.e. detect as many malicious URLs as possible. URL strings tend to be very unstructured and noisy. Hence, bagging algorithms were found to be a good fit for the task since they average out multiple learners trained on different parts of the training data, thus reducing variance. The classification model was tested on five different testing sets and produced an average False Negative Rate (FNR) of 0.1%, average accuracy of 92% and average AUC of 0.98. The model is presently being used in the FireEye Advanced URL Detection Engine (used to detect malicious URLs in emails), to generate fast real-time verdicts on URLs. The malicious URL detections from the engine have gone up by 22% since the deployment of the model into the engine workflow. The results obtained show noteworthy evidence that a purely lexical approach can be used to detect malicious URLs.

Date: 14 Oct 2019

#### Homomorphic Encryption based on Hidden Subspace Membership

Authors: Uddipana Dowerah, Srinivasan Krishnaswamy

Abstract: In this paper, we propose a leveled fully homomorphic encryption scheme based on multivariate polynomial evaluation. We first identify a decision problem called the Hidden Subspace Membership (HSM) problem and show that it is related to the well-known Learning with Errors (LWE) problem. We show that an adversary against the LWE problem can be translated into an adversary against the HSM problem and on the contrary, solving the HSM problem is equivalent to solving the LWE problem with multiple secrets. We then show that the security of the proposed scheme rely on the hardness of the Hidden Subspace Membership problem. Further, we propose a batch variant of the scheme where multiple plaintext bits can be packed into a single ciphertext.

Date: 14 Oct 2019

#### Using AI/ML to gain situational understanding from passive network observations

Authors: D. Verma, S. Calo

Abstract: The data available in the network traffic fromany Government building contains a significant amount ofinformation. An analysis of the traffic can yield insightsand situational understanding about what is happening inthe building. However, the use of traditional network packet inspection, either deep or shallow, is useful for only a limited understanding of the environment, with applicability limited to some aspects of network and security management. If weuse AI/ML based techniques to understand the network traffic, we can gain significant insights which increase our situational awareness of what is happening in the environment.At IBM, we have created a system which uses a combination of network domain knowledge and machine learning techniques to convert network traffic into actionable insights about the on premise environment. These insights include characterization of the communicating devices, discovering unauthorized devices that may violate policy requirements, identifying hidden components and vulnerability points, detecting leakage of sensitive information, and identifying the presence of people and devices.In this paper, we will describe the overall design of this system, the major use-cases that have been identified for it, and the lessons learnt when deploying this system for some of those use-cases

Comment: Presented at AAAI FSS-19: Artificial Intelligence in Government and Public Sector, Arlington, Virginia, USA

Date: 14 Oct 2019

#### Real-world attack on MTCNN face detection system

Authors: Edgar Kaziakhmedov, Klim Kireev, Grigorii Melnikov, Mikhail Pautov, Aleksandr Petiushko

Abstract: Recent studies proved that deep learning approaches achieve remarkable results on face detection task. On the other hand, the advances gave rise to a new problem associated with the security of the deep convolutional neural network models unveiling potential risks of DCNNs based applications. Even minor input changes in the digital domain can result in the network being fooled. It was shown then that some deep learning-based face detectors are prone to adversarial attacks not only in a digital domain but also in the real world. In the paper, we investigate the security of the well-known cascade CNN face detection system - MTCNN and introduce an easily reproducible and a robust way to attack it. We propose different face attributes printed on an ordinary white and black printer and attached either to the medical face mask or to the face directly. Our approach is capable of breaking the MTCNN detector in a real-world scenario.

Date: 14 Oct 2019

#### Confidence-Calibrated Adversarial Training: Towards Robust Models Generalizing Beyond the Attack Used During Training

Authors: David Stutz, Matthias Hein, Bernt Schiele

Abstract: Adversarial training is the standard to train models robust against adversarial examples. However, especially for complex datasets, adversarial training incurs a significant loss in accuracy and is known to generalize poorly to stronger attacks, e.g., larger perturbations or other threat models. In this paper, we introduce confidence-calibrated adversarial training (CCAT) where the key idea is to enforce that the confidence on adversarial examples decays with their distance to the attacked examples. We show that CCAT preserves better the accuracy of normal training while robustness against adversarial examples is achieved via confidence thresholding. Most importantly, in strong contrast to adversarial training, the robustness of CCAT generalizes to larger perturbations and other threat models, not encountered during training. We also discuss our extensive work to design strong adaptive attacks against CCAT and standard adversarial training which is of independent interest. We present experimental results on MNIST, SVHN and Cifar10.

Date: 14 Oct 2019