Python Crypto API Misuses in the Wild: Analyzing Threats to Validity

cover
6 May 2024

Authors:

(1) Anna-Katharina Wickert, Technische Universität Darmstadt, Darmstadt, Germany (wickert@cs.tu-darmstadt.de);

(2) Lars Baumgärtner, Technische Universität Darmstadt, Darmstadt, Germany (baumgaertner@cs.tu-darmstadt.de);

(3) Florian Breitfelder, Technische Universität Darmstadt, Darmstadt, Germany (florian.breitfelder@tu-darmstadt.de);

(4) Mira Mezini, Technische Universität Darmstadt, Darmstadt, Germany (mezini@cs.tu-darmstadt.de).

Abstract and 1 Introduction

2 Background

3 Design and Implementation of Licma and 3.1 Design

3.2 Implementation

4 Methodology and 4.1 Searching and Downloading Python Apps

4.2 Comparison with Previous Studies

5 Evaluation and 5.1 GitHub Python Projects

5.2 MicroPython

6 Comparison with previous studies

7 Threats to Validity

8 Related Work

9 Conclusion, Acknowledgments, and References

7 THREATS TO VALIDITY

We evaluated top GitHub Python projects and it may be that our results fail to generalize on specialized Python applications. For our data set on MicroPython applications, we also concentrated on popular projects. Thus, our insights may not generalize to less popular or closed-source projects. However, we believe that our results provide first interesting insights on crypto misuses in Python.

Currently, our analysis is limited to capabilities of Babelfish, especially the recursive maximum depth of its filter function. Furthermore, currently Babelfish only creates an AST for a single file. Thus, our analysis fails to resolve misuse over multiple files. We hope that these limitations can be lifted through further development of Babelfish. These improvements will hopefully help to reduce the number of false-positives in the potential misuses. Furthermore, it may be that our static analysis missed some misuses as Python is a dynamic typed language.

We compare different application types of studies conducted in different years. Thus, it may be that the results might change when conducted on the same kind of applications now. Further, the results may differ due to the effect of different application domains and different analysis frameworks. Moreover, the percentages of applications with at least one misuse per rule that we used from Zhang et al. [13] might be too positive for C, as the number of firmware images with crypto usages is not explicitly reported.

This paper is available on arxiv under CC BY 4.0 DEED license.