Google's TensorFlow Drops YAML Support Due to Code Execution Flaw

06 September, 2021
No Comments

TensorFlow, a popular Python-based machine learning and artificial intelligence project developed by Google has dropped support for YAML, to patch a critical code execution vulnerability.

YAML or Yet Another Markup Language is a convenient choice among developers looking for a human-readable data serialization language for handling configuration files and data in transit.

Untrusted deserialization vulnerability in TensorFlow

Maintainers behind both TensorFlow and Keras, a wrapper project for TensorFlow, have patched an untrusted deserialization vulnerability that stemmed from unsafe parsing of YAML.

Tracked as CVE-2021-37678, the critical flaw enables attackers to execute arbitrary code when an application deserializes a Keras model provided in the YAML format.

Deserialization vulnerabilities typically occur when an application reads malformed or malicious data originating from inauthentic sources.

After an application reads and deserializes the data, it may crash resulting in a Denial of Service (DoS) condition, or worse, execute the attacker’s arbitrary code.

This YAML deserialization vulnerability, rated a 9.3 in severity, was responsibly reported to TensorFlow maintainers by security researcher Arjun Shibu.

Also Read: Top 11 Ultimate Cold Calling Guidelines To Boost Your Sales

And the source of the flaw, you ask? The notorious “yaml.unsafe_load()” function in TensorFlow code:

**Vulnerable yaml.unsafe_load function call in TensorFlow** (GitHub)

The “unsafe_load” function is known to deserialize YAML data rather liberally—it resolves all tags, “even those known to be unsafe on untrusted input.”

This means, ideally “unsafe_load” should only be called on input that comes from a trusted source and is known to be free of any malicious content.

Should that not be the case, attackers can exploit the deserialization mechanism to execute code of their choice by injecting malicious payload in the YAML data which is yet to be serialized.

An example Proof-of-Concept (PoC) exploit shared in the vulnerability advisory demonstrates just this:

from tensorflow.keras import models

payload = '''
!!python/object/new:type
args: ['z', !!python/tuple [], {'extend': !!python/name:exec }]
listitems: "__import__('os').system('cat /etc/passwd')"
'''

models.model_from_yaml(payload)

TensorFlow drops YAML altogether in favor of JSON

After the vulnerability was reported, TensorFlow decided to drop YAML support altogether and use JSON deserialization instead.

“Given that YAML format support requires a significant amount of work, we have removed it for now,” say the project maintainers in the same advisory.

“The methods `Model.to_yaml()` and `keras.models.model_from_yaml` have been replaced to raise a `RuntimeError` as they can be abused to cause arbitrary code execution,” also explain the release notes associated with the fix.

“It is recommended to use JSON serialization instead of YAML, or, a better alternative, serialize to H5.”

It is worth noting, TensorFlow is not the first or only project found to be using YAML’s unsafe_load. The function’s use is rather prevalent in Python projects.

Also Read: Management Training PDF for Effective Managers and Leaders

GitHub shows thousands of search results referencing the function, with some developers proposing improvements:

github results for applications using unsafe_load — **Many repos on GitHub have used and use YAML’s unsafe load function** (GitHub)

Fix for CVE-2021-37678 is expected to arrive in TensorFlow version 2.6.0, and will also be backported into prior versions 2.5.1, 2.4.3, and 2.3.4, state the maintainers.

Privacy Ninja

DPO-As-A-Service

DPTM Certification Readiness Consultancy

PDPA Awareness Training

Cyber Hygiene Training

Compromised Credentials Monitoring

Email Phishing

Vulnerability Assessment

Web Penetration Testing

Mobile Penetration Testing

Thick Client Penetration Testing

API Penetration Testing

On-Prem & Cloud Network Penetration Testing

Email Spoofing Prevention

Source Code Review

Smart Contract Audit

Cyber Essentials Bundle (Coming Soon)

Google’s TensorFlow Drops YAML Support Due to Code Execution Flaw

Untrusted deserialization vulnerability in TensorFlow

TensorFlow drops YAML altogether in favor of JSON

Categories

Privacy Ninja

Data Protection

Training

Managed Services

Cybersecurity

Location

Newsletter

KEEP IN TOUCH

KEEP IN TOUCH

REPORTING DATA BREACH TO PDPC?

We have assisted numerous companies to prepare proper and accurate reports to PDPC to minimise financial penalties.

Cyber Essentials Bundle (Coming Soon)

Google’s TensorFlow Drops YAML Support Due to Code Execution Flaw

Untrusted deserialization vulnerability in TensorFlow

TensorFlow drops YAML altogether in favor of JSON

Categories

Tags

Data Protection​

Training

Managed Services

Cybersecurity

Location

Newsletter

KEEP IN TOUCH

KEEP IN TOUCH

REPORTING DATA BREACH TO PDPC?

We have assisted numerous companies to prepare proper and accurate reports to PDPC to minimise financial penalties.

Data Protection