AI + ML

This article is more than 1 year old

Apple didn't engage with the infosec world on CSAM scanning – so get used to a slow drip feed of revelations

Starting with: Neural net already found in iOS, hash collisions, and more

Wed 18 Aug 2021 // 21:52 UTC

Apple's system to scan iCloud-bound photos on iOS devices to find illegal child sexual abuse material (CSAM) is supposed to ship in iOS 15 later this year.

However, the NeuralHash machine-learning model involved in that process appears to have been present on iOS devices at least since the December 14, 2020 release of iOS 14.3. It has been adapted to run on macOS 11.3 or later using the API in Apple's Vision framework. And thus exposed to the world, it has been probed by the curious.

In the wake of Apple's initial child safety announcement two weeks ago, several developers have explored Apple's private NeuralHash API and provided a Python script to convert the model into a convenient format – the Open Neural Network Exchange (ONNX) – for experimentation.

On Wednesday, Intel Labs research scientist Cory Cornelius used these resources to create a hash collision – two different images that, when processed by the algorithm, produce the same NeuralHash identifier.

That's expected behavior from perceptual hashing, which is designed to compute the same identifier for similar images – the idea is that one shouldn't be able to, say, convert a CSAM image from color to grayscale to evade hash-based detection.

This raised the possibility of 'poisoned' images that looked harmless, but triggered as child sexual abuse media

As Apple explains in its technical summary [PDF], "Only another image that appears nearly identical can produce the same number; for example, images that differ in size or transcoded quality will still have the same NeuralHash value."

But in this instance, the matching hashes come from completely dissimilar images – a beagle and a variegated gray square. And that finding amplifies ongoing concern that Apple's child safety technology may be abused to cause inadvertent harm. For instance, by giving someone an innocent image that is wrongly flagged up as CSAM.

Apple has said there's less than "an extremely low (1 in 1 trillion) probability of incorrectly flagging a given account," but as Matthew Green, associate professor of computer science at Johns Hopkins, observed via Twitter, Apple's statistics don't cover the possibility of "deliberately-constructed false positives."

"It was always fairly obvious that in a perceptual hash function like Apple’s, there were going to be 'collisions' — very different images that produced the same hash," said Green in reference to the collision demo. "This raised the possibility of 'poisoned' images that looked harmless, but triggered as child sexual abuse media."

Jonathan Mayer, assistant professor of computer science and public affairs at Princeton University, told The Register that this does not mean that Apple's NeuralHash image matching scheme is broken.

"That would be a reasonable response if NeuralHash were a cryptographic hash function," explained Mayer. "But it's a perceptual hash function, with very different properties."

With cryptographic hash functions, he said, you're not supposed to be able to find two inputs with the same output. The formal term for that is "second-preimage resistance."

"With a perceptual hash function, by comparison, a small change to the input is supposed to produce the same output," said Mayer. "These functions are designed specifically not to have second-preimage resistance."

Mayer said while he worries the collision proof-of-concept will provoke an overreaction, he's nonetheless concerned. "There is a real security risk here," he said.

Of greatest concern, he said, is an adversarial machine-learning attack that generates images that match CSAM hashes and appear to be possible CSAM during Apple's review process. Apple, he said, can defend against these attacks and, in fact, describes some planned mitigations in its documentation.

Apple, said Mayer, "has both a technical mitigation (running a separate, undisclosed server-side perceptual hash function to check for a match) and a process mitigation (human review)," he explained. "Those mitigations have limits, and they still expose some content, but Apple has clearly thought about this issue."

"I’m less concerned about the attack than some observers, because it presupposes access to known CSAM hashes," said Mayer. "And the most direct way to get those hashes is from source images. So it presupposes an attacker committing a very serious federal felony."

Mayer's objections have more to do with the way Apple handled its child safety announcement, which even the company itself was forced to concede has led to misunderstandings.

"I find it mind boggling that Apple wasn't prepared to discuss this risk, like so many other risks surrounding its new system," said Mayer. "Apple hasn't seriously engaged with the information security community, so we're going to have a slow drip of concerning developments like this, with little context for understanding."

The Register asked Apple to comment, but we expect to hear nothing.

Cupertino comeback

Apple, aware of these developments, reportedly held a call for the press in which the company downplayed the hash collision and cited safeguards like the operating system's code signing to guarantee the integrity of the NeuralHash model, human review, and redundant algorithmic check that runs server-side.

Nonetheless, AsuharietYgvar, the pseudonymous individual who made the NeuralHash model available in ONYX format, and asked to be identified as "an average concerned citizen," expressed concern that Apple was misinforming the public and skepticism about the supposed server-side check.

This is highly questionable because it adds a black box in the detection process, which no one can perform security audits on

"If their claim was true, the collision would appear to no longer be a problem since it's impossible to retrieve the algorithm they are using on the servers," said AsuharietYgvar in a message to The Register. "However, this is highly questionable because it adds a black box in the detection process, which no one can perform security audits on.

"We already know that NeuralHash is not as robust as Apple claimed. Who can believe their secret, non-audited secondary check will be better? Considering that Apple already described their NeuralHash and Private Set Intersection algorithms in detail, it's ironic that eventually they decided to keep the integral parts in secret to combat security researchers. And if I did not make their NeuralHash public, we will never know that the algorithm is that easy to defeat."

"Another real problem is that this system can be easily worked around to store CSAM materials without being detected," AsuharietYgvar continued. "Since the NeuralHash model is public now it's trivial to implement an algorithm which completely changes the hash without introducing visible difference. This will make those materials easily pass the initial on-device check.

"I believe what I did was a firm step against mass surveillance, but certainly this will not be enough. We cannot let Apple's famous 1984 ad become a reality. At least not without a fight." ®

More about

Narrower topics

2FA
AdBlock Plus
Advanced persistent threat
AirTag
App
Apple M1
Application Delivery Controller
App stores
Audacity
Authentication
BEC
Black Hat
Bug Bounty
Common Vulnerability Scoring System
Confluence
Cybercrime
Cybersecurity
Cybersecurity and Infrastructure Security Agency
Cybersecurity Information Sharing Act
Database
Data Breach
Data Protection
Data Theft
DDoS
Digital certificate
Encryption
Exploit
Firewall
FOSDEM
FOSS
Grab
Hacker
Hacking
iCloud
IDE
Identity Theft
iMac
Incident response
Infosec
iOS
iPad
iPhone
iPod
iTunes
Jenkins
Kenna Security
LibreOffice
Mac
MacBook
Map
Microsoft 365
Microsoft Office
Microsoft Teams
NCSAM
NCSC
OpenOffice
Palo Alto Networks
Password
Phishing
Programming Language
QR code
Quantum key distribution
Ransomware
Remote Access Trojan
Retro computing
REvil
RSA Conference
Safari
Search Engine
Siri
Software bug
Software License
Spamming
Spyware
Surveillance
Tim Cook
TLS
Trojan
Trusted Platform Module
User interface
Visual Studio
Visual Studio Code
Vulnerability
Wannacry
WebAssembly
Web Browser
Zero trust

Broader topics

More about

COMMENTS

More about

Narrower topics

2FA
AdBlock Plus
Advanced persistent threat
AirTag
App
Apple M1
Application Delivery Controller
App stores
Audacity
Authentication
BEC
Black Hat
Bug Bounty
Common Vulnerability Scoring System
Confluence
Cybercrime
Cybersecurity
Cybersecurity and Infrastructure Security Agency
Cybersecurity Information Sharing Act
Database
Data Breach
Data Protection
Data Theft
DDoS
Digital certificate
Encryption
Exploit
Firewall
FOSDEM
FOSS
Grab
Hacker
Hacking
iCloud
IDE
Identity Theft
iMac
Incident response
Infosec
iOS
iPad
iPhone
iPod
iTunes
Jenkins
Kenna Security
LibreOffice
Mac
MacBook
Map
Microsoft 365
Microsoft Office
Microsoft Teams
NCSAM
NCSC
OpenOffice
Palo Alto Networks
Password
Phishing
Programming Language
QR code
Quantum key distribution
Ransomware
Remote Access Trojan
Retro computing
REvil
RSA Conference
Safari
Search Engine
Siri
Software bug
Software License
Spamming
Spyware
Surveillance
Tim Cook
TLS
Trojan
Trusted Platform Module
User interface
Visual Studio
Visual Studio Code
Vulnerability
Wannacry
WebAssembly
Web Browser
Zero trust

Broader topics

TIP US OFF

Send us news

Topics

Special Features

Vendor Voice

Resources

AI + ML

Apple didn't engage with the infosec world on CSAM scanning – so get used to a slow drip feed of revelations

Starting with: Neural net already found in iOS, hash collisions, and more

Cupertino comeback

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

Apple's GoFetch silicon security fail was down to an obsession with speed

Microsoft slammed for lax security that led to China's cyber-raid on Exchange Online

Rust developers at Google are twice as productive as C++ teams

A different view from the edge

Sleuths who cracked Zodiac Killer's cipher thank the crowd

Academics probe Apple's privacy settings and get lost and confused

Meet clickjacking's slicker cousin, 'gesture jacking,' aka 'cross window forgery'

Apple to allow some iPhones to be repaired with used parts

Malicious SSH backdoor sneaks into xz, Linux world's data compression library

In-app browsers are still a privacy, security, and choice problem

US government excoriates Microsoft for 'avoidable errors' but keeps paying for its products

Apple's failure to duck UK antitrust probe could bring £785M windfall for devs

About Us

Our Websites

Your Privacy