The GAZEploit attack consists of two parts, said Zhan, one of the lead researchers. First, the researchers created a way to identify when someone wearing the Vision Pro is typing by analyzing the 3D avatars they share. To do this, they used recordings of 30 avatars as they completed various typing tasks to train a recurrent neural network, a type of deep learning model.
According to the researchers, when you type using Vision Pro, your eyes focus on the key you’re about to press, then quickly move on to the next key. “When we’re typing, there are some regular patterns in our gaze,” says Zhang.
Wang said this pattern occurs more often while typing than when wearing a headset while browsing a website or watching a video. “During tasks such as gaze tracking, you’ll be able to concentrate and blink less frequently,” Wang says. So looking at a QWERTY keyboard and navigating between characters is a much different behavior.
The second part of the study uses geometric calculations to figure out where someone places a keyboard and its size, Zhan explains. “The only requirement is that it can detect every subsequent keystroke as long as it can capture enough gaze information to accurately reconstruct the keyboard.”
By combining these two factors, we were able to predict the keys someone would type. A series of clinical tests revealed no knowledge of the victim’s typing habits, speed, or where the keyboard was placed. However, researchers found that with up to five guesses, messages could be typed with 92.1 percent accuracy, passwords with 77 percent accuracy, PINs with 73 percent accuracy, and emails and URLs with 86.1 percent accuracy. was able to predict the correct letter. , and web pages. (On your first guess, the letters are correct 35 to 59 percent of the time, depending on what information you’re looking for.) Duplicate letters or typos cause further problems .
“Knowing where someone is looking is very powerful,” says Alexandra Paputosaki, an associate professor of computer science at Pomona College who has studied eye tracking for years and reviewed GAZEploit’s research for WIRED.
Paputsaki said what makes this work stand out is that it relies solely on the video feed of someone’s persona, compared to a hacker actually manipulating someone’s headset and trying to access eye-tracking data. This, he said, makes it a more “realistic” space for attacks to occur. . “The fact that someone could potentially expose their actions just by streaming a persona is where the vulnerability becomes even more important,” Paputosaki says.
Although the attack was created in a laboratory setting and has not been used against anyone using the persona in the real world, researchers say there are ways hackers could exploit the data breach. say. They say that, at least in theory, criminals could share files with victims during Zoom calls, which could result in victims logging into things like Google or Microsoft accounts. An attacker could record the persona while the target is logged in and use attack methods to recover the password and access the account.
quick fix
GAZEploit researchers reported their findings to Apple in April and subsequently submitted proof-of-concept code to Apple so they could reproduce the attack. Apple fixed a flaw in its Vision Pro software update in late July that stopped sharing a persona if someone was using a virtual keyboard.
An Apple spokesperson confirmed that the company has fixed the vulnerability and said it was addressed in VisionOS 1.3. The company’s software update notes do not mention this fix, but the company’s security-specific notes provide more details. Researchers say Apple has assigned CVE-2024-40865 to this vulnerability and recommends downloading the latest software update.