AI Draws What the Eye Sees

AI Draws What the Eye Sees

A new technology that can read our minds and draw the visuals we are thinking of!

After creating images from text instructions, Generative AI now dives deep into the intricacies of human brain.Researchers from Osaka University,Japan,havejust developed a new AI-powered algorithm that can interpret signals from our brainto recreate whatever our eyes are seeing. In other words, the technology can read our minds and draw the visuals we are thinking of. As reported in the research paper, the solution has reconstructed around 1,000 images from brain scans, with 80% accuracy. An impressive result indeed!

Stable Diffusion made the difference

ProfessorsYu Takagi and Shinji Nishimoto of Osaka Universityhave used the algorithm to recreate high-resolution images from scans of brain activitywhile the research subjects looked at pictures of several objects.The algorithm processed information gathered from those regions of the brain that are involved in perceiving images – namely, the occipital and temporal lobes. TheAI system interpreted these information through functional magnetic resonance imaging (fMRI) scans of the brains.The fMRI technology can detect variations in blood flowin those regions of the brain that are actively engaged in processing at any moment. Ituses sensors to identify high concentration of oxygen molecules, because neurons that are working hard when the brain is thinking or perceivingconsume the maximum oxygen.

According to the researchers,while we view an image, our temporal lobe registers information about its contents, while the occipital lobe records layouts and perspectives of the visual.The informationrecorded via fMRIscans are then converted into an imitation of the image with the help of artificial intelligence. For this, the research used an existing AI solutionStable Diffusion, developed by the British company, Stability AI. The Stable Diffusion model, included in OpenAI’s DALL-E 2, can create any imagery based on text inputs.

There have been previous attempts to decode brainscans viaAI algorithms. But all earlier efforts relied on large data sets. In contrast, Stable Diffusion was able to achieve the feat with less training – essentially by incorporating captions of images into its algorithm.

High accuracy with less training

Additional training was imparted to the Stable Diffusion algorithm based on an online dataset from the University of Minnesota, which consisted of brain scans from four participants – each of whom viewed 10,000 pictures.

Thetraining involved connecting additional text descriptions of thousands of photos to brain patterns that were recorded when the same images were observed by the brain scan participants. Theresearch team stated that “…unlike previous studies of image reconstruction, our method does not require training or fine-tuning of complex deep-learning models.”

As presented in the paper, the solution generated 1,000 images based on cues and prompts from the brain scans. The AI generates the images as noise, similar to television static, which is then substituted with distinct characteristics seen in the action by comparing to the images it was taught on and finding a match.

The research paper, about to be presented at a computer vision conference in Vancouver this summer, states:

“From a neuroscience standpoint, we numerically analyse each component of an LDM by mapping particular components to discrete brain regions. We offer an objective explanation of how an LDM [a latent diffusion model] text-to-image translation process integrates the semantic information conveyed by the conditional text while keeping the look of the original picture.”

The images generated were about 80% accurate – in terms of their similarity to the original images that the test subject was viewing during the scan.

Image:The top row shows original pictures viewed by test subjects during fMRI scan. The bottom row presents AI-recreated versions of them based on the relevant brain scan; Image credit: Shinji Nishimoto and Yu Takagi, Osaka University, via Creative Commons

Immense possibilities: From dream capture to assisted speech

Their paper is yet to be peer-reviewed but experts believe that the technology could go a long way in revealing how animals perceive the world, enable communication with paralysed people, and record whatever we see in our dreams.

It can be used to capture and analyse nonverbalbrainwaves from paralysed patients totransfer them into sentences on a computer screen in real-time.

It can also decode brain activity while a person is mentally attempting to construct full phrasesby spelling words phonetically. This use case reminds us of the legendary physicist Stephen Hawking who used machine assisted speech technology to spell out his sentences letter-by-letter on a screen. But he still had to do that through physical-mechanical efforts and not via mental communication. If fully developed as per expectations, this new technology could straightway create text from thought!

Know more about the syllabus and placement record of our Top Ranked Data Science Course in KolkataData Science course in BangaloreData Science course in Hyderabad, and Data Science course in Chennai.

© 2024 Praxis. All rights reserved. | Privacy Policy
   Contact Us