AGI can enable robots to understand, learn, and do intellectual activities in the same way as humans
The race for Artificial General Intelligence (AGI) is intensifying with OpenAI, supported by Microsoft, coming out with its newest version of an algorithm that can create life-like images based only on simple English description of the scene, and Google’s DeepMind launching its AGI algorithm that can play video games, caption images, and move robotic arms with an extremely high degree of human-like exactness. Both these algorithms have been trained on billions of datasets of images and text, collected from the Internet. AGI is known for enabling intelligent robots to understand, learn, and do intellectual activities in the same way as humans do.
Accurate images from textual description
In January 2021, OpenAI, an artificial intelligence (AI) research laboratory, introduced DALL-E. One year later, the company’s newest system, DALL-E 2, generates more realistic and accurate images with 4x greater resolution. DALL-E is an AI program that creates images from textual descriptions. It uses a 12-billion parameter version of the GPT-3 Transformer model to interpret natural language inputs and generate corresponding images. It can create images of realistic objects as well as objects that do not exist in reality. The company, worried about its harmful potential, has avoided releasing the program in the market.
OpenAI’s mission is to ensure that artificial general intelligence (AGI) – by which we mean highly autonomous systems that outperform humans at most economically valuable work – benefits all of humanity. Microsoft has invested $1 billion in OpenAI to support its AGI initiative.
DeepMind algorithm performs 600 tasks
Close on the heels of DALL-E2, Google’s DeepMind released Gato, a generalist AI algorithm that can perform many different tasks that humans can do, without carving a niche for itself as an expert on one task. Gato can perform more than 600 different tasks, such as playing video games, captioning images, and moving real-world robotic arms. Gato is a multi-modal, multi-task, multi-embodiment generalist policy.
DALL-E 2 was trained using a combination of photos gathered from the internet and acquired from licensed sources, according to the document authored by OpenAI ethics and policy researchers. OpenAI did make efforts to mitigate toxicity or the spread of disinformation, applying text filters to the image generator and removing some images that were sexually explicit or gory. The training dataset included 650 million images and text captions, according to developers familiar with the algorithm.
Learning from billions of datasets
Meanwhile, DeepMind claimed that Gato was trained on a large number of datasets comprising agent experience in both simulated and real-world environments, in addition to a variety of natural language and image datasets. Gato, like all AI systems, learns by example, ingesting billions of words, images from real-world and simulated environments, button presses, joint torques, and more in the form of tokens. These tokens served to represent data in a way Gato could understand, enabling the system to perform different tasks.
Gato’s architecture isn’t that different from many of the AI systems in use today. In the sense that it’s a Transformer, it’s similar to OpenAI’s GPT-3. The Transformer has been the architecture of choice for complicated reasoning tasks, displaying abilities in summarizing texts, producing music, categorizing objects in photos, and analyzing protein sequences.
Gato has a parameter count that is orders of magnitude lower than single-task systems, including GPT-3. Parameters are system components learnt from training data that fundamentally describe the system’s ability to solve a problem, such as text generation. GPT-3 has more than 170 billion, while Gato has only 1.2 billion. Both GPT-3 and Gato require strong filters to remove weaknesses and shortcomings like bias, racism, and harsh language from the outcome.
With cognitive computing capabilities, it can analyze the human mind and solve any complex problem. Both of these tech companies are dealing with major AGI challenges, including issues with learning human-centric capabilities like sensory perception, motor skills, problem-solving, human-level creativity, and so on, as well as a lack of working protocol, and reduced universality, business alignment, and AGI direction.
Not yet free from dangerous biases
Nevertheless, these algorithms have the potential to cause tremendous harm by creating deep fakes that can be nearly impossible to detect. DALL-E’s creators call the model experimental and not yet fit for commercial use but say it could influence industries like art, education, and marketing and help advance OpenAI’s stated goal of creating artificial general intelligence.
While AGI systems could lead to hugely positive innovations, they also have the potential to surpass human intelligence and become “super-intelligent”. If a super-intelligent system were unaligned, it could be difficult or even impossible to control for and predict its behavior, leaving humans vulnerable. Gato, like many other AI models, can produce biased or harmful output (though it’s not currently being deployed to any users). This is partly due to biases present in the vision and language datasets used for training, which include “racist, sexist, and otherwise harmful content.”
Similar biases have been found in DALL-E2. AI researchers found that DALL-E 2’s depictions of people can be too biased for public consumption. Early tests by red team members and OpenAI have shown that DALL-E 2 leans toward generating images of white men by default, overly sexualizes images of women, and reinforces racial stereotypes.
Know more about the syllabus and placement record of our Top Ranked Data Science Course in Kolkata, Data Science course in Bangalore, Data Science course in Hyderabad, and Data Science course in Chennai.