The immense potential and challenges of multimodal AI

December 30, 2020April 12, 2022 hitesh nikam AI, Artificial Intelligence, theinfotech

Unlike most AI systems, humans understand the meaning of text, videos, audio, and images together in context. For example, given text and an image that seem innocuous when considered apart (e.g., “Look how many people love you” and a picture of a barren desert), people recognize that these elements take on potentially hurtful connotations when they’re paired or juxtaposed.

You May Also Like

Subscribe to our newsletter