Apple Unveils New AI Image Tool
The tool enables users to seamlessly remove unwanted elements from backgrounds, allowing the primary subject to stand out.
Tech giant Apple has introduced a new AI tool called Multimodal Large Language Model-Guided Image Editing (MGIE).[1]
This tool marks a significant departure from traditional image editing methods, allowing users to manipulate images based on natural language commands.
Collaborating with researchers from the University of California, Santa Barbara, Apple's MGIE leverages Multimodal Large Language Models (MLLMs) to process both textual and visual inputs. Through interpreting natural language instructions, the model conducts precise pixel-level alterations according to user directives.
“Instead of brief but ambiguous guidance, MGIE derives explicit visual-aware intention and leads to reasonable image editing. We conduct extensive studies on various editing aspects and demonstrate that our MGIE effectively improves performance while maintaining competitive efficiency. We also believe the MLLM-guided framework can contribute to future vision-and-language research,” the researchers said in the paper.[2]
Accessible to all via GitHub and a web demo on Hugging Face Spaces, MGIE offers a wide array of editing capabilities. From enhancing the healthiness of a pizza by adding vegetable toppings to adjusting photo contrast for a brighter effect.
Furthermore, the tool enables users to seamlessly remove unwanted elements from backgrounds, allowing the primary subject to stand out.