Apple shares ‘Ferret’ machine learning model for image-based queries

Key Points:

  • Apple and Cornell University researchers quietly released an open-source multimodal LLM named “Ferret,” allowing queries using image regions, gaining attention from AI researchers.
  • The non-commercial license of Ferret raises speculation about potential integration into future Apple products or services, highlighting Apple’s commitment to impactful AI research.
  • Twitter posts by an Apple AI/ML research scientist delve into Ferret’s capabilities, emphasizing its precision in identifying and responding to specific elements within an image.


In a surprise move, researchers from Apple and Cornell University introduced an open-source multimodal LLM, dubbed “Ferret,” which can utilize image regions for queries. Released quietly on Github in October with no accompanying fanfare, Ferret has now gained attention from AI researchers, with industry experts applauding Apple’s commitment to impactful AI research. Despite its non-commercial license, there’s speculation about the potential integration of Ferret into future Apple products or services. Notably, Twitter posts by Apple’s AI/ML research scientist provide insight into Ferret’s capabilities, explaining how it can identify and respond to specific elements within an image, offering precise understanding of small image regions.



Prompt Engineering Guides



©2024 The Horizon