Automattic, the parent company of WordPress and Tumblr, is reportedly in discussions to sell user-generated content from its platforms to AI companies like MidJourney and OpenAI for training purposes, as revealed by a new report from 404 Media. While specific details are still emerging, Automattic is assuring users that they will have the option to opt out of this data-sharing arrangement.
Intriguingly, internal strife within Automattic has surfaced due to concerns that some scraped content included private data not meant for company storage. The situation is further complicated by the inadvertent inclusion of advertising material, such as old Apple Music campaign ads, in the training dataset used by these AI companies.
The controversial nature of these plans has led to internal discord, with reports indicating that a product manager at Automattic has taken personal action to remove his own photos from Tumblr to prevent their unintended utilization in AI training. The rising prominence of Generative AI, especially since the release of ChatGPT by OpenAI, has fueled a surge in AI-generated content creation across various media formats.
Several major publishers have decried the legality of utilizing data for training AI models, leading to legal disputes over copyright infringement and fair use. Automattic is poised to unveil a new feature allowing users to opt out of contributing their data to AI training sets, mirroring a move made previously by competitor Squarespace.
In response to queries, Automattic highlighted a blog post reaffirming the company’s commitment to user choice and the evolving landscape of AI technologies. The post emphasizes the importance of empowering users to control the fate of their content while acknowledging the absence of legal mandates dictating compliance with opt-out preferences for web crawlers.
Despite the defensive tone of Automattic’s statement, the company intends to uphold user preferences regarding AI training data and vows to update partners on newly opted-out users for data removal. This initiative reflects their intention to align with industry standards and provide users with robust controls over how their content is utilized for AI training purposes.