Wednesday, April 2, 2025

Your online activity is being tracked by Meta, collecting information from your publicly available Facebook and Instagram posts.

Key Takeaways

  • Meta is leveraging Facebook and Instagram content to train AI models.
  • Meta acknowledges scraping public Facebook posts containing images that may feature minors.
  • Currently, solely EU customers have the ability to opt out.



Has anyone ever stopped to wonder whether a familiar face is lurking in an old photograph? It may have appeared somewhat relatable to someone you know or even yourself. In all likelihood, it wouldn’t have been fully written down then.

Facebook’s parent company, Meta, has officially acknowledged that it leverages users’ photographs, videos, and conversations to train its artificial intelligence models. The corporation is siphoning off publicly available content dating back to 2007 to refine its AI offerings, leaving the vast majority of people powerless against this phenomenon. Currently, users within the European Union have the sole option to opt-out of the indiscriminate collection of personal content; for the rest of the world, the only way to prevent this is by making posts private.

The stark reality is that the EU alone can bring an end to this privacy infringement because, currently, it is the only region where robust laws exist to compel Meta to provide that option. Without explicit oversight and regulatory frameworks in place, it has become painfully evident that major AI companies are inherently incapable of self-policing their own practices effectively.


Meta is secretly scraping millions of Facebook and Instagram posts dating back to 2007 for AI training.

The EU and the UK alone have the discretion to determine their future relationship.

A photo of Facebook settings on an iPhone

During a public inquiry in Australia investigating AI usage within the country, Meta’s global privacy director conceded that the company is scraping publicly available posts from Facebook and Instagram users to train its AI products. Australian Senator Claybaugh stated bluntly: “The stark truth is that unless you deliberately changed these posts to personal since 2007, Meta has simply chosen to scrape all public pictures and texts from each public post on Instagram or Facebook since then, until a conscious decision was made to set them private.” “That’s the truth, isn’t it?” Claybaugh replied curtly, “Right.”


“The stark truth is that prior to your conscious decision to make these posts private in 2007, Meta had been quietly collecting all publicly available photos and texts from every public post on both Instagram and Facebook since 2007, only ceasing the practice upon explicit user choice to make them non-public.”

While this phenomenon may be prevalent globally, including many countries outside of Australia, there are certain nations where this is not the case. Within the European Union, as of June this year, customers gained the ability to opt-out of having their content scraped by Meta, thanks to the robust privacy regulations in place across the region. Despite this, public posts by EU member citizens will continue to be monitored unless users proactively adjust their privacy settings to opt-out deliberately. For many Europeans, it remains a mystery that they have a say in the matter at all.

Notwithstanding.

Meta AI on phone against colored background


Meta has clarified that they are solely scraping content from adult user accounts, with no scraping taking place on Facebook or Instagram accounts belonging to individuals under the age of 18. Despite this controversy, another Australian senator inquired about scraping images from their personal adult account that included pictures of their children. Claybaugh confirmed that they’d.

While it’s impossible to entirely discount the possibility that, when scraping the social media profiles of individuals currently over 18, some posts might have been collected that were initially shared by those same people when they were still minors? Given the passage of time and Meta’s retroactive data collection, individuals in their 30s may still have photos from when they were under 18 years old potentially scraped from their accounts, even if uploaded over a decade ago.


The practice of meta-scraping content featuring images of children under 18 to train AI models raises serious ethical concerns, warranting scrutiny and potential reform. What’s particularly concerning is that Meta seems entirely disconnected from the issue, lacking any apparent strategy to address or prevent it beyond halting data scraping altogether. Customers outside the EU cannot stop its processing in relation to their personal accounts, except by setting all posts to private.

Other tech giants may also start collecting and analyzing your online activities, including search histories, location data, and other personal information.

It appears that when you post something publicly, people generally assume it’s true, a fair play.

Twitter's Grok AI

While Meta has acknowledged its practice of scraping personal content, it’s unlikely to be the only company engaging in such activities. Artificial intelligence fashioning necessitates vast amounts of data for training purposes; the more information at their disposal, the more sophisticated they can become. As the demand for AI models continues to grow, concerns have emerged about the potential limitations of relying solely on existing data to train these systems. The worry is that, eventually, we may exhaust our real-world knowledge and be forced to turn to generating artificial information instead.


Because of this, AI firms are likely to aggressively acquire companies or technologies that offer them a significant strategic advantage. In July of last year, Elon Musk revealed during a Twitter Spaces conversation that the company would leverage public tweets to train its AI models, effectively making all publicly available tweets fair game for use unless users had opted out, which means their posts could have been harvested to aid model preparation.


It’s not the one chatbot that takes action, nonetheless. Musk reiterated his concerns about X’s reliance on publicly available information during an identical conversation, stating that he had set fee limits for accessing OpenAI’s knowledge due to widespread AI development efforts using Twitter data for training. His animosity towards OpenAI stems from their co-founder past, and Musk suspects that has also been trained on public posts from Twitter/X. As soon as you allow Grok to utilize your posts as training material, it’s virtually impossible to retract the permission; your entire public posting history has likely been harvested by now.

The lack of transparency among AI companies leaves users in the dark.

After a series of protracted negotiations, Meta ultimately agreed to reveal its intentions.

Instagram app on phone on colored background

In Australia, one of the most pressing concerns that emerged from the inquiry was the difficulty in getting AI companies to reveal their true intentions and activities. When Senator Sheldon questioned Melinda Claybaugh about allegations that Meta was gathering Australians’ data to develop its generational AI tools, she denied the claim outright. While technically accurate, she acknowledges that Meta’s data collection isn’t comprehensive, as there are many Australians who don’t use Facebook or Instagram.


One of the most pressing concerns to emerge from the investigation in Australia was the difficulty in getting AI companies to disclose their practices, which was a major source of frustration and disturbance.

When Senator Shoebridge pressed her on the issue, seeking clarification specific to Facebook and Instagram users, Claybaugh reluctantly acknowledged that the practice was occurring. Meta CEO Mark Zuckerberg has expressed concern about the platform’s role in spreading misinformation without explicitly stating so. He highlighted that our playbook’s next crucial element involves leveraging unique insights and expertise gained from product feedback loops before referencing the vast array of publicly shared images on Facebook and Instagram, collectively totalling hundreds of billions.


Although this assertion implies that Meta is scraping content from as far back as 2007 without permission, it does not explicitly state so, leaving room for interpretation. If Elon Musk’s claims hold water, it is possible that numerous AI companies are secretly harvesting personal content from social media platforms without consequence, an unsettling reality that warrants closer examination.

Not every company takes a cavalier approach to safeguarding your privacy.

The exceptions are uncommon, nonetheless

Apple Intelligence

Artificial intelligence fashions require extensive knowledge, and the internet is a vast and valuable resource. Web scraping has long been a familiar concept, and its success hinges on being able to execute it effectively? While there is a significant disparity between extracting key phrases from a website and employing personal images to train AI models, nonetheless.


Not every AI firm is harvesting personal data without explicit user consent. Some companies appear to make an effort to approach things differently. Apple leverages its proprietary internet crawler, Applebot, to scour the web and gather relevant information that can be seamlessly integrated into Siri’s functionality. Applebot Prolonged is an autonomous entity responsible for overseeing how websites utilize their content, with the ability to provide website management services. Websites can now add a snippet of code to prevent Applebot-Extended access, thereby denying permission to scrape data from that website for training Apple’s AI capabilities? Apple empowers website owners to decide whether their content contributes to training its AI, allowing them to opt-out without incurring penalties.

Several prominent online platforms have opted to block Apple’s access to scrape their websites for educational purposes. While Facebook and Instagram are owned by Meta, it’s reassuring that Apple’s AI models won’t utilize any of your personal posts for training purposes, unlike Meta’s current practices.


While commendable, this approach ultimately sidesteps the core problem, doesn’t it? Siri will soon integrate AI-powered models, unencumbered by Apple’s management over the data utilized in coaching OpenAI’s fashion designs.

The European Union has demonstrated that companies will ultimately shut down if compelled to comply.

To ensure seamless control over our individual privacy choices, comprehensive guidelines need to be established.

Framework Convention on Artificial Intelligence being held by signatories

Council of Europe

In a landscape beset by uncertainty, a glimmer of optimism emerges. The European Union is renowned for its stringent online privacy regulations. Despite their good intentions, some regulations ultimately backfire, as seen with the GDPR laws that inadvertently trigger pesky cookie-consent pop-ups. While the sentiment is commendable, the actual execution falls short, yielding an even more exasperating experience for users who must navigate through an overly complex process simply to access the website’s features and start using it effectively.


While major companies may view the EU with skepticism, its sheer size – encompassing almost 500 million people and comprising a significant portion of the market for tech firms – cannot be ignored? The European Union’s success in compelling Apple to finally agree Under mounting pressure from the EU, Meta was compelled to adapt by providing European customers with the option to opt-out of having their data scraped for AI training purposes.

Despite its reputation as a bastion of free expression, even Facebook has taken steps to curb online harassment. The corporation has consented to refrain from leveraging account data from Europe to train its artificial intelligence models, despite being unable to rectify the situation regarding the previously gathered information.


It won’t be long before we’re packing up and transferring to Barcelona, though. While tech companies may initially adapt to new legal requirements by removing AI features from EU customer offerings entirely. META has paused the launch of certain features in Europe and Apple’s Intelligence tracking system for EU iPhone customers. While there’s a possibility they’ll enter the EU market, the sheer scale of potential customers makes it an opportunity not to be ignored.

That is the true subject. Artificial intelligence has emerged rapidly, leaving governments struggling to keep pace?

Ultimately, establishing guidelines that can be universally applied across the world is crucial. Given that the identical option is available to EU Facebook and Instagram users due to existing laws, it’s only fair that Australian customers are also offered this choice, considering they share similar rights and protections under international law. As long as laws are applicable everywhere, companies can continue operating in any country that does not explicitly prohibit their activities. However, we are still a significant distance from achieving worldwide regulation of AI.


That is the true subject. Artificial intelligence has emerged rapidly, leaving governments struggling to keep pace. The European Union’s experience demonstrates that well-defined legal frameworks can effectively compel major corporations to prioritize privacy. While confirmation of the flip side has been obtained, it’s crucial to acknowledge that AI companies may still exploit loopholes as long as laws are ambiguous. The imperative for explicit regulation becomes increasingly pressing in order to safeguard individual privacy from potential exploitation.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles