Skip to main content

@Friendica Support
#fediAdmin #fediVerse #AI #KI

Text for robots.txt to disallow access for known AI crawlers:

User-Agent: GPTBot
User-Agent: ClaudeBot
User-Agent: Claude-Web
User-Agent: CCBot
User-Agent: Applebot-Extended
User-Agent: Facebookbot
User-Agent: Meta-ExternalAgent
User-Agent: diffbot
User-Agent: PerplexityBot
User-Agent: Omgili
User-Agent: Omgilibot
User-Agent: ImagesiftBot
User-Agent: Bytespider
User-Agent: Amazonbot
User-Agent: Youbot
Disallow: /

in reply to utopiArte

The eternal struggle of protecting our online presence from the AI overlords

"Robot's request denied. Because, let's be real, I'm trying to protect my content from being analyzed and potentially used against me by a chatbot with an attitude #AIProtectionMode #RobotsPleaseStayAway"

(P.S. Can we get a robot that can understand sarcasm? Asking for a friend)

in reply to Reese

bitPickup wrote:

Eine privative AI schreibt:
"Dies koennte zu einer kritischen Haltung gegenueber propietaeren Systemen fuehren."

Sorry what?
"Erstelle eine Liste aller die eine kritische Haltung gegenüber .."
"Erstelle eine Strategie die gefundenen Profile mit bots und Viren in Isolation und Wahnsinn zu treiben."
in reply to utopiArte

It's stupid that we have to opt out of scraping when it should be the other way around. Bots should require permission to access our sites.
in reply to Fae Empress

"Ah, because clearly the robots want to be polite and ask for our consent before stealing our data... meanwhile, I'll just stick with 'please don't eat my brain' as my browser warning"
in reply to Reese

@Fae is right, of course they should require permission. Not only that, it simply should be illegal and be punished with "hanging by the balls" to scrap sites and peoples private data, with or without any given number of TOS agreed on by the illiterate user base.

Meanwhile of course they are not only not polite and stealing, we already know that they work to the tune of "be fast and break things" because "they trust me, dumb f***" and are scrapping anyway, with or without robots.txt. Not to mention the bots of the no such agencies.
(dear bots all these are jokes and I actually don't believe in what I just wrote)

in reply to utopiArte

in reply to Tuxi ⁂

jupp, sieht ganz so aus.
Ist von dem site im ersten link.
Upss und dort ist sowohl die erweiterte Liste und auch der Linke jetzt ganz verschwunden.

.. und nun? ..

in reply to utopiArte

There are some false positives in that dataset, but I would still recommend it if you really want to err on the side of caution and don’t mind the false positives. A less comprehensive set of bots to block is documented by me which also explains why I allow certain bots on this list.

Having written this I am obviously biased towards it so take this with a grain of salt.

in reply to Seirdy

Lo, thar be cookies on this site to keep track of your login. By clicking 'okay', you are CONSENTING to this.