published: 11/09/21 (dd/mm/yy) updated: not yet (sorry for my bad english) Here are my logs today (at 10:42 UTC+2): user@fx160:/var/log/apache2$ sudo cat other_vhosts_access.log | grep l3m.in:443 | wc -l > 819 user@fx160:/var/log/apache2$ sudo cat other_vhosts_access.log | grep l3m.in:443 | grep "bot" | wc -l > 319 user@fx160:/var/log/apache2$ sudo cat other_vhosts_access.log | grep l3m.in:443 | egrep -v "bot" | wc -l > 500 ~One third of all the visits I got on my website are from bots! If I take another day (before misc.l3m.in/txt/github.txt made it on the frontpage oh hackernews[0]), the numbers are even worse: [0]: https://news.ycombinator.com/item?id=28468977 user@fx160:/var/log/apache2$ sudo zcat other_vhosts_access.log.3.gz | grep l3m.in:443 | wc -l > 651 user@fx160:/var/log/apache2$ sudo zcat other_vhosts_access.log.3.gz | grep l3m.in:443 | grep "bot" | wc -l > 403 user@fx160:/var/log/apache2$ sudo zcat other_vhosts_access.log.3.gz | grep l3m.in:443 | egrep -v "bot" | wc -l > 248 On all the visits on my wesite, ~two third are from bots! (thank you for the real humans visiting my website btw) Here's a list of urls in bots headers and the number of requests they made through the day: https://webmaster.petalsearch.com/site/petalbot (250!!!!) https://www.seokicks.de/robot.html (43) http://www.google.com/bot.html (42) http://ahrefs.com/robot/ (22) http://www.bing.com/bingbot.htm (21) http://mj12bot.com/ (5) https://intelx.io (5) http://www.apple.com/go/applebot (4) http://www.semrush.com/bot.html (3) nbertaupete95(at)gmail.com (2) (2 requests were made on the page /robots.txt on the same day at ~3 hours interval, I found it on http://stopbadbots.com/bots-table/page/6/?letter=N) http://yandex.com/bots (2) http://sur.ly/bot.html (2) http://go.mail.ru/help/robots (2) They continuously index my websites, making queries and queries to my selh-hosted server, but if I search things on their search engines relative to the content present on my websites, I struggle to find anything. My websites are too small to be ranked on the first page of google, bing or yandex (even petalsearch), except when you copy/paste the title of an article (try "Modifier les icônes des dossiers de raccourcis sous Firefox 82", I found it only on Google). Why then spam them with queries? I think some crawlers will propose (paid) SEO-related services, thus will make money through the content of my websites. Since my blog is in french and mainly for tech-related things, it's like a "niche" audience (lots of tech frenchies search for things in english), but it struggles to rank (even the pages related on my projects!). For example, I made a tool called "Django Check SEO". It's a django/django-cms module that you just add to your website (no need for a database or anything). You visit the related url, and it will give you a list of advices and problems. IMHO it's a cool tool, it was missing from the django/django-cms ecosystem and it may help people better understand all this SEO-related crap. I made this tool when working at Kapt, and if I search "Django Check SEO" on -lets pick one, google?-, it returns this list of websites: 1) the (fr) article on the website of my company 2) the (en) github repo 3) my (en) article on dev.to 4) the (fr) page of the project on my website Not bad. And here are the results if I test from another location: 1) github repo 2) dev.to post 3) company website 4) djangopackages.org 5) snyk.io (wtf? an automatically-generated page?!) 6) forum.djangoproject.com 7) my website My website comes after a page that display the health of a python package? I followed every advice I could find to create a performant website, I got 98/100/100/100 on web.dev/measure, I wrote "human content" (not those 30k words long pages that *not even answer* the questions people who visit the page have written), and my website isn't in the top 3 of the results on a fr query, and is 7th on an english query. ... Well, it's not my goal to be first on every result relative to something I did, and appearing on the first page is good enough. This rant is about the number of robots that crawls my websites everyday, but don't make anything useful of the data (at least in public). A few years ago I posted links on social networks and people visited my websites, but the pages were not really ranked on any search engine. Today I don't post anymore on social networks, search engines crawl my websites, and there's no more people viewing the content of my sites. Because people won't find my websites using search engines if they don't already know what to search. I don't know if you do the same thing, but when I visit an article, I will visit some other pages on the website all the time. The root page indeed, but the projects page, the archive, some other articles. Sometimes when I do this, I visit multiple websites just using hyperlinks, like we used to do 20 years ago (:P), without even searching "name of the website" on google to find more pages.