bot image

Version January 2008
  
~ Bots lore ~


BOT WRITING, BOT TRAPPING & BOT WARS
Part of searchlore

This is a 'living' workshop on bots trapping and reversing, you will find elsewhere on my site other web searching and data mining "broad" lore.
As ~S~ deep wrote in his bot-essay: "There are many Perl bots available on the net, but I'm fairly certain that you will not find one that does exactly what you want. There's also a "convention" amoung bot writers not to give bots source code to people who do not understand them - it's considered irresponsible. Of course, once you've learned how to build bots, you can be as irresponsible as you like". This is exact. Anyway knowledge runs downhill on the web: we will find more knowledge only if we create it ourself at the same time.
Your own contributes and work are necessary. The material presented here should be more than enough to "get you started" on the bot path. Write your own bots, publish the code so that others may improve them. Reverse the code of the existing bots.
Awaiting your own contributions...


["Our" essays!]
recent and old


A good robots.txt
If you do not know what is a robots.txt just leave now and come back later :)

More (older) stuff
[Iliad fetchbot]   [Juno autoresponder]   [our friends' essays!]


An older introduction - explanation

The term "bot" is, according to DeadelviS, a short for "robot", which sounds much cooler than "program"

As Andrew Leonard explains, like mechanical robots, bots are guided by algorithmic rules of behavior - if this happens, do that; if that happens, do this. But instead of clanking around a laboratory bumping into walls, software robots are executable programs that maneuver through cyberspace bouncing off communications protocols. Strings of code written by everyone from teenage chat-room lurkers to top-flight computer scientists, bots are variously designed to carry on conversations, act as human surrogates, or achieve specific tasks - such as seeking out and retrieving information. And... bots can also be used as weapons: great fun assured.


These pages of mine may regards all sort of web bots: spiders, wanderers, and worms. Cancelbots, Lazarus, Automoose. Chatterbots, softbots, userbots, taskbots, knowbots, mailbots, searchbots. MrBot and MrsBot. Warbots, clonebots, floodbots, annoybots, hackbots, and Vladbots. Gaybots, gossipbots, gamebots. Skeleton bots, spybots, and sloth bots. Xbots, meta-bots. Eggdrop bots... as you can see the terminology is far from being simple... basically, though, the idea is to allow you to learn enough to WRITE your own search bots. Information searching through commercial engines -y compris google- is less and less efficient, alas: it is of paramount importance, for a ~S~, to learn how to bulid his OWN 'home made and tailored" specific bots. You'll be amazed at the whealth of really strong signals you'll be able to find among the noise as soon as you use your own bots, even with imperfect and surely not 100% state of the art, tiny, simple bots...

It's up to you to help us with your own work or, alternatively, to keep what you'll dvelop guarded for yourself: it is my intention to offer enough material on this section to allow anyone to start. Choose the konwledge path or choose the dark path, it's up to you.
I'll NEVER charge money for accessing my site, hence I ask you the only "money" that's worth something on this web of ours: knowledge for all.

Contribute with YOUR knowledge: if you build on other people's shoulders, you should imho offer your own shoulders for others to build upon

The old Iliad fetchbot


Hey! I wanna see a real bot in action before joining this section!

Yessir! Here is a good and powerful 'fetchbot', very useful for seekers and searchers alike. And if you knew nothing of this stuff you'll be fascinated (and even if you already knew... :-)
You'll now (at once if I were you) approach the "iliad" Searchbot (a very useful one, btw, was at iliad@algol.jsc.nasa.gov, is now at iliad@prime.jsc.nasa.gov):
Send an email to:
iliad@prime.jsc.nasa.gov
write into the SUBJECT part of your email (into the subject field, duh!):
iliad query
write into the TEXT part of your email (that's your letter, duh!):
?Q: internet bots automated retrieval
(for instance... and you'll -most probably- get quite a lot of interesting material about bots from this mighty useful Searchbot... whatt'd'ya say?
If you're stuck email the same address with the word help both in Subject and in text (a pretty poor help will you get :-(



The old Juno autoresponder


Hey, this is great! I wanna taste another email-bot, just for fun!

Yessir! Please go ahead: have a look at friend autobot: Send an email to:
autobot@junoaccmail.org
write into the SUBJECT part of your email (into the subject field, duh!):
send index
or if you want to have a laugh at some 'scarecrow' copyright propaganda, write - always in the SUBJECT field - the following:
send Copyrights


Hey, this is gorgeous! Now, before I start working on my own, let me please see and touch the code of a "real" bot!
Yessir! Please go ahead: enjoy the following essays!
You'll find here all the code you may need to start working on your own!

Recent Essays

  • [mhyst_w3s.htm]: W3S: Web Personal Spyder.
    by Mhyst, January 2008
    "The aim of this document is to put forward the structure and functionality of W3S and, at the same time, to describe a basic searching web spider. I hope this essay will bring somebody the possibility of making his own web spider."
    Part of the bots section.


  • [termisearch.htm]: A proof of concept: a pre-search filter/bot.
    by fravia+, Mai 2007
    Just an example of a possible application of a simple, but effective google's pre-filtering approach
    Part of the bots section (even if it is just a pre-filter and not a bot strictu sensu).
    This is here simply done adding, subtracting or ORring automatically some ad hoc search terms to whatever query you may have.
    This is the whole point of this example. Tou don't need to go linguistic. You modify or create your own forms at leisure.
    You may want special effective forms in order to search for books or images and just eliminate all those idiot sites that try to 'trap' searchers into advertisement hells or crippled items for zombies and guinea pigs.
    Or maybe you want mp3s without having to wade knee-deep into morons trying "to sell" you those very mp3s (quelle vulgarité!). Or whatever... I'm sure you get the infinite possibilities now in your own hands :-)



  • winky_stripper.htm: Winky strips for Yahoo (A Yahoo results stripper)
    by Winky ;-), April 2004
    A further introduction to the power of python, by Winky.
    Part of the bots section.
    ...on the board they where talking about "stripping" and searching html.
    Anyway I decided to make one for yahoo, it is primative a learning tool, but could be expanded to handle next queries etc.
    yahoo_stripper.py -> is the actual script itself
    clean.html -> is the "output from the script"
    To create the "docs" which reside in the html directory just run the script thru epydoc.
    This sort of tactic of stripping webpages is much more effective then just blindly using regular expressions.
    Not for beginners!




  • Older Essays: how to build your own bots

    PHASE ONE (16 July 1999)

    this essay (perl_es1.htm): Perl@usa.net ~ How to reverse a "free" service has been written by [blue] in July 1999 for the removing banners section, read and enjoy, let's hope you'll write afterwards your own perl-bots and send them here so that others can ameliorate and give feedback...

    PHASE TWO (22 July 1999)

    this essay (rt_bot1.htm):The HCUbot: a simple Web Retrieval Bot in Perl has been written by deep in July 1999, read and enjoy! Let's hope you'll write afterwards your own perl-bots and send them here so that others can ameliorate and give feedback...

    PHASE THREE (14 September 1999)

    this essay (botcgi.htm):Mirbot 1.0: a very special kind of a Robot has been written by The Mystical Friend in September 1999, read and enjoy! Let's hope you'll write afterwards your own perl-bots and send them here so that others can ameliorate and give feedback...

    PHASE FOUR (14 September 1999)

    this essay (rt_bot2.htm):The HCUbot (Version 2.0): a simple Web Retrieval Bot in Perl has been written by deep in July 1999 and updated and ameliorated in September 1999, read and enjoy! Let's hope you'll write afterwards your own perl-bots and send them here so that others can ameliorate and give feedback...

    PHASE FIVE (21 September 1999)

    this essay (sono_bot.htm):spider.r: a handy search tool and intro to REBOL has been written by sonofsamiam in September 1999, read and enjoy! Let's hope you'll write afterwards your own rebol-bots and send them over here so that others can ameliorate and give feedback...

    PHASE SIX (March 2000)

    [ftpbot1.htm]: A small ftp fetcher bot
    by DarkWyrm
    This bot searches a FTP site for a particular file (in Perl)

    PHASE SEVEN (May 2000)

    [plbtgrab.htm]: Source code for a spam bot (Kevin's spider) (in Perl)
    by Kevin Jobson
    Automatical link searching
    PHASE EIGHT (September 2000)

    [scan_reb.htm]: A simple REBOL scanner ways to retrieve hidden files, pages, zips, images
    by -Sp!ke
    Automatical link sniffing
    PHASE NINE (October 2000)

    Check [mysearch.zip]: ~ 20233 bytes A search bot in Visual Basic by Shoki (see [shokiwcd.htm])
    PHASE TEN (February 2001)

    Check [wf_add.htm]: Adding engines to WebFerret by Laurent (The guts of a search engines parser) Advanced
    PHASE ELEVEN (April 2001)
    [perlbot.htm]: HOW TO FOOL SSL DOWNLOAD OBSTACLES (spelunking into https "secure" servers)
    by DigJim, Very Advanced essay
    PHASE TWELVE (April 2001)