Back to where [you came from] [fravia's tips] [evaluation lore] [main portal] [Essays] |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
This is part three of the
're-ranking' trilogy. one:[yoyo1.htm]: The yo-yo technique two:[synecdoc.htm]: The synecdochical searching method |
The epanaleptical approach (and other fuzzy searching tricks) by fravia+ first published @ searchlores in December 2001 Updated in December 2003 [The epanaleptical approach] [Search engines' tides] [Epanawhat? Wuzz dat?] [Other fuzzy searching tricks] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The 'geminatio' or Epanalepsis is in rhetoric nothing more than an 'echo sound' or, to be more precise, the textual repetition of the same part of phrase or word. From a rhetorical point of view we could subdivide the repetition of a single query term, palillogia from the repetition of a group of terms (or a complex query string) which is a "proper" epanalepsis. But -as a searcher- I'm going to use this term for ALL kind of approaches where you just refine your search using repetitions. This approach is -afaik- not widely used in searching, yet when dealing with the algos of the main search engines (which are inerehently stupid) it allows you to get at different 'clusters' of your signal than you would have got WITHOUT repetition. So you better be epanaleptical and use redundancy every time you suspect phony ranking algos and, oh boy, the commercial search engines ranking algos are indeed most of the time fishy... to say the least... :-)
The 'epanaleptical' touch
Please always keep in mind that we are trying here to apply OUR OWN parameters to the algos of the main search engines. And that these algos may vary wildly. Yet our epanaleptical results will shed some light on those very algos as well... Infact this is one of a series of approaches that can be used in order to reverse search engines algos (a compart of the web where much money is involved, and where are active many of the lurkers that roam and leech this site without ever contributing to it). Should we for instance repeat the same epanaleptical search on FAST (Alltheweb) we would get similar results for 'haiku', but, awkward enough, OPPOSITE results for 'tolkien':
How is this possible? Fast has INCREASED results when using epanalepsis on 'tolkien'! Mistery of the algos. It could have to do with special postindexing filters and reranking algos these engines use when dealing with human names ("aristotle" will fetch 249840 while "aristotle aristotle" will fetch 253395 results with a smaller but still inexplicable increment of 1,4%). This will teach us to accept all these approaches cum grano salis. They do work indeed (at times they work small searching wonders) but the web is so slippery that you can always make mistakes or blunders... That's the fun of the whole thing. Readers that are going to be wizard searchers should never take themselves too seriously, else they will go ka-buum hitting all sort of obstacles when searching in the dark. And since we are using haiku as quarries...
Let's now increase the epanalepsis level... (keep in mind that the 'results' data may vary at EVERY REQUEST - because of search engines' tides - depending from the group of Google servers that will answer)
Clearly with this kind of simple "one word" repetition (palilogia, from greek logia & palin: "speaking over again ") the 'main' reduction appears at level 1. Once more the results above must be interpreted with care. First of all the results from any given search engine do vary every time you repeat a query. This depends FIRST from the 'depth' (or width) of their specific databases. These vary considerably, the most broad ("omnipotent") ones being - at the time of writing this snippet - those at Google, see the graphic below.
Now the problem is that the bigger the database, the less inclined the search engine will be to search it in its whole completeness when there are server overloads (or when the search query is evidently much too vague). There are various ways to limit access: different indexes usage, timeouts, redirection towards slower servers. Once more keep ALWAYS in mind that search engines are NOT there in order to provide you some kind of 'free' service for the glory of knowledge and for the sake of the web of old... They are just trying to scrap as much money as possible out of your USE of their concoctions and biased ranking algos. As usual in our doomed society, short-term profit is the ONLY reason search engines exist and someone is paying money for their bandwith, duh. Yet, once we understand this, we may be able to reverse part of it :-) As pointed out above, depending from the vastness of their indexes, the 'biggest' search engines, like Google, will tend to give you inconsistent results either every few hours or even every time you do query them, depending on server connection speed, overload, moon phases, how many Americans watch television and so on. In other words most (big) search engines have TIDES, and accomplished searchers would be well advised to take account of this problem as well. A 'quick and dirty' check of the database depth 'tides', for instance in Google, can be gathered through the following 'rimbaudian' query, based on the weight of the five vowels (aeiou... note also how the letter 'a' will give you an approximate idea of Google's global depth at the moment of the query)
now you could also try determine the colors of the vowels... :-) Back to our approach, let's see what happen with FAST's weird personal name 'increment' when we speed up the epanalepsis...
Well, once upon a time there was a peak at level two, while level 3 was smaller than level 1 (but still an increment). Nowadays there is a small difference at level 3, and that's all. New algos. Let's check it once again, repeating the epanaleptical query for our 'aristotle':
Different. Once upon a time (two years ago) there was a peak at level two, while level 3 always smaller than level 1, Hence as (provisory) conclusions for Fast/Alltheweb (and Wisenut): with people names you could WIDEN your search using the epanalepsis approach. Nowadays there is a difference at level 2 and 4, no difference at level 3. New algos. Other search engines particularities Hotbot is 'invariable', no matter how much you repeat your search term, it will still give you the same amount of results. Lykos shows the same 'epanaleptical incrementation' phenomena that we have seen above for Alltheweb/Fast when using 'Tolkien' as search term. You may -legitimately until now- wonder why should we have to use funny terms like 'epanalepsis' instead of just saying something simpler like 'repeating the search term', or 'redundance searching'. Well there are good reasons for that in my humble opinion. First of all the concept of epanalepsis is quite complex, encompassing simple "palilogical" one-word searches, as we have seen above, AND more complex real epanalexis searches where you'll use MORE THAN ONE TERM and/or whole (long) strings. Here the repetition can concern only one/more word(s) at the beginning of the string ('Go, go good countrymen'), the center ('daraus kann nimmer, nimmer Gutes kommen') or the end ('Come away, away!') of the string. Moreover more subtle approaches, when searching, could also be a 'diacopical' epanalepsis, if you INSERT one or more words inbetween, or 'elliptical' epanalepsis, if you OMIT some terms from your search strings. This of course brings us back to the [synecdochical searching method] that we have examined in part two of this trilogy. Since all these techniques give DIFFERENT results on the main search engines, I do believe seekers should indeed strive to be terminologically precise. Here some examples (Mid-december 2001): advanced searching: 1,010,000 advanced advanced searching: 982,000 advanced advanced advanced advanced advanced advanced searching: 967,000 advanced searching searching: 1,030,000 advanced tips searching: 186,000 advanced search searching: 881,000 To be continued These have something to do with the famous 'art of guessing', but relate to our re-ranking efforts as well.
Hey, this is part three of my re-ranking trilogy. Note that there are more concoctions for your reading pleasure: Part one: [yoyo1.htm]: The yo-yo technique by fravia+ (Tackling the 'down yonder' problem: a discussion about search engines' "depth") Part two:[synecdoc.htm]: The synecdochical searching method by fravia+ (substituting a part for the whole when searching) Back to the essays |