Lets meet Pollux our new Search Index

Jan Burse, created Jul 01. 2017

Dear All, This is just to let you know that our website www.jekejeke.ch features a new search index. The search index is code named pollux, and its a n-gram index. - Search without Index: The old search without index is still available under "found2.jsp" as a fall back. This old search had the disadvantage that it was simply scanning all available documents. For a query such as "Artificial Intelligence" we got the result: Search results - Artificial Intelligence Results 1 - 6 of 6 in 13744 ms. - Search with Index, Debug mode: The new search uses an index. We currently allow a debug mode, that shows the n-grams that are inquired for a particular query. This mode might go away in the future and the index handling might also change in the future. It can be invoked via "found.jsp?debug=true". Search results - Artificial Intelligence pregram=art, union=739 pregram=ifi, union=433 pregram=cia, union=596 pregram=l, union=1786 specimen res=209 union res=209 pregram=int, union=1083 pregram=ell, union=592 pregram=ige, union=69 pregram=nce, union=812 specimen res=21 union res=21 inter res=14 - Search with Index, Normal mode: As seen in the above the pollux index works for pregrams, that is n-grams which are exact and prefix matched. For example a pregram "l" matches all n-grams that start with "l". The normal mode is without debugging information that is what the search button currently directs to. The document retrieval is much faster: Search results - Artificial Intelligence Results 1 - 6 of 6 in 546 ms. We would like to thank Guy Castagnoli for helpful discussions and also repeatedly showing us KIWIX, a poket version of wikipedia which also features a search index. Best Regards