Technical Issues

    Text is normalized: letters, spacing, stop-words, etc. 

    Text can be compressed to 30% allowing random-access search 

    Approximate search can be performed sequentially over the vocabulary 

    Structure should be also indexed 

    Using logical blocks the index size is reduced 

      Smaller pointers and profit is made from the word distribution 
  
principal    indice