query-autocomplete User Guide

query-autocomplete builds fast local autocomplete suggestions from your own text, PDFs, and DOCX files. Use it for search boxes, command palettes, document title completion, and typo-tolerant prefix search without running Elasticsearch, Meilisearch, Algolia, Typesense, or another search server.

Install

pip install query-autocomplete

PDF and DOCX readers are included in the base install. Optional sentence chunking support is available with:

pip install "query-autocomplete[chunking]"

Choose a Path

The library has three common ways to use it:

  • Autocomplete: build an in-memory autocomplete from documents
  • saved artifacts: save a compiled serving index and load it later
  • AdaptiveStore: keep source documents in SQLite and update them over time

Start with Autocomplete. Move to saved artifacts when you want to build once and serve many times. Move to AdaptiveStore when your documents need to be added, removed, listed, or persisted as source data.

Use Autocomplete when:

  • your source text can be loaded at startup
  • the document set rarely changes
  • you want the simplest possible integration

Use saved artifacts when:

  • you build an index once
  • you serve it many times
  • you do not need to mutate the source documents

Use AdaptiveStore when:

  • documents are added or removed over time
  • you need durable source documents
  • you want SQLite-backed persistence

Cold Starts

Build or load the autocomplete once when your app starts. Do not rebuild it inside every request handler.

Cold starts happen per process: a new process has to load or rebuild the serving index once. After that, suggestions are served from the in-process engine.

For app startup, call warm() after loading the index or opening the store:

from query_autocomplete import Autocomplete

index = Autocomplete.load("my-index")
index.warm()

For mutable SQLite stores:

from query_autocomplete import AdaptiveStore

store = AdaptiveStore.open("sqlite:///adaptive.sqlite3")
store.warm()

AdaptiveStore.warm() loads the current compiled serving index if one exists, or rebuilds it from stored documents if needed. If the store has no documents yet, it is a no-op.

Quick Start

from query_autocomplete import Autocomplete, Document

index = Autocomplete.create([
    Document(text="how to build a deck"),
    Document(text="how to build a desk"),
    Document(text="how to build with python"),
])

print(index.suggest("how to bui", topk=5))
print(index.suggest("how to biuld", topk=5))

suggest(...) returns a plain list[str].

You can also pass file paths directly. .txt, .pdf, and .docx inputs are supported in the base package:

from pathlib import Path
from query_autocomplete import Autocomplete

index = Autocomplete.create([
    Path("docs/handbook.pdf"),
    Path("docs/release-notes.docx"),
    Path("docs/faq.txt"),
])

print(index.suggest("install", topk=5))

Documents

The main input type is Document.

from query_autocomplete import Document

doc = Document(
    text="how to build with python",
    doc_id="doc-123",
    metadata={"source": "docs"},
)

Fields:

  • text: source text used to learn suggestions
  • doc_id: optional stable identifier
  • metadata: optional JSON-like metadata on in-memory documents

Document.text can be a phrase, paragraph, full article, transcript, or larger body of text. Use doc_id when you need stable document identity, especially with AdaptiveStore.

In-Memory Autocomplete

Use Autocomplete when the source collection can be loaded in memory.

from query_autocomplete import Autocomplete, Document

documents = [
    Document(text="how to build a deck"),
    Document(text="how to build a desk"),
    Document(text="how to build with python"),
]

index = Autocomplete.create(documents)

print(index.suggest("how to build ", topk=5))

Useful methods:

index = Autocomplete.create(documents)
index.suggest("how to bui", topk=5)
index.inspect("how to bui", topk=5)
index.warm()
index.save("my-index")
loaded = Autocomplete.load("my-index")
docs = index.export_documents()

Saved Artifacts

Saved artifacts are compiled serving indexes. They are useful when you build an autocomplete once and load it later.

from query_autocomplete import Autocomplete, Document

index = Autocomplete.create([
    Document(text="how to build a deck"),
    Document(text="how to build a desk"),
])

index.save("my-index")

loaded = Autocomplete.load("my-index")
print(loaded.suggest("how to bui", topk=5))

Artifact path behavior:

  • index.save() writes to a managed folder under .query_autocomplete_artifacts/
  • index.save("docs-v1") writes to .query_autocomplete_artifacts/docs-v1/
  • index.save("artifacts/docs-v1") writes to that explicit relative path
  • Autocomplete.load("docs-v1") loads from the managed artifact folder

Artifacts are for serving. They do not act like a mutable document database.

SQLite Adaptive Stores

Use AdaptiveStore when documents change over time.

from query_autocomplete import AdaptiveStore, Document

store = AdaptiveStore.open("sqlite:///adaptive.sqlite3")

store.add_documents([
    Document(text="how to build a deck", doc_id="deck"),
    Document(text="how to build with python", doc_id="python"),
])

print(store.suggest("how to bui", topk=5))

Each adaptive SQLite database owns one document collection. Adding documents invalidates the serving cache, which is rebuilt when needed.

Supported store paths:

AdaptiveStore.open("sqlite:///adaptive.sqlite3")
AdaptiveStore.open("sqlite:////absolute/path/adaptive.sqlite3")
AdaptiveStore.open("./adaptive.sqlite3")
AdaptiveStore.open(":memory:")

Serving a SQLite-backed autocomplete from FastAPI:

from fastapi import FastAPI
from query_autocomplete import AdaptiveStore

app = FastAPI()
store = AdaptiveStore.open("sqlite:///adaptive.sqlite3")

@app.on_event("startup")
def startup():
    store.warm()

@app.get("/autocomplete")
def autocomplete(q: str):
    return {"suggestions": store.suggest(q, topk=5)}

Useful methods:

store = AdaptiveStore.open("sqlite:///adaptive.sqlite3")
store = AdaptiveStore.open_or_create("sqlite:///adaptive.sqlite3")

result = store.add_documents([
    Document(text="how to build a deck", doc_id="deck"),
])

store.suggest("how to bui", topk=5)
store.inspect("how to bui", topk=5)
store.warm()
store.list_documents()
store.remove_document("deck")
store.clear()
store.migrate("sqlite:///adaptive-copy.sqlite3")

store.delete() is available as a backwards-compatible alias for store.clear().

Use AdaptiveStore.import_autocomplete(...) to promote an in-memory engine into a SQLite-backed store:

store = AdaptiveStore.import_autocomplete(
    "sqlite:///adaptive.sqlite3",
    engine=index,
)

Reusable Serving Config

Use store.with_suggest_config(...) when you want to reuse runtime serving settings.

from query_autocomplete import SuggestConfig

autocomplete = store.with_suggest_config(SuggestConfig(default_top_k=3))
autocomplete.suggest("how to bui")
autocomplete.inspect("how to bui")

This returns an AdaptiveAutocomplete handle backed by the same store.

Quality Profiles

Most projects should start with a quality profile before touching individual config fields.

from query_autocomplete import Autocomplete, Document

index = Autocomplete.create(
    [
        Document(text="how to build a deck"),
        Document(text="how to build a desk"),
        Document(text="how to build with python"),
    ],
    quality_profile="precision",
    max_generated_words=4,
    phrase_min_count=3,
)

Profiles:

  • balanced: default behavior for clean suggestions
  • precision: stricter ranking and phrase mining
  • recall: keeps more candidates
  • code_or_logs: better for structured tokens, code, and logs
  • natural_language: better for prose-like collections

Explicit BuildConfig and SuggestConfig values override profile defaults.

Inspect Rankings

Use inspect(...) when you want to understand why suggestions ranked the way they did.

diagnostics = index.inspect("how to bui", topk=3)

for item in diagnostics:
    print(item.text, item.score)
    print(item.breakdown)
    print(item.expansion_trace)

Diagnostics include score details, prefix matching information, and expansion traces. suggest(...) still returns plain strings.

Custom Reranking

You can pass a reranker to suggest(...) or inspect(...).

from query_autocomplete import BaseReranker

class ReverseReranker(BaseReranker):
    def rerank(self, prefix: str, candidates: list[str]) -> list[str]:
        return list(reversed(candidates))

results = index.suggest("how to build ", reranker=ReverseReranker())
diagnostics = index.inspect("how to build ", reranker=ReverseReranker())

Configuration Reference

There are three config layers:

  • BuildConfig: build-time indexing, phrase mining, and pruning
  • SuggestConfig: runtime ranking and generation
  • NormalizationConfig: text normalization before indexing

BuildConfig

from query_autocomplete.config import BuildConfig, NormalizationConfig

build_config = BuildConfig(
    max_generated_words=4,
    max_indexed_prefix_chars=24,
    max_context_tokens=3,
    top_tokens_per_prefix=64,
    top_next_tokens=32,
    top_next_phrases=16,
    phrase_min_count=2,
    phrase_min_doc_freq=1,
    phrase_min_pmi=0.0,
    phrase_max_dominant_extension_ratio=0.95,
    phrase_boundary_generic_min_count=8,
    phrase_max_len=4,
    vocab_prune_min_total_tokens=100_000,
    vocab_prune_min_unigram_count=2,
    vocab_prune_min_segment_freq=2,
    vocab_prune_rescue_unigram=12,
    vocab_prune_line_count_to_apply_df=5_000,
    normalization=NormalizationConfig(),
)

Fields:

  • max_generated_words: maximum generated continuation length stored in the index
  • max_indexed_prefix_chars: maximum prefix length indexed for lookup
  • max_context_tokens: number of previous tokens used for context, up to 6
  • top_tokens_per_prefix: number of token candidates retained per prefix
  • top_next_tokens: number of next-token transitions retained
  • top_next_phrases: number of phrase transitions retained
  • phrase_min_count: minimum phrase count for phrase mining
  • phrase_min_doc_freq: minimum document frequency for phrases
  • phrase_min_pmi: minimum PMI score for phrases
  • phrase_max_dominant_extension_ratio: filters phrases dominated by one extension
  • phrase_boundary_generic_min_count: filters generic phrase boundaries
  • phrase_max_len: maximum mined phrase length
  • vocab_prune_min_total_tokens: corpus size threshold before vocabulary pruning activates
  • vocab_prune_min_unigram_count: minimum unigram count when pruning
  • vocab_prune_min_segment_freq: minimum segment frequency when pruning
  • vocab_prune_rescue_unigram: keep words that are frequent enough even if segment frequency is low
  • vocab_prune_line_count_to_apply_df: segment-count threshold before segment-frequency pruning applies

SuggestConfig

from query_autocomplete import SuggestConfig

suggest_config = SuggestConfig(
    default_top_k=10,
    default_length_bias=0.5,
    max_suggestion_words=4,
    beam_width=24,
    token_branch_limit=8,
    phrase_branch_limit=8,
    prior_weight=0.35,
    noise_penalty_weight=0.35,
    suppress_redundant_continuations=True,
    min_context_support_ratio=0.0,
    context_support_penalty_weight=0.25,
    collapse_prefix_ladders=True,
    collapse_prefix_ladder_strategy="best",
    unknown_context_strategy="skip",
    normalize_phrase_scores_by_length=False,
    fuzzy_prefix="auto",
    max_edit_distance=2,
)

Fields:

  • default_top_k: default number of suggestions
  • default_length_bias: preference for shorter or longer completions
  • max_suggestion_words: maximum words returned at serving time
  • beam_width: search width during generation
  • token_branch_limit: token candidates explored per beam step
  • phrase_branch_limit: phrase candidates explored per beam step
  • prior_weight: weight for prefix and context evidence
  • noise_penalty_weight: weight for structural noise penalties
  • suppress_redundant_continuations: suppress near-duplicate continuations
  • min_context_support_ratio: minimum context support before penalties apply
  • context_support_penalty_weight: strength of context-support penalties
  • collapse_prefix_ladders: collapse suggestions that are just longer versions of each other
  • collapse_prefix_ladder_strategy: best, prefer_longest, or prefer_shortest
  • unknown_context_strategy: skip or strict
  • normalize_phrase_scores_by_length: normalize phrase scores by phrase length
  • fuzzy_prefix: auto, True, or False
  • max_edit_distance: maximum fuzzy prefix edit distance

NormalizationConfig

from query_autocomplete.config import NormalizationConfig

normalization = NormalizationConfig(
    lowercase=True,
    unicode_nfkc=True,
    strip_accents=False,
    strip_punctuation=True,
    split_sentences=True,
    pysbd_language=None,
)

Set pysbd_language to a language code such as "en" only if you installed sentence chunking support:

pip install "query-autocomplete[chunking]"

Public API Reference

Most users should import from the top-level package:

from query_autocomplete import (
    AdaptiveAutocomplete,
    AdaptiveStore,
    Autocomplete,
    BaseReranker,
    BuildConfig,
    DeleteResult,
    Document,
    ExpansionStep,
    HeuristicReranker,
    IngestResult,
    QualityProfile,
    ScoreBreakdown,
    SuggestConfig,
    SuggestionDiagnostic,
    apply_quality_profile,
)

Main objects:

  • Autocomplete: in-memory autocomplete engine
  • AdaptiveStore: SQLite-backed mutable document store
  • AdaptiveAutocomplete: serving handle returned by store.with_suggest_config(...)
  • Document: source text plus optional doc_id and metadata
  • BuildConfig: build-time indexing and phrase-mining settings
  • SuggestConfig: runtime suggestion and ranking settings
  • QualityProfile: one of balanced, precision, recall, code_or_logs, or natural_language
  • IngestResult: returned by store.add_documents(...)
  • DeleteResult: returned by store.remove_document(...)
  • SuggestionDiagnostic: returned by inspect(...)
  • ScoreBreakdown: diagnostic score details
  • ExpansionStep: diagnostic expansion trace item
  • BaseReranker: base class for custom rerankers
  • HeuristicReranker: built-in heuristic reranker
  • apply_quality_profile: helper for applying profile defaults to configs

More Information

See the project README for the full API walkthrough and release notes:

https://github.com/MarcellM01/query-autocomplete