query-autocomplete User Guide
query-autocomplete builds fast local autocomplete suggestions from your own text, PDFs, and DOCX files. Use it for search boxes, command palettes, document title completion, and typo-tolerant prefix search without running Elasticsearch, Meilisearch, Algolia, Typesense, or another search server.
Install
pip install query-autocomplete
PDF and DOCX readers are included in the base install. Optional sentence chunking support is available with:
pip install "query-autocomplete[chunking]"
Choose a Path
The library has three common ways to use it:
Autocomplete: build an in-memory autocomplete from documents- saved artifacts: save a compiled serving index and load it later
AdaptiveStore: keep source documents in SQLite and update them over time
Start with Autocomplete. Move to saved artifacts when you want to build once and serve many times. Move to AdaptiveStore when your documents need to be added, removed, listed, or persisted as source data.
Use Autocomplete when:
- your source text can be loaded at startup
- the document set rarely changes
- you want the simplest possible integration
Use saved artifacts when:
- you build an index once
- you serve it many times
- you do not need to mutate the source documents
Use AdaptiveStore when:
- documents are added or removed over time
- you need durable source documents
- you want SQLite-backed persistence
Cold Starts
Build or load the autocomplete once when your app starts. Do not rebuild it inside every request handler.
Cold starts happen per process: a new process has to load or rebuild the serving index once. After that, suggestions are served from the in-process engine.
For app startup, call warm() after loading the index or opening the store:
from query_autocomplete import Autocomplete
index = Autocomplete.load("my-index")
index.warm()
For mutable SQLite stores:
from query_autocomplete import AdaptiveStore
store = AdaptiveStore.open("sqlite:///adaptive.sqlite3")
store.warm()
AdaptiveStore.warm() loads the current compiled serving index if one exists, or rebuilds it from stored documents if needed. If the store has no documents yet, it is a no-op.
Quick Start
from query_autocomplete import Autocomplete, Document
index = Autocomplete.create([
Document(text="how to build a deck"),
Document(text="how to build a desk"),
Document(text="how to build with python"),
])
print(index.suggest("how to bui", topk=5))
print(index.suggest("how to biuld", topk=5))
suggest(...) returns a plain list[str].
You can also pass file paths directly. .txt, .pdf, and .docx inputs are supported in the base package:
from pathlib import Path
from query_autocomplete import Autocomplete
index = Autocomplete.create([
Path("docs/handbook.pdf"),
Path("docs/release-notes.docx"),
Path("docs/faq.txt"),
])
print(index.suggest("install", topk=5))
Documents
The main input type is Document.
from query_autocomplete import Document
doc = Document(
text="how to build with python",
doc_id="doc-123",
metadata={"source": "docs"},
)
Fields:
text: source text used to learn suggestionsdoc_id: optional stable identifiermetadata: optional JSON-like metadata on in-memory documents
Document.text can be a phrase, paragraph, full article, transcript, or larger body of text. Use doc_id when you need stable document identity, especially with AdaptiveStore.
In-Memory Autocomplete
Use Autocomplete when the source collection can be loaded in memory.
from query_autocomplete import Autocomplete, Document
documents = [
Document(text="how to build a deck"),
Document(text="how to build a desk"),
Document(text="how to build with python"),
]
index = Autocomplete.create(documents)
print(index.suggest("how to build ", topk=5))
Useful methods:
index = Autocomplete.create(documents)
index.suggest("how to bui", topk=5)
index.inspect("how to bui", topk=5)
index.warm()
index.save("my-index")
loaded = Autocomplete.load("my-index")
docs = index.export_documents()
Saved Artifacts
Saved artifacts are compiled serving indexes. They are useful when you build an autocomplete once and load it later.
from query_autocomplete import Autocomplete, Document
index = Autocomplete.create([
Document(text="how to build a deck"),
Document(text="how to build a desk"),
])
index.save("my-index")
loaded = Autocomplete.load("my-index")
print(loaded.suggest("how to bui", topk=5))
Artifact path behavior:
index.save()writes to a managed folder under.query_autocomplete_artifacts/index.save("docs-v1")writes to.query_autocomplete_artifacts/docs-v1/index.save("artifacts/docs-v1")writes to that explicit relative pathAutocomplete.load("docs-v1")loads from the managed artifact folder
Artifacts are for serving. They do not act like a mutable document database.
SQLite Adaptive Stores
Use AdaptiveStore when documents change over time.
from query_autocomplete import AdaptiveStore, Document
store = AdaptiveStore.open("sqlite:///adaptive.sqlite3")
store.add_documents([
Document(text="how to build a deck", doc_id="deck"),
Document(text="how to build with python", doc_id="python"),
])
print(store.suggest("how to bui", topk=5))
Each adaptive SQLite database owns one document collection. Adding documents invalidates the serving cache, which is rebuilt when needed.
Supported store paths:
AdaptiveStore.open("sqlite:///adaptive.sqlite3")
AdaptiveStore.open("sqlite:////absolute/path/adaptive.sqlite3")
AdaptiveStore.open("./adaptive.sqlite3")
AdaptiveStore.open(":memory:")
Serving a SQLite-backed autocomplete from FastAPI:
from fastapi import FastAPI
from query_autocomplete import AdaptiveStore
app = FastAPI()
store = AdaptiveStore.open("sqlite:///adaptive.sqlite3")
@app.on_event("startup")
def startup():
store.warm()
@app.get("/autocomplete")
def autocomplete(q: str):
return {"suggestions": store.suggest(q, topk=5)}
Useful methods:
store = AdaptiveStore.open("sqlite:///adaptive.sqlite3")
store = AdaptiveStore.open_or_create("sqlite:///adaptive.sqlite3")
result = store.add_documents([
Document(text="how to build a deck", doc_id="deck"),
])
store.suggest("how to bui", topk=5)
store.inspect("how to bui", topk=5)
store.warm()
store.list_documents()
store.remove_document("deck")
store.clear()
store.migrate("sqlite:///adaptive-copy.sqlite3")
store.delete() is available as a backwards-compatible alias for store.clear().
Use AdaptiveStore.import_autocomplete(...) to promote an in-memory engine into a SQLite-backed store:
store = AdaptiveStore.import_autocomplete(
"sqlite:///adaptive.sqlite3",
engine=index,
)
Reusable Serving Config
Use store.with_suggest_config(...) when you want to reuse runtime serving settings.
from query_autocomplete import SuggestConfig
autocomplete = store.with_suggest_config(SuggestConfig(default_top_k=3))
autocomplete.suggest("how to bui")
autocomplete.inspect("how to bui")
This returns an AdaptiveAutocomplete handle backed by the same store.
Quality Profiles
Most projects should start with a quality profile before touching individual config fields.
from query_autocomplete import Autocomplete, Document
index = Autocomplete.create(
[
Document(text="how to build a deck"),
Document(text="how to build a desk"),
Document(text="how to build with python"),
],
quality_profile="precision",
max_generated_words=4,
phrase_min_count=3,
)
Profiles:
balanced: default behavior for clean suggestionsprecision: stricter ranking and phrase miningrecall: keeps more candidatescode_or_logs: better for structured tokens, code, and logsnatural_language: better for prose-like collections
Explicit BuildConfig and SuggestConfig values override profile defaults.
Inspect Rankings
Use inspect(...) when you want to understand why suggestions ranked the way they did.
diagnostics = index.inspect("how to bui", topk=3)
for item in diagnostics:
print(item.text, item.score)
print(item.breakdown)
print(item.expansion_trace)
Diagnostics include score details, prefix matching information, and expansion traces. suggest(...) still returns plain strings.
Custom Reranking
You can pass a reranker to suggest(...) or inspect(...).
from query_autocomplete import BaseReranker
class ReverseReranker(BaseReranker):
def rerank(self, prefix: str, candidates: list[str]) -> list[str]:
return list(reversed(candidates))
results = index.suggest("how to build ", reranker=ReverseReranker())
diagnostics = index.inspect("how to build ", reranker=ReverseReranker())
Configuration Reference
There are three config layers:
BuildConfig: build-time indexing, phrase mining, and pruningSuggestConfig: runtime ranking and generationNormalizationConfig: text normalization before indexing
BuildConfig
from query_autocomplete.config import BuildConfig, NormalizationConfig
build_config = BuildConfig(
max_generated_words=4,
max_indexed_prefix_chars=24,
max_context_tokens=3,
top_tokens_per_prefix=64,
top_next_tokens=32,
top_next_phrases=16,
phrase_min_count=2,
phrase_min_doc_freq=1,
phrase_min_pmi=0.0,
phrase_max_dominant_extension_ratio=0.95,
phrase_boundary_generic_min_count=8,
phrase_max_len=4,
vocab_prune_min_total_tokens=100_000,
vocab_prune_min_unigram_count=2,
vocab_prune_min_segment_freq=2,
vocab_prune_rescue_unigram=12,
vocab_prune_line_count_to_apply_df=5_000,
normalization=NormalizationConfig(),
)
Fields:
max_generated_words: maximum generated continuation length stored in the indexmax_indexed_prefix_chars: maximum prefix length indexed for lookupmax_context_tokens: number of previous tokens used for context, up to6top_tokens_per_prefix: number of token candidates retained per prefixtop_next_tokens: number of next-token transitions retainedtop_next_phrases: number of phrase transitions retainedphrase_min_count: minimum phrase count for phrase miningphrase_min_doc_freq: minimum document frequency for phrasesphrase_min_pmi: minimum PMI score for phrasesphrase_max_dominant_extension_ratio: filters phrases dominated by one extensionphrase_boundary_generic_min_count: filters generic phrase boundariesphrase_max_len: maximum mined phrase lengthvocab_prune_min_total_tokens: corpus size threshold before vocabulary pruning activatesvocab_prune_min_unigram_count: minimum unigram count when pruningvocab_prune_min_segment_freq: minimum segment frequency when pruningvocab_prune_rescue_unigram: keep words that are frequent enough even if segment frequency is lowvocab_prune_line_count_to_apply_df: segment-count threshold before segment-frequency pruning applies
SuggestConfig
from query_autocomplete import SuggestConfig
suggest_config = SuggestConfig(
default_top_k=10,
default_length_bias=0.5,
max_suggestion_words=4,
beam_width=24,
token_branch_limit=8,
phrase_branch_limit=8,
prior_weight=0.35,
noise_penalty_weight=0.35,
suppress_redundant_continuations=True,
min_context_support_ratio=0.0,
context_support_penalty_weight=0.25,
collapse_prefix_ladders=True,
collapse_prefix_ladder_strategy="best",
unknown_context_strategy="skip",
normalize_phrase_scores_by_length=False,
fuzzy_prefix="auto",
max_edit_distance=2,
)
Fields:
default_top_k: default number of suggestionsdefault_length_bias: preference for shorter or longer completionsmax_suggestion_words: maximum words returned at serving timebeam_width: search width during generationtoken_branch_limit: token candidates explored per beam stepphrase_branch_limit: phrase candidates explored per beam stepprior_weight: weight for prefix and context evidencenoise_penalty_weight: weight for structural noise penaltiessuppress_redundant_continuations: suppress near-duplicate continuationsmin_context_support_ratio: minimum context support before penalties applycontext_support_penalty_weight: strength of context-support penaltiescollapse_prefix_ladders: collapse suggestions that are just longer versions of each othercollapse_prefix_ladder_strategy:best,prefer_longest, orprefer_shortestunknown_context_strategy:skiporstrictnormalize_phrase_scores_by_length: normalize phrase scores by phrase lengthfuzzy_prefix:auto,True, orFalsemax_edit_distance: maximum fuzzy prefix edit distance
NormalizationConfig
from query_autocomplete.config import NormalizationConfig
normalization = NormalizationConfig(
lowercase=True,
unicode_nfkc=True,
strip_accents=False,
strip_punctuation=True,
split_sentences=True,
pysbd_language=None,
)
Set pysbd_language to a language code such as "en" only if you installed sentence chunking support:
pip install "query-autocomplete[chunking]"
Public API Reference
Most users should import from the top-level package:
from query_autocomplete import (
AdaptiveAutocomplete,
AdaptiveStore,
Autocomplete,
BaseReranker,
BuildConfig,
DeleteResult,
Document,
ExpansionStep,
HeuristicReranker,
IngestResult,
QualityProfile,
ScoreBreakdown,
SuggestConfig,
SuggestionDiagnostic,
apply_quality_profile,
)
Main objects:
Autocomplete: in-memory autocomplete engineAdaptiveStore: SQLite-backed mutable document storeAdaptiveAutocomplete: serving handle returned bystore.with_suggest_config(...)Document: source text plus optionaldoc_idandmetadataBuildConfig: build-time indexing and phrase-mining settingsSuggestConfig: runtime suggestion and ranking settingsQualityProfile: one ofbalanced,precision,recall,code_or_logs, ornatural_languageIngestResult: returned bystore.add_documents(...)DeleteResult: returned bystore.remove_document(...)SuggestionDiagnostic: returned byinspect(...)ScoreBreakdown: diagnostic score detailsExpansionStep: diagnostic expansion trace itemBaseReranker: base class for custom rerankersHeuristicReranker: built-in heuristic rerankerapply_quality_profile: helper for applying profile defaults to configs
More Information
See the project README for the full API walkthrough and release notes:
https://github.com/MarcellM01/query-autocomplete