Cross-Language with gRPC

Start the Vectra gRPC server and query it from Python.

Table of contents

What you’ll build

A cross-language setup where:

  1. A Vectra gRPC server runs in Node.js, managing indexes and computing embeddings
  2. A Python client connects to the server, ingests documents, and runs queries
  3. The server handles all the heavy lifting — the Python client just sends text

Prerequisites

  • Node.js 22.x or newer
  • Python 3.8 or newer
  • An OpenAI API key (for server-side embeddings)
# Node.js — install Vectra globally for the CLI
npm install -g vectra

# Python — install gRPC
pip install grpcio grpcio-tools

Step 1: Create an index and keys file

# Create the index
vectra create ./my-index

# Create keys.json for the embeddings provider
cat > keys.json << 'EOF'
{
  "apiKey": "sk-...",
  "model": "text-embedding-3-small",
  "maxTokens": 8000
}
EOF

Step 2: Start the gRPC server

Single-index mode

Serve one index:

vectra serve ./my-index --keys ./keys.json --port 50051

Multi-index mode

Serve all indexes under a directory — the server auto-detects index subdirectories:

vectra serve --root ./indexes --keys ./keys.json --port 50051

Daemon mode

Run in the background:

vectra serve ./my-index --keys ./keys.json --daemon --pid-file ./vectra.pid

# Stop later
vectra stop --pid-file ./vectra.pid

The server binds to 127.0.0.1 (localhost only). There’s no authentication in v1 — it’s designed for local development and single-machine deployments.

Step 3: Generate Python bindings

vectra generate --language python --output ./python-client

This creates a directory with:

  • vectra_service.proto — the gRPC service definition
  • vectra_client.py — an idiomatic Python wrapper
  • README.md — setup instructions

Follow the generated README to compile the proto stubs:

cd python-client
python -m grpc_tools.protoc \
  -I. \
  --python_out=. \
  --grpc_python_out=. \
  vectra_service.proto

Step 4: Write the Python client

from vectra_client import VectraClient

client = VectraClient('localhost:50051')

# --- Ingest documents ---
# The server computes embeddings — you just send text
client.upsert_document(
    index_name='my-index',
    uri='doc://python-guide',
    text='''
    Python is a high-level, general-purpose programming language.
    Its design philosophy emphasizes code readability with the use
    of significant indentation.
    ''',
    doc_type='txt'
)

client.upsert_document(
    index_name='my-index',
    uri='doc://rust-guide',
    text='''
    Rust is a systems programming language focused on safety,
    speed, and concurrency. It achieves memory safety without
    garbage collection.
    ''',
    doc_type='txt'
)

print('Documents ingested.')

# --- Query ---
results = client.query_documents(
    index_name='my-index',
    query='Which language focuses on memory safety?',
    max_documents=3
)

for result in results:
    print(f'URI: {result.uri}  Score: {result.score:.4f}')
    for section in result.sections:
        print(f'  {section.text[:100]}...')

Step 5: Run it

Terminal 1 — start the server:

vectra serve ./my-index --keys ./keys.json --port 50051

Terminal 2 — run the Python client:

cd python-client
python my_client.py

Expected output:

Documents ingested.
URI: doc://rust-guide  Score: 0.8932
  Rust is a systems programming language focused on safety, speed, and concurrency...

Multi-index operations

In multi-index mode, you can create and manage indexes from the client:

client = VectraClient('localhost:50051')

# List available indexes
indexes = client.list_indexes()
print('Available indexes:', indexes)

# Create a new index
client.create_index('new-index')

# All operations specify which index to target
client.upsert_document(
    index_name='new-index',
    uri='doc://test',
    text='Test document for the new index.',
    doc_type='txt'
)

# Get index stats
stats = client.get_index_stats('new-index')
print(f'Items: {stats.item_count}')

Service API reference

The gRPC server exposes 19 RPCs across 6 categories:

Category RPCs
Index Management CreateIndex, DeleteIndex, ListIndexes
Item Operations InsertItem, UpsertItem, BatchInsertItems, GetItem, DeleteItem, ListItems
Query QueryItems, QueryDocuments
Document Operations UpsertDocument, DeleteDocument, ListDocuments
Stats GetIndexStats, GetCatalogStats
Lifecycle Healthcheck, Shutdown

See the gRPC Server guide for the full API and examples in C# and Rust.

Other languages

The vectra generate command supports 6 languages:

vectra generate --language csharp --output ./csharp-client
vectra generate --language rust --output ./rust-client
vectra generate --language go --output ./go-client
vectra generate --language java --output ./java-client
vectra generate --language typescript --output ./ts-client

Each generated package includes the proto file, an idiomatic client wrapper, and language-specific setup instructions. See the gRPC Server — Language Bindings for client examples in C# and Rust.

Next steps

  • Daemon mode — use --daemon for production deployments. See gRPC Server.
  • Protocol Buffers — create the index with vectra create ./my-index --format protobuf for smaller on-disk files.
  • Monitor — the Healthcheck RPC can be used for liveness probes.
  • Graceful shutdown — the Shutdown RPC and vectra stop allow a 5-second draining window for in-flight requests.