gRPC Server

Cross-language access to Vectra indexes via gRPC.

Table of contents

Overview
Starting the server
Service API
- Key patterns
Language bindings

Overview

Vectra includes a built-in gRPC server that exposes all index operations over the network. This lets any language use Vectra as a vector database — clients send text, the server computes embeddings and executes operations. The server binds to 127.0.0.1 (localhost only).

Starting the server

Single-index mode

Serve one index:

npx vectra serve ./my-index --keys ./keys.json --port 50051

Multi-index mode

Serve all indexes under a directory. The server auto-detects index subdirectories, and new indexes can be created via the CreateIndex RPC.

npx vectra serve --root ./indexes --keys ./keys.json

Daemon mode

Run in the background:

npx vectra serve ./my-index --keys ./keys.json --daemon --pid-file ./vectra.pid
npx vectra stop --pid-file ./vectra.pid   # stop later

Server options

Flag	Default	Description
`--root <dir>`	–	Directory containing multiple index subdirectories
`--port`	50051	Port to bind the gRPC server on
`--keys`	–	Path to keys.json for server-side embeddings
`--daemon`	false	Fork to background as a daemon process
`--pid-file`	auto	Path to PID file (daemon mode)

Service API

The VectraService gRPC service provides 19 RPCs:

Category	RPCs
Index Management	`CreateIndex`, `DeleteIndex`, `ListIndexes`
Item Operations	`InsertItem`, `UpsertItem`, `BatchInsertItems`, `GetItem`, `DeleteItem`, `ListItems`
Query	`QueryItems`, `QueryDocuments`
Document Operations	`UpsertDocument`, `DeleteDocument`, `ListDocuments`
Stats	`GetIndexStats`, `GetCatalogStats`
Lifecycle	`Healthcheck`, `Shutdown`

The full service definition is in proto/vectra_service.proto.

Key patterns

Embeddings are server-side — clients send text, the server generates vectors using the configured embeddings provider.
Multi-index addressing — in multi-index mode, RPCs that target a specific index accept an index_name field.
Graceful shutdown — the Shutdown RPC (and vectra stop) allow a 5-second draining window for in-flight requests.

Language bindings

Generate idiomatic client scaffolding for your language:

npx vectra generate --language <lang> --output <dir>

Each generated package includes the .proto file, an idiomatic client wrapper, and a README with setup instructions.

Supported languages

Language	Client Class	Package Ecosystem	Notes
Python	`VectraClient`	`grpcio`	Run `protoc` to generate stubs
C#	`VectraClient`	`Grpc.Net.Client`	Add Protobuf item to `.csproj`
Rust	`vectra-client` crate	`tonic`	`build.rs` generates stubs
Go	`VectraClient`	`google.golang.org/grpc`	Run `protoc` for stubs
Java	`VectraClient`	gRPC Java	Place proto in `src/main/proto/`
TypeScript	`VectraClient`	`@grpc/grpc-js`	Dynamic proto loading, no codegen

Example: Python client

from vectra_client import VectraClient

client = VectraClient('localhost:50051')

# Insert a document (server handles embedding)
client.upsert_document(
    index_name='my-index',
    uri='doc://readme',
    text='Your document text...',
    doc_type='md'
)

# Query by text
results = client.query_documents(
    index_name='my-index',
    query='What is Vectra?',
    max_documents=5
)

Example: C# client

using Vectra.Client;

var client = new VectraClient("localhost:50051");

await client.UpsertDocumentAsync(
    indexName: "my-index",
    uri: "doc://readme",
    text: "Your document text...",
    docType: "md"
);

var results = await client.QueryDocumentsAsync(
    indexName: "my-index",
    query: "What is Vectra?",
    maxDocuments: 5
);

Example: Rust client

use vectra_client::VectraClient;

let client = VectraClient::connect("http://localhost:50051").await?;

client.upsert_document(
    "my-index",
    "doc://readme",
    "Your document text...",
    Some("md"),
).await?;

let results = client.query_documents(
    "my-index",
    "What is Vectra?",
    5,
).await?;