Custom Storage Provider (SQLite)
Implement the FileStorage interface with SQLite as the backing store.
Table of contents
What you’ll build
A custom FileStorage implementation that stores all Vectra index data in a single SQLite database file. This gives you:
- Single-file portability — one
.dbfile instead of a folder of JSON/protobuf files - Concurrent readers — SQLite supports multiple simultaneous readers
- SQL queryability — inspect index data with standard SQL tools
Prerequisites
- Node.js 22.x or newer
- An embeddings provider configured (any — OpenAI, local, etc.)
npm install vectra better-sqlite3
npm install -D @types/better-sqlite3
The FileStorage interface
Every Vectra storage backend implements these 9 methods:
interface FileStorage {
createFile(filePath: string, content: Buffer | string): Promise<void>;
createFolder(folderPath: string): Promise<void>;
deleteFile(filePath: string): Promise<void>;
deleteFolder(folderPath: string): Promise<void>;
getDetails(fileOrFolderPath: string): Promise<FileDetails>;
listFiles(folderPath: string, filter?: 'files' | 'folders' | 'all'): Promise<FileDetails[]>;
pathExists(fileOrFolderPath: string): Promise<boolean>;
readFile(filePath: string): Promise<Buffer>;
upsertFile(filePath: string, content: Buffer | string): Promise<void>;
}
Key behavioral contracts:
createFilemust throw if the file already existscreateFoldermust create parent directories recursivelydeleteFoldermust remove the folder and all its contentsupsertFilecreates or overwrites
Step 1: Design the schema
We need two tables — one for files, one for folders:
import Database from 'better-sqlite3';
function initDatabase(dbPath: string): Database.Database {
const db = new Database(dbPath);
db.pragma('journal_mode = WAL'); // better concurrent read performance
db.exec(`
CREATE TABLE IF NOT EXISTS files (
path TEXT PRIMARY KEY,
content BLOB NOT NULL,
is_folder INTEGER NOT NULL DEFAULT 0
);
`);
return db;
}
We store folders as rows with is_folder = 1 and empty content. Files have is_folder = 0 and their content as a BLOB (supports both JSON text and protobuf binary).
Step 2: Implement the class
import { FileStorage, FileDetails } from 'vectra';
import Database from 'better-sqlite3';
import { pathUtils } from 'vectra';
export class SQLiteStorage implements FileStorage {
private db: Database.Database;
constructor(dbPath: string) {
this.db = initDatabase(dbPath);
}
async createFile(filePath: string, content: Buffer | string): Promise<void> {
const normalized = pathUtils.normalize(filePath);
const existing = this.db.prepare('SELECT 1 FROM files WHERE path = ? AND is_folder = 0').get(normalized);
if (existing) {
throw new Error(`File already exists: ${filePath}`);
}
const buf = typeof content === 'string' ? Buffer.from(content, 'utf-8') : content;
this.db.prepare('INSERT INTO files (path, content, is_folder) VALUES (?, ?, 0)').run(normalized, buf);
}
async createFolder(folderPath: string): Promise<void> {
const normalized = pathUtils.normalize(folderPath);
// Create parent directories recursively
const parts = normalized.split('/');
let current = '';
for (const part of parts) {
current = current ? `${current}/${part}` : part;
this.db.prepare(
'INSERT OR IGNORE INTO files (path, content, is_folder) VALUES (?, ?, 1)'
).run(current, Buffer.alloc(0));
}
}
async deleteFile(filePath: string): Promise<void> {
const normalized = pathUtils.normalize(filePath);
this.db.prepare('DELETE FROM files WHERE path = ? AND is_folder = 0').run(normalized);
}
async deleteFolder(folderPath: string): Promise<void> {
const normalized = pathUtils.normalize(folderPath);
// Delete the folder and everything under it
this.db.prepare("DELETE FROM files WHERE path = ? OR path LIKE ?").run(
normalized,
`${normalized}/%`
);
}
async getDetails(fileOrFolderPath: string): Promise<FileDetails> {
const normalized = pathUtils.normalize(fileOrFolderPath);
const row = this.db.prepare('SELECT path, is_folder FROM files WHERE path = ?').get(normalized) as
| { path: string; is_folder: number }
| undefined;
if (!row) {
throw new Error(`Not found: ${fileOrFolderPath}`);
}
return {
name: pathUtils.basename(row.path),
path: row.path,
isFolder: row.is_folder === 1,
fileType: row.is_folder === 0 ? pathUtils.basename(row.path).split('.').pop() : undefined,
};
}
async listFiles(folderPath: string, filter?: 'files' | 'folders' | 'all'): Promise<FileDetails[]> {
const normalized = pathUtils.normalize(folderPath);
const prefix = normalized ? `${normalized}/` : '';
// List direct children only (no nested paths)
const rows = this.db
.prepare("SELECT path, is_folder FROM files WHERE path LIKE ? AND path NOT LIKE ?")
.all(`${prefix}%`, `${prefix}%/%`) as Array<{ path: string; is_folder: number }>;
return rows
.filter((row) => {
if (filter === 'files') return row.is_folder === 0;
if (filter === 'folders') return row.is_folder === 1;
return true;
})
.map((row) => ({
name: pathUtils.basename(row.path),
path: row.path,
isFolder: row.is_folder === 1,
fileType: row.is_folder === 0 ? pathUtils.basename(row.path).split('.').pop() : undefined,
}));
}
async pathExists(fileOrFolderPath: string): Promise<boolean> {
const normalized = pathUtils.normalize(fileOrFolderPath);
const row = this.db.prepare('SELECT 1 FROM files WHERE path = ?').get(normalized);
return row !== undefined;
}
async readFile(filePath: string): Promise<Buffer> {
const normalized = pathUtils.normalize(filePath);
const row = this.db.prepare('SELECT content FROM files WHERE path = ? AND is_folder = 0').get(normalized) as
| { content: Buffer }
| undefined;
if (!row) {
throw new Error(`File not found: ${filePath}`);
}
return Buffer.from(row.content);
}
async upsertFile(filePath: string, content: Buffer | string): Promise<void> {
const normalized = pathUtils.normalize(filePath);
const buf = typeof content === 'string' ? Buffer.from(content, 'utf-8') : content;
this.db.prepare(
'INSERT INTO files (path, content, is_folder) VALUES (?, ?, 0) ON CONFLICT(path) DO UPDATE SET content = excluded.content'
).run(normalized, buf);
}
/** Close the database connection when done. */
close(): void {
this.db.close();
}
}
Step 3: Use it with Vectra
Plug the custom storage into any Vectra index — no other code changes needed:
import { LocalDocumentIndex, OpenAIEmbeddings } from 'vectra';
import { SQLiteStorage } from './sqlite-storage';
const storage = new SQLiteStorage('./my-index.db');
const docs = new LocalDocumentIndex({
folderPath: 'my-index', // logical path inside the database
embeddings: new OpenAIEmbeddings({
apiKey: process.env.OPENAI_API_KEY!,
model: 'text-embedding-3-small',
maxTokens: 8000,
}),
storage,
});
if (!(await docs.isIndexCreated())) {
await docs.createIndex({ version: 1 });
}
await docs.upsertDocument('doc://hello', 'Hello from SQLite storage!', 'txt');
const results = await docs.queryDocuments('hello', { maxDocuments: 5 });
console.log('Results:', results.length);
// Inspect with any SQLite tool:
// sqlite3 my-index.db "SELECT path, length(content), is_folder FROM files;"
The folderPath becomes a logical prefix inside the database, not an actual filesystem path. Multiple indexes can coexist in one database by using different folderPath values.
Step 4: Test your implementation
A quick smoke test that exercises all the key operations:
import { SQLiteStorage } from './sqlite-storage';
const storage = new SQLiteStorage(':memory:'); // in-memory for tests
// Folders
await storage.createFolder('test/sub');
console.assert(await storage.pathExists('test'), 'folder should exist');
console.assert(await storage.pathExists('test/sub'), 'subfolder should exist');
// Files
await storage.createFile('test/a.json', '{"hello": "world"}');
console.assert(await storage.pathExists('test/a.json'), 'file should exist');
// Read back
const content = await storage.readFile('test/a.json');
console.assert(content.toString() === '{"hello": "world"}', 'content should match');
// Upsert (overwrite)
await storage.upsertFile('test/a.json', '{"hello": "updated"}');
const updated = await storage.readFile('test/a.json');
console.assert(updated.toString() === '{"hello": "updated"}', 'upsert should overwrite');
// createFile should throw on existing file
try {
await storage.createFile('test/a.json', 'duplicate');
console.assert(false, 'should have thrown');
} catch {
// expected
}
// List files
const files = await storage.listFiles('test', 'files');
console.assert(files.length === 1, 'should list one file');
console.assert(files[0].name === 'a.json', 'file name should match');
// Delete
await storage.deleteFile('test/a.json');
console.assert(!(await storage.pathExists('test/a.json')), 'file should be deleted');
// Delete folder recursively
await storage.createFile('test/sub/b.json', 'data');
await storage.deleteFolder('test');
console.assert(!(await storage.pathExists('test')), 'folder should be deleted');
console.assert(!(await storage.pathExists('test/sub/b.json')), 'nested file should be deleted');
console.log('All tests passed.');
When to choose SQLite
| Scenario | SQLite | LocalFileStorage |
|---|---|---|
| Single-file backup/transfer | Best — one .db file | Requires zipping a folder |
| Concurrent readers | Good — WAL mode | Good — OS-level file locks |
| Inspecting data | SQL queries | Manual JSON/protobuf parsing |
| Maximum I/O performance | Good | Best (direct fs) |
| Browser/Electron | Not available | Use IndexedDBStorage instead |
Next steps
- Protocol Buffers —
SQLiteStoragehandles binary content natively, soProtobufCodecworks out of the box. Passcodec: new ProtobufCodec()to the index constructor. - Connection pooling — for server scenarios, consider pooling
better-sqlite3connections or using a single shared instance. - See the Storage guide for the full
FileStoragecontract and the built-in implementations.