Browser & Electron App
Build a semantic search app that runs entirely in the browser — no server, no API key.
Table of contents
What you’ll build
A browser-based semantic search application that:
- Runs entirely client-side — no backend server
- Uses
TransformersEmbeddingsfor local model inference (no API key) - Persists data in IndexedDB across page reloads
- Works in Electron with the same code
Prerequisites
- Node.js 22.x or newer (for bundling)
- A bundler: Vite (recommended) or webpack
- No API key needed — embeddings run locally
npm install vectra @huggingface/transformers
Step 1: The vectra/browser entry point
Import from vectra/browser instead of vectra. This entry point excludes Node-specific modules (LocalFileStorage, FileFetcher, WebFetcher, FolderWatcher) and includes browser alternatives.
import {
LocalDocumentIndex,
TransformersEmbeddings,
IndexedDBStorage,
BrowserWebFetcher,
} from 'vectra/browser';
Bundlers that support conditional exports (Vite, webpack 5+) resolve this automatically. If your bundler doesn’t, import directly from vectra/browser.
Step 2: Initialize storage and embeddings
// Persistent storage via IndexedDB — survives page reloads
const storage = new IndexedDBStorage('my-search-app');
// Local embeddings — downloads model on first use, caches it in the browser
const embeddings = await TransformersEmbeddings.create({
device: 'auto', // WebGPU if available, falls back to WASM
dtype: 'q8', // quantized for faster inference + smaller download
progressCallback: (progress) => {
console.log(`Model: ${progress.status} ${Math.round(progress.progress || 0)}%`);
},
});
The first run downloads the model (~30 MB for the default all-MiniLM-L6-v2 with q8 quantization). Subsequent runs load from the browser cache.
Device selection
| Device | When to use |
|---|---|
'auto' | Let Vectra pick the best option (default) |
'gpu' | Force WebGPU — fastest, but not all browsers support it |
'wasm' | Force WASM — works everywhere, good fallback |
'cpu' | Most compatible, slowest |
Quantization
| Precision | Model size | Speed | Quality |
|---|---|---|---|
'fp32' | ~90 MB | Baseline | Best |
'fp16' | ~45 MB | Faster with GPU | Very good |
'q8' | ~23 MB | Good balance | Good |
'q4' | ~12 MB | Fastest | Acceptable |
Step 3: Create the index
const index = new LocalDocumentIndex({
folderPath: 'search-index', // logical path inside IndexedDB
embeddings,
storage,
});
if (!(await index.isIndexCreated())) {
await index.createIndex({ version: 1 });
}
The folderPath is a logical name inside IndexedDB, not a filesystem path.
Step 4: Add documents
From text
await index.upsertDocument('doc://intro', 'Vectra is a local vector database...', 'txt');
await index.upsertDocument('doc://features', 'Vectra supports metadata filtering...', 'txt');
From web pages
Use BrowserWebFetcher to fetch and convert web pages to text:
const fetcher = new BrowserWebFetcher();
await fetcher.fetch('https://example.com/docs', async (uri, text, docType) => {
await index.upsertDocument(uri, text, docType);
return true;
});
BrowserWebFetcher uses fetch() + DOMParser, so it’s subject to CORS restrictions. You can only fetch pages that allow cross-origin requests, or pages from the same origin.
Step 5: Query and display results
async function search(query: string): Promise<void> {
const results = await index.queryDocuments(query, {
maxDocuments: 5,
maxChunks: 20,
});
for (const result of results) {
const sections = await result.renderSections(500, 1, true);
console.log(`[${result.score.toFixed(4)}] ${result.uri}`);
for (const section of sections) {
console.log(section.text);
}
}
}
await search('What is Vectra?');
Step 6: Bundler configuration
Vite
Vite handles WASM and worker files automatically. Minimal config:
// vite.config.ts
import { defineConfig } from 'vite';
export default defineConfig({
optimizeDeps: {
exclude: ['@huggingface/transformers'], // let Vite handle WASM files
},
});
Webpack 5
Enable WASM and async module support:
// webpack.config.js
module.exports = {
experiments: {
asyncWebAssembly: true,
},
resolve: {
fallback: {
fs: false,
path: false,
},
},
};
Complete example
<!DOCTYPE html>
<html>
<head><title>Vectra Browser Search</title></head>
<body>
<input id="query" placeholder="Search..." />
<button id="search">Search</button>
<div id="results"></div>
<script type="module">
import {
LocalDocumentIndex,
TransformersEmbeddings,
IndexedDBStorage,
} from 'vectra/browser';
const storage = new IndexedDBStorage('search-demo');
const embeddings = await TransformersEmbeddings.create({
device: 'auto',
dtype: 'q8',
progressCallback: (p) => {
document.getElementById('results').textContent =
`Loading model: ${p.status} ${Math.round(p.progress || 0)}%`;
},
});
const index = new LocalDocumentIndex({
folderPath: 'demo-index',
embeddings,
storage,
});
if (!(await index.isIndexCreated())) {
await index.createIndex({ version: 1 });
// Seed with sample data
await index.upsertDocument('doc://vectra', 'Vectra is a local vector database with sub-millisecond query latency.', 'txt');
await index.upsertDocument('doc://features', 'Vectra supports metadata filtering, hybrid BM25 retrieval, and multiple storage backends.', 'txt');
await index.upsertDocument('doc://browser', 'Vectra runs in browsers using IndexedDB for storage and TransformersEmbeddings for local inference.', 'txt');
}
document.getElementById('results').textContent = 'Ready — try a search.';
document.getElementById('search').addEventListener('click', async () => {
const query = document.getElementById('query').value;
if (!query) return;
const results = await index.queryDocuments(query, { maxDocuments: 5 });
const el = document.getElementById('results');
el.innerHTML = '';
for (const result of results) {
const sections = await result.renderSections(500, 1, true);
const div = document.createElement('div');
div.innerHTML = `<strong>${result.uri}</strong> (${result.score.toFixed(4)})<br>${sections.map(s => s.text).join('<br>')}`;
el.appendChild(div);
}
});
</script>
</body>
</html>
Electron considerations
The same vectra/browser entry point works in Electron renderer processes:
- Use
IndexedDBStoragefor persistence (renderer has IndexedDB access) TransformersEmbeddingsworks with both WebGPU and WASM in Electron- If you need filesystem access, use
contextBridgeto exposeLocalFileStoragefrom the main process
// In Electron renderer — same code as browser
import { LocalDocumentIndex, TransformersEmbeddings, IndexedDBStorage } from 'vectra/browser';
If nodeIntegration is enabled in your Electron config, you can import from vectra directly and use LocalFileStorage. But the vectra/browser path is recommended for security (renderer shouldn’t have full Node access).
Cleanup
To delete all stored data:
const storage = new IndexedDBStorage('my-search-app');
await storage.destroy(); // deletes the entire IndexedDB database
Next steps
- Aligned tokenizer — use
embeddings.getTokenizer()to ensure chunk boundaries match the model’s tokens. See Embeddings Guide. - GPU acceleration — pass
device: 'gpu'for WebGPU inference if your users’ browsers support it. - Offline-first — since embeddings run locally and data lives in IndexedDB, the app works completely offline after the initial model download.
- See the Storage guide for the full browser compatibility matrix.