Browser & Electron App

Build a semantic search app that runs entirely in the browser — no server, no API key.

Table of contents

What you’ll build
Prerequisites
Step 1: The vectra/browser entry point
Step 2: Initialize storage and embeddings
- Device selection
- Quantization
Step 3: Create the index
Step 4: Add documents
- From text
- From web pages
Step 5: Query and display results
Step 6: Bundler configuration
- Vite
- Webpack 5
Complete example
Electron considerations
Cleanup
Next steps

What you’ll build

A browser-based semantic search application that:

Runs entirely client-side — no backend server
Uses TransformersEmbeddings for local model inference (no API key)
Persists data in IndexedDB across page reloads
Works in Electron with the same code

Prerequisites

Node.js 22.x or newer (for bundling)
A bundler: Vite (recommended) or webpack
No API key needed — embeddings run locally

npm install vectra @huggingface/transformers

Step 1: The vectra/browser entry point

Import from vectra/browser instead of vectra. This entry point excludes Node-specific modules (LocalFileStorage, FileFetcher, WebFetcher, FolderWatcher) and includes browser alternatives.

import {
  LocalDocumentIndex,
  TransformersEmbeddings,
  IndexedDBStorage,
  BrowserWebFetcher,
} from 'vectra/browser';

Bundlers that support conditional exports (Vite, webpack 5+) resolve this automatically. If your bundler doesn’t, import directly from vectra/browser.

Step 2: Initialize storage and embeddings

// Persistent storage via IndexedDB — survives page reloads
const storage = new IndexedDBStorage('my-search-app');

// Local embeddings — downloads model on first use, caches it in the browser
const embeddings = await TransformersEmbeddings.create({
  device: 'auto', // WebGPU if available, falls back to WASM
  dtype: 'q8',    // quantized for faster inference + smaller download
  progressCallback: (progress) => {
    console.log(`Model: ${progress.status} ${Math.round(progress.progress || 0)}%`);
  },
});

The first run downloads the model (~30 MB for the default all-MiniLM-L6-v2 with q8 quantization). Subsequent runs load from the browser cache.

Device selection

Device	When to use
`'auto'`	Let Vectra pick the best option (default)
`'gpu'`	Force WebGPU — fastest, but not all browsers support it
`'wasm'`	Force WASM — works everywhere, good fallback
`'cpu'`	Most compatible, slowest

Quantization

Precision	Model size	Speed	Quality
`'fp32'`	~90 MB	Baseline	Best
`'fp16'`	~45 MB	Faster with GPU	Very good
`'q8'`	~23 MB	Good balance	Good
`'q4'`	~12 MB	Fastest	Acceptable

Step 3: Create the index

const index = new LocalDocumentIndex({
  folderPath: 'search-index', // logical path inside IndexedDB
  embeddings,
  storage,
});

if (!(await index.isIndexCreated())) {
  await index.createIndex({ version: 1 });
}

The folderPath is a logical name inside IndexedDB, not a filesystem path.

Step 4: Add documents

From text

await index.upsertDocument('doc://intro', 'Vectra is a local vector database...', 'txt');
await index.upsertDocument('doc://features', 'Vectra supports metadata filtering...', 'txt');

From web pages

Use BrowserWebFetcher to fetch and convert web pages to text:

const fetcher = new BrowserWebFetcher();
await fetcher.fetch('https://example.com/docs', async (uri, text, docType) => {
  await index.upsertDocument(uri, text, docType);
  return true;
});

BrowserWebFetcher uses fetch() + DOMParser, so it’s subject to CORS restrictions. You can only fetch pages that allow cross-origin requests, or pages from the same origin.

Step 5: Query and display results

async function search(query: string): Promise<void> {
  const results = await index.queryDocuments(query, {
    maxDocuments: 5,
    maxChunks: 20,
  });

  for (const result of results) {
    const sections = await result.renderSections(500, 1, true);
    console.log(`[${result.score.toFixed(4)}] ${result.uri}`);
    for (const section of sections) {
      console.log(section.text);
    }
  }
}

await search('What is Vectra?');

Step 6: Bundler configuration

Vite

Vite handles WASM and worker files automatically. Minimal config:

// vite.config.ts
import { defineConfig } from 'vite';

export default defineConfig({
  optimizeDeps: {
    exclude: ['@huggingface/transformers'], // let Vite handle WASM files
  },
});

Webpack 5

Enable WASM and async module support:

// webpack.config.js
module.exports = {
  experiments: {
    asyncWebAssembly: true,
  },
  resolve: {
    fallback: {
      fs: false,
      path: false,
    },
  },
};

Complete example

<!DOCTYPE html>
<html>
<head><title>Vectra Browser Search</title></head>
<body>
  <input id="query" placeholder="Search..." />
  <button id="search">Search</button>
  <div id="results"></div>

  <script type="module">
    import {
      LocalDocumentIndex,
      TransformersEmbeddings,
      IndexedDBStorage,
    } from 'vectra/browser';

    const storage = new IndexedDBStorage('search-demo');
    const embeddings = await TransformersEmbeddings.create({
      device: 'auto',
      dtype: 'q8',
      progressCallback: (p) => {
        document.getElementById('results').textContent =
          `Loading model: ${p.status} ${Math.round(p.progress || 0)}%`;
      },
    });

    const index = new LocalDocumentIndex({
      folderPath: 'demo-index',
      embeddings,
      storage,
    });

    if (!(await index.isIndexCreated())) {
      await index.createIndex({ version: 1 });

      // Seed with sample data
      await index.upsertDocument('doc://vectra', 'Vectra is a local vector database with sub-millisecond query latency.', 'txt');
      await index.upsertDocument('doc://features', 'Vectra supports metadata filtering, hybrid BM25 retrieval, and multiple storage backends.', 'txt');
      await index.upsertDocument('doc://browser', 'Vectra runs in browsers using IndexedDB for storage and TransformersEmbeddings for local inference.', 'txt');
    }

    document.getElementById('results').textContent = 'Ready — try a search.';

    document.getElementById('search').addEventListener('click', async () => {
      const query = document.getElementById('query').value;
      if (!query) return;

      const results = await index.queryDocuments(query, { maxDocuments: 5 });
      const el = document.getElementById('results');
      el.innerHTML = '';

      for (const result of results) {
        const sections = await result.renderSections(500, 1, true);
        const div = document.createElement('div');
        div.innerHTML = `<strong>${result.uri}</strong> (${result.score.toFixed(4)})<br>${sections.map(s => s.text).join('<br>')}`;
        el.appendChild(div);
      }
    });
  </script>
</body>
</html>

Electron considerations

The same vectra/browser entry point works in Electron renderer processes:

Use IndexedDBStorage for persistence (renderer has IndexedDB access)
TransformersEmbeddings works with both WebGPU and WASM in Electron
If you need filesystem access, use contextBridge to expose LocalFileStorage from the main process

// In Electron renderer — same code as browser
import { LocalDocumentIndex, TransformersEmbeddings, IndexedDBStorage } from 'vectra/browser';

If nodeIntegration is enabled in your Electron config, you can import from vectra directly and use LocalFileStorage. But the vectra/browser path is recommended for security (renderer shouldn’t have full Node access).

Cleanup

To delete all stored data:

const storage = new IndexedDBStorage('my-search-app');
await storage.destroy(); // deletes the entire IndexedDB database

Next steps

Aligned tokenizer — use embeddings.getTokenizer() to ensure chunk boundaries match the model’s tokens. See Embeddings Guide.
GPU acceleration — pass device: 'gpu' for WebGPU inference if your users’ browsers support it.
Offline-first — since embeddings run locally and data lives in IndexedDB, the app works completely offline after the initial model download.
See the Storage guide for the full browser compatibility matrix.