Search…

Search integration

In this series (15 parts)
  1. Backend system design scope
  2. Designing RESTful APIs
  3. Authentication and session management
  4. Database design for backend systems
  5. Caching in backend systems
  6. Background jobs and task queues
  7. File upload and storage
  8. Search integration
  9. Email and notification delivery
  10. Webhooks: design and security
  11. Payments integration
  12. Multi-tenancy patterns
  13. Backend for Frontend (BFF) pattern
  14. GraphQL server design
  15. gRPC and internal service APIs

A database query finds rows that match exact conditions. Search finds documents that are relevant to a user’s intent. The difference is subtle but fundamental: search ranks results by relevance, handles typos, understands synonyms, and returns results in milliseconds even across millions of documents. This article covers how to integrate a search engine (Elasticsearch, OpenSearch, Typesense, Meilisearch) into your backend without turning your data pipeline into a mess.

The indexing problem

Your source of truth lives in your database. Your search engine is a secondary index. The core challenge is keeping them in sync. Every create, update, and delete in the database must eventually be reflected in the search index. Get this wrong and users search for products that no longer exist, or cannot find ones that do.

Synchronous vs event-driven indexing

There are two approaches to keeping the search index in sync with the database.

Synchronous indexing

When the application writes to the database, it also writes to the search index in the same request.

async function createProduct(data) {
  const product = await db.products.create(data);
  await searchClient.index('products', product.id, {
    name: product.name,
    description: product.description,
    price: product.price,
    categories: product.categories,
  });
  return product;
}

Advantages: simple, immediately consistent.

Disadvantages: if the search engine is slow or down, the API request fails or slows down. Your API’s availability is now coupled to your search engine’s availability.

Event-driven indexing

The application writes to the database and emits an event. A separate consumer reads the event and updates the search index.

graph LR
  API["API Server"] -->|Write| DB["Database"]
  DB -->|CDC / Events| Queue["Event Queue"]
  Queue -->|Consume| Indexer["Search Indexer"]
  Indexer -->|Index| ES["Elasticsearch"]

  style DB fill:#3498db,color:#fff
  style Queue fill:#f39c12,color:#fff
  style ES fill:#2ecc71,color:#fff

Event-driven indexing pipeline. Database changes flow through an event queue to the search indexer.

Advantages: the API is decoupled from search engine availability. If the search engine is down, events queue up and are processed when it recovers. The indexer can batch updates for efficiency.

Disadvantages: the search index is eventually consistent. There is a delay (typically milliseconds to seconds) between a database write and the corresponding search index update.

For most search architectures, event-driven indexing is the right choice. The slight delay is acceptable, and the reliability improvement is significant.

Change data capture (CDC)

CDC tools (Debezium, Maxwell, AWS DMS) read the database’s transaction log and emit events for every row change. This is more reliable than application-level events because:

  • It captures changes made by any process, not just your API (migrations, admin scripts, other services).
  • It does not require modifying application code.
  • It is ordered and complete; you cannot accidentally skip a change.

The trade-off is operational complexity. CDC requires access to the database’s replication stream, which needs careful configuration and monitoring.

Handling deletes

Deletes in search are trickier than creates and updates. Three patterns:

Hard delete

When you delete from the database, also delete from the search index. Simple, but if the search delete fails, you have a ghost result that points to a deleted record.

Soft delete with filtering

Mark the document as deleted in the search index instead of removing it. Apply a filter to all search queries that excludes deleted documents:

{
  "query": {
    "bool": {
      "must": { "match": { "name": "laptop" } },
      "must_not": { "term": { "deleted": true } }
    }
  }
}

Then periodically purge deleted documents in a background job. This is the safest pattern because the filter guarantees deleted items never appear in results, even if the purge job is delayed.

Tombstone events

When using CDC, a delete emits a tombstone event (a record with the key but null value). The indexer processes this by removing the document from the search index. This is the standard CDC approach and works well.

Building the indexing pipeline

A robust indexing pipeline has these components:

graph TD
  DB["Database"] -->|CDC| Debezium["Debezium"]
  Debezium -->|Events| Kafka["Kafka / SQS"]
  Kafka -->|Consume| Transform["Transformer<br/>Map DB rows to search docs"]
  Transform -->|Batch| Bulk["Bulk Indexer<br/>Elasticsearch _bulk API"]
  Bulk -->|Index| ES["Elasticsearch"]
  Bulk -->|On failure| DLQ["Dead Letter Queue"]
  DLQ -->|Manual replay| Bulk

  style DB fill:#3498db,color:#fff
  style Kafka fill:#f39c12,color:#fff
  style ES fill:#2ecc71,color:#fff
  style DLQ fill:#e74c3c,color:#fff

Complete indexing pipeline from database to search engine with error handling.

Transformer

The transformer maps database rows to search documents. This is where you:

  • Flatten nested data (join product and category tables into a single search document).
  • Compute derived fields (full name from first + last, lowercase for case-insensitive search).
  • Strip HTML from rich text fields.
  • Add boost fields (popular products get a higher score).

Bulk indexing

Never index one document at a time in production. Use the bulk API to batch hundreds or thousands of documents per request. This reduces network overhead and improves throughput by 10x to 100x.

async function bulkIndex(documents) {
  const body = documents.flatMap(doc => [
    { index: { _index: 'products', _id: doc.id } },
    doc,
  ]);
  const response = await esClient.bulk({ body });
  if (response.errors) {
    const failed = response.items.filter(item => item.index.error);
    await deadLetterQueue.addBatch(failed);
  }
}

Full reindex

Sometimes you need to rebuild the entire search index: after changing the mapping, fixing a transformer bug, or migrating to a new search engine. Use an alias strategy:

  1. Create a new index (products_v2) with the new mapping.
  2. Run a full reindex from the database into products_v2.
  3. While reindexing, continue indexing live changes into both products_v1 and products_v2.
  4. Once the reindex completes, swap the alias (products -> products_v2).
  5. Delete products_v1.

This is zero-downtime reindexing. Users see results from the old index until the swap, then immediately see results from the new one.

Search relevance tuning basics

Out of the box, Elasticsearch ranks results using BM25 (a term frequency/inverse document frequency algorithm). This works reasonably well, but you will need to tune it.

Field boosting

Not all fields are equally important. A match in the product name should rank higher than a match in the description:

{
  "query": {
    "multi_match": {
      "query": "wireless keyboard",
      "fields": ["name^3", "description^1", "categories^2"]
    }
  }
}

The ^3 means a match in name is weighted 3x more than a match in description.

Function score for business logic

Combine text relevance with business signals:

{
  "query": {
    "function_score": {
      "query": { "match": { "name": "keyboard" } },
      "functions": [
        {
          "field_value_factor": {
            "field": "sales_count",
            "modifier": "log1p",
            "factor": 0.5
          }
        },
        {
          "gauss": {
            "created_at": {
              "origin": "now",
              "scale": "30d",
              "decay": 0.5
            }
          }
        }
      ],
      "boost_mode": "multiply"
    }
  }
}

This query boosts results by sales count (popular products rank higher) and recency (newer products get a slight boost).

Each tuning stage improves relevance. Field boosting alone can improve NDCG by 30% or more over raw BM25.

Type-ahead and autocomplete

Autocomplete is a distinct search problem. Users expect results after typing 2 to 3 characters, within 50 to 100 milliseconds. This requires specialized data structures.

Completion suggester

Elasticsearch’s completion suggester uses an in-memory FST (finite state transducer) for prefix lookups:

{
  "mappings": {
    "properties": {
      "name_suggest": {
        "type": "completion",
        "contexts": [
          { "name": "category", "type": "category" }
        ]
      }
    }
  }
}

Query:

{
  "suggest": {
    "product-suggest": {
      "prefix": "wire",
      "completion": {
        "field": "name_suggest",
        "size": 5,
        "contexts": {
          "category": "electronics"
        }
      }
    }
  }
}

Edge n-gram tokenizer

For more flexible matching (not just prefix), use an edge n-gram tokenizer that generates tokens like w, wi, wir, wire, wirel at index time. At query time, the search term matches against these tokens:

{
  "settings": {
    "analysis": {
      "tokenizer": {
        "edge_ngram_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 15
        }
      },
      "analyzer": {
        "autocomplete_analyzer": {
          "tokenizer": "edge_ngram_tokenizer",
          "filter": ["lowercase"]
        }
      }
    }
  }
}

Client-side debouncing

Do not send a search request on every keystroke. Debounce by 200 to 300 milliseconds. This reduces server load by 60 to 80% without noticeably affecting the user experience.

Monitoring search health

Track these metrics:

  • Index lag: time between a database write and the corresponding search index update. Target under 5 seconds.
  • Search latency: p50 and p99 query time. Target p99 under 200ms.
  • Index size: document count and disk usage. Unexpected growth may indicate duplicate indexing.
  • Query error rate: failed searches per minute. Should be near zero.
  • Zero-result rate: percentage of queries that return no results. High rates suggest missing data or poor relevance tuning.

What comes next

This is the final article in the Backend System Design series. You now have a foundation covering APIs, authentication, databases, caching, background jobs, file storage, and search. Each topic connects to the others: APIs authenticate with tokens stored in databases, background jobs warm caches and sync search indexes, file uploads flow through scanning pipelines into CDN-served storage.

The natural next step is to apply these patterns in a real system design exercise. Pick a product you use daily, sketch the backend architecture, and identify which patterns from this series apply to each component. That exercise will consolidate everything you have learned.

Start typing to search across all content
navigate Enter open Esc close