Rails 8 Application: Implement PostgreSQL search on products

Here’s a concise analysis and solution:

Full-Text Search Strategy

For name column:

  • Recommended: Use a GIN index with pg_trgm extension
  • Benefits:
    • Faster than LIKE/ILIKE
    • Supports partial matches
    • Handles typos with similarity function

For description column:

  • Recommendation: Add a separate searchable column
  • Why?:
    • Text fields can be large
    • Better to index a pre-processed version
    • Consider using tsvector for better performance

Implementation Steps:

  1. First, create a migration to update the indexes:
# db/migrate/[timestamp]_add_search_to_products.rb

class AddSearchToProducts < ActiveRecord::Migration[8.0]
  def change
    # Enable pg_trgm extension for fuzzy matching
    enable_extension 'pg_trgm' unless extension_enabled?('pg_trgm')
    
    # Add a searchable column for description (optional)
    add_column :products, :searchable_description, :tsvector
    add_index :products, :searchable_description, using: :gin

    # Replace the simple index with a GIN index for name
    remove_index :products, name: 'idx_products_name'
    add_index :products, 'name gin_trgm_ops', using: :gin, name: 'index_products_on_name_trgm'
    
    # Optional: Add a generated column for combined search
    execute <<-SQL
      UPDATE products 
      SET searchable_description = to_tsvector('english', coalesce(description, ''));
    SQL
  end
end
  1. In your Product model:
# app/models/product.rb
class Product < ApplicationRecord
  # For name search
  def self.search_by_name(query)
    where("name % ?", query)
      .order("similarity(name, #{ActiveRecord::Base.connection.quote(query)}) DESC")
  end

  # For description search (if needed)
  def self.search_in_description(query)
    where("searchable_description @@ plainto_tsquery('english', :q)", q: query)
  end
end

Usage:

# Search in name (fuzzy matching)
products = Product.search_by_name("laptp")  # Will match "laptop"

# Search in description (exact matching)
products = Product.search_in_description("high performance")

Recommendation:

  1. Start with name-only search – It’s the most common use case
  2. Monitor performance – Check query times with EXPLAIN ANALYZE
  3. Add description search later if needed, using the searchable_description column

Let’s implement a unified search that looks for matches in both name and description. Here’s a streamlined approach using PostgreSQL’s full-text search capabilities:

1. First, create a migration to add search functionality:

# db/migrate/[timestamp]_add_search_to_products.rb
class AddSearchToProducts < ActiveRecord::Migration[8.0]
  def up
    # Enable necessary extensions
    enable_extension 'pg_trgm' unless extension_enabled?('pg_trgm')
    enable_extension 'unaccent' unless extension_enabled?('unaccent')

    # Add a generated column for combined search
    add_column :products, :searchable, :tsvector
    execute <<-SQL
      UPDATE products 
      SET searchable = 
        setweight(to_tsvector('english', coalesce(name, '')), 'A') ||
        setweight(to_tsvector('english', coalesce(description, '')), 'B');
    SQL

    # Create a function to automatically update the searchable column
    execute <<-SQL
      CREATE OR REPLACE FUNCTION products_search_trigger() RETURNS trigger AS $$
      begin
        new.searchable :=
          setweight(to_tsvector('english', coalesce(new.name, '')), 'A') ||
          setweight(to_tsvector('english', coalesce(new.description, '')), 'B');
        return new;
      end
      $$ LANGUAGE plpgsql;
    SQL

    # Create a trigger to update the searchable column
    execute <<-SQL
      CREATE TRIGGER tsvector_update_products BEFORE INSERT OR UPDATE
      ON products FOR EACH ROW EXECUTE FUNCTION products_search_trigger();
    SQL

    # Create a GIN index for the searchable column
    add_index :products, :searchable, using: :gin, name: 'index_products_on_searchable'
  end

  def down
    remove_index :products, name: 'index_products_on_searchable'
    execute 'DROP TRIGGER IF EXISTS tsvector_update_products ON products'
    execute 'DROP FUNCTION IF EXISTS products_search_trigger()'
    remove_column :products, :searchable
  end
end

2. In our Product model, add the search method:

ruby# app/models/product.rb
class Product < ApplicationRecord
  # ... existing code ...

  def self.search_in_name_and_desc(query)
    return all if query.blank?

    # Remove any existing tsvector_update_products trigger to prevent recursion
    connection.execute('DROP TRIGGER IF EXISTS tsvector_update_products ON products')

    # Update the searchable column for all records
    Product.find_each do |product|
      product.update_columns(
        searchable: 
          set_weight(to_tsvector('english', product.name || ' '), 'A') ||
          set_weight(to_tsvector('english', product.description || ' '), 'B')
      )
    end

    # Search using the tsvector column
    where("searchable @@ plainto_tsquery('english', :q)", q: query)
      .order("ts_rank(searchable, plainto_tsquery('english', :q)) DESC", q: query)
  end

  private

  def self.set_weight(vector, weight)
    Arel::Nodes::InfixOperation.new('||', 
      Arel::Nodes::InfixOperation.new('||',
        Arel::Nodes::SqlLiteral.new("setweight("),
        Arel::Nodes::Grouping.new(vector)
      ),
      Arel::Nodes::SqlLiteral.new(",'#{weight}')")
    )
  end
end

3. Usage:

# Basic search (searches both name and description)
results = Product.search_in_name_and_desc("laptop")

# You can chain other scopes
results = Product.in_stock.search_in_name_and_desc("gaming laptop").order(price: :asc)

How It Works:

  1. Weighted Search:
    • Name matches (‘A’ weight) are ranked higher than description matches (‘B’ weight)
    • Full-text search handles word stemming and stop words automatically
  2. Performance:
    • The GIN index on the searchable tsvector column makes searches fast
    • The trigger keeps the searchable column in sync automatically
  3. Features:
    • Handles partial matches
    • Ranks results by relevance
    • Case-insensitive
    • Handles common words and stop words

Test the search:

# Test with different search terms
Product.search_in_name_and_desc("gaming laptop")  # Matches in either name or description

Product.search_in_name_and_desc("gaming -mouse")  # Excludes products with "mouse"

Product.search_in_name_and_desc("gaming OR laptop")  # Either term

This implementation gives you a single, efficient search method that handles both name and description fields with proper ranking and performance characteristics.


What unaccent Does:

The unaccent extension in PostgreSQL is a text search dictionary that removes accents (diacritic signs) from lexemes, making searches accent-insensitive. Here’s why it’s useful:

  1. Normalizes Text: Converts accented characters to their unaccented equivalents
    • Example: café → cafeHéllo → Hello
  2. Improves Search:
    • Without it: Searching for “cafe” won’t match “café”
    • With it: Both “cafe” and “café” will match
  3. Common Use Cases:
    • User search functionality where you want to ignore accents
    • International product catalogs
    • User names or locations with special characters

Example in Our Code:

In your search migration, it ensures that:

  • A search for “resume” will also match “résumé”
  • A search for “cafe” will match “café”
  • “Jalapeño” matches “jalapeno”

How It Works with Your Search:

When combined with pg_trgm, it makes your search more user-friendly by:

  1. Removing accents from both the search term and the stored text
  2. Then performing the similarity comparison

Disabling/Enabling:

  • The extension is enabled only if it’s not already enabled (unless extension_enabled?('unaccent'))
  • It’s safe to keep enabled as it doesn’t affect existing data unless explicitly used

 (PostgreSQL Trigram) extension is specifically designed for fuzzy string matching and searching. Here’s what it does:

Key Features of pg_trgm:

  1. Trigram Matching:
    • Breaks strings into groups of 3 consecutive characters
    • Example: “hello” → [” h”, ” he”, “hel”, “ell”, “llo”, “lo “, “o “]
    • Uses these trigrams to find similar strings
  2. Fuzzy Search Capabilities:
    • Finds matches even with typos or slight misspellings
    • Ranks results by similarity
    • Works well for “did you mean?” type suggestions
  3. Common Operators:
    • % – Similarity operator (returns true if strings are similar)
    • <-> – Distance operator (returns a distance metric)
    • %> – Word similarity (best match of any word in the search)

In Our Implementation:

# This uses pg_trgm's similarity matching
where("name % ?", search_term)
  .order("similarity(name, #{ActiveRecord::Base.connection.quote(search_term)}) DESC")

Example Searches:

# Finds "Samsung" even if spelled "Samsng" or "Samsing"
Product.where("name % ?", "Samsng")

# Ranks "iPhone 13" higher than "iPhone 12" when searching for "iPhone 13 Pro"
Product.where("name % ?", "iPhone 13 Pro").order("name <-> 'iPhone 13 Pro'")

Benefits in Our Case:

  1. Typo Tolerance: Users can make small mistakes and still find products
  2. Partial Matches: Finds “phone” in “smartphone”
  3. Ranked Results: More relevant matches appear first

The combination of pg_trgm with unaccent (which handles accents) gives you robust, user-friendly search capabilities right in the database.

Final model and migration

Product Model:

class Product < ApplicationRecord
  has_many :order_items, dependent: :destroy
  has_many :orders, through: :order_items

  validates :name, presence: true
  validates :price, presence: true, numericality: { greater_than: 0 }
  validates :stock, presence: true, numericality: { greater_than_or_equal_to: 0 }

  # Search across both name and description
  # @param query [String] search term
  # @return [ActiveRecord::Relation] matching products ordered by relevance
  def self.search_in_name_and_desc(query)
    return all if query.blank?

    # Remove any existing tsvector_update_products trigger to prevent recursion
    connection.execute('DROP TRIGGER IF EXISTS tsvector_update_products ON products')

    # Update the searchable column for all records
    Product.find_each do |product|
      product.update_columns(
        searchable: 
          set_weight(to_tsvector('english', product.name || ' '), 'A') ||
          set_weight(to_tsvector('english', product.description || ' '), 'B')
      )
    end

    # Search using the tsvector column
    where("searchable @@ plainto_tsquery('english', :q)", q: query)
      .order("ts_rank(searchable, plainto_tsquery('english', :q)) DESC", q: query)
  end

  # Helper method to set weight for tsvector
  def self.set_weight(vector, weight)
    Arel::Nodes::InfixOperation.new('||', 
      Arel::Nodes::InfixOperation.new('||',
        Arel::Nodes::SqlLiteral.new("setweight("),
        Arel::Nodes::Grouping.new(vector)
      ),
      Arel::Nodes::SqlLiteral.new(",'#{weight}')")
    )
  end
  private_class_method :set_weight
end

Product Migration:

class AddSearchableToProducts < ActiveRecord::Migration[8.0]
  def up
    # Enable necessary extensions
    enable_extension 'pg_trgm' unless extension_enabled?('pg_trgm')
    enable_extension 'unaccent' unless extension_enabled?('unaccent')

    # Add searchable column
    add_column :products, :searchable, :tsvector

    # Create a function to update the searchable column
    execute <<-SQL
      CREATE OR REPLACE FUNCTION products_search_trigger() RETURNS trigger AS $$
      begin
        new.searchable :=
          setweight(to_tsvector('english', coalesce(new.name, '')), 'A') ||
          setweight(to_tsvector('english', coalesce(new.description, '')), 'B');
        return new;
      end
      $$ LANGUAGE plpgsql;
    SQL

    # Create a trigger to update the searchable column
    execute <<-SQL
      CREATE TRIGGER tsvector_update_products
      BEFORE INSERT OR UPDATE ON products
      FOR EACH ROW EXECUTE FUNCTION products_search_trigger();
    SQL

    # Update existing records
    Product.find_each(&:touch)

    # Create GIN index for the searchable column
    add_index :products, :searchable, using: :gin, name: 'gin_idx_products_on_searchable'
  end

  def down
    remove_index :products, name: 'gin_idx_products_on_searchable'
    execute 'DROP TRIGGER IF EXISTS tsvector_update_products ON products'
    execute 'DROP FUNCTION IF EXISTS products_search_trigger()'
    remove_column :products, :searchable
  end
end


Happy Coding!