Let’s try to add some sample data first to our database.
Step 1: Install pgxnclient
On macOS (with Homebrew):
brew install pgxnclient
On Ubuntu/Debian:
sudo apt install pgxnclient
Step 2: Install the faker extension via PGXN
pgxn install faker
I get issue with installing faker via pgxn:
~ pgxn install faker
INFO: best version: faker 0.5.3
ERROR: resource not found: 'https://api.pgxn.org/dist/PostgreSQL_Faker/0.5.3/META.json'
⚠️ Note: faker extension we’re trying to install via pgxn is not available or improperly published on the PGXN network. Unfortunately, the faker extension is somewhat unofficial and not actively maintained or reliably hosted.
🚨 You can SKIP STEP 3,4,5 and opt Option 2
Step 3: Build and install the extension into PostgreSQL
cd /path/to/pg_faker # PGXN will print this after install
make
sudo make install
Step 4: Enable it in your database
Inside psql :
CREATE EXTENSION faker;
Step 5: Insert 10,000 fake users
INSERT INTO users (user_id, username, email, phone_number)
SELECT
gs AS user_id,
faker_username(),
faker_email(),
faker_phone_number()
FROM generate_series(1, 10000) AS gs;
Option 2: Use Ruby + Faker gem (if you’re using Rails or Ruby)
If you’re building your app in Rails, use the faker gem directly:
In Ruby:
require 'faker'
require 'pg'
conn = PG.connect(dbname: 'test_db')
(1..10_000).each do |i|
conn.exec_params(
"INSERT INTO users (user_id, username, email, phone_number) VALUES ($1, $2, $3, $4)",
[i, Faker::Internet.username, Faker::Internet.email, Faker::PhoneNumber.phone_number]
)
end
In Rails (for test_db), Create the Rake Task:
Create a file at:
lib/tasks/seed_fake_users.rake
# lib/tasks/seed_fake_users.rake
namespace :db do
desc "Seed 10,000 fake users into the users table"
task seed_fake_users: :environment do
require "faker"
require "pg"
conn = PG.connect(dbname: "test_db")
# If user_id is a serial and you want to reset the sequence after deletion, run:
# conn.exec_params("TRUNCATE TABLE users RESTART IDENTITY")
# delete existing users to load fake users
conn.exec_params("DELETE FROM users")
puts "Seeding 10,000 fake users ...."
(1..10_000).each do |i|
conn.exec_params(
"INSERT INTO users (user_id, username, email, phone_number) VALUES ($1, $2, $3, $4)",
[ i, Faker::Internet.username, Faker::Internet.email, Faker::PhoneNumber.phone_number ]
)
end
puts "Seeded 10,000 fake users into the users table"
conn.close
end
end
# run the task
bin/rails db:seed_fake_users
For Normal Rails Rake Task:
# lib/tasks/seed_fake_users.rake
namespace :db do
desc "Seed 10,000 fake users into the users table"
task seed_fake_users: :environment do
require 'faker'
puts "🌱 Seeding 10,000 fake users..."
users = []
# delete existing users
User.destroy_all
10_000.times do |i|
users << {
user_id: i + 1,
username: Faker::Internet.unique.username,
email: Faker::Internet.unique.email,
phone_number: Faker::PhoneNumber.phone_number
}
end
# Use insert_all for performance
User.insert_all(users)
puts "✅ Done. Inserted 10,000 users."
end
end
# run the task
bin/rails db:seed_fake_users
Now we will discuss about PostgreSQL Extensions and it’s usage.
PostgreSQL extensions are add-ons or plug-ins that extend the core functionality of PostgreSQL. They provide additional capabilities such as new data types, functions, operators, index types, or full features like full-text search, spatial data handling, or fake data generation.
🔧 What Extensions Can Do
Extensions can:
- Add functions (e.g.
gen_random_bytes()frompgcrypto) - Provide data types (e.g.
hstore,uuid,jsonb) - Enable indexing techniques (e.g.
btree_gin,pg_trgm) - Provide tools for testing and development (e.g.
faker,pg_stat_statements) - Enhance performance monitoring, security, or language support
📦 Common PostgreSQL Extensions
| Extension | Purpose |
|---|---|
pgcrypto | Cryptographic functions (e.g., hashing, random byte generation) |
uuid-ossp | Functions to generate UUIDs |
postgis | Spatial and geographic data support |
hstore | Key-value store in a single PostgreSQL column |
pg_trgm | Trigram-based text search and indexing |
citext | Case-insensitive text type |
pg_stat_statements | SQL query statistics collection |
faker | Generates fake but realistic data (for testing) |
📥 Installing and Enabling Extensions
1. Install (if not built-in)
Via package manager or PGXN (PostgreSQL Extension Network), or compile from source.
2. Enable in a database
CREATE EXTENSION extension_name;
Example:
CREATE EXTENSION pgcrypto;
Enabling an extension makes its functionality available to the current database only.
🤔 Why Use Extensions?
- Productivity: Quickly add capabilities without writing custom code.
- Performance: Access to advanced indexing, statistics, and optimization tools.
- Development: Generate test data (
faker), test encryption (pgcrypto), etc. - Modularity: PostgreSQL stays lightweight while letting you add only what you need.
Here’s a categorized list (with a simple visual-style layout) of PostgreSQL extensions that are safe and useful for Rails apps in both development and production environments.
🔌 PostgreSQL Extensions for Rails Apps
# connect psql
psql -U username -d database_name
# list all available extensions
SELECT * FROM pg_available_extensions;
# eg. to install the hstore extension run
CREATE EXTENSION hstore;
# verify the installation
SELECT * FROM pg_extension;
SELECT * FROM pg_extension WHERE extname = 'hstore';
🔐 Security & UUIDs
| Extension | Use Case | Safe for Prod |
|---|---|---|
pgcrypto | Secure random bytes, hashes, UUIDs | ✅ |
uuid-ossp | UUID generation (v1, v4, etc.) | ✅ |
💡 Tip: Use uuid-ossp or pgcrypto to generate UUID primary keys (id: :uuid) in Rails.
📘 PostgreSQL Procedures and Triggers — Explained with Importance and Examples
PostgreSQL is a powerful, open-source relational database that supports advanced features like stored procedures and triggers, which are essential for encapsulating business logic inside the database.
🔹 What are Stored Procedures in PostgreSQL?
A stored procedure is a pre-compiled set of SQL and control-flow statements stored in the database and executed by calling it explicitly.
Purpose: Encapsulate business logic, reuse complex operations, improve performance, and reduce network overhead.
✅ Benefits of Stored Procedures:
- Faster execution (compiled and stored in DB)
- Centralized logic
- Reduced client-server round trips
- Language support: SQL, PL/pgSQL, Python, etc.
🧪 Example: Create a Procedure to Add a New User
CREATE OR REPLACE PROCEDURE add_user(name TEXT, email TEXT)
LANGUAGE plpgsql
AS $$
BEGIN
INSERT INTO users (name, email) VALUES (name, email);
END;
$$;
▶️ Call the procedure:
CALL add_user('John Doe', 'john@example.com');
🔹 What are Triggers in PostgreSQL?
A trigger is a special function that is automatically executed in response to certain events on a table (like INSERT, UPDATE, DELETE).
Purpose: Enforce rules, maintain audit logs, auto-update columns, enforce integrity, etc.
✅ Benefits of Triggers:
- Automate tasks on data changes
- Enforce business rules and constraints
- Keep logs or audit trails
- Maintain derived data or counters
🧪 Example: Trigger to Log Inserted Users
1. Create the audit table:
CREATE TABLE user_audit (
id SERIAL PRIMARY KEY,
user_id INTEGER,
name TEXT,
email TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
2. Create the trigger function:
CREATE OR REPLACE FUNCTION log_user_insert()
RETURNS TRIGGER AS $$
BEGIN
INSERT INTO user_audit (user_id, name, email)
VALUES (NEW.id, NEW.name, NEW.email);
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
3. Create the trigger on users table:
CREATE TRIGGER after_user_insert
AFTER INSERT ON users
FOR EACH ROW
EXECUTE FUNCTION log_user_insert();
Now, every time a user is inserted, the trigger logs it in the user_audit table automatically.
📌 Difference: Procedures vs. Triggers
| Feature | Stored Procedures | Triggers |
|---|---|---|
| When executed | Called explicitly with CALL | Automatically executed on events |
| Purpose | Batch processing, encapsulate logic | React to data changes automatically |
| Control | Full control by developer | Fire based on database event (Insert, Update, Delete) |
| Returns | No return or OUT parameters | Must return NEW or OLD row in most cases |
🎯 Why Are Procedures and Triggers Important?
✅ Use Cases for Stored Procedures:
- Bulk processing (e.g. daily billing)
- Data import/export
- Account setup workflows
- Multi-step business logic
✅ Use Cases for Triggers:
- Auto update
updated_atcolumn - Enforce soft-deletes
- Maintain counters or summaries (e.g., post comment count)
- Audit logs / change history
- Cascading updates or cleanups
🚀 Real-World Example: Soft Delete Trigger
Instead of deleting records, mark them as deleted = true.
CREATE OR REPLACE FUNCTION soft_delete_user()
RETURNS TRIGGER AS $$
BEGIN
UPDATE users SET deleted = TRUE WHERE id = OLD.id;
RETURN NULL; -- cancel the delete
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER before_user_delete
BEFORE DELETE ON users
FOR EACH ROW
EXECUTE FUNCTION soft_delete_user();
Now any DELETE FROM users WHERE id = 1; will just update the deleted column.
🛠️ Tools to Manage Procedures & Triggers
- pgAdmin (GUI)
- psql (CLI)
- Code-based migrations (via tools like ActiveRecord or
pggem)
🧠 Summary
| Feature | Stored Procedure | Trigger |
|---|---|---|
| Manual/Auto | Manual (CALL) | Auto (event-based) |
| Flexibility | Complex logic, loops, variables | Quick logic, row-based or statement-based |
| Languages | PL/pgSQL, SQL, Python, etc. | PL/pgSQL, SQL |
| Best for | Multi-step workflows | Audit, logging, validation |
Use Postgres RANDOM()
By using RANDOM() in PostgreSQL. If the application uses PostgreSQL’s built-in RANDOM() function to efficiently retrieve a random user from the database. Here’s why this is important:
- Efficiency: PostgreSQL’s RANDOM() is more efficient than loading all records into memory and selecting one randomly in Ruby. This is especially important when dealing with large datasets (like if we have 10000 users).
- Database-level Operation: The randomization happens at the database level rather than the application level, which:
- Reduces memory usage (we don’t need to load unnecessary records)
- Reduces network traffic (only one record is transferred)
- Takes advantage of PostgreSQL’s optimized random number generation
- Single Query: Using RANDOM() allows us to fetch a random record in a single SQL query, typically something like:sqlApply to
SELECT * FROM users ORDER BY RANDOM() LIMIT 1
This is in contrast to less efficient methods like:
- Loading all users and using Ruby’s sample method (
User.all.sample) - Getting a random ID and then querying for it (which would require two queries)
- Using offset with count (which can be slow on large tables)
🔍 Full Text Search & Similarity
| Extension | Use Case | Safe for Prod |
|---|---|---|
pg_trgm | Trigram-based fuzzy search (great with ILIKE & similarity) | ✅ |
unaccent | Remove accents for better search results | ✅ |
fuzzystrmatch | Soundex, Levenshtein distance | ✅ (heavy use = test!) |
💡 Combine pg_trgm + unaccent for powerful search in Rails models using ILIKE.
📊 Performance Monitoring & Dev Insights
| Extension | Use Case | Safe for Prod |
|---|---|---|
pg_stat_statements | Monitor slow queries, frequency | ✅ |
auto_explain | Log plans for slow queries | ✅ |
hypopg | Simulate hypothetical indexes | ✅ (dev only) |
🧪 Dev Tools & Data Generation
| Extension | Use Case | Safe for Prod |
|---|---|---|
faker | Fake data generation for testing | ❌ Dev only |
pgfaker | Community alternative to faker | ❌ Dev only |
📦 Storage & Structure
| Extension | Use Case | Safe for Prod |
|---|---|---|
hstore | Key-value storage in a column | ✅ |
citext | Case-insensitive text | ✅ |
💡 citext is very handy for case-insensitive email columns in Rails.
🗺️ Geospatial (Advanced)
| Extension | Use Case | Safe for Prod |
|---|---|---|
postgis | GIS/spatial data support | ✅ (big apps) |
🎨 Visual Summary
+-------------------+-----------------------------+-----------------+
| Category | Extension | Safe for Prod? |
+-------------------+-----------------------------+-----------------+
| Security/UUIDs | pgcrypto, uuid-ossp | ✅ |
| Search/Fuzziness | pg_trgm, unaccent, fuzzystr | ✅ |
| Monitoring | pg_stat_statements | ✅ |
| Dev Tools | faker, pgfaker | ❌ (Dev only) |
| Text/Storage | citext, hstore | ✅ |
| Geo | postgis | ✅ |
+-------------------+-----------------------------+-----------------+
PostgreSQL Extension for VSCode
# 1. open the Command Palette (Cmd + Shift + P)
# 2. Type 'PostgreSQL: Add Connection'
# 3. Enter the hostname of the database authentication details
# 4. Open Command Palette, type: 'PostgreSQL: New Query'
Enjoy PostgreSQL 🚀
3 thoughts on “Rails 8 App: Setup Test DB in PostgreSQL | Faker | Extensions for Rails app, VSCode”