Design: Our Students Table -> course -> school
We need a UNIQUE constraint on user_id because:
- ✅ One student per user (user_id should be unique)
- ✅ Multiple students per course (course_id can be repeated)
Check Migration Files:
- Students: Github: 20250709072153_create_students.rb
- Courses: Github: 20250709155145_create_courses.rb
- Schools: Github: 20250709155152_create_schools.rb
Key Changes:
- ✅ Added UNIQUE constraint:
CONSTRAINT uk_students_user_id UNIQUE (user_id) - 🔧 Fixed typos:
TIMSTAMP→TIMESTAMPstidents→students
📈 Optimized indexes: No need for user_id index since UNIQUE creates one automatically
Business Logic Validation:
- user_id: One student per user ✅
- course_id: Multiple students per course ✅
- school_id: Multiple students per school ✅
This ensures referential integrity and business rules are enforced at the database level!
📁 Schema Storage Options:
Rails allows you to store the schema in SQL format instead of the default Ruby format. Let me explain the options and why you’d choose each:
1. Ruby Format (Default)
# db/schema.rb
ActiveRecord::Schema[8.0].define(version: 2025_07_09_074552) do
enable_extension "pg_catalog.plpgsql"
create_table "users", force: :cascade do |t|
t.string "first_name", limit: 100, null: false
t.string "email", limit: 150, null: false
t.datetime "created_at", null: false
t.index ["email"], name: "idx_users_email"
end
end
2. SQL Format
-- db/structure.sql
CREATE EXTENSION IF NOT EXISTS pg_catalog.plpgsql;
CREATE TYPE gender_enum AS ENUM ('male', 'female', 'not-specified');
CREATE TABLE users (
id bigserial PRIMARY KEY,
first_name varchar(100) NOT NULL,
email varchar(150) NOT NULL,
created_at timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
);
CREATE UNIQUE INDEX idx_users_email ON users(email);
Check structure.sql File: Github: db/structure.sql
🔧 How to Switch to SQL Format:
Add this to config/application.rb:
module AcademicSqlSoftware
class Application < Rails::Application
# ... existing config ...
# Use SQL format for schema
config.active_record.schema_format = :sql
end
end
Then regenerate the schema:
# Generate SQL schema file
rails db:schema:dump
This will create db/structure.sql instead of db/schema.rb.
Comparison Table:
| Feature | Ruby Format (schema.rb) | SQL Format (structure.sql) |
|---|---|---|
| Default | ✅ Yes | ❌ No |
| Database Agnostic | ✅ Yes | ❌ No (DB-specific) |
| Custom SQL Features | ❌ Limited | ✅ Full support |
| Views | ❌ No | ✅ Yes |
| Stored Procedures | ❌ No | ✅ Yes |
| Triggers | ❌ No | ✅ Yes |
| Custom Types/ENUMs | ❌ Limited | ✅ Full support |
| Partial Indexes | ❌ Limited | ✅ Yes |
| Check Constraints | ❌ Limited | ✅ Yes |
| Version Control | ✅ Easy to diff | ⚠️ Harder to diff |
| Readability | ✅ Ruby-like | ⚠️ Raw SQL |
When to Use Each:
Choose SQL Format When:
- ✅ Using PostgreSQL-specific features (ENUMs, arrays, JSONB operators)
- ✅ Need database views, triggers, or stored procedures
- ✅ Using complex constraints (CHECK, partial indexes)
- ✅ Working with custom data types
- ✅ Team is SQL-proficient
Choose Ruby Format When:
- ✅ Want database portability (switching between PostgreSQL/MySQL/SQLite)
- ✅ Team prefers Ruby syntax
- ✅ Using standard Rails features only
- ✅ Need easier version control diffs
For Our Project:
I recommend SQL format because you’re using:
- ✅ PostgreSQL ENUMs (
gender_enum,category_enum) – Rails 8 supports this - ✅ Custom constraints (
CHECK (price > 0)) – Rails 8 supports this - ✅ Complex foreign key relationships
- ✅ Raw SQL migrations
Our config/application.rb should include:
config.active_record.schema_format = :sql
Then run:
rails db:schema:dump
This will give us full control over PostgreSQL features while maintaining schema consistency! 🎯
🌱 Seed Data
1. Let’s refactor the seed file into modular components. I’ll create separate seeder classes for each model type and organize them properly.
2. Let’s me create individual seeder classes for each model, the seeders for the dependent models (orders and students), refactor the main seeds.rb file to use the new modular structure, create a documentation file for the new modular structure.
3. I’ve successfully refactored our seed file into a modular, maintainable architecture. Here’s what we’ve created:
🎯 Modular Seed System Overview
📁 File Structure:
db/
├── seeds.rb # Main orchestrator (now 20 lines vs 246!)
└── seeds/
├── README.md # Complete documentation
├── base_seeder.rb # Shared functionality & config
├── users_seeder.rb # User creation logic
├── products_seeder.rb # Product creation logic
├── courses_seeder.rb # Course creation logic
├── schools_seeder.rb # School creation logic
├── orders_seeder.rb # Order creation (with dependencies)
└── students_seeder.rb # Student creation (with constraints)
Key Benefits:
✅ Separation of Concerns:
- Each model has its own dedicated seeder class
- Clear responsibility boundaries
- Easy to modify individual model generation
✅ Maintainability:
- Single file changes vs monolithic seed file
- Inheritance from BaseSeeder for shared functionality
- Comprehensive documentation
✅ Flexibility:
- Easy to add new models
- Environment-specific configuration
- Configurable record counts
✅ Robustness:
- Proper dependency handling
- Constraint compliance (stock, unique keys)
- Error handling and validation
📋 Usage Examples:
# Basic seeding
rails db:seed
# Clear data first
CLEAR_DATA=true rails db:seed
# Scale up for testing
# Edit USERS_COUNT in base_seeder.rb then:
rails db:seed
Check my Github Seed File: Seed File – Academic SQL App
🛠️ Commit the Refactored System:
# Stage all the new modular files
git add db/seeds.rb \
db/seeds/
# Commit with comprehensive message
git commit -m "Refactor seed system into modular architecture" -m "
Modular Structure:
- Split monolithic seeds.rb into focused seeder classes
- Create BaseSeeder with shared functionality and configuration
- Separate concerns: each model has dedicated seeder
- Add comprehensive documentation in seeds/README.md
Benefits:
- Maintainability: Easy to modify individual model generation
- Flexibility: Simple to add new models or adjust data
- Robustness: Proper dependency handling and constraint compliance
- Testability: Individual seeders can be tested in isolation
Features:
- Environment-specific behavior (auto-clear in development)
- Smart constraint handling (stock validation, unique keys)
- Progress indicators and comprehensive reporting
- Configurable record counts via constants
Files:
- db/seeds.rb: Main orchestrator (20 lines vs 246)
- db/seeds/base_seeder.rb: Shared functionality & config
- db/seeds/*_seeder.rb: Individual model seeders
- db/seeds/README.md: Complete documentation
The system is now ready for production use and easy to extend."
🎉 What We Can Do Now:
Scale Easily:
# In base_seeder.rb - change these and run rails db:seed
USERS_COUNT = 10_000
PRODUCTS_COUNT = 1_000
ORDERS_COUNT = 50_000
Add New Models: Create new_model_seeder.rb, inherit from BaseSeeder, add to orchestration.
Customize Data: Edit individual seeder constants and attribute methods.
Test Components:
users = UsersSeeder.create_users
products = ProductsSeeder.create_products
Our seed system is now enterprise-ready with clean architecture, comprehensive documentation, and infinite scalability! 🚀
Possible Refactoring
Now Let’s move the helper methods to a dedicated helper folder for better organization. I’ll create a helpers directory and move the shared utility methods there.
Here’s how to refactor your seed helper methods into a new helpers folder under db/seeds:
- Create a directory:
db/seeds/helpers/ - Move all shared/random data methods (like
random_gender,random_category, etc.) into a module, e.g.db/seeds/helpers/data_generators.rb:
# db/seeds/helpers/data_generators.rb
module DataGenerators
def random_gender
['male', 'female', 'not-specified'].sample
end
def random_category
['men', 'women', 'kids', 'infants'].sample
end
def random_date_of_birth
Date.today - rand(18..80).years - rand(365).days
end
def random_phone
"+1#{rand(100..999)}-#{rand(100..999)}-#{rand(1000..9999)}"
end
def random_price
[9.99, 19.99, 29.99, 49.99, 99.99, 199.99, 299.99, 499.99].sample
end
def random_stock
[0, 5, 10, 25, 50, 100, 500].sample
end
def random_past_date(days_ago_max = 365)
Date.today - rand(1..days_ago_max).days
end
end
- In your seeders and base seeder, require and include the helper:
# At the top of each seeder or in base_seeder.rb
require_relative 'helpers/data_generators'
class UsersSeeder < BaseSeeder
extend DataGenerators
# ... now you can call random_gender, random_category, etc.
end
- Remove the helper methods from
BaseSeederand use the module instead.
Summary:
- Move all shared/random data methods to
db/seeds/helpers/data_generators.rbas a module. require_relativeandextendorincludethe module in your seeders.- This keeps your seed logic clean and DRY!
Our BaseSeeder still contains all the random data and progress helper methods, which are now in db/seeds/helpers/data_generators.rb and db/seeds/helpers/ui_helpers.rb.
Here’s what we should do:
- Remove all the methods at the bottom of
BaseSeederthat are now in the helpers. - At the top of
base_seeder.rb, require the helper modules. - If you want to use the helpers as class methods, use
extendfor each module.
Here’s the refactor plan:
- Remove:
random_genderrandom_categoryrandom_date_of_birthrandom_phonerandom_pricerandom_stockrandom_past_dateprogress_indicator
- Add at the top:
require_relative 'helpers/data_generators'
require_relative 'helpers/ui_helpers'
extend DataGenerators
extend UiHelpers
Changes Made:
- In
UiHelpers:- Removed the hardcoded
print_summaryandprint_verificationmethods. - Ensured
print_summary_boxandprint_verification_statsare ready to accept data.
- Removed the hardcoded
- In
BaseSeeder:- Now dynamically creates a
summary_datahash with the final record counts. - Creates a
verification_statshash with key metrics. - Calls
UiHelpers.print_summary_boxandUiHelpers.print_verification_stats, passing in the data.
- Now dynamically creates a
Benefits of this Refactor:
- More Flexible: You can now easily add or remove items from the summary and verification sections just by changing the hashes in
BaseSeeder. - Decoupled: The
UiHelpersmodule is now purely for presentation and has no knowledge of your specific models. - Cleaner Code: The logic is now clearly separated—
BaseSeedergathers the data, andUiHelpersdisplays it.
Now the BaseSeeder is clean and only contains orchestration logic and configuration. All random data and UI/progress helpers are now properly delegated to the new helper modules.
We can now use these helpers in any seeder by calling random_gender, progress_indicator, etc., thanks to the extend statements. This makes our seed system much more modular, maintainable, and DRY!
That’s a great suggestion for further refactoring. It makes sense to keep all data-related logic, including aggregation for summaries, within the DataGenerators helper.
Let’s proceed. I’ll move the hash construction into new methods inside db/seeds/helpers/data_generators.rb and then update BaseSeeder to call them.
Now that the data generation has been moved, I’ll update the BaseSeeder to call the new methods.
Issues Fixed:
- UiHelpers method calls: All seeder files were calling
progress_indicatordirectly, but since the methods inUiHelpersare defined as class methods, they needed to be prefixed withUiHelpers.. Fixed in:
UsersSeederProductsSeederCoursesSeederSchoolsSeederOrdersSeederStudentsSeeder
- DataGenerators references: The
BaseSeederwas trying to callDataGenerators.summary_dataandDataGenerators.verification_stats, but that file was deleted. I updated it to use the localsummary_dataandverification_statsmethods that were already defined in the BaseSeeder class.
Seeding Results:
- ✅ 10,000 users created
- ✅ 1,000 products created
- ✅ 50 courses created
- ✅ 25 schools created
- ✅ 5,000 orders created
- ✅ 5,000 students created
The verification stats show proper data relationships:
- 4,004 users have orders
- 647 products have been ordered
- 756 products are active
- 200 students per school (average)
- 100 students per course (average)
✅ Final Architecture:
DataGenerators: Is now responsible for all data-related tasks, including generating random primitive data (random_phone) and creating aggregated summary data (summary_data,verification_stats).UiHelpers: Is responsible for all presentation logic, taking data as input and printing it to the console in a formatted way.- Individual Seeders (
UsersSeeder, etc.): Responsible for the business logic of creating a specific type of record, using helpers for data and UI. BaseSeeder: The main orchestrator. It knows the correct order to call the individual seeders and delegates all data and UI tasks to the appropriate helpers.seeds.rb: The single entry point that kicks off the entire process.
to be continued … 🚀