String 𓍯 operations using Ruby 💎methods

Let’s find out solutions to some ruby coding problems that can help us to manipulate over a String in Ruby.

Learn About the following topics to solve the below problems:

Ruby String scan: https://railsdrop.com/2012/07/07/ruby-string-method-scan/


🧪 Q1: Ruby String Manipulation

Prompt:

Write a method reverse_words that takes a string and returns a new string where the order of words is reversed, but the characters within each word stay in the same order.

Words are separated by spaces. Preserve exact spacing between words (multiple spaces too).

Examples:

reverse_words("hello world")             #=> "world hello"
reverse_words("  good   morning  ruby ") #=> " ruby  morning   good  "

✏️ Answer:

def reverse_words(str)
  str.scan(/\s+|\S+/).reverse.join
end

Explanation:

  • str.scan(/\s+|\S+/) splits the string into tokens that are either a word or a space block (preserves exact spacing).
  • .reverse reverses their order.
  • .join merges them back into a single string.

Sample Test Cases:

puts reverse_words("hello world")             # => "world hello"
puts reverse_words("  good   morning  ruby ") # => " ruby  morning   good  "
puts reverse_words("one")                     # => "one"
puts reverse_words("")                        # => ""


🧪 Q2: Normalize Email Addresses

Prompt:

Write a method normalize_email that normalizes email addresses using the following rules (similar to Gmail):

  1. Ignore dots (.) in the username part.
  2. Remove everything after a plus (+) in the username.
  3. Keep the domain part unchanged.

The method should return the normalized email string.

Examples:

normalize_email("john.doe+work@gmail.com")     # => "johndoe@gmail.com"
normalize_email("alice+spam@company.org")      # => "alice@company.org"
normalize_email("bob.smith@domain.co.in")      # => "bobsmith@domain.co.in"

✏️ Answer:

def normalize_email(email)
  local, domain = email.split("@")
  local = local.split("+").first.delete(".")
  "#{local}@#{domain}"
end

Explanation:

  • split("@") separates username from domain.
  • split("+").first keeps only the part before +.
  • .delete(".") removes all dots from the username.
  • Concatenate with the domain again.

Test Cases:

puts normalize_email("john.doe+work@gmail.com")     # => "johndoe@gmail.com"
puts normalize_email("alice+spam@company.org")      # => "alice@company.org"
puts normalize_email("bob.smith@domain.co.in")      # => "bobsmith@domain.co.in"
puts normalize_email("simple@domain.com")           # => "simple@domain.com"


to be continued.. 🚀

Useful Ruby 💎 Methods: A Short Guide – Scan, Inject With Performance Analysis

#scan Method

Finds all occurrences of a pattern in a string and returns them as an array.

The scan method in Ruby is a powerful string method that allows you to find all occurrences of a pattern in a string. It returns an array of matches, making it extremely useful for text processing and data extraction tasks.

Basic Syntax

string.scan(pattern) → array
string.scan(pattern) { |match| block } → string

Examples

Simple Word matching:

text = "hello world hello ruby"
matches = text.scan(/hello/)
puts matches.inspect
# Output: ["hello", "hello"]

Matching Multiple Patterns

text = "The quick brown fox jumps over the lazy dog"
matches = text.scan(/\b\w{3}\b/)  # Find all 3-letter words
puts matches.inspect
# Output: ["The", "fox", "the", "dog"]

1. Find All Matches

"hello world".scan(/\w+/) # => ["hello", "world"]  

2. Extract Numbers

"Age: 25, Price: $50".scan(/\d+/) # => ["25", "50"]  

3. Matching All Characters

"hello".scan(/./) { |c| puts c }
# Output:
# h
# e
# l
# l
# o

3. Capture Groups (Returns Arrays)

"Name: Alice, Age: 30".scan(/(\w+): (\w+)/)  
# => [["Name", "Alice"], ["Age", "30"]]  

When you use parentheses in your regex, scan returns arrays of captures:

text = "John: 30, Jane: 25, Alex: 40"
matches = text.scan(/(\w+): (\d+)/)
puts matches.inspect
# Output: [["John", "30"], ["Jane", "25"], ["Alex", "40"]]

4. Iterate with a Block

"a1 b2 c3".scan(/(\w)(\d)/) { |letter, num| puts "#{letter} -> #{num}" }  
# Output:  
# a -> 1  
# b -> 2  
# c -> 3  
text = "Prices: $10, $20, $30"
total = 0
text.scan(/\$(\d+)/) { |match| total += match[0].to_i }
puts total
# Output: 60

5. Case-Insensitive Search

"Ruby is COOL!".scan(/cool/i) # => ["COOL"]  

6. Extract Email Addresses

"Email me at test@mail.com".scan(/\S+@\S+/) # => ["test@mail.com"]  
text = "Contact us at support@example.com or sales@company.org"
emails = text.scan(/\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b/)
puts emails.inspect
# Output: ["support@example.com", "sales@company.org"]

Performance Characteristics: Ruby’s #scan Method

The #scan method is generally efficient for most common string processing tasks, but its performance depends on several factors:

  1. String length – Larger strings take longer to process
  2. Pattern complexity – Simple patterns are faster than complex regex
  3. Number of matches – More matches mean more memory allocation

Performance Considerations

1. Time Complexity 🧮

  • Best case: O(n) where n is string length
  • Worst case: O(n*m) for complex regex patterns (with backtracking)

2. Memory Usage 🧠

  • Creates an array with all matches
  • Each match is a new string object (memory intensive for large results)

Benchmark 📈 Examples

require 'benchmark'

large_text = "Lorem ipsum " * 10_000

# Simple word matching
Benchmark.bm do |x|
  x.report("simple scan:") { large_text.scan(/\w+/) }
  x.report("complex scan:") { large_text.scan(/(?:^|\s)(\w+)(?=\s|$)/) }
end

Typical results:

              user     system      total        real
simple scan:  0.020000   0.000000   0.020000 (  0.018123)
complex scan: 0.050000   0.010000   0.060000 (  0.054678)

Optimization Tips 💡

  1. Use simpler patterns when possible:
   # Slower
   text.scan(/(?:^|\s)(\w+)(?=\s|$)/)

   # Faster equivalent
   text.scan(/\b\w+\b/)
  1. Avoid capture groups if you don’t need them:
   # Slower (creates match groups)
   text.scan(/(\w+)/)

   # Faster
   text.scan(/\w+/)
  1. Use blocks to avoid large arrays:
   # Stores all matches in memory
   matches = text.scan(pattern)

   # Processes matches without storing
   text.scan(pattern) { |m| process(m) }
  1. Consider alternatives for very large strings:
   # For simple splits, String#split might be faster
   words = text.split

   # For streaming processing, use StringIO

When to Be Cautious ⚠️

  • Processing multi-megabyte strings
  • Using highly complex regular expressions
  • When you only need the first few matches (consider #match instead)

The #scan method is optimized for most common cases, but for performance-critical applications with large inputs, consider benchmarking alternatives.


#inject Method (aka #reduce)

Enumerable#inject takes two arguments: a base case and a block.

Each item of the Enumerable is passed to the block, and the result of the block is fed into the block again and iterate next item.

In a way the inject function injects the function between the elements of the enumerable. inject is aliased as reduce. You use it when you want to reduce a collection to a single value.

For example:

    product = [ 2, 3, 4 ].inject(1) do |result, next_value|
      result * next_value
    end
    product #=> 24

Purpose

  • Accumulates values by applying an operation to each element in a collection
  • Can produce a single aggregated result or a compound value

Basic Syntax

collection.inject(initial) { |memo, element| block } → object
collection.inject { |memo, element| block } → object
collection.inject(symbol) → object
collection.inject(initial, symbol) → object

Key Features

  1. Takes an optional initial value
  2. The block receives the memo (accumulator) and current element
  3. Returns the final value of the memo

👉 Examples

1. Summing Numbers

[1, 2, 3].inject(0) { |sum, n| sum + n } # => 6

2. Finding Maximum Value

[4, 2, 7, 1].inject { |max, n| max > n ? max : n } # => 7

3. Building a Hash

[:a, :b, :c].inject({}) { |h, k| h[k] = k.to_s; h }
# => {:a=>"a", :b=>""b"", :c=>"c"}

4. Symbol Shorthand (Ruby 1.9+)

[1, 2, 3].inject(:+) # => 6 (same as sum)
[1, 2, 3].inject(2, :*) # => 12 (2 * 1 * 2 * 3)

5. String Concatenation

%w[r u b y].inject("") { |s, c| s + c.upcase } # => "RUBY"

6. Counting Occurrences

words = %w[apple banana apple cherry apple]
words.inject(Hash.new(0)) { |h, w| h[w] += 1; h }
# => {"apple"=>3, "banana"=>1, "cherry"=>1}

Performance Characteristics: Ruby’s #inject Method

  1. Time Complexity: O(n) – processes each element exactly once
  2. Memory Usage:
  • Generally creates only one accumulator object
  • Avoids intermediate arrays (unlike chained map + reduce)

Benchmark 📈 Examples

require 'benchmark'

large_array = (1..1_000_000).to_a

Benchmark.bm do |x|
  x.report("inject:") { large_array.inject(0, :+) }
  x.report("each + var:") do
    sum = 0
    large_array.each { |n| sum += n }
    sum
  end
end

Typical results show inject is slightly slower than explicit iteration but more concise:

              user     system      total        real
inject:      0.040000   0.000000   0.040000 (  0.042317)
each + var:  0.030000   0.000000   0.030000 (  0.037894)

Optimization Tips 💡

  1. Use symbol shorthand when possible (faster than blocks):
   # Faster
   array.inject(:+)

   # Slower
   array.inject { |sum, n| sum + n }
  1. Preallocate mutable objects when building structures:
   # Good for hashes
   items.inject({}) { |h, (k,v)| h[k] = v; h }

   # Better for arrays
   items.inject([]) { |a, e| a << e.transform; a }
  1. Avoid unnecessary object creation in blocks:
   # Bad - creates new string each time
   strings.inject("") { |s, x| s + x.upcase }

   # Good - mutates original string
   strings.inject("") { |s, x| s << x.upcase }
  1. Consider alternatives for simple cases:
   # For simple sums
   array.sum # (Ruby 2.4+) is faster than inject(:+)

   # For concatenation
   array.join is faster than inject(:+)

When to Be Cautious ⚠️

  • With extremely large collections where memory matters
  • When the block operations are very simple (explicit loop may be faster)
  • When building complex nested structures (consider each_with_object)

The inject method provides excellent readability with generally good performance for most use cases.