Question

How can I get commands in Rails Migration file using regex?

I'm trying to get commands from a Rails migration file as an array based on a specific migration command, using regex. My code works well on most cases, but when there is a command with multiline code, it broke and I couldn't fix.

Example

class AddMissingUniqueIndices < ActiveRecord::Migration
  def self.up
    add_index :tags, :name, unique: true

    remove_index :taggings, :tag_id
    remove_index :taggings, [:taggable_id, :taggable_type, :context]
    add_index :taggings,
              [:tag_id, :taggable_id, :taggable_type, :context, :tagger_id, :tagger_type],
              unique: true, name: 'taggings_idx'
  end

  def self.down
    remove_index :tags, :name

    remove_index :taggings, name: 'taggings_idx'
    add_index :taggings, :tag_id
    add_index :taggings, [:taggable_id, :taggable_type, :context]
  end
end

My objective is return an array with the separated commands as string. What I expect:

[
  "add_index :tags, :name, unique: true", 
  "remove_index :taggings, :tag_id", 
  "remove_index :taggings, [:taggable_id, :taggable_type, :context]", 
  "add_index :taggings, [:tag_id, :taggable_id, :taggable_type, :context, :tagger_id, :tagger_type], unique: true, name: 'taggings_idx'"
]

First, I separe the block of change or self.up (for old migrations), and then try to use the above regex code to collect each add/remove index commands into an array:

migration_content = 'migration file in txt'
@table_name = 'taggings'
regex_pattern = /(add|remove)_index\s+:#{@table_name}.*\w+:\s+?\w+/m
columns_to_process = migration_content.to_enum(:scan, regex_pattern).map { Regexp.last_match.to_s.squish }
puts columns_to_process
=> ["remove_index :taggings, :tag_id remove_index :taggings, [:taggable_id, :taggable_type, :context] add_index :taggings, [:tag_id, :taggable_id, :taggable_type, :context, :tagger_id, :tagger_type], unique: true"]

As you can see, didn't work, returning just 2 commands, and both in same string. This works fine for inline code, my problem starts when the user can use a block like the last self.up action, specially this case where has much elements, I couldn't adapt the regex to all cases, also tried to get all content between add_index/remove_index or end, but didn't work. Can anyone help me?

 2  69  2
1 Jan 1970

Solution

 0

I think before scanning the file content you could replace all the line breaks that come after coma with space:

migration_content = migration_content.gsub(/,\s*\R/, ', ')

Maybe can also use gsub(/\(\s*\R/, '(') to replace multiline function calls where the code line ends with (

2024-07-16
GProst

Solution

 0

This will get all the commands correctly:

@table_name = 'taggings'
commands = []
current_command = ""
inside_method = false

migration_content.each_line do |line|
  stripped_line = line.strip

  if stripped_line =~ /^def\s+self\.(up|down)/
    inside_method = true
  elsif stripped_line =~ /^end$/
    inside_method = false
    if current_command.include?(":#{@table_name}")
      commands << current_command.strip unless current_command.empty?
    end
    current_command = ""
  elsif inside_method && stripped_line =~ /^(add|remove)_index/
    if current_command.include?(":#{@table_name}")
      commands << current_command.strip unless current_command.empty?
    end
    current_command = stripped_line
  elsif inside_method && !current_command.empty?
    current_command += " #{stripped_line}"
  end
end

if current_command.include?(":#{@table_name}")
  commands << current_command.strip unless current_command.empty?
end

puts commands

Output:

["remove_index :taggings, :tag_id", "remove_index :taggings, [:taggable_id, :taggable_type, :context]", "add_index :taggings, [:tag_id, :taggable_id, :taggable_type, :context, :tagger_id, :tagger_type], unique: true, name: 'taggings_idx'", "remove_index :taggings, name: 'taggings_idx'", "add_index :taggings, :tag_id", "add_index :taggings, [:taggable_id, :taggable_type, :context]"]
2024-07-20
Asbah Ishaq