31

Is there any gem which adds # encoding: UTF-8 to each Ruby file automatically?

Or is there any other way to prevent from the invalid multibyte char (US-ASCII) error in the entire Ruby on Rails project (not in a single class only)?

1
  • 3
    This isn't what you asked for, but for what it's worth some text editors (e.g. emacs) automatically insert "#encoding: UTF-8" at the top when you save a ruby file containing UTF-8. Commented Jan 26, 2011 at 13:02

6 Answers 6

31

Upgrade to Ruby 2.0, as it makes UTF-8 the default encoding, removing the need for magic comments.

Sign up to request clarification or add additional context in comments.

1 Comment

@Subimage I believe most legacy code should work in the newer versions of ruby, the other way is harder.
25

Try magic_encoding gem, it can insert uft-8 magic comment to all ruby files in your app.

[EDIT] Having switched to SublimeText now I use auto-encoding-for-ruby plugin.

4 Comments

note that this won't be a dependency, it's a tool that edits all the files for you. you can uninstall afterwords if you want
This is Atom version for auto-encoding-for-ruby
@Mirko Did you get the auto-encoding-for-ruby plugin to work with Sublime Text 3 ?
@Mirko Hm, I'm also on ruby 2.x, but I was having problems with running certain scripts in the irb console for some reason.
7

Vim:

:args **/*.ruby
:set hidden
:argdo norm! O# encoding: UTF-8
:wqa

2 Comments

Thanks! Great suggestion. For me, vim crashed (segfault) on the :wqa command during the write, resulting in some written files and a bunch of .swp. So I ended up using :wa, then :q which worked fine. Of course, change .ruby to .rb if the latter is your ruby extension.
@William Denniss: Another one: argdo 0put ='#encoding: UTF-8'
2

If you're using Sublime Text 2, you can use a plugin that automatically includes encoding declaration when needed: https://github.com/elomarns/auto-encoding-for-ruby.

Comments

2

How about just running a script?

#!/usr/bin/env ruby1.9.1
require 'find'

fixfile = []

Find.find('.') do |path|
  next unless /\.rb$/.match(path);
  File.open(path) do |file|
    count = 0;
    type = :lib
    file.each do |line|
      if count == 0 and /#!/.match(line)
        type = :script
      end
      if  /utf/.match(line)
        break
      end
      if (count += 1) > 10 then
        fixfile.push path:path, type:type
        break
      end
    end
    if file.eof?
        fixfile.push path:path, type:type
    end
  end
end

fixfile.each do |info|
  path = info[:path]
  backuppath = path + '~'
  type = info[:type]
  begin
     File.delete(backuppath) if File.exist?(backuppath)
     File.link(path, backuppath)
  rescue Errno::ENOENT => x
     puts "could not make backup file '#{backuppath}' for '#{ path }': #{$!}"
     raise
  end
  begin
    inputfile = File.open(backuppath, 'r')
    File.unlink(path)
    File.open(path, 'w') do |outputfile|
      if type == :script
        line = inputfile.readline
        outputfile.write line
      end
      outputfile.write "# encoding: utf-8\n"
      inputfile.each do |line|
        outputfile.write line
      end
      inputfile.close
      outputfile.close
    end
  rescue => x
    puts "error: #{x} #{$!}"
    exit
  end

To make it automatic add this to your Rakefile.

You could run file -bi #{path} and look for charset=utf-8 if you only want to update files that have utf-8 chars.

Comments

0

Adding a # encoding: UTF-8 to each Ruby file automatically makes only sense, when your files are really stored in UTF-8.

If your files are encoded CP850 (AFAIK default in Windows) and you use Non-ASCII characters, you replace invalid multibyte char (US-ASCII) with invalid multibyte char (UTF-8).

I would prefer a manual modification and check of each file, if it is really UTF-8.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.