0

I know there's lots of similar questions but I haven't found a solution yet. I'm trying to use the CSV parsing library with Ruby 1.9.1 but I keep getting:

/usr/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift': Illegal quoting in line 1. (CSV::MalformedCSVError)

My CSV files were created in Windows 7 but it's Ubuntu 12.04 that I'm using to run the Ruby script, which looks like this:

require 'csv'

CSV.foreach('out.csv', :col_sep => ';') do |row|
   puts row
end

Nothing complicated, just a test, so I assumed it must be the Windows control characters causing problems. Vim shows up this:

"Part 1";;;;^M
;;;;^M
;;;;^M
Failure to Lodge Income Tax Return(s);;;;^M
NAME;ADDRESS;OCCUPATION;"NO OF CHARGES";"FINE/PENALTY £"^M
some name;"some,address";Bookkeeper;3;1,250.00^M
some name;"some,address";Haulier;1;600.00^M
some name;"some,address";Scaffolding Hire;1;250.00^M
some name;"some,address";Farmer;2;500.00^M
some name;"some,address";Builder;2;3000.00

I've tried removing those control characters for carraige returns that Windows added (^M), but %s/^V^M//g and %s/^M//g result in no pattern found. If I run %s/\r//g then the ^M characters are removed, but the same error still persists when I run the Ruby script. I've also tried running set ffs=unix,dos but it has no effect. Thanks.

Update:
If I remove the double quotes around the Part 1 on the first line, then the script prints out what it should and then throws a new error: Unquoted fields do not allow \r or \n (line 10). If I then remove the \r characters, the script runs fine.

I understand that I would have to remove the \r characters, but why will it only work if I unquote the first value?

17
  • Just for debugging, do File.readlines('out.csv') and see what are the characters present at the end of each line. Commented Apr 11, 2014 at 11:55
  • I was just running some more tests there, and if I remove the quotes around the 'Part 1' on the first line, then there's no error, and it prints out the csv values just fine ?? Commented Apr 11, 2014 at 11:58
  • 1
    see this - stackoverflow.com/questions/19350213/… Commented Apr 11, 2014 at 12:31
  • 1
    I was just looking at a similar question/answer, it seems to be a popular issue. Anyway I ran it with the encoding set to 'bom|utf-8' and it runs fine (provided the \r have been removed). Thanks for all your help @ArupRakshit Commented Apr 11, 2014 at 12:33
  • 1
    glad to help.. you don't need to remove ^M manually, use :row_sep => "\r" Commented Apr 11, 2014 at 12:37

1 Answer 1

2

The problem causing the Illegal quoting error was due to a Byte-Order-Mark (BOM) at the very beginning of the file. It didn't show up in editors, but the Ruby CSV lib was choking on it unless :encoding => 'bom|utf-8' was set.

Once that was fixed, I still needed to remove all the '^M' characters by running %s/\r//g in vim. And everything was working fine after that.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.