I'm working on a Kotlin project where I need to take some incoming data and output a csv formatted string. I decided to try using the apache-commons csv library to handle building the csv output. While it mostly works, there is some strange behavior in the way it handles double-quotes in the incoming data. If I understand the rules for csv formatting correctly, a string containing a double quote should get escaped with an additional double quote. So if my input string is "foo, the output string should be ""foo. However, when I build my output string using CSVFormat, I instead get the output """foo.
Why am I getting an extra double quote in the output? Is this a bug or expected behavior? Is there any way to work around this without completely disabling quoting?
For reference, my function responsible for creating the scv string looks like:
fun Rosters.toCsvString(includeHeader: Boolean): String {
val writer = StringWriter()
val builder = CSVFormat.DEFAULT.builder().setQuoteMode(QuoteMode.MINIMAL)
if(includeHeader){
builder.setHeader(Rosters.Header::class.java)
}
val printer = CSVPrinter(writer, builder.get())
return writer.use {
printer.printRecords(this.rosters.map { it.values() })
writer.toString()
}
}
If I run a test against that function:
@Test
fun `Test convertRosterToCsvWithSpecialChars`()
{
val expected = """
EPPN,FirstName,LastName,Email,ClassStanding,StudentID,Degree,College,Department,SuitableRole
""foo014"",Foo,Bar,[email protected],,9200#8210,,,,ROLE_TENANT_ADVISOR,
""".trimMargin()
val worker1 = Roster("\"foo014\"","Foo","Bar","[email protected]",
studentId = "9200#8210", suitableRole = "ROLE_TENANT_ADVISOR")
val incomingRosters = Rosters(listOf( worker1))
val actual = incomingRosters.toCsvString(true)
assertThat(actual).isEqualToNormalizingNewlines(expected)
}
I get the error:
Expecting actual:
"EPPN,FirstName,LastName,Email,ClassStanding,StudentID,Degree,College,Department,SuitableRole
"""foo014""",Foo,Bar,[email protected],,9200#8210,,,,ROLE_TENANT_ADVISOR
"
to be equal to:
"EPPN,FirstName,LastName,Email,ClassStanding,StudentID,Degree,College,Department,SuitableRole
""foo014"",Foo,Bar,[email protected],,9200#8210,,,,ROLE_TENANT_ADVISOR,
"
when ignoring newline differences ('\r\n' == '\n')
.trimMargin()."""fooreally be"""foo"?