Unicode Clojure unit test output
When unit testing some code that translates ascii sequences into unicode characters I have found a problem with the output of Clojure tests.
I ha开发者_Go百科ve tested that my terminal can output unicode characters (by cat-ing the test files) and that works fine, so the problem seems related to leiningen, Clojure or clojure.test somehow.
Here's an example test (using the Greek section of unicode - I will also be using Greek extended but I assume the same problems will apply):
(deftest bc-string-w-comma
(is (= "αβγ, ΑΒΓ" (parse "abg,*a*b*g"))))
It is meant to fail due to the missing space in the input. The output from lein test
is the following:
Testing parse_perseus.test.betacode
FAIL in (bc-string-w-comma) (betacode.clj:15)
expected: (= "???, ???" (parse "abg,*a*b*g"))
actual: (not (= "???, ???" "???,???"))
Testing parse_perseus.test.core
Testing parse_perseus.test.pluralise
Ran 10 tests containing 59 assertions.
1 failures, 0 errors.
What am I doing wrong here? Is this a terminal emulation problem or something clojure-related? I have the same problem running code in the REPL with Slime/swank/emacs. The REPL in emacs only outputs question marks for unicode output (although emacs is quite capable of understanding unicode).
I have tried running this in Terminal and iTerm (OS X) with the same results.
It turns out that you can pass options to java to force the output encoding of *out*
so that unicode works, like this:
java -Dfile.encoding=utf-8 -cp lib/clojure-1.2.0.jar:lib/clojure-contrib-1.2.0.jar clojure.main -i src/whatever.clj
As I'm using Leiningen, I added this property to my project.clj file:
(defproject project_name "1.0.0-SNAPSHOT"
:description "A Clojure Project"
:dependencies [[org.clojure/clojure "1.2.0"]
[org.clojure/clojure-contrib "1.2.0"]]
:dev-dependencies [[swank-clojure "1.2.0"]]
:jvm-opts ["-Dfile.encoding=utf-8"])
Clojure itself seems in the clear (this is Ubuntu 10.10, gnome-terminal, OpenJDK):
john@woc-desktop$ java -cp /home/john/.m2/repository/org/clojure/clojure/1.2.0/clojure-1.2.0.jar:/home/john/.m2/repository/org/clojure/clojure-contrib/1.2.0/clojure-contrib-1.2.0.jar clojure.main
Clojure 1.2.0
user=> (use 'clojure.test)
nil
user=> (defn parse [s] "αβγ,ΑΒΓ")
#'user/parse
user=> (deftest greek (is (= "αβγ, ΑΒΓ" (parse ""))))
#'user/greek
user=> (run-tests)
Testing user
FAIL in (greek) (NO_SOURCE_FILE:3)
expected: (= "αβγ, ΑΒΓ" (parse ""))
actual: (not (= "αβγ, ΑΒΓ" "αβγ,ΑΒΓ"))
Ran 1 tests containing 1 assertions.
1 failures, 0 errors.
{:type :summary, :test 1, :pass 0, :fail 1, :error 0}
user=>
But it does break emacs/swank/clojure-maven-plugin/maven
at REPL in emacs:
> (is "αβγ""αβγ")
slime-net-send: Coding system iso-latin-1-unix not suitable for "000052(:emacs-rex (swank:listener-eval \"(is \\\"αβγ\\\"\\\"αβγ\\\")
\") \"user\" :repl-thread 33)
"
If I use maven, the simple pom file below, and mvn clojure:repl then it's ok:
[INFO] [clojure:repl {execution: default-cli}]
Clojure 1.2.0
user=> (use 'clojure.test) (is "αβγ""αβγ")
nil
"αβγ"
user=> (defn parse [s] "αβγ,ΑΒΓ")
#'user/parse
user=> (deftest greek (is (= "αβγ, ΑΒΓ" (parse ""))))
#'user/greek
user=> (run-tests)
Testing user
FAIL in (greek) (NO_SOURCE_FILE:3)
expected: (= "αβγ, ΑΒΓ" (parse ""))
actual: (not (= "αβγ, ΑΒΓ" "αβγ,ΑΒΓ"))
Ran 1 tests containing 1 assertions.
1 failures, 0 errors.
{:type :summary, :test 1, :pass 0, :fail 1, :error 0}
user=>
but if I add the jline library using this snippet:
<dependency>
<groupId>jline</groupId>
<artifactId>jline</artifactId>
<version>0.9.94</version>
</dependency>
then I get:
[INFO] [clojure:repl {execution: default-cli}]
[INFO] Enabling JLine support
Clojure 1.2.0
user=> (use 'clojure.test) (is "αβγ""αβγ")
nil
"���"
user=> (defn parse [s] "αβγ,ΑΒΓ")
#'user/parse
user=> (deftest greek (is (= "αβγ, ΑΒΓ" (parse ""))))
#'user/greek
user=> (run-tests)
Testing user
FAIL in (greek) (NO_SOURCE_FILE:3)
expected: (= "���, ���" (parse ""))
actual: (not (= "���, ���" "���,���"))
Ran 1 tests containing 1 assertions.
1 failures, 0 errors.
{:type :summary, :test 1, :pass 0, :fail 1, :error 0}
user=>
Which looks awfully like your error. So it may be that the problem is in jLine, or some other piece which Leiningen and maven have in common which is associated with jLine.
Or of course, there may be two independent unicode-related failures.
Here is my maven pom.xml file in case anyone is trying to debug this.
<project>
<modelVersion>4.0.0</modelVersion>
<groupId>com.aspden</groupId>
<artifactId>maven-clojure-simple</artifactId>
<version>1.0-SNAPSHOT</version>
<name>maven-clojure-simple</name>
<description>maven, clojure: simple project</description>
<repositories>
<repository>
<id>clojure</id>
<url>http://build.clojure.org/releases</url>
</repository>
<repository>
<id>central</id>
<url>http://repo1.maven.org/maven2</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>org.clojure</groupId>
<artifactId>clojure</artifactId>
<version>1.2.0</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>com.theoryinpractise</groupId>
<artifactId>clojure-maven-plugin</artifactId>
<version>1.3.5-SNAPSHOT</version>
</plugin>
</plugins>
</build>
</project>
I appreciate this is not an answer, but i thought it might be helpful.
精彩评论