开发者

Backslash read and write and F# interactive console

Edit: whats the difference between reading a backslash from a file and writing it to the interactive window vs writing directly the string to the interactive window ?

For example

let toto = "Adelaide Gu\u00e9nard" 

toto;;

the interactive window prints "Adelaide Guénard".

Now if I save a txt file with the single line Adel开发者_JAVA百科aide Gu\u00e9nard . And read it in:

System.IO.File.ReadAllLines(@"test.txt")

The interactive window prints [|"Adelaide Gu\u00e9nard"|]

What is the difference between these 2 statements in terms of the interactive window printing ?


As far as I know, there is no library that would decode the F#/C# escaping of string for you, so you'll have to implement that functionality yourself. There was a similar question on how to do that in C# with a solution using regular expressions.

You can rewrite that to F# like this:

open System
open System.Globalization
open System.Text.RegularExpressions

let regex = new Regex (@"\\[uU]([0-9A-F]{4})", RegexOptions.IgnoreCase)
let line = "Adelaide Gu\\u00e9nard"
let line = regex.Replace(line, fun (m:Match) -> 
  (char (Int32.Parse(m.Groups.[1].Value, NumberStyles.HexNumber))).ToString())

(If you write "some\\u00e9etc" then you're creating string that contains the same thing as what you'd read from the text file - if you use single backslash, then the F# compiler interprets the escaping)


It uses the StructuredFormat stuff from the F# PowerPack. For your string, it's effectively doing printfn toto;;.

You can achieve the same behaviour in a text file as follows:

open System.IO;;
File.WriteAllText("toto.txt", toto);;

The default encoding used by File.WriteAllText is UTF-8. You should be able to open toto.txt in Notepad or Visual Studio and see the é correctly.

Edit: If wanted to write the content of test.txt to another file in the clean F# interactive print, how would i proceed ?

It looks like fsi is being too clever when printing the contents of test.txt. It's formatting it as a valid F# expression, complete with quotes, [| |] brackets, and a Unicode character escape. The string returned by File.ReadAllLines doesn't contain any of these things; it just contains the words Adelaide Guénard.

You should be able to take the array returned by File.ReadAllLines and pass it to File.WriteAllLines, without the contents being mangled.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜