开发者

Why does Google Wave Operational Transform need annotations?

The operational transform stuff used in Google Wave has a rather curious document format. A document is basically just an xml subset document - characters, start tags and end tags. In addition to that, the document has "annotations", which are meta-data associated with ranges, e.g. start position and end position. The white paper justifies their presence with:

Wave document operations also support annotations. An annotation is some meta-data associated with an item range, i.e., a start position and an end position. This is particularly useful for describing text formatting and spelling suggestions, as it does not unecessarily complicate the underlying structured document format.

I can certainly see how it would be somewhat difficult if an arbitrary range from a document would be selected and for example bolded - XML tag nesting is strict and that would cause a mess of open and close tag insertions.

However, is this really a problem in practise? I mean, does one necessarily have to support such operation, if not making an editor that basically mimics the years old word processing paradigm instead of being a structured editor? Would pure XML operational transform with the document structure as simply HTML5 be that terrible? Is it a performance issue that styles wo开发者_JS百科uld be in the document as tags? Or does the operational transform model somehow produce unsatisfactory results on text formatting if they are represented with tags?

Also, a side question - how good would the pure "insert character, remove character, retain" operational transform model be on plain text representations? For example, editing HTML5 as text - or editing Wikipedia articles?


There are fundamental problems with using a hierarchical markup language with OT. See below for a worked example:

Does operational transformation work on structured documents such as HTML if simply treated as plain text?


This choice makes sense to me as an optimization on several fronts:

  • The underlying document remains as human readable and parse-able as possible
  • The algorithms to parse the underlying XML remain as simple as possible (useful for compatibility with non-google attempts at parsing the resulting documents, and for maintenance)
  • The extra collected garbage, after multiple edits, can lead to large performance hits - due to the sheer number of tags and/or additional passes on the document to attempt to simplify it.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜