How to generate a large (30MB+) xml file in java?
The file itself is not that big, should fit in memory. But once you开发者_如何学运维 combine that with other overhead factors then starts to become a problem. We are building a DOM in memory and that is not scaling for us. Using raw output streams seems problematic in the sense that we have to be careful about escaping characters.
What are some good approaches for doing this?
Are there any goods libs for this?
STAX provides a convenient API with which to write XML to output stream. Easy tutorial here.
Try XStream
Features
- Ease of use. A high level facade is supplied that simplifies common use cases.
- No mappings required. Most objects can be serialized without need for specifying mappings.
- Performance. Speed and low memory footprint are a crucial part of the design, making it suitable for large object graphs or systems with high message throughput.
- Clean XML. No information is duplicated that can be obtained via reflection. This results in XML that is easier to read for humans and more compact than native Java serialization.
- Requires no modifications to objects. Serializes internal fields, including private and final. Supports non-public and inner classes. Classes are not required to have default constructor.
- Full object graph support. Duplicate references encountered in the object-model will be maintained. Supports circular references.
- Integrates with other XML APIs. By implementing an interface, XStream can serialize directly to/from any tree structure (not just XML).
- Customizable conversion strategies. Strategies can be registered allowing customization of how particular types are represented as XML.
- Error messages. When an exception occurs due to malformed XML, detailed diagnostics are provided to help isolate and fix the problem.
- Alternative output format. The modular design allows other output formats. XStream ships currently with JSON support and morphing.
With Saxon, you can use the StAX XMLStreamWriter API in conjunction with a Serializer that gives you full control of the serialization properties as defined in xsl:output, for example the ability to control indentation, use of CDATA sections, etc. See the s9api Serializer class.
It depends on how your data is structured, but a StAX implementation might be what you are looking for - like Woodstock for example.
精彩评论