Programing

자바에서 XML 파싱을위한 최고의 라이브러리는 무엇입니까?

lottogame 2020. 6. 8. 07:52
반응형

자바에서 XML 파싱을위한 최고의 라이브러리는 무엇입니까?


XML (복잡한 구성 및 데이터 파일) 구문 분석을 위해 Java 라이브러리를 검색 중입니다 .Google에서 조금 봤지만 dom4j 이외의 것을 찾을 수 없었습니다 (V2에서 작업하는 것처럼 보입니다). 공통 구성을 살펴 보았지만 마음에 들지 않으면, XML에 대한 다른 아파치 프로젝트는 최대 절전 모드에있는 것 같습니다. 나는 스스로 dom4j를 평가하지 않았지만 알고 싶었습니다. 자바에는 다른 (좋은) 오픈 소스 XML 파싱 라이브러리가 있습니까? dom4j 사용 경험은 어떻습니까?

@Voo의 대답 후에 또 다른 질문을하겠습니다-Java의 내장 클래스 또는 dom4j와 같은 타사 라이브러리를 사용해야합니까? 장점은 무엇입니까?


실제로 Java는 4 가지 방법으로 XML을 즉시 구문 분석 할 수 있습니다.

DOM 파서 / 빌더 : 전체 XML 구조가 메모리에로드되며 잘 알려진 DOM 메소드를 사용하여 작업 할 수 있습니다. DOM을 사용하면 Xslt 변환을 사용하여 문서에 쓸 수도 있습니다. 예:

public static void parse() throws ParserConfigurationException, IOException, SAXException {
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    factory.setValidating(true);
    factory.setIgnoringElementContentWhitespace(true);
    DocumentBuilder builder = factory.newDocumentBuilder();
    File file = new File("test.xml");
    Document doc = builder.parse(file);
    // Do something with the document here.
}

SAX 파서 : XML 문서를 읽기만하면됩니다. Sax 파서는 문서를 통해 실행되고 사용자의 콜백 메소드를 호출합니다. 문서, 요소 등을 시작 / 종료하는 방법이 있습니다. 그것들은 org.xml.sax.ContentHandler에 정의되어 있으며 빈 헬퍼 클래스 DefaultHandler가 있습니다.

public static void parse() throws ParserConfigurationException, SAXException {
    SAXParserFactory factory = SAXParserFactory.newInstance();
    factory.setValidating(true);
    SAXParser saxParser = factory.newSAXParser();
    File file = new File("test.xml");
    saxParser.parse(file, new ElementHandler());    // specify handler
}

StAx Reader / Writer : 데이터 스트림 지향 인터페이스에서 작동합니다. 프로그램은 커서 / 반복자와 같이 준비가되면 다음 요소를 요청합니다. 문서를 만들 수도 있습니다. 문서를 읽으십시오 :

public static void parse() throws XMLStreamException, IOException {
    try (FileInputStream fis = new FileInputStream("test.xml")) {
        XMLInputFactory xmlInFact = XMLInputFactory.newInstance();
        XMLStreamReader reader = xmlInFact.createXMLStreamReader(fis);
        while(reader.hasNext()) {
            reader.next(); // do something here
        }
    }
}

문서 작성 :

public static void parse() throws XMLStreamException, IOException {
    try (FileOutputStream fos = new FileOutputStream("test.xml")){
        XMLOutputFactory xmlOutFact = XMLOutputFactory.newInstance();
        XMLStreamWriter writer = xmlOutFact.createXMLStreamWriter(fos);
        writer.writeStartDocument();
        writer.writeStartElement("test");
        // write stuff
        writer.writeEndElement();
    }
}

JAXB: The newest implementation to read XML documents: Is part of Java 6 in v2. This allows us to serialize java objects from a document. You read the document with a class that implements a interface to javax.xml.bind.Unmarshaller (you get a class for this from JAXBContext.newInstance). The context has to be initialized with the used classes, but you just have to specify the root classes and don't have to worry about static referenced classes. You use annotations to specify which classes should be elements (@XmlRootElement) and which fields are elements(@XmlElement) or attributes (@XmlAttribute, what a surprise!)

public static void parse() throws JAXBException, IOException {
    try (FileInputStream adrFile = new FileInputStream("test")) {
        JAXBContext ctx = JAXBContext.newInstance(RootElementClass.class);
        Unmarshaller um = ctx.createUnmarshaller();
        RootElementClass rootElement = (RootElementClass) um.unmarshal(adrFile);
    }
}

Write document:

public static void parse(RootElementClass out) throws IOException, JAXBException {
    try (FileOutputStream adrFile = new FileOutputStream("test.xml")) {
        JAXBContext ctx = JAXBContext.newInstance(RootElementClass.class);
        Marshaller ma = ctx.createMarshaller();
        ma.marshal(out, adrFile);
    }
}

Examples shamelessly copied from some old lecture slides ;-)

Edit: About "which API should I use?". Well it depends - not all APIs have the same capabilities as you see, but if you have control over the classes you use to map the XML document JAXB is my personal favorite, really elegant and simple solution (though I haven't used it for really large documents, it could get a bit complex). SAX is pretty easy to use too and just stay away from DOM if you don't have a really good reason to use it - old, clunky API in my opinion. I don't think there are any modern 3rd party libraries that feature anything especially useful that's missing from the STL and the standard libraries have the usual advantages of being extremely well tested, documented and stable.


Java supports two methods for XML parsing out of the box.

SAXParser

You can use this parser if you want to parse large XML files and/or don't want to use a lot of memory.

http://download.oracle.com/javase/6/docs/api/javax/xml/parsers/SAXParserFactory.html

Example: http://www.mkyong.com/java/how-to-read-xml-file-in-java-sax-parser/

DOMParser

You can use this parser if you need to do XPath queries or need to have the complete DOM available.

http://download.oracle.com/javase/6/docs/api/javax/xml/parsers/DocumentBuilderFactory.html

Example: http://www.mkyong.com/java/how-to-read-xml-file-in-java-dom-parser/


Nikita's point is an excellent one: don't confuse mature with bad. XML hasn't changed much.

JDOM would be another alternative to DOM4J.


You don't need an external library for parsing XML in Java. Java has come with built-in implementations for SAX and DOM for ages.


If you want a DOM-like API - that is, one where the XML parser turns the document into a tree of Element and Attribute nodes - then there are at least four to choose from: DOM itself, JDOM, DOM4J, and XOM. The only possible reason to use DOM is because it's perceived as a standard and is supplied in the JDK: in all other respects, the others are all superior. My own preference, for its combination of simplicity, power, and performance, is XOM.

And of course, there are other styles of processing: low-level parser interfaces (SAX and StAX), data-object binding interfaces (JAXB), and high-level declarative languages (XSLT, XQuery, XPath). Which is best for you depends on your project requirements and your personal taste.


For folks interested in using JDOM, but afraid that hasn't been updated in a while (especially not leveraging Java generics), there is a fork called CoffeeDOM which exactly addresses these aspects and modernizes the JDOM API, read more here:

http://cdmckay.org/blog/2011/05/20/introducing-coffeedom-a-jdom-fork-for-java-5/

and download it from the project page at:

https://github.com/cdmckay/coffeedom


VTD-XML is the heavy duty XML parsing lib... it is better than others in virtually every way... here is a 2013 paper that analyzes all XML processing frameworks available in java platform...

http://sdiwc.us/digitlib/journal_paper.php?paper=00000582.pdf

참고URL : https://stackoverflow.com/questions/5059224/which-is-the-best-library-for-xml-parsing-in-java

반응형