Open Space: Java : Xử lý tập tin Xml dùng DOM

Thứ Ba, 27 tháng 11, 2012

Java : Xử lý tập tin Xml dùng DOM

Trong Java, tùy theo nhu cầu và mục đích xử lý tập tin Xml, ta có nhiều lựa chọn. Dưới đây, ta dùng DOM, viết tắt của Document Object Model (Mô hình đối tượng tài liệu), để đọc, viết, sửa, thêm nốt (node) con, thêm thuộc tính (attribute) cho một nốt.
Nội dung tập tin Xml "mobile phone.xml" cần đọc:

<?xml version="1.0" encoding="UTF-8"?>
<mobile_phones xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:noNamespaceSchemaLocation="mobile phone.xsd">
 <mobile_phone category="smartphone">
  <name>Samsung Galaxy II</name>
  <manufacturer>Samsung</manufacturer>
  <os>Android</os>
  <description>smartphone of Samsung</description>
 </mobile_phone>
 <mobile_phone category="smartphone">
  <name>iPhone 4</name>
  <manufacturer>Apple</manufacturer>
  <os>iOS</os>
  <description>smartphone of Apple</description>
 </mobile_phone>
</mobile_phones>

Ta tạo thư mục dự án (project) trong Eclipse có cấu trúc sau:

Nội dung mã trong tập tin DomTest.java

package xml.test;

import java.io.File;
import java.io.IOException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.junit.Assert;
import org.junit.Test;
import org.w3c.dom.Attr;
import org.w3c.dom.Document;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

public class DomTest {
 private static Runtime runtime = Runtime.getRuntime();
 @Test
 public void readWriteXmlWithDom() {
  try {
   long startTime = System.currentTimeMillis();
   System.out.println("Before reading file : " + (runtime.totalMemory() - runtime.freeMemory()) + " bytes" );
   DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
   DocumentBuilder db = dbf.newDocumentBuilder();
   Document doc = db.parse("resources/mobile phone.xml");
   
   Node phone = doc.getElementsByTagName("mobile_phone").item(0);
   // clone the element and add it to root
   Node addedPhone = phone.cloneNode(true);
   // create a new attribute
   Attr attr = doc.createAttribute("newAttr");
   attr.setValue("new value");
   addedPhone.getAttributes().setNamedItem(attr);
   // add element
   phone.getParentNode().appendChild(addedPhone);

   // read attribute value of a node
   NamedNodeMap attrs = phone.getAttributes();
   Assert.assertEquals("smartphone", attrs.getNamedItem("category").getNodeValue());

   NodeList nl = phone.getChildNodes();
   Node node;
   // change values of child nodes
   for (int i = 0; i < nl.getLength(); i++) {
     node = nl.item(i);
     if("name".compareTo(node.getNodeName()) == 0) {
      node.setTextContent("HTC Desire");
     }
     if("manufacturer".compareTo(node.getNodeName()) == 0) {
      node.setTextContent("HTC");
     }
     if("description".compareTo(node.getNodeName()) == 0) {
      node.setTextContent("smartphone of HTC");
     }
   }   
   System.out.println("After generating DOM objects : " + (runtime.totalMemory() - runtime.freeMemory()) + " bytes" );
   
   // write modified xml to file
   TransformerFactory tf = TransformerFactory.newInstance();
   Transformer t = tf.newTransformer();
   DOMSource ds = new DOMSource(doc);
   StreamResult sr = new StreamResult(new File("resources/dom output mobile.xml"));
   t.transform(ds, sr);
   
   long endTime = System.currentTimeMillis();
   System.out.println("After writing to XML file : " + (runtime.totalMemory() - runtime.freeMemory()) + " bytes" );
   System.out.println("Execution time : " +(endTime - startTime) + " mili seconds");
  } catch (ParserConfigurationException e) {
   e.printStackTrace();
  } catch (SAXException e) {
   e.printStackTrace();
  } catch (IOException e) {
   e.printStackTrace();
  } catch (TransformerConfigurationException e) {
   e.printStackTrace();
  } catch (TransformerException e) {
   e.printStackTrace();
  }
 }
}

Giải thích mã:
- Ngoài phần dùng DOM để xử lý tập tin xml, còn có thêm phần mã để tính thời gian và bộ nhớ tiêu thụ trong quá trình đó.
- Đầu tiên DocumentBuilder được dùng để đọc và phân tích tập tin "mobile phone.xml" vào cấu trúc Document.
- Tìm nốt mobile_phone đầu tiên, phone. Sao chép nội dung của nó vào nốt addedPhone.
- Thêm thuộc tính mới newAttr cùng nội dung của nó vào nốt vừa tạo.
- Thêm nốt addedPhone vừa tạo vào Document.
- Tiếp theo, đọc nội dung của nốt phone. Kiểm tra giá trị của thuộc tính category đúng là "smartphone". Sau đó thay đổi nội dung các nốt con của nó.
- Ghi nội dung toàn bộ Document vừa thay đổi xuống tập tin "dom output mobile.xml" dùng Transformer.

Nhận xét:
- DOM cho phép ta đọc, chỉnh sửa hay thêm nội dung trong quá trình đọc và sau cùng lưu lại nội dung vừa thay đổi.
- DOM đọc toàn bộ nội dung tập tin xml và lưu nó trong bộ nhớ dưới dạng cấu trúc cây. Đó là lý do tại sao dùng DOM tốn nhiều bộ nhớ đặc biệt khi đọc những tập tin dung lượng lớn.

Open Space

Thứ Ba, 27 tháng 11, 2012

Java : Xử lý tập tin Xml dùng DOM

Không có nhận xét nào:

Đăng nhận xét