An Efficient Compressor for XML Abstract: XML is becoming an increasingly popular standard for representing and storing documents and for transporting data over the Internet. The amount of data available in XML is growing rapidly and efficient transport and storage techniques are necessary. One such technique is compression. Conventional compressors - such as Lempel-Ziv, or Huffman encoding - achieve reasonable compression. However they do not consider the specific syntax and semantics of XML and thus miss several opportunities for compression. In this talk, we will describe a special purpose compressor for XML, called XMLzip, that improves over general purpose ones. The main component of XMLzip is a clustering technique that groups data elements together before applying conventional data compression to them. Depending on the type of XML data to be compressed, the user can choose between default clustering techniques or can define own clustering strategies using regular expressions. Furthermore, XMLzip allows the user to add domain-specific compressors for complex text structures, such as URLs or dates. Consequently, further improvement of the compression rate can be obtained.