Murali Mani Worcester Polytechnic Institute Title: Efficient XML Stream Processing: The Raindrop Approach Abstract: We will examine key techniques used by Raindrop for processing XQuery experessions over XML streams. XQuery can be considered as consisting of three parts: (a) pattern retrieval via XPath expressions (b) filtering via predicates and (c) output restructuring. There are at least two different opportunities for performing pattern retrieval -- (a) retrieve the patterns on the streaming XML tokens using automaton, (b) extract several tokens into an appropriate object, say DOM object, and perform navigation on this object. Which approach to use for retrieving the different patterns will depend on the query/XML stream characteristics. In Raindrop, we model pattern retrieval using automaton and DOM-based navigation uniformly using an algebriac paradigm, and use cost-based approaches for choosing an efficient plan. Once we have come up with a query plan, we need to execute this plan efficiently. In stream processing, this typically involves ensuring minimum memory requirements. We achieve this in Raindrop by outputting/purging data at the earliest, and thus ensuring that no data is stored longer than required. We also study using schema knowledge to further decrease memory requirements. Acknowledgements: Raindrop is partially sponsored by NSF grant, IIS 0414567. Several people have been involved in this work: Elke Rundensteiner, Hong Su, Jinhui Jian, Ming Li, Mingzhu Wei, Shoushen Wang, Drew Ditto, Bogomil Tselkov, and others.