Query by Annotation Steven Bird, CSSE, University of Melbourne Haejoong Lee, LDC, University of Pennsylvania Abstract: Databases of hierarchically annotated text occupy a central place in linguistic research and language technology development. I will describe a new approach to tree query called "Query by Annotation". Users express a query by annotating a tree, and the annotation is compiled into an expression in a path language and thence to SQL for execution on a relational database. Result trees are annotated with the original query in order to demonstrate why the query matches the tree. Since queries and results are annotated trees, users can easily refine and resubmit their queries. The approach to Query by Annotation is motivated and exemplified using a popular kind of linguistic database known as a treebank. An open source implementation, distributed with the Natural Language Toolkit, will be demonstrated. This research forms part of a recently concluded NSF project "Querying Linguistic Databases", conducted in collaboration with Susan Davidson and Mark Liberman.