Robin Burke
  home > research >  faqfinder    

FAQ Finder

Overview
 

FAQ Finder is a system that retrieves answers to natural language questions from USENET FAQ files. The system integrates symbolic knowledge and statistical data in doing its question-matching. Part of the challenge was to pre-compile as much of the system's knowledge as possible so that answers could be found in fast enough to satisfy the constraints of Web use.
One issue raised by this research is the need to have the system correctly identify that a question cannot be answered. This problem is not one that can be addressed under the standard definition of the information retrieval problem. Even the TREC question-answering track doesn't evaluate systems on this property.

The system
 

The current version of the system has been running since the Summer of 1996, but the FAQs from which FAQ Finder retrieves haven't been updated since then. It's answers on some subjects have dated considerably. FAQ Finder is implemented in Allegro Common Lisp on top of the CL-HTTP web server.

Data
 

Approximately 170,000 questions asked of the FAQ Finder system over 4 years are available for research purposes. (Gzipped txt, 335K)

Selected Publications
  Burke, R., Hammond, K., Kulyukin, V., Lytinen, S., Tomuro, N. & Schoenberg, S. Question Answering from Frequently-Asked Question Files: Experiences with the FAQ Finder System. AI Magazine, 18(2), pages 57-66, 1997. (Postscript 1.5M)
  Burke, R., Hammond, K. & Cooper, E. Knowledge-based Information Retrieval from Semi-structured Text. In AAAI Workshop on Internet-based Information Systems, page 9-15. AAAI, 1996. (Postscript, 3.4M)
Related Work
 

There are any number of people working on question answering now. TREC is probably the best place to start. Steve Lytinen of DePaul University (one of the original project members) is still doing some research with the FAQ Finder system itself.