Project Details
Description
The search paradigm for finding information of interest in massive text corpora is well established, and exemplified by the success of Web search engines. The next frontier is the support of sense-making out of medium to large scale text corpora by the domain expert and analyst, who is trying to untap the tacit knowledge hidden in the text. In contrast to Web search, where the user's information need is satisfied by a few high-quality results, the domain analyst typically needs to find all relevant documents related to a topic of interest. Furthermore the domain analyst's information need is often much too complex to be easy to express in a small number of query terms. These requirements are very different from those of a web search engine. This project aims towards algorithms and systems that will address such requirements, in the context of use cases of significant practical interest including: reports on factory worker incidents and on aviation incidents; court decisions in the common law legal system; medical research literature for generation of systematic reviews; and prior art patent search. Key challenges include: vocabulary mismatch, where the queries of the user contain a vocabulary that is different from that used in the relevant documents; design of interactive mechanisms and visualizations to support the interactive nature of and incorporate human feedback into the retrieval process. The impact of the proposed research will be improvement of safety in a variety of domains where incidents are recorded and studied to obtain insights on how to prevent future incidents, such as in factories, aviation, hospitals and old-age homes.
Status | Active |
---|---|
Effective start/end date | 1/1/17 → … |
Funding
- Natural Sciences and Engineering Research Council of Canada: US$101,215.00
ASJC Scopus Subject Areas
- Artificial Intelligence
- Information Systems
- Information Systems and Management
- Management Information Systems