Algorithm and systems engineering for high-performance visual text analytics on big data

  • Zeh, Norbert (PI)
  • Rau-chaplin, Andrew A (CoPI)
  • Keselj, Vlado (CoPI)

Project: Research project

Project Details

Description

Big Data is a popular term used to describe the exponential growth and availability of data and the technical opportunities and challenges it presents. In many diverse applications, the Big Data Challenge is how to infer important trends and causal relationships from extremely large data collections. Examples include the analysis of aviation incident and accident reports to link component failures to operational conditions that cause them to occur, analysis of textual feedback provided in product reviews with the goal to improve the products or the analysis of trending topics in Facebook and Twitter posts. The common theme of these examples is that at least a large part of the information is stored in unstructured textual form. Recent research has been successful in developing machine learning techniques that support the analysis of collections of text documents by grouping documents by topic, extracting the most important keywords from documents for easy review by the user, and many more. Most of these techniques have a high computation cost, which limits their applicability to very large text collections. Substantial performance improvements are possible through the development, implementation, performance evaluation, and tuning of improved algorithms for the basic analysis tasks underlying these machine learning techniques and through the use of parallel computing and cloud technologies. This will enable the application of these techniques to significantly larger text collections and is the focus of this research.

StatusActive
Effective start/end date1/1/17 → …

Funding

  • Natural Sciences and Engineering Research Council of Canada: US$101,214.00

ASJC Scopus Subject Areas

  • Artificial Intelligence
  • Control and Systems Engineering
  • Mathematics (miscellaneous)