The goal of this project is to build a complete probabilistic data management
system, called PrDB, that can manage, store, and process large-scale
repositories of uncertain data. PrDB unifies ideas from "large-scale structured
graphical models" like probabilistic relational models (PRMs), developed in the
machine learning literature, and "probabilistic query processing", studied in
the database literature. PrDB framework is based on the notion of "shared
factors", which not only allows us to express and manipulate uncertainties at
various levels of abstractions, but also supports capturing rich correlations
among the uncertain data. PrDB supports a declarative SQL-like language for
specifying uncertain data and the correlations among them. PrDB also supports
exact and approximate evaluation of a wide range of queries including inference
queries, SQL queries, and decision-support queries.