Scalable DB+IR technology: processing Probabilistic Datalog with HySpirit
Abstract
Probabilistic Datalog (PDatalog, proposed in 1995) is a probabilistic variant of Datalog and a nice conceptual idea to model Information Retrieval in a logical, rule-based programming paradigm. Making PDatalog work in real-world applications requires more than probabilistic facts and rules, and the semantics associated with the evaluation of the programs. We report in this paper some of the key features of the HySpirit system required to scale the execution of PDatalog programs. Firstly, there is the requirement to express probability estimation in PDatalog. Secondly, fuzzy-like predicates are required to model vague predicates (e.g. vague match of attributes such as age or price). Thirdly, to handle large data sets there are scalability issues to be addressed, and therefore, HySpirit provides probabilistic relational indexes and parallel and distributed processing. The main contribution of this paper is a consolidated view on the methods of the HySpirit system to make PDatalog applicable in real-scale applications that involve a wide range of requirements typical for data (information) management and analysis.Citation
Frommholtz I, Roelleke T (2016) 'Scalable DB+IR technology: processing Probabilistic Datalog with HySpirit', Datenbank-Spektrum, 16 (1), pp.39-48.Publisher
Springer VerlagJournal
Datenbank-SpektrumPubMed ID
29368760PubMed Central ID
PMC5750817Additional Links
https://link.springer.com/article/10.1007/s13222-015-0208-zhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5750817/
Type
ArticleLanguage
enISSN
1618-2162EISSN
1618-2162ae974a485f413a2113503eed53cd6c53
10.1007/s13222-015-0208-z
Scopus Count
Collections
The following license files are associated with this item:
- Creative Commons
Except where otherwise noted, this item's license is described as Green - can archive pre-print and post-print or publisher's version/PDF
Related articles
- Size reduction by interpolation in fuzzy rule bases.
- Authors: Koczy LT, Hirota K
- Issue date: 1997
- A Novel Group Decision-Making Method Based on Sensor Data and Fuzzy Information.
- Authors: Bai YT, Zhang BH, Wang XY, Jin XB, Xu JP, Su TL, Wang ZY
- Issue date: 2016 Oct 28
- A distributed query execution engine of big attributed graphs.
- Authors: Batarfi O, Elshawi R, Fayoumi A, Barnawi A, Sakr S
- Issue date: 2016
- Rule-based programming paradigm: a formal basis for biological, chemical and physical computation.
- Authors: Krishnamurthy V, Krishnamurthy EV
- Issue date: 1999 Mar
- Performance of a Computational Model of the Mammalian Olfactory System.
- Authors: Persaud KC, Marco S, Gutiérrez-Gálvez A, Benjaminsson S, Herman P, Lansner A
- Issue date: 2013