Demand for real world applications
Nowadays, to solve real-world problems in many areas such as cognitive sciences, biology, finance, physics, social sciences, etc, scientists think about data-driven solutions to a progressively increasing extent.
Challenge for domain expertsHowever, current technologies offer cumbersome solutions along multiple dimensions. Some of this include, Interaction with messy naturally occurring data; Necessity for expertise in CS and extensive programming; Necessity of exploiting various learning paradigms and techniques; and Extensive experimental exploration for model selection, feature selection, parameter tuning due to the lack of theoretical evidence about the effectiveness of various models.
High-level goal and implied directions
DeLBP workshop aims at highlighting the issues and challenges that arise for having a declarative data driven problem-solving paradigm. This paradigm facilitates and simplifies the design and the development of intelligent real world applications that consider learning from data and reasoning based on knowledge. It highlights the challenges in making machine learning accessible to various users including non-CS-experts and application programmers.
Conventional programming languages have not been primarily designed to offer help for the above-mentioned challenges. To Achieve the DeLBP goals there is a need to go beyond designing tools for classic machine learning by enriching the existing solutions and frameworks with the capabilities in:
Specifying the requirements of the application at a high abstraction level; Exploiting the expert knowledge in learning; Dealing with uncertainty in data and knowledge in various layers of the application program; Using representations that support flexible relational feature engineering; Using representations that support flexible reasoning and structure learning; Ability to reuse, combining and chaining models and perform flexible inference on complex models or pipelines of decision making; Integrating a range of learning and inference algorithms; Closing the loop of moving from data to knowledge and exploiting knowledge to generate data; and finally having a unified programming environment to design application programs.
Related communitiesOver the last few years the research community has tried to address these problems from multiple perspectives, most notably various approaches based on Probabilistic programming (PP), Logical Programming (LP), Constrained Conditional models (CCM) and other integrated paradigms such as Probabilistic Logical Programming (PLP) and Statistical relational learning (SRL). These paradigms and related languages aim at learning over probabilistic structures and exploiting knowledge in learning. We aim at motivating the need for further research toward a unified framework in this area based on the above mentioned key existing paradigms as well as other related research such as First-order query languages, database management systems (DBMS), deductive databases (DDB), hybrid optimization for learning from data and knowledge. We are interested in connecting these ideas towards developing a Declarative Learning Based Programming Paradigm and investigate the required type of languages, representations and computational models to support such a paradigm.
Topics SummaryThe main research questions and topics of interest include the following existing topics in the context of an integrated learning based paradigm:
- New abstractions and modularity levels towards a unified framework for learning and reasoning,
- Frameworks/Computational models to combine learning and reasoning paradigms and exploit accomplishments in AI from various perspectives.
- Flexible use of structured and relational data from heterogeneous resources in learning.
- Data modeling (relational/graph-based ) issues in such a new integrated framework for learning based on data and knowledge.
- Exploiting knowledge such as expert knowledge and common sense knowledge expressed via multiple formalisms, in learning.
- The ability of closing the loop to acquire knowledge from data and data from knowledge towards life-long learning, and reasoning.
- Using declarative domain knowledge to guide the design of learning models,
- including feature extraction, model selection, dependency structure and deep model architecture.
- Automation of hyper-parameter tuning.
- Design and representation of complex learning and inference models.
- The interface for learning-based programming,
- either in the form of programming languages, declarations, frameworks, libraries or graphical user interfaces.
- Storage and retrieval of trained learning models in a flexible way to facilitate incremental learning.
- Related applications in Natural language processing, Computer vision, Bioinformatics, Computational biology, etc.
ScheduleTo come ...
- Guy Van den Broeck, University of California, Los Angeles.
Title: PSDDs for Tractable Learning in Structured and Unstructured Spaces
Abstract: In this talk, I will discuss two related settings for learning a probabilistic model from data. In the first setting, called tractable learning, we are given a standard dataset and the goal is to learn a distribution that allows for efficient inference. This setting received a lot of attention recently, and is often tackled by learning a circuit representation of the distribution. In the second setting, which has not been treated as systematically before, one has access to Boolean constraints that characterize examples known to be impossible (e.g., due to known domain physics). The task is then to learn a probabilistic model over this structured space that is guaranteed to assign a zero probability to each impossible example. I will describe a new class of Arithmetic Circuits, the PSDD, for addressing both classes of learning problems. The PSDD is based on advances from both machine learning and logical reasoning and can be learned under Boolean constraints. It achieves state-of-the-art results, both in tractable learning from standard datasets, and in structured probability spaces, such as rankings and combinatorial objects. I will additionally show how PSDDs can be learned from a new type of incomplete datasets, in which examples are partially specified using arbitrary Boolean expressions.
Bio: Guy Van den Broeck is an Assistant Professor and Samueli Fellow in the Computer Science Department at the University of California, Los Angeles (UCLA). Guy’s research interests are in artificial intelligence, machine learning, logical and probabilistic automated reasoning, and statistical relational learning. He also studies applications of reasoning in other fields, such as probabilistic databases and programming languages. Guy’s work received best paper awards from key artificial intelligence venues such as UAI, ILP, and KR, and an outstanding paper honorable mention at AAAI. His doctoral thesis was awarded the ECCAI Dissertation Award for the best European dissertation in AI. He directs the Statistical and Relational Artificial Intelligence (StarAI) Lab at UCLA.
- Nikolaos Vasiloglou, LogicBlox.
Title: Declarative Data Science
Abstract: abstract: Data science has become essential for the operations of every business. Unfortunately data science follows the excel syndrome when it is seen from the database perspective. Although the data is usually normalized in a relational database, when it is time to be used in a machine learning model denormalization is required in ordered to be fed to the algorithms. This widely adopted practice has two problems. First it can convert a small data problem to a big data problem and second it ignores the relational nature of the data that might be important for the machine learning model. In this talk we will see how we can develop fast algorithms inside the database by leveraging the relational model. Based on some recent join algorithms we show that even with the database overhead it is much faster to process the data inside the database rather than exporting it and using high performance computing. We also demonstrate how we can use Linear Programming as a framework in doing data science inside the database and how this can supercharge the process of model development.
- Learning to Learn Programs from Examples: Going Beyond Program Structure. Kevin Ellis, Sumit Gulwani. [Download]
- Foundations of Declarative Data Analysis Using Limit Datalog Programs. Mark Kaminski, Bernardo Cuenca Grau, Egor Kostylev, Boris Motik, Ian Horrocks. [Download]
- Model Selection Scores for Multi-Relational Bayesian Networks. Sajjad Gholami, Oliver Schulte. [Download]
- Scaffolding the Generation of Machine Learning Models with SciRise, Aneesha Bakharia. [Download]
- Data Science with Linear Programming, Nantia Makrynioti, Nikolaos Vasiloglou, Emir Pasalic and Vasilis Vassalos. [Download]
SubmissionsWe encourage contributions with either a technical paper (IJCAI style, 6 pages without references), a position statement (IJCAI style, 2 pages maximum) or an abstract of a published work. IJCAI Style files available here. Please make submissions via EasyChair, here.
- Submission Deadline:
May 8thMay 20th, 2017
June 5thJune 9th, 2017
- Workshop Days: August 19th, 2017
||Tulane University, IHMCfirstname.lastname@example.org|
|University of Illinois at Urbana-Champaignemail@example.com|
|University of Texas at Dallasfirstname.lastname@example.org|
|Technical University of Dortmundemail@example.com|
||University of California, Los Angeles|
|University of California, Irvine|
|Charles River Analytics|
|Vrije University of Brussels|
||Amazon Cambridge, UK|
|University of California, Santa Barbara|
|Technical University of Dortmund|
|University of Virginia|
|University College London|