# Introduction

cplint is a suite of programs for reasoning with LPADs/CP-logic programs [12], [15], [13], [14]. It contains modules for both inference and learning.

cplint is available in two versions, one for Yap Prolog and one for SWI-Prolog. They differ slightly in the features offered. This manual is about the SWI-Prolog version. You can find the manual for the Yap version at http://ds.ing.unife.it/~friguzzi/software/cplint/manual.html.

# Installation

cplint is distributed as a pack of SWI-Prolog. To install it, use

?- pack_install(cplint).

Moreover, in order to make sure you have a foreign library that matches your architecture, run

?- pack_rebuild(cplint). 

# Syntax

LPAD and CP-logic programs consist of a set of annotated disjunctive clauses. Disjunction in the head is represented with a semicolon and atoms in the head are separated from probabilities by a colon. For the rest, the usual syntax of Prolog is used. A general CP-logic clause has the form

h1:p1 ; ... ; hn:pn :- b1,...,bm,\+ c1,....,\+ cl

No parentheses are necessary. The pi are numeric expressions. It is up to the user to ensure that the numeric expressions are legal, i.e. that they sum up to less than one.

If the clause has an empty body, it can be represented like this

h1:p1 ; ... ; hn:pn.

If the clause has a single head with probability 1, the annotation can be omitted and the clause takes the form of a normal prolog clause, i.e.

h1 :- b1,...,bm,\+ c1,...,\+ cl.

stands for

h1:1 :- b1,...,bm,\+ c1,...,\+ cl.

The coin example of [15] is represented as (file coin.cpl)

heads(Coin):1/2 ; tails(Coin):1/2 :-
toss(Coin),\+biased(Coin).

toss(Coin),biased(Coin).

fair(Coin):0.9 ; biased(Coin):0.1.

toss(coin).

The first clause states that if we toss a coin that is not biased it has equal probability of landing heads and tails. The second states that if the coin is biased it has a slightly higher probability of landing heads. The third states that the coin is fair with probability 0.9 and biased with probability 0.1 and the last clause states that we toss a coin with certainty.

Moreover, the bodies of rules may contain the built-in predicates:

is/2, >/2, </2, >=/2 ,=</2,
=:=/2, =\=/2, true/0, false/0,
=/2, ==/2, \=/2 ,\==/2, !/0, length/2

The bodies may also contain the following library predicates:

member/2, max_list/2, min_list/2
nth0/3, nth/3, name/2, float/1,
integer/1, var/1, @>/2,
memberchk/2, select/3, dif/2,
between/3

plus the predicate

average/2

that, given a list of numbers, computes its arithmetic mean.

The body of rules may also contain the predicate prob/2 that computes the probability of an atom, thus allowing nested probability computations. For example (meta.pl)

a:0.2:-
prob(b,P),
P>0.2.

is a valid rule.

Moreover, the probabilistic annotations can be variables, as in (flexprob.pl))

red(Prob):Prob.

draw_red(R, G):-
Prob is R/(R + G),
red(Prob).

Variables in probabilistic annotations must be ground when resolution reaches the end of the body, otherwise an exception is raised.

# Semantics

The semantics of LPADs for the case of programs without functions symbols can be given as follows. An LPAD defines a probability distribution over normal logic programs called worlds. A world is obtained from an LPAD by first grounding it, by selecting a single head atom for each ground clause and by including in the world the clause with the selected head atom and the body. The probability of a world is the product of the probabilities associated to the heads selected. The probability of a ground atom (the query) is given by the sum of the probabilities of the worlds where the query is true.

If the LPAD contains function symbols, the definition is more complex, see [7,10,11].

# Inference

cplint answers queries using the module pita or mcintyre. The first performs the program transformation technique of [8]. Differently from that work, techniques alternative to tabling and answer subsumption are used. The latter performs approximate inference by sampling using a different program transformation technique and is described in [9].

For answering queries, you have to prepare a Prolog file where you first load the inference module (for example pita), initialize it with a directive (for example :- pita) and then enclose the LPAD clauses in :-begin_lpad. and :-end_lpad. For example, the coin program above can be stored in coin.pl for performing inference with pita as follows

:- use_module(library(pita)).
:- pita.
toss(Coin),\+biased(Coin).

toss(Coin),biased(Coin).

fair(Coin):0.9 ; biased(Coin):0.1.

toss(coin).
:- end_lpad.

The same program for mcintyre is

:- use_module(library(mcintyre)).
:- mc.
toss(Coin),\+biased(Coin).

toss(Coin),biased(Coin).

fair(Coin):0.9 ; biased(Coin):0.1.

toss(coin).
:- end_lpad.

You can have also (non-probabilistic) clauses outside :-begin/end_lpad. These are considered as database clauses. In pita subgoals in the body of probabilistic clauses can query them by enclosing the query in db/1. For example (testdb.pl)

:- use_module(library(pita)).
:- pita.
sampled_male(X):0.5:-
db(male(X)).
male(john).
male(david).

You can also use findall/3 on subgoals defined by database clauses (persons.pl)

:- use_module(library(pita)).
:- pita.
male:M/P; female:F/P:-
findall(Male,male(Male),LM),
findall(Female,female(Female),LF),
length(LM,M),
length(LF,F),
P is F+M.
male(john).
male(david).
female(anna).
female(elen).
female(cathy).

Aggregate predicates on probabilistic subgoals are not implemented due to their high computational cost (if the aggregation is over $$n$$ atoms, the values of the aggregation are potentially $$2^n$$). The Yap version of cplint includes reasoning algorithms that allows aggregate predicates on probabilistic subgoals, see http://ds.ing.unife.it/~friguzzi/software/cplint/manual.html.

In mcintyre you can query database clauses in the body of probabilistic clauses without any special syntax. You can also use findall/3.

Then you can simply load coin.pl as

?- [coin].

Note that supplying coin.pl as an argument to the swipl command currently returns errors due to bad interaction between pita and the top-level. The program is loaded correctly anyway but it is recommended to load it from the top-level to avoid these errors.

## Unconditional Queries

You can ask the unconditional probability of an atom using pita with the predicate

prob(:Query:atom,-Probability:float).

as in

?- prob(heads(coin),P).

If the query is non-ground, prob/2 returns in backtracking the succesful instantiations together with their probability.

When using mcintyre, the predicate for querying is

mc_prob(:Query:atom,-Probability:float).

as in

?- mc_prob(heads(coin),P).

With mcintyre, you can also take a given number of sample with

mc_sample(:Query:atom,+Samples:int,-Successes:int,-Failures:int,
-Probability:float).

as in (coinmc.pl)

?- mc_sample(heads(coin),1000,S,F,P).

that samples heads(coin) 1000 times and returns in S the number of successes, in F the number of failures and in P the estimated probability (S/1000).

Differently from exact inference, in approximate inference the query can be a conjunction of atoms.

If you are just interested in the probability, you can use

mc_sample(:Query:atom,+Samples:int,-Probability:float) 

as in (coinmc.pl)

?- mc_sample(heads(coin),1000,Prob).

that samples heads(coin) 1000 times and returns the estimated probability that a sample is true (i.e., that a sample succeeds).

Moreover, you can sample arguments of queries with

mc_sample_arg(:Query:atom,+Samples:int,?Arg:var,-Values:list).

The predicate samples Query a number of Samples times. Arg should be a variable in Query. The predicate returns in Values a list of couples L-N where L is the list of values of Arg for which Query succeeds in a world sampled at random and N is the number of samples returning that list of values. If L is the empty list, it means that for that sample the query failed. If L is a list with a single element, it means that for that sample the query is determinate. If, in all couples L-N, L is a list with a single element, it means that the clauses in the program are mutually exclusive, i.e., that in every sample, only one clause for each subgoal has the body true. This is one of the assumptions taken for programs of the PRISM system [11]. For example pfcglr.pl and plcg.pl satisfy this constraint while markov_chain.pl and var_obj.pl don’t.

An example of use of the above predicate is

?- mc_sample_arg(reach(s0,0,S),50,S,Values). 

of markov_chain.pl that takes 50 samples of L in findall(S,(reach(s0,0,S),L).

You can sample arguments of queries also with

mc_sample_arg_first(:Query:atom,+Samples:int,?Arg:var,-Values:list).

samples Query a number of Samples times and returns in Values a list of couples V-N where V is the value of Arg returned as the first answer by Query in a world sampled at random and N is the number of samples returning that value. V is failure if the query fails. mc_sample_arg_first/4 differs from mc_sample_arg/4 because the first just computes the first answer of the query for each sampled world.

Finally, you can compute expectations with

mc_expectation(:Query:atom,+N:int,?Arg:var,-Exp:float).

that computes the expected value of Arg in Query by sampling. It takes N samples of Query and sums up the value of Arg for each sample. The overall sum is divided by N to give Exp.

An example of use of the above predicate is

?- mc_expectation(eventually(elect,T),1000,T,E).

of pctl_slep.pl that returns in E the expected value of T by taking 1000 samples.

## Conditional Queries

You can ask the conditional probability of an atom given another atom using pita with the predicate

prob(:Query:atom,:Evidence:atom,-Probability:float).

as in

?- prob(heads(coin),biased(coin),P).

If the query/evidence are non-ground, prob/3 returns in backtracking ground instantiations together with their probability.

If the evidence is composed of more than one atom, add a clause of the form

evidence:- e1,...,en.

to the program, where e1,...,en are the evidence atoms, and use the query

?- prob(goal,evidence,P).

When using mcintyre, you can ask conditional queries with rejection sampling or with Metropolis-Hastings Markov Chain Monte Carlo. In rejection sampling, you first query the evidence and, if the query is successful, query the goal in the same sample, otherwise the sample is discarded. In Metropolis-Hastings MCMC, mcintyre follows the algorithm proposed in [6] (the non adaptive version): after a sample, a number of sampled probabilistic choices are deleted and the others are retained for the next sample. The sample is accepted with a probability of $$\min\{1,\frac{N_0}{N_1}\}$$ where $$N_0$$ is the number of choices sampled in the previous sample and $$N_1$$ is the number of choices sampled in the current sample.

You can take a given number of sample with rejection sampling using

mc_rejection_sample(:Query:atom,:Evidence:atom,+Samples:int,
-Successes:int,-Failures:int,-Probability:float).

as in (coinmc.pl)

?- mc_rejection_sample(heads(coin),biased(coin),1000,S,F,P).

that takes 1000 samples where biased(coin) is true and returns in S the number of samples where heads(coin) is true, in F the number of samples where heads(coin) is false and in P the estimated probability (S/1000).

Differently from exact inference, in approximate inference the evidence can be a conjunction of atoms.

You can take a given number of sample with Metropolis-Hastings MCMC using

mc_mh_sample(:Query:atom,:Evidence:atom,+Samples:int,+Lag:int,
-Successes:int,-Failures:int,-Probability:float).

where Lag is the number of sampled choices to forget before taking a new sample. For example (arithm.pl)

?- mc_mh_sample(eval(2,4),eval(1,3),10000,1,T,F,P).

takes 10000 accepted samples and returns in T the number of samples where eval(2,4) is true, in F the number of samples where eval(2,4) is false and in P the estimated probability (T/10000).

Moreover, you can sample arguments of queries with rejection sampling and Metropolis-Hastings MCMC using

mc_rejection_sample_arg(:Query:atom,:Evidence:atom,
+Samples:int,?Arg:var,-Values:list).
mc_mh_sample_arg(:Query:atom,:Evidence:atom,
+Samples:int,+Lag:int,?Arg:var,-Values:list).

that return the distribution of values for Arg in Query in Samples of Query given that Evidence is true. The predicate returns in Values a list of couples L-N where L is the list of values of Arg for which Query succeeds in a world sampled at random where Evidence is true and N is the number of samples returning that list of values.

An example of use of the above predicates is

?- mc_mh_sample_arg(eval(2,Y),eval(1,3),1000,1,Y,V).

Finally, you can compute expectations with

mc_expectation(:Query:atom,+N:int,?Arg:var,-Exp:float).

that computes the expected value of Arg in Query by sampling. It takes N samples of Query and sums up the value of Arg for each sample. The overall sum is divided by N to give Exp.

An example of use of the above predicate is

?- mc_expectation(eventually(elect,T),1000,T,E).

of pctl_slep.pl that returns in E the expected value of T by taking 1000 samples.

To compute conditional expectations, use

mc_mh_expectation(:Query:atom,:Evidence:atom,+N:int,
+Lag:int,?Arg:var,-Exp:float).

as in

?- mc_mh_expectation(eval(2,Y),eval(1,3),1000,1,Y,E).

of arithm.pl that computes the expectation of argument Y of eval(2,Y) given that eval(1,3) is true by taking 1000 samples using Metropolis-Hastings MCMC.

## Parameters

The inference modules have a number of parameters in order to control their behavior. They can be set with the directive

:- set_pita(<parameter>,<value>).

or

:- set_mc(<parameter>,<value>).

after initialization (:-pita. or :-mc.) but outside :-begin/end_lpad. The current value can be read with

?- setting_pita(<parameter>,Value).

or

?- setting_mc(<parameter>,Value).

from the top-level. The available parameters common to both pita and mcintyre are:

• epsilon_parsing: if (1 - the sum of the probabilities of all the head atoms) is larger than epsilon_parsing, then pita adds the null event to the head. Default value 0.00001.

• single_var: determines how non ground clauses are treated: if true, a single random variable is assigned to the whole non ground clause, if false, a different random variable is assigned to every grounding of the clause. Default value false.

Moreover, pita has the parameters

• depth_bound: if true, the depth of the derivation of the goal is limited to the value of the depth parameter. Default value false.

• depth: maximum depth of derivations when depth_bound is set to true. Default value 5.

If depth_bound is set to true, derivations are depth-bounded so you can query also programs containing infinite loops, for example programs where queries have an infinite number of explanations. However the probability that is returned is guaranteed only to be a lower bound, see for example markov_chaindb.pl

mcintyre has the parameters

• min_error: minimal width of the binomial proportion confidence interval for the probability of the query. When the confidence interval for the probability of the query is below min_error, the computation stops. Default value 0.01.

• k: the number of samples to take before checking whether the the binomial proportion confidence interval is below min_error. Default value 1000. max_samples: the maximum number of samples to take. This is used when the probability of the query is very close to 0 or 1. In fact mcintyre also checks for the validity of the the binomial proportion confidence interval: if less than 5 failures or successes are sampled, even if the width of the confidence interval is less than min_error, the computation continues. This would lead to non-termination in cases where the probability is 0 or 1. max_samples ensures termination. Default value 10e4.

The example markov_chain.pl shows that mcintyre can perform inference in presence of an infinite number of explanations for the goal. Differently from pita, no depth bound is necessary, as the probability of selecting the infinite computation branch is 0. However, also mcintyre may not terminate if loops not involving probabilistic predicates are present.

If you want to set the seed of the random number generator, you can use SWI-Prolog predicates setrand/1 and getrand/1, see SWI-Prolog manual.

# Learning

The following learning algorithms are available:

• EMBLEM (EM over Bdds for probabilistic Logic programs Efficient Mining): an implementation of EM for learning parameters that computes expectations directly on BDDs ???, [1], [2]

• SLIPCOVER (Structure LearnIng of Probabilistic logic programs by searChing OVER the clause space): an algorithm for learning the structure of programs by searching the clause space and the theory space separately [3]

## Input

To execute the learning algorithms, prepare a Prolog file divided in five parts

• preamble

• background knowledge, i.e., knowledge valid for all interpretations

• LPAD/CPL-program for you which you want to learn the parameters (optional)

• language bias information

• example interpretations

The preamble must come first, the order of the other parts can be changed.

For example, consider the Bongard problems of [5]. bongard.pl and bongardkeys.pl represent a Bongard problem. Let us consider bongard.pl.

### Preamble

In the preamble, the SLIPCOVER library is loaded with

:- use_module(library(slipcover)).

Now you can initialize SLIPCOVER with

:- sc.

At this point you can start setting parameters for SLIPCOVER such as for example

:- set_sc(megaex_bottom,20).
:- set_sc(max_iter,2).
:- set_sc(max_iter_structure,5).
:- set_sc(verbosity,1).

We will see later the list of available parameters. A particularly important parameter is verbosity: if set to 1, nothing is printed and learning is fastest, if set to 3 much information is printed and learning is slowest, 2 is in between. This ends the preamble.

Now you can specify the background knowledge with a fact of the form

bg(<list of terms representing clauses>).

where the clauses must currently be deterministic. Alternatively, you can specify a set of clauses by including them in a section between :- begin_bg. and :- end_bg. For example

:- begin_bg.
replaceable(gear).
replaceable(wheel).
replaceable(chain).
not_replaceable(engine).
not_replaceable(control_unit).
component(C):-
replaceable(C).
component(C):-
not_replaceable(C).
:- end_bg.

from the mach.pl example. If you specify both a bg/1 fact and a section, the clauses of the two will be combined.

Moreover, you can specify an initial program with a fact of the form

in(<list of terms representing clauses>).

The initial program is used in parameter learning for providing the structure. The indicated parameters do not matter as they are first randomized. Remember to enclose each clause in parentheses because :- has the highest precedence.

For example, bongard.pl has the initial program

in([(pos:0.197575 :-
circle(A),
in(B,A)),
(pos:0.000303421 :-
circle(A),
triangle(B)),
(pos:0.000448807 :-
triangle(A),
circle(B))]).

Alternatively, you can specify an input program in a section between :- begin_in. and :- end_in., as for example

:- begin_in.
pos:0.197575 :-
circle(A),
in(B,A).
pos:0.000303421 :-
circle(A),
triangle(B).
pos:0.000448807 :-
triangle(A),
circle(B).
:- end_in.

If you specify both a in/1 fact and a section, the clauses of the two will be combined.

### Language Bias

The language bias part contains the declarations of the input and output predicates. Output predicates are declared as

output(<predicate>/<arity>).

and indicate the predicate whose atoms you want to predict. Derivations for the atoms for this predicates in the input data are built by the system. These are the predicates for which new clauses are generated.

Input predicates are those whose atoms you are not interested in predicting. You can declare closed world input predicates with

input_cw(<predicate>/<arity>).

For these predicates, the only true atoms are those in the interpretations and those derivable from them using the background knowledge, the clauses in the input/hypothesized program are not used to derive atoms for these predicates. Moreover, clauses of the background knowledge that define closed world input predicates and that call an output predicate in the body will not be used for deriving examples.

Open world input predicates are declared with

input(<predicate>/<arity>).

In this case, if a subgoal for such a predicate is encountered when deriving a subgoal for the output predicates, both the facts in the interpretations, those derivable from them and the background knowledge, the background clauses and the clauses of the input program are used.

Then, you have to specify the language bias by means of mode declarations in the style of Progol.

modeh(<recall>,<predicate>(<arg1>,...)).

specifies the atoms that can appear in the head of clauses, while

modeb(<recall>,<predicate>(<arg1>,...)).

specifies the atoms that can appear in the body of clauses. <recall> can be an integer or *. <recall> indicates how many atoms for the predicate specification are retained in the bottom clause during a saturation step. * stands for all those that are found. Otherwise the indicated number is randomly chosen.

Two specialization modes are available: bottom and mode. In the first, a bottom clause is built and the literals to be added during refinement are taken from it. In the latter, no bottom clause is built and the literals to be added during refinement are generated directly from the mode declarations.

Arguments of the form

+<type>

specifies that the argument should be an input variable of type <type>, i.e., a variable replacing a +<type> argument in the head or a -<type> argument in a preceding literal in the current hypothesized clause.

Another argument form is

-<type>

for specifying that the argument should be a output variable of type <type>. Any variable can replace this argument, either input or output. The only constraint on output variables is that those in the head of the current hypothesized clause must appear as output variables in an atom of the body.

Other forms are

#<type>

for specifying an argument which should be replaced by a constant of type <type> in the bottom clause but should not be used for replacing input variables of the following literals when building the bottom clause or

-#<type>

for specifying an argument which should be replaced by a constant of type <type> in the bottom clause and that should be used for replacing input variables of the following literals when building the bottom clause.

<constant>

for specifying a constant.

Note that arguments of the form #<type> -#<type> are not available in specialization mode mode, if you want constants to appear in the literals you have to indicate them one by one in the mode declarations.

An example of language bias for the Bongard domain is

output(pos/0).

input_cw(triangle/1).
input_cw(square/1).
input_cw(circle/1).
input_cw(in/2).
input_cw(config/2).

modeh(*,pos).
modeb(*,triangle(-obj)).
modeb(*,square(-obj)).
modeb(*,circle(-obj)).
modeb(*,in(+obj,-obj)).
modeb(*,in(-obj,+obj)).
modeb(*,config(+obj,-#dir)).

SLIPCOVER also requires facts for the determination/2 predicate Aleph-style that indicate which predicates can appear in the body of clauses. For example

determination(pos/0,triangle/1).
determination(pos/0,square/1).
determination(pos/0,circle/1).
determination(pos/0,in/2).
determination(pos/0,config/2).

state that triangle/1 can appear in the body of clauses for pos/0.

SLIPCOVER also allows mode declarations of the form

modeh(<r>,[<s1>,...,<sn>],[<a1>,...,<an>],[<P1/Ar1>,...,<Pk/Ark>]). 

These mode declarations are used to generate clauses with more than two head atoms. In them, <s1>,...,<sn> are schemas, <a1>,...,<an> are atoms such that <ai> is obtained from $$\verb|<si>|$$ by replacing placemarkers with variables, <Pi/Ari> are the predicates admitted in the body. <a1>,...,<an> are used to indicate which variables should be shared by the atoms in the head. An example of such a mode declaration (from uwcselearn.pl) is

modeh(*,
[professor/1,student/1,hasposition/2,inphase/2,
publication/2,taughtby/3,ta/3,courselevel/2,yearsinprogram/2]).

If you want to specify negative literals for addition in the body of clauses, you should define a new predicate in the background as in

not_worn(C):-
component(C),
\+ worn(C).
one_worn:-
worn(_).
none_worn:-
\+ one_worn.

from mach.pl and add the new predicate in a modeb/2 fact

modeb(*,not_worn(-comp)).
modeb(*,none_worn).

Note that successful negative literals do not instantiate the variables, so if you want a variable appearing in a negative literal to be an output variable you must instantiate before calling the negative literals. The new predicates must also be declared as input

input_cw(not_worn/1).
input_cw(none_worn/0).

Lookahead can also be specified with facts of the form

lookahead(<literal>,<list of literals>).

In this case when a literal matching <literal> is added to the body of clause during refinement, then also the literals matching <list of literals> will be added. An example of such declaration (from muta.pl) is

lookahead(logp(_),[(_=_))]).

Note that <list of literals> is copied with copy_term/2 before matching, so variables in common between <literal> and <list of literals> may not be in common in the refined clause.

It is also possible to specify that a literal can only be added together with other literals with facts of the form

lookahead_cons(<literal>,<list of literals>).

In this case <literal> is added to the body of clause during refinement only together with literals matching <list of literals>. An example of such declaration is

lookahead_cons(logp(_),[(_=_))]).

Also here <list of literals> is copied with copy_term/2 before matching, so variables in common between <literal> and <list of literals> may not be in common in the refined clause.

Moreover, we can specify lookahead with

lookahead_cons_var(<literal>,<list of literals>).

In this case <literal> is added to the body of clause during refinement only together with literals matching <list of literals> and <list of literals> is not copied before matching, so variables in common between <literal> and <list of literals> are in common also in the refined clause. This is allowed only with specialization set to bottom. An example of such declaration is

lookahead_cons_var(logp(B),[(B=_))]).

### Example Interpretations

The last part of the file contains the data. You can specify data with two modalities: models and keys. In the models type, you specify an example model (or interpretation or megaexample) as a list of Prolog facts initiated by begin(model(<name>)). and terminated by end(model(<name>)). as in

begin(model(2)).
pos.
triangle(o5).
config(o5,up).
square(o4).
in(o4,o5).
circle(o3).
triangle(o2).
config(o2,up).
in(o2,o3).
triangle(o1).
config(o1,up).
end(model(2)).

The interpretations may contain a fact of the form

prob(0.3).

assigning a probability (0.3 in this case) to the interpretations. If this is omitted, the probability of each interpretation is considered equal to $$1/n$$ where $$n$$ is the total number of interpretations. prob/1 can be used to set a different multiplicity for the interpretations.

The facts in the interpretation are loaded in SWI-Prolog database by adding an extra initial argument equal to the name of the model. After each interpretation is loaded, a fact of the form int(<id>) is asserted, where id is the name of the interpretation. This can be used in order to retrieve the list of interpretations.

Alternatively, with the keys modality, you can directly write the facts and the first argument will be interpreted as a model identifier. The above interpretation in the keys modality is

pos(2).
triangle(2,o5).
config(2,o5,up).
square(2,o4).
in(2,o4,o5).
circle(2,o3).
triangle(2,o2).
config(2,o2,up).
in(2,o2,o3).
triangle(2,o1).
config(2,o1,up).

which is contained in the bongardkeys.pl This is also how model 2 above is stored in SWI-Prolog database. The two modalities, models and keys, can be mixed in the same file. Facts for int/1 are not asserted for interpretations in the key modality but can be added by the user explicitly.

Note that you can add background knowledge that is not probabilistic directly to the file writing the clauses taking into account the model argument. For example (carc.pl) contains

connected(_M,Ring1,Ring2):-
Ring1 \= Ring2,
member(A,Ring1),
member(A,Ring2), !.

symbond(Mod,A,B,T):- bond(Mod,A,B,T).
symbond(Mod,A,B,T):- bond(Mod,B,A,T).

where the first argument of all the atoms is the model.

Example registration.pl contains for example

party(M,P):-
participant(M,_, _, P, _).

that defines intensionally the target predicate party/1. Here M is the model and participant/4 is defined in the interpretations. You can also define intensionally the negative examples with

neg(party(M,yes)):- party(M,no).
neg(party(M,no)):- party(M,yes).

Then you must indicate how the examples are divided in folds with facts of the form: fold(<fold_name>,<list of model identifiers>), as for example

fold(train,[2,3,...]).
fold(test,[490,491,...]).

As the input file is a Prolog program, you can define intensionally the folds as in

fold(all,F):-
findall(I,int(I),F).

fold/2 is dynamic so you can also write (registration.pl)

:- fold(all,F),
sample(4,F,FTr,FTe),
assert(fold(rand_train,FTr)),
assert(fold(rand_test,FTe)).

which however must be inserted after the input interpretations otherwise the facts for int/1 will not be available and the fold all would be empty. This command uses sample(N,List,Sampled,Rest) exported from slipcover that samples N elements from List and returns the sampled elements in Sampled and the rest in Rest. If List has N elements or less, Sampled is equal to List and Rest is empty.

## Commands

### Parameter Learning

To execute EMBLEM, prepare an input file as indicated above, load it into SWI-Prolog and execute

?- induce_par(<list of folds>,P).

where <list of folds> is a list of the folds for training and P will contain the input program with updated parameters.

For example bongard.pl, you can load it into SWI-Prolog with

?- [bongard].

and perform parameter learning on the train fold with

?- induce_par([train],P).

A program can also be tested on a test set with

?- test(<program>,<list of folds>,LL,AUCROC,ROC,AUCPR,PR).

where <program> is a list of terms representing clauses and <list of folds> is a list of folds. This returns the log likelihood of the test examples in LL, the Area Under the ROC curve in AUCROC, a dictionary containing the list of points (in the form of Prolog pairs x-y) of the ROC curve in ROC, the Area Under the PR curve in AUCPR, a dictionary containing the list of points of the PR curve in PR.

For example, to test on fold test the program learned on fold train you can run the query

?- induce_par([train],P),
test(P,[test],LL,AUCROC,ROC,AUCPR,PR).

Or you can test the input program on the fold test with

?- in(P),
test(P,[test],LL,AUCROC,ROC,AUCPR,PR).

### Structure Learning

To execute SLIPCOVER, prepare an input file as indicated above, load it into SWI-Prolog and call

?- induce(<list of folds>,P).

where <list of folds> is a list of the folds for training and P will contain the learned program.

For example bongard.pl, you can load it into SWI-Prolog with

?- [bongard].

and perform structure learning on the train fold with

?- induce([train],P).

A program can also be tested on a test set with test/7 as described above.

## Parameters

Parameters are set with commands of the form

:- set_sc(<parameter>,<value>).

The available parameters are:

• specialization: (values: {bottom,mode}, default value: bottom) specialization mode.

• depth_bound: (values: {true,false}, default value: true) if true, the depth of the derivation of the goal is limited to the value of the depth parameter.

• depth (values: integer, default value: 2): depth of derivations if depth_bound is set to true

• single_var (values: {true,false}, default value: false): if set to true, there is a random variable for each clause, instead of a different random variable for each grounding of each clause

• epsilon_em (values: real, default value: 0.1): if the difference in the log likelihood in two successive parameter EM iteration is smaller than epsilon_em, then EM stops

• epsilon_em_fraction (values: real, default value: 0.01): if the difference in the log likelihood in two successive parameter EM iteration is smaller than epsilon_em_fraction*(-current log likelihood), then EM stops

• iter (values: integer, defualt value: 1): maximum number of iteration of EM parameter learning. If set to -1, no maximum number of iterations is imposed

• iterREF (values: integer, defualt value: 1, valid for SLIPCOVER): maximum number of iteration of EM parameter learning for refinements. If set to -1, no maximum number of iterations is imposed.

• random_restarts_number (values: integer, default value: 1, valid for EMBLEM and SLIPCOVER): number of random restarts of parameter EM learning

• random_restarts_REFnumber (values: integer, default value: 1, valid for SLIPCOVER): number of random restarts of parameter EM learning for refinements

• setrand (values: rand(integer,integer,integer)): seed for the random functions, see SWI-Prolog manual for the allowed values

• logzero (values: negative real, default value $$\log(0.000001)$$): value assigned to $$\log 0$$

• max_iter (values: integer, default value: 10, valid for SLIPCOVER): number of interations of beam search

• max_var (values: integer, default value: 4, valid for SLIPCOVER): maximum number of distinct variables in a clause

• beamsize (values: integer, default value: 100, valid for SLIPCOVER): size of the beam

• megaex_bottom (values: integer, default value: 1, valid for SLIPCOVER): number of mega-examples on which to build the bottom clauses

• initial_clauses_per_megaex (values: integer, default value: 1, valid for SLIPCOVER): number of bottom clauses to build for each mega-example (or model or interpretation)

• d (values: integer, default value: 1, valid for SLIPCOVER): number of saturation steps when building the bottom clause

• max_iter_structure (values: integer, default value: 10000, valid for SLIPCOVER): maximum number of theory search iterations

• background_clauses (values: integer, default value: 50, valid for SLIPCOVER): maximum numbers of background clauses

• maxdepth_var (values: integer, default value: 2, valid for SLIPCOVER): maximum depth of variables in clauses (as defined in [4]).

• neg_ex (values: given, cw, default value: cw): if set to given, the negative examples in testing are taken from the test folds interpretations, i.e., those examples ex stored as neg(ex); if set to cw, the negative examples are generated according to the closed world assumption, i.e., all atoms for target predicates that are not positive examples. The set of all atoms is obtained by collecting the set of constants for each type of the arguments of the target predicate.

• verbosity (values: integer in [1,3], default value: 1): level of verbosity of the algorithms

## Files

The pack/cplint/prolog/examples folder in SWI-Prolog home contains some example programs. The subfolder learning contains some learning examples. The pack/cplint/doc folder in SWI-Prolog home contains this manual in latex, html and pdf.

cplint follows the Artistic License 2.0 that you can find in cplint root folder. The copyright is by Fabrizio Riguzzi.

The library CUDD for manipulating BDDs has the following license:

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

• Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

• Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

• Neither the name of the University of Colorado nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAU-SED
AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

1. Elena Bellodi and Fabrizio Riguzzi. 2011. EM over binary decision diagrams for probabilistic logic programs. Proceedings of the 26th italian conference on computational logic (CILC2011), pescara, italy, 31 august 31-2 september, 2011. Retrieved from http://www.ing.unife.it/docenti/FabrizioRiguzzi/Papers/BelRig-CILC11.pdf

2. Elena Bellodi and Fabrizio Riguzzi. 2011. EM over binary decision diagrams for probabilistic logic programs. Dipartimento di Ingegneria, Università di Ferrara, Italy. Retrieved from http://www.unife.it/dipartimento/ingegneria/informazione/informatica/rapporti-tecnici-1/CS-2011-01.pdf/view

3. Elena Bellodi and Fabrizio Riguzzi. 2015. Structure learning of probabilistic logic programs by searching the clause space. Theory and Practice of Logic Programming 15, 2: 169–212. Retrieved from http://arxiv.org/abs/1309.2080

4. William W. Cohen. 1995. Pac-learning non-recursive prolog clauses. Artif. Intell. 79, 1: 1–38.

5. L. De Raedt and W. Van Laer. 1995. Inductive constraint logic. Proceedings of the 6th conference on algorithmic learning theory (aLT 1995), Springer, 80–94.

6. Arun Nampally and CR Ramakrishnan. 2014. Adaptive mCMC-based inference in probabilistic logic programs. arXiv preprint arXiv:1403.6036. Retrieved from http://arxiv.org/pdf/1403.6036.pdf

7. David Poole. 1997. The independent choice logic for modelling multiple agents under uncertainty. Artificial Intelligence 94, 1-2: 7–56.

8. Fabrizio Riguzzi and Terrance Swift. 2010. Tabling and Answer Subsumption for Reasoning on Logic Programs with Annotated Disjunctions. Technical communications of the international conference on logic programming, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 162–171. http://doi.org/10.4230/LIPIcs.ICLP.2010.162

9. Fabrizio Riguzzi. 2013. MCINTYRE: A Monte Carlo system for probabilistic logic programming. Fundamenta Informaticae 124, 4: 521–541. Retrieved from http://ds.ing.unife.it/~friguzzi/Papers/Rig13-FI-IJ.pdf

10. Fabrizio Riguzzi. 2015. The distribution semantics is well-defined for all normal programs. Proceedings of the 2nd international workshop on probabilistic logic programming (pLP), Sun SITE Central Europe, 69–84. Retrieved from http://ceur-ws.org/Vol-1413/#paper-06

11. Taisuke Sato and Yoshitaka Kameya. 2001. Parameter learning of logic programs for symbolic-statistical modeling. J. Artif. Intell. Res. 15: 391–454.

12. J. Vennekens and S. Verbaeten. 2003. Logic programs with annotated disjunctions. K. U. Leuven. Retrieved from http://www.cs.kuleuven.ac.be/\string~joost/techrep.ps

13. J. Vennekens, M. Denecker, and M. Bruynooghe. 2006. Representing causal information about a probabilistic process. Proceedings of the 10th european conference on logics in artificial intelligence, Springer.

14. J. Vennekens, Marc Denecker, and Maurice Bruynooghe. 2009. CP-logic: A language of causal probabilistic events and its relation to logic programming. Theory Pract. Log. Program. 9, 3: 245–308.

15. J. Vennekens, S. Verbaeten, and M. Bruynooghe. 2004. Logic programs with annotated disjunctions. International conference on logic programming, Springer, 195–209.