How Concurrent is Prolog?
I can't find any info on this online... I am al开发者_高级运维so new to Prolog...
It seems to me that Prolog could be highly concurrent, perhaps trying many possibilities at once when trying to match a rule. Are modern Prolog compilers/interpreters inherently* concurrent? Which ones? Is concurrency on by default? Do I need to enable it somehow?
* I am not interested in multi-threading, just inherent concurrency.
Are modern Prolog compilers/interpreters inherently* concurrent? Which ones? Is concurrency on by default?
No. Concurrent logic programming was the main aim of the 5th Generation Computer program in Japan in the 1980s; it was expected that Prolog variants would be "easily" parallelized on massively parallel hardware. The effort largely failed, because automatic concurrency just isn't easy. Today, Prolog compilers tend to offer threading libraries instead, where the program must control the amount of concurrency by hand.
To see why parallelizing Prolog is as hard as any other language, consider the two main control structures the language offers: conjunction (AND, serial execution) and disjunction (OR, choice with backtracking). Let's say you have an AND construct such as
p(X) :- q(X), r(X).
and you'd want to run q(X)
and r(X)
in parallel. Then, what happens if q
partially unifies X
, say by binding it to f(Y)
. r
must have knowledge of this binding, so either you've got to communicate it, or you have to wait for both conjuncts to complete; then you may have wasted time if one of them fails, unless you, again, have them communicate to synchronize. That gives overhead and is hard to get right. Now for OR:
p(X) :- q(X).
p(X) :- r(X).
There's a finite number of choices here (Prolog, of course, admits infinitely many choices) so you'd want to run both of them in parallel. But then, what if one succeeds? The other branch of the computation must be suspended and its state saved. How many of these states are you going to save at once? As many as there are processors seems reasonable, but then you have to take care to not have computations create states that don't fit in memory. That means you have to guess how large the state of a computation is, something that Prolog hides from you since it abstracts over such implementation details as processors and memory; it's not C.
In other words, automatic parallelization is hard. The 5th Gen. Computer project got around some of the issues by designing committed-choice languages, i.e. Prolog dialects without backtracking. In doing so, they drastically changed the language. It must be noted that the concurrent language Erlang is an offshoot of Prolog, and it too has traded in backtracking for something that is closer to functional programming. It still requires user guidance to know what parts of a program can safely be run concurrently.
In theory that seems attractive, but there are various problems that make such an implementation seem unwise.
for better or worse, people are used to thinking of their programs as executing left-to-right and top-down, even when programming in Prolog. Both the order of clauses for a predicate and of terms within a clause is semantically meaningful in standard Prolog. Parallelizing them would change the behaviour of far too much exising code to become popular.
non-relational language elements such as the cut operator can only be meaningfully be used when you can rely on such execution orders, i.e. they would become unusable in a parallel interpreter unless very complicated dependency tracking were invented.
all existing parallelization solutions incur at least some performance overhead for inter-thread communication.
Prolog is typically used for high-level, deeply recursive problems such as graph traversal, theorem proving etc. Parallelization on a modern machines can (ideally) achieve a speedup of
n
for some constantn
, but it cannot turn an unviable recursive solution method into a viable one, because that would require an exponential speedup. In contrast, the numerical problems that Fortran and C programmers usually solve typically have a high but quite finite cost of computation; it is well worth the effort of parallelization to turn a 10-hour job into a 1-hour job. In contrast, turning a program that can look about 6 moves ahead into one that can (on average) look 6.5 moves ahead just isn't as compelling.
There are two notions of concurrency in Prolog. One is tied to multithreading, the other to suspended goals. I am not sure what you want to know. So I will expand a little bit about multithreading first:
Today widely available Prolog system can be differentiated whether they are multithreaded or not. In a multithreaded Prolog system you can spawn multiple threads that run concurrently over the same knowledge base. This poses some problems for consult and dynamic predicates, which are solved by these Prolog systems.
You can find a list of the Prolog systems that are multithreaded here:
Operating system and Web-related features
Multithreading is a prerequesite for various parallelization paradigmas. Correspondingly the individudal Prolog systems provide constructs that serve certain paradigmas. Typical paradigmas are thread pooling, for example used in web servers, or spawning a thread for long running GUI tasks.
Currently there is no ISO standard for a thread library, although there has been a proposal and each Prolog system has typically rich libraries that provide thread synchronization, thread communication, thread debugging and foreign code threads. A certain progress in garbage collection in Prolog system was necessary to allow threaded applications that have potentially infinitely long running threads.
Some existing layers even allow high level parallelization paradigmas in a Prolog system independent fashion. For example Logtalk has some constructs that map to various target Prolog systems.
Now lets turn to suspended goals. From older Prolog systems (since Prolog II, 1982, in fact) we know the freeze/2 command or blocking directives. These constructs force a goal not to be expanded by existing clauses, but instead put on a sleeping list. The goal can the later be woken up. Since the execution of the goal is not immediate but only when it is woken up, suspended goals are sometimes seen as concurrent goals, but the better notion for this form of parallelism would be coroutines.
Suspended goals are useful to implement constraint solving systems. In the simplest case the sleeping list is some variable attribute. But a newer approach for constraint solving systems are constraint handling rules. In constraint handling rules the wake up conditions can be suspended goal pair patterns. The availability of constraint solving either via suspended goals or constraint handling rules can be seen here:
Overview of Prolog Systems
Best Regards
From a quick google search it appears that the concurrent logic programming paradigm has only been the basis for a few research languages and is no longer actively developed. I have seen claims that concurrent logic is easy to do in the Mozart/Oz system.
There was great hope in the 80s/90s to bake parallelism into the language (thus making it "inherently" parallel), in particular in the context of the Fifth Generation Project. Even special hardware constructs were studied to implement "Parallel Inference Machine" (PIM) (Similar to the special hardware for LISP machines in the "functional programming" camp). Hardware efforts were abandoned due to continual improvement of off-the-shelf CPUs and software effort were abandoned due to excessive compiler complexity, lack of demand for hard-to-implement high-level features and likely lack of payoff: parallelism that looks transparent and elegantly exploitable at the language level generally means costly inter-process communication and transactional locking "under the hood".
A good read about this is
"The Deevolution of Concurrent Logic Programming Languages"
by Evan Tick, March 1994. Appeared in "Journal of Logic Programming, Tenth Anniversary Special Issue, 1995". The Postscript file linked to is complete, unlike the PDF you get at Elsevier.
The author says:
There are two main views of concurrent logic programming and its development over the past several years [i.e. 1990-94]. Most logic programming literature views concurrent logic programming languages as a derivative or variant of logic programs, i.e., the main difference being the extensive use of "don't care" nondeterminism rather than "don't know" (backtracking) nondeterminism. Hence the name committed choice or CC languages. A second view is that concurrent logic programs are concurrent, reactive programs, not unlike other "traditional" concurrent languages such as 'C' with explicit message passing, in the sense that procedures are processes that communicate over data streams to incrementally produce answers. A cynic might say that the former view has more academic richness, whereas the latter view has more practical public relations value.
This article is a survey of implementation techniques of concurrent logic programming languages, and thus full disclosure of both of these views is not particularly relevant. Instead, a quick overview of basic language semantics, and how they relate to fundamental programming paradigms in a variety of languages within the family, will suffice. No attempt will be made to cover the many feasible programming paradigms; nor semantical nuances, nor the family history. (...).
The main point I wish to make in this article is that concurrent logic programming languages have been deevolving since their inception, about ten years ago, because of the following tatonnement:
- Systems designers and compiler writers could supply only certain limited features in robust; efficient implementations. This drove the market to accept these restricted languages as, in some informal sense, de facto standards.
- Programmers became aware that certain, more expressive language features were not critically important to getting applications written, and did not demand their inclusion.
Thus my stance in this article will be a third view: how the initially rich languages gradually lost their "teeth," and became weaker, but more practically implementable, and achieved faster performance.
The deevolutionary history begins with Concurrent Prolog (deep guards, atomic unification; read-only annotated variables for synchronization), and after a series of reductions (for example: GHC (input-matching synchronization), Parlog (safe), FCP (flat), Fleng (no guards), Janus (restricted communication), Strand (assignment rather than output unification)), and ends for now with PCN (flat guards, non-atomic assignments input-matching synchronization, and explicitly-defined mutable variables). This and other terminology will be defined as the article proceeds.
This view may displease some readers because it presupposes that performance is the main driving force of the language market; and furthermore that the main "added value" of concurrent logic programs over logic programs is the ability to naturally exploit parallelism to gain speed. Certainly the reactive nature of the languages also adds value; e.g., in building complex object-oriented applications. Thus one can argue that the deevolution witnessed is a bad thing when reactive capabilities are being traded for speed.
ECLiPSe-CLP, a language "largely backward-compatible with Prolog", supports OR-parallelism, even though "this functionality is currently not actively maintained because of other priorities".
[1,2] document OR- (and AND-)parallelism in ECLiPSe-CLP.
However, I tried to get it working some time using the code from ECLiPSe-CLP's repository, but I didn't get it though.
- [1] http://eclipseclp.org/reports/book.ps.gz
- [2] http://eclipseclp.org/doc/bips/kernel/compiler/parallel-1.html
精彩评论