Theory of Formal Languages, Automata, and Computation/Introduction
This textbook is fundamentally on finite representations of infinite languages, where the representations are amenable to computational analysis and characterization. That's it in a nutshell. The rest of this book fleshes out this nutshell in considerable detail, with pointers to still more detailed treatments.
Languages
editA formal language is a set of strings over some alphabet. A language, call it L, can be finite or infinite. The alphabet, which is often denoted as Σ, is a finite set of primitive symbols, such as Σ = {a,…,z, A,….,Z} or Σ = {0,….,9}. Each string is a sequence (i.e., ordered) of symbols from the alphabet. Just as sets generally can be represented intensionally or extensionally, languages can too. An extensional representation is an explicit listing of the strings in the language, and is only relevant to the representation of finite languages. For example, the finite language of binary representations of natural numbers less than 4 is L = {0, 1, 10, 11} or alternatively it could be {00, 01, 10, 11}, over the alphabet Σ = {0, 1}.
In contrast, intensional representations of languages are finite descriptions of the set of strings in the language. These finite-length descriptions are used to describe infinite, as well as finite languages. The natural language description “binary representations of natural numbers less than 4” is an intensional representation of a language, which can also be written as L = {w | w is a binary representation of an integer less than 4}. The phrase “binary representation” suggests the relevant alphabet of {0, 1}. There are other intensional representations of finite sets above. What are they? While I used Σ = {a,…,z, A,….,Z} or Σ = {0,….,9} as representing alphabets, I could also say L = {0,….,9}, for example, or L = “the single digits in common arithmetic, base 10”.
Intensional descriptions are also absolutely needed to finitely represent infinite languages. Consider L = {w | w is a binary representation of an integer greater than 3}, or perhaps “L is the set of binary representations of natural numbers greater than 3”, with Σ = {0, 1}. An interesting observation about intensional descriptions is that there are implicit (typically unstated) inferences that are required to interpret them. There are inferences required of extensional representations too, but I would say that the inferences in the latter case are much closer to the mechanical level that are unambiguous.
Sometimes the intensional description, upon further thought, is insufficient for unambiguous understanding. Fortunately, in the intensional description of the digits, most of us will know how to interpret the ‘…’ because of our prior training, but others will not know how to interpret that. Even for the best trained among us, the description of the binary representations of natural numbers greater than 3 may, upon reflection, seem incomplete. Perhaps we should add a phrase, as in L = {w | w is a binary representation of an integer greater than 3, where w is represented by the least number of binary digits necessary} = {100, 101, 110, 111, 1000, 1001, …}. Why? Even here we might want to say w has no leading 0s.
To my knowledge, no course on formal languages will seek to formalize the nature of implicit inferences in the general case, though this would be required to fully understand the complexity of language representation and comprehension, and is undoubtedly of importance to pure mathematics. We will touch on some of these issues, particularly when we talk about the relationship between AI and formal languages.
As already mentioned, an intensional description (+ the inference mechanism, which may be a computer program) is a finite representation of what is often an infinite language. In fact, much of this course is about two classes of finite representations of languages – grammars and automata (or machines) – that are both amenable to mechanical, unambiguous, automated inference. Roughly speaking, a grammar specifies how to generate strings in a language, and an automaton specifies how to recognize strings in a language. This distinction between generators and recognizers of languages is not as sharp as suggested here, but it’s a good starting point.
Proof
editA traditional goal in automata and formal language courses is to practice skills of deductive proof, which may influence and benefit thought for a lifetime. The word ‘proof’ is often misused, as when it is used to state that there is no proof of this or that natural or social phenomenon (e.g., no proof that climate change is human made). Such areas are not appropriate for deductive proof, but as we will see when discussing AI, the logical mechanisms for carrying out deductive reasoning can be augmented with probabilities and utilities, and be reinterpreted, to become the skeletal structure for other important forms of reasoning: possibilistic reasoning (i.e., what is possible?), probabilistic reasoning (i.e., what is probable?), evidential reasoning (i.e., what is probable given evidence/data/observations?), and rational reasoning (i.e., what are the expected consequences of actions, which includes probabilities but also utilities/costs of outcomes?).
Returning now to deductive proof, Hopcroft, Motwani, and Ullman (2007) state
- "
In the USA of the 1990's it became popular to teach proof as a matter of personal feelings about the statement.
" (p. 5, [1])
But they continue
- "
While it is good to feel the truth of a statement you need to use, important techniques of proof are no longer mastered in high school. Yet proof is something that every computer scientist needs to understand.
” (p. 5, [1])
Contrast this statement with breakout box on page 12 of the same text "How Formal Do Proofs Have to Be?
" where it acknowledges that "persuasion" of an audience member is very relevant, but this will depend on what knowledge that audience member has. It goes on the say, in effect, that proof should persuade an audience (that will include knowledgeable members too) and not just arbitrary individuals.
Another point is that proof (which I will characterize as "deductive" and Polya[2] calls "demonstrative" and "finished", and includes all forms of proof that we discuss) is the result of search for a proof through intelligent, knowledgeable guessing -- that is, mathematics "in the making
".
- "
Mathematics is regarded as a demonstrative science. Yet this is only one of its aspects. Finished mathematics presented in a finished form appears as purely demonstrative consisting of proofs only. Yet mathematics in the making resembles any other human knowledge in the making. You have to guess a mathematical theorem before you can prove it; you have to guess the ideas of a proof before you carry through the details.
" (p. vi, [2])
Polya makes clear that the "guessing" behind the search for a proof should not be arbitrary but should follow principles of "plausible" reasoning. Textbooks often present "finished" proofs, but embedded in the discussion sometimes there will be insights into the search and rules of thumb for easing and directing the search for the finished product.
As an example, consider Goldbach’s Conjecture – that every even number greater than 4 is the sum of two prime numbers. But how do I come up with this statement to begin with? How did Goldbach come up with Goldbach’s conjecture? I do so through what Polya also includes as part of plausible reasoning – I notice that every even number that I can think of is the sum of two primes. This is an empirical exercise of identifying a pattern based on data. After identifying a plausible pattern, I can attempt a proof of the statement (e.g., If N is even and greater than 4, then there exists ( ) N-k and k that are both prime). No one, by the way, has ever proven Goldbach’s Conjecture. It is so easily stated but as yet unproved as true or false. That’s one of the attractions of number theory – easily comprehended and difficult to prove hypotheses, at least in many cases. A classic program of early AI and machine learning is known as AM, standing for "A Mathematician", by Douglas Lenat[3]. Rather than proving theorems, which was an early focus of AI research predating AM, Lenat's system searches for empirically supported patterns in number theory which then are presented by the system as candidate theorems. AM illustrates the tasks of mathematics in the making, which are integral to computational theory as well as mathematics.
Once a candidate theorem is identified, a search for a proof of it (or not) can begin. Generally, finding a valid proof is a much more difficult problem than validating a given proof. An interesting hybrid problem of finding and validating arises when grading a proof, where ostensibly the task is to validate or invalidate the "proof" submitted by a researcher, students included, but this will sometimes require that a focused search evaluates "how far" the submission is from a valid proof (e.g., for the purpose of assigning a grade). This grading of a proof attempt is a good example of an AI task, rather than ALC territory per se.
Proof by Induction
editYou may have covered proof by induction in a course on discrete structures, and if so, our treatment here is review. If we want to prove a statement St of all natural numbers, then we explicitly show base cases -- that St(0) is true and perhaps other base cases -- and then we show the truth of an inductive step, that ‘if St(k) is true then St(k+1) is true’ for all k 0 and/or some other base cases. The inductive step suggests the importance of proofs by induction in formal languages, where strings have lengths and the truth of statements about strings of length k+1 are demonstrated by building on the assumption of truth of strings of length k (or less than and up to length k). Figures InductionProof1, InductionProof2, and InductionProof3 give sample proofs, two arithmetic and the third involving strings.
The deductive validity of proof by induction rests on the axioms of number theory -- one in particular. But to review, Peano_axioms of number theory are as follows.
1. 0 is a natural number.
2. For every natural number x, x = x. That is, equality is reflexive.
3. For all natural numbers x and y, if x = y, then y = x. That is, equality is symmetric.
4. For all natural numbers x, y and z, if x = y and y = z, then x = z. That is, equality is transitive.
5. For all a and b, if b is a natural number and a = b, then a is also a natural number. That is, the natural numbers are closed under equality.
6. For every natural number n, the successor of n, n+1, is a natural number. That is, the natural numbers are closed under the successor function.
7. For all natural numbers m and n, m = n if and only if the successor of m is the successor of n, m+1 = n+1.
8. For every natural number n, n+1 = 0 is false. That is, there is no natural number whose successor is 0.
9. If K is a set such that: 0 is in K, and for every natural number n, n being in K implies that n+1 is in K, then K contains every natural number.”
It is the 9th axiom that justifies proof by induction. It says precisely what we have been illustrating above, which is that if a statement is true of n (i.e., St(n) is true; n is in the set of natural natural numbers for which St is true), and St(n) St(n+1), then the statement, St, is true of every natural number.
The 9th axiom, instantiated for particular statements, St, is itself a finite representation of the countably infinite set of natural numbers.
Proof by Contradiction
editThere are other forms of proof too, like proof by contradiction. Suppose you want to prove proposition p, then the equivalent statement p false (or simply that p is false) may be more amenable to proof.
For example, prove “If x is greater than 2 (p1) and x is prime (p2), then x is odd (q)” by contradiction. Abbreviate this as proving (p1 ⋀ p2) q by contradiction, so negate the statement resulting in
((p1 ⋀ p2) q), which is equivalent to (represented as )
( (p1 ⋀ p2) ⋁ q)
( ( p1⋁ p2) ⋁ q)
( ( p1⋁ p2) ⋀ q)
(( p1⋀ p2) ⋀ q)
((p1⋀ p2) ⋀ q)
(p1⋀ p2 ⋀ q) // (x is odd) (x is even)
Does (x is greater than 2) and (x is prime) and (x is even) create a contradiction? If x is even then x = 2y, for some y > 1, then x is composite by definition, contradicting x is prime (p2).
Here is another example of proof by contradiction. This example is due to Hopcroft, Motwani, and Ullman (2007)[5]. Let S be a finite subset of an infinite set U. Let T be the complement of S with respect to U. Prove by contradiction that T is infinite. Note that U and S represent the cardinality or size of sets U and S.
Prove “IF S ⊂ U (p1) and
there is no integer n s.t. U = n (p2) and
there is an integer m s.t. S = m (p3) and
S ⋃ T = U (p4) and
S ∩ T = {} (p5)
THEN there is no integer p s.t. T = p (q)
by contradiction
Similar to prior example, negate the entire expression and simplify, getting p1 ⋀ p2 ⋀ p3 ⋀ p4 ⋀ p5 ⋀ q.
q there is an integer p s.t. |T| = p
p4 and p5 U = S + T = m + p contradicting p2 (that there is no integer n s.t. U = n)
Diagonalization
editSpecial cases of proof by contradiction include proof by counterexample and proof by diagonalization. This latter strategy is used to prove exclusion of items from an infinite set of like items. In diagonalization one defines a matrix in which the rows, of which there may be a countably infinite number, represents members of a set. Each row is defined by the variables given by each of the columns of the matrix. One defines the matrix and uses the diagonal of the matrix to construct a hypothetical candidate row, but it is a candidate row that cannot belong to the set of rows in the matrix by virtue of the way the candidate row was constructed.
Cantor’s proof that the real numbers are not countably infinite introduced the diagonal argument, and assumes a matrix where rows correspond to the (countably infinite) natural numbers (i.e., 0, 1, 2, 3, …) and the columns correspond to digits that are to the right of the decimal point, as in 0.x1x2x3… (e.g., 0.3271…). We can then take the diagonal of this matrix and create a candidate row by taking each digit xj,j in the diagonal and changing it, say to (xj,j+1) mod 10. This candidate sequence cannot match any row, by definition, since at least one digit in it, the one that was changed along the diagonal, is different from the corresponding digit in any row. So, there are syntactically valid sequences of digits to the right of the decimal point that cannot correspond to any row in the countably infinite set of rows. This is illustrated in Figure Diagonalization.
When this strategy was introduced in the late 19th century there was some controversy around it because of the novelty of creating an infinite-length counterexample, and Wittgenstein was reported to have called the diagonal argument “hocus pocus”, but it has now been long accepted as mathematically valid. We will use it later in proving that some languages are not members of certain language classes.
Proof with Construction
editThe most common form of "proof" or demonstration that you will see in this textbook involves construction. For example, if we want to show that two descriptions, X and Y, define the same language, we show that we can construct/translate Y from X: X Y, and vice versa, Y X. But we can't construct just any old thing and expect that to be a sufficient demonstration of equivalence. You have to be convinced that the translation or construction from one representation to the other is not buggy and is otherwise valid. If we want to be formally rigorous we might therefore follow construction by a proof, say by induction or contradiction, of the construction's validity. We will typically only allude to or sketch the proof that the construction is valid, if that, spending most of our time explaining the construction and its validity.
Automated Proof
editIn AI, automatic proof has a rich history, and we take time to talk about AI proof as a way of demonstrating important principles of thinking. An important insight that is rarely taught in lecture is that to prove something requires that one search for a valid proof. When was the last time that you saw a lecturer prove something live and previously unseen by the lecturer? This search can take a long time and it may not be successful at all. Rather, the search for a proof (or the search for a correct computer program for a given specification) typically happens beforehand, perhaps from scratch or by looking up the results of someone else’s search. Demonstrating the validity of an existing proof, which is what is typically done in lecture, is generally much easier than finding a valid proof through search! This distinction between validating and finding will relate to our study on P and NP problems, respectively. You may have seen P and NP elsewhere, perhaps an algorithms course, but if not we address it in Properties of Language Classes. Our talk of AI proof will make explicit the importance of search or what Polya includes as part of plausible reasoning for theory "in the making".
If you know a statement that you want to prove, then it often makes sense to reason backwards from that statement to a known set of axioms by inverting deductively valid steps (so that when read forward the steps are valid). Backwards reasoning is an important strategy in AI and in thinking generally. An early, if not the first AI paper on this as thought strategy was Shachter and Heckerman (1987)[6] but a search on Google for backwards reasoning or the like turns up numerous business articles and talks in the last decade. I am reminded of an old adage in motorcycle circles -- "when you are making a turn look to where you want to be", which I am sure is a metaphor for much that you will read in business and elsewhere.
A more technical example is that if I am planning a route to particular destination, I might start with the destination on a road map and work backwards along the roads leading to it, selecting those that are in the direction of my current location. As a matter of interest, automated route planners might work in both directions, from destination to start location, and vice versa, in search of intersections in the search "frontiers".
Automated proof (aka theorem proving) is grounded in formal logic. You saw propositions (e.g., 'p's and 'q's), as used in propositional logic, under Proof by Contradiction above. Two basic operations of inference in propositional logic are modus ponens and resolution. The modus ponens inference rule says that if you know 'p' (i.e., proposition p is true) and you know 'p q' (i.e., p implies q) then you can deductively conclude proposition 'q' (i.e., q is true). You now know both p and q, i.e., (p q). Modus ponens is the best known deductive inference rule.
The resolution rule relies on the equivalence between implication and disjunction, (p q) ( p q). If you know 'p' and you know '( p q)' then you know q of course. Its as if the p and the p cancel each other, and if you think of it logically you can see why: if p ( p q) (p p q) (p p) (p q) false (p q), which is again (p q).
Modus ponens and resolution are equivalent rules of inference at this simple level, but they offer different perspectives on deductive reasoning, and we will see them again when discussing computational theory, and again in the last chapter on applications, where we will also talk about automated theorem proving in both propositional and first-order logic.
Problems, Procedures, and Algorithms
editBefore getting to the various types of grammars and automata to finitely represent languages, lets briefly preview some theory of computation with concepts that we already know – procedures and algorithms. A procedure is a finite sequence of steps that can be carried out automatically, that is by a computer. An algorithm is a procedure that is guaranteed to halt on all inputs.
Decidability and Undecidability
editProcedures, be they algorithms or not, can be written to recognize languages. The question of whether a string w is a member of a language L is a yes/no question, and a procedure for answering the question is a finite representation of L if and only if (a) the procedure says 'yes' for every w in L (and halts), and (b) the procedure never says 'yes' for any w that is not in L. If in addition the procedure says 'no' for every w not in L (and halts) then the procedure is an algorithm. Notice that these conditions collectively allow for a procedure that is a finite representation of L but that is not an algorithm. Such a non-algorithm procedure says 'yes' and halts if and only if w in L, but there exists w that are not in L for which the procedure will not answer 'no' and will not halt. If for language L there is no algorithm for correctly answering the yes/no question of membership in the language, then membership in L is said to be undecidable. If an algorithm does exist for correctly answering the question of membership in L for all possible input strings, members and non-members, then membership in L is decidable.
Presumably most readers can write a procedure that recognizes whether an input integer is a prime number or not, outputting yes or no, respectively. If written correctly the procedure would certainly halt in all cases, so its an algorithm and it recognizes the language of prime numbers. An algorithm can also be written to determine whether an integer is a perfect number or not. A perfect number is a number that is the sum of all its divisors (excepting itself, but including 1). So, membership in the language of prime numbers is decidable, as is membership in the language of perfect numbers.
Decision Problems
editFormally, decision problems refer to yes/no questions. The question of membership in a language is one type of decision problem, for which we have given examples of testing for primality and testing for "perfection". Lets consider two other decision problems. You can write an algorithm for answering whether there is a prime number that is greater than a number that you supply as input. Since the prime numbers are known to be infinite then its a simple algorithm indeed -- just answer 'yes' regardless of input! But for the sake of illustration, assume that in addition to yes, you wanted to give the smallest prime that is greater than the number that is passed as input. You can write an algorithm for doing this that could use the algorithm for membership testing for primes as a subroutine.
You can also write a procedure for answering the question of whether there is a perfect number greater than a number that is passed as input, but it is not known whether the perfect numbers are infinite, and so its not known whether this procedure is an algorithm or not – its not known whether it will halt on all inputs. In the case of an input that is greater than the largest currently known perfect number, the procedure will halt with a yes if a larger perfect number is eventually found, but again, the procedure will not halt if there is no larger perfect number. This decision problem of next perfect number is not known to be decidable or undecidable, but as in all examples of undecidable decision problems thus far, the 'undecidability aspect' of a question lies in the handling of 'no' cases.
Notice that the knowledge that the primes are infinite and the uncertainty on whether the perfect numbers are (or are not) infinite lies outside the procedures for next-prime and next-perfect, respectively. We can write the procedures for each next-type problem in very much the same way, and the fact that one is an algorithm and one is not known to be an algorithm is not revealed in the procedures themselves.
As other examples of decision problems lets consider inspirations from Goldbach's conjecture. Consider the language of even numbers greater than 4. Call it LEven. Consider the language, LEvenSum, of even numbers greater than 4 that are the sum of two primes. Now, consider the language, LEvenNotSum, of even numbers greater than 4 that are not the sum of two primes. One decision problem is whether LEvenSum LEven? A second decision problem is whether LEvenNotSum , i.e., the empty language? Intuitively, these two questions seem similar, and perhaps equivalent. Its left as an exercise to show that the two questions are indeed equivalent.
Given that Goldbach's Conjecture is still a conjecture, it is not known whether LEvenSum LEven and LEvenNotSum are decidable or undecidable. Not coincidentally, it is not apparent on how the write a procedure for answering the questions! Such a procedure would constitute the main part in a proof with construction on the decidability or undecidablity of these decision problems.
LEvenSum LEven is just one example of a decision problem that tests the equality of two languages. Likewise, LEvenNotSum is just one example of a decision problem testing for the emptiness of a language. In the chapter on Properties and Language Classes we will study decision problems in the context of classes (aka sets) of languages rather than specific languages only. For example, is the question of L decidable or undecidable for all languages L in a given class of languages?
Interestingly, decision problems themselves define languages -- languages of "languages"! Let Lmeta be the set (aka language) of recognition procedures for languages that are empty. Convince yourself that this makes sense. A recognition procedure for a language is after all a finite sequence of steps, which is a finite string in a some programming language. Is Lmeta ? Is membership in Lmeta decidable? Does a recognition procedure for Lmeta even exist that can correctly answer in all of the yes cases and no cases?
Languages that cannot be Finitely Represented
editIf no recognition procedure for Lmeta exists (i.e., no procedure exists that can recognize all, and only, members of Lmeta ) , then recognition for Lmeta would be undecidable in an even stronger sense than other undecidable decision problems that we've discussed thus far, in which the 'undecidability aspect' of the problem rested with the 'no' cases only.
Intuitively, examples of languages that cannot be finitely represented at all will be those for which there is no basis by which a recognizer can be constructed. A language in which the members are selected randomly would be an intuitive illustrative example: LR = {w | w in Σ* and random(w) == true} could be formed by enumerating the strings over alphabet Σ in some systematic order and randomly deciding in each case whether a string, w, is in the language or not. The reason that a recognizer can't be built for this language assumes that random(w) is applied in the recognizer with no memory of its application in the creation of the language. And we can't simply remember the members, since there are an infinite number. In contrast, if random(w) were a pseudorandom process then such procedures are repeatable and we could implement a recognizer in that case, but if we have a truly random process then no recognizer can be built to say yes correctly in all cases.
Random languages, as above, are posed to give you an example of languages that are not finitely representable that might be less mind numbing than Lmeta . We will return to these questions when we talk more on the theory of computation.
Generating Languages
editRecognition algorithms like those for prime and perfect numbers can be used as subroutines for generating languages as well. The language of prime numbers can be generated by initializing the set of "strings" to {2}, and embedding the recognition procedure in a loop that iterates over increasingly large odd integers, starting with 3, testing each for primeness, and adding only those numbers that are found to be prime. Note that because the primes are known to be infinite this generation procedure won’t terminate, so its not an algorithm but of course its not a yes/no question either. We could create a non-equivalent algorithm from the generator by using a counter that artificially stops the iterative process after a specified number of iterations. This would not be a generator of the infinite language any more, but only a generator up to strings in the language up to a prescribed length, which would be a finite language.
We can use the same strategy of systematically generating larger numbers, not odd only in this case, and using the perfect number recognition algorithm on each. This would be a generator of the perfect numbers. This too would run forever unless it was constrained by a maximum count on iterations.
A procedure that is designed to "run forever" is called an anytime algorithm because at any time it can be paused and the output to that point can be examined before continuing to play the algorithm forward. Anyone who has experienced a system crash knows that an anytime algorithm is best regarded as a theoretical construct, though even in the case of a crash, computer systems generally save their state and can be restarted from that saved state, so an anytime algorithm could proceed indefinitely, if not forever.
There are ways to convert a generation procedure for a language to one that generates one string, w, of a language, L, on each run of the algorithm. An exercise asks you to sketch an approach to do so, which will guarantee that each string in the language will be generated with non-zero probability.
Exercises, Projects, and Discussions
editInduction Exercise 1: Prove the equality given in Figure InductionExercise1 by induction.
Induction Exercise 2: (Due to Hopcroft and Ullman (1979)[7]). Briefly explain what is wrong with the following inductive “proof” that all elements in any set must be identical. The basis is that for sets with one element the statement is trivially true. Assume the statement is true for sets with n-1 elements, and consider a set S with n elements. Let a be an element of S. Let S = S1 S2, where S1 and S2 each have n-1 elements, and each contains a. By the inductive hypothesis (assumption) all elements in S1 are identical to a and all elements in S2 are identical to a. Thus all elements in S are identical to a.
Induction Exercise 3: Prove by induction that the number of leaves in a non-empty full binary tree is exactly one greater than the number of internal nodes.
Proof Discussion 1: Have a discussion with an AI large language model on the history and philosophy of "proof", potentially including the role of plausible reasoning, infinity, and/or other issues that you are particularly interested in. Include your prompts and the responses, and/or ask the LLM to summarize the discussion in X words or less (e.g., X = 500).
Decision Problem Exercise 1: Show that LEvenSum LEven and LEvenNotSum are equivalent decision problems.
Decision Problems Exercise 2: There are ways to convert a generation procedure for a language to one that generates one string, w, of a language, L, on each run of the algorithm. Sketch an approach to do so, which will gaurantee that each string in the language will be generated with non-zero probability. Your "one-shot" version of generate can use the recognition procedure for L as a subroutine.
Proof Project 1: Learn about AI proof assistants and use these throughout the semester to addresses problems that you are assigned, and unassigned. Report the results and experiences in a proof portfolio. https://www.nytimes.com/2023/07/02/science/ai-mathematics-machine-learning.html
References
edit- ↑ a b Hopcroft, John E., Motwani, Rajeev, Ullman, Jeffrey D. (2007). Introduction to Automata Theory, Languages, and Computation 3rd Edition. Pearson Education, Inc/Addison-Wesley Publishing Company.
- ↑ a b Polya, George (1954). Induction and Analogy in Mathematics. Princeton University Press. ISBN 0-691-08005-4.
- ↑ Lenat, Douglas. "Automated Theory Formation in Mathematics" (PDF). Proceedings of the 5th International Joint Conference in Artificial Intelligence: pp. 833-842.
{{cite journal}}
:|pages=
has extra text (help) - ↑ Hopcroft, John E; Ullman, Jeffrey D (1979). Introduction to Automata Theory, Languages, and Computation. Addison Wesley. p. 11. ISBN 0-201-02988-X.
- ↑ Hopcroft, John. E (2007). Introduction to Automata Theory, Languages, and Computation (3rd ed.). Pearson Education, Inc. p. 9. ISBN 0-321-45536-3.
- ↑ Shachter, Ross; Heckerman, David (Fall 1987). "Thinking Backwards for Knowledge Acquisition" (PDF). AI Magazine. 8 (3): 55–61.
- ↑ Hopcroft, John E; Ullman, Jeffrey D (1979). Introduction to Automata Theory, Languages, and Computation. Addison Wesley. p. 11. ISBN 0-201-02988-X.