Abstract

Zero pronominals challenge Type Logical Grammar in two ways. One, TLG displays a linear resource management regime for semantic composition, meaning that pronominals call for special treatment if they want to do resource multiplication. Two, as a grammar of lexicalism, TLG applies to phonologically realized lexical entries only, illegitimating the phonetically null items during syntactic derivation. Jägor extends the inventory of category-forming connectives of TLG by a third kind of implication that creates categories of anaphoric items and solves the first problem above. This article goes a step further to tackle the second one. In order to formalize the constructions with zero pronominals, we design a ternary category and include the latter into Jägor’s system. The proposed system is proof-theoretically well-behaved. It is complete, sound, and decidable. More importantly, zero pronominals of various forms can be derived in the system.

1. Introduction

Natural languages are economical, arguably on all tangible levels, to convey more information with less effort [1, 2]. On the sentence level, various resource reuse strategies are utilized to shun repetitiveness. The multiplicated resources sometimes resort to pronouns, reflexives, and auxiliaries, as in (1)–(3), and sometimes are covert, as the subject in the second coordinate of (4) and the PROs in control constructions of (5) and (6). Interestingly, however, the story goes to the contrary on the semantic side. Each of the repetitive resources, even if it is in zero form, multiplies the meaning of its antecedent and reappears in semantics. For instance, the two PROs in (5) and (6) pick up the meaning representations of their controllers in logical interpretations. They are logically represented as the subject JOE′ and object JEAN′ (in bold), respectively, of the upstairs verbs, as shown in (7) and (8).(1)Joe claims that he will win.       pronoun(2)Joe likes himself.           reflexive(3)Joe walks and Jean does too.       ellipsis(4)Joe walks and talks.        coordination(5)Joei promises Jean [PROi to stay]  subject control(6)Joe asks Jeani [PROi to stay].    object control(7)PROMISE′ (JOE′, JEAN′, STAY′(JOE′))(8)ASK′ (JOE′, JEAN′, STAY′(JEAN′))

The issue of how to calculate these resource reuse mechanisms in TLG is an apparent challenge to the grammar because TLG assumes a monostratal model for natural languages, which means that they can only combine neighboring items or constructions. However, all above-mentioned reused linguistic items in (1)–(6), including anaphora, non-constituent coordination, and control, are discontinuous constructions, in which the anaphor/PRO locates several words away from its antecedent/controller. Thus, the grammar has to be capable of coping with discontinuity in order to be a competent natural language grammar. There are mainly two ways to deal with anaphoric constructions in type logical setting. One is to treat pronominals as being triggered by certain lexical items whose semantic representations contain a λ-operator that binds more than one variable occurrence [37]. However, the strategy of this kind forces highly complex lexical entries and coerces formidable mechanisms, like secondary wrapping [5], into syntax. Another way is to introduce into syntax a specifically designed operation to secure the pronominals a semantic interpretation as simple as an identity function. Hepple’s permutation operator Δ [8], Jacobson’s variable free semantics [9, 10], and Jägor’s Lambek calculus with Limited Contraction (LLC) [11] are all attempts of the sort, among which the last attracts the most attention and engenders a series of logical extensions for its theoretical simplicity [1215].

Jägor [11] includes a limited version of the structural rule of contraction as is shown in Rule 1. Rule 1 does nothing more than allowing the antecedent formulae to be multiplied. The resulted system LLC highlights its vertical slash | that creates categories of anaphoric items. A sign has category A|B iff it needs an antecedent of category B and, if it finds one, behaves like an item of category A. Thus, both the pronoun in (1) and reflexive in (2) are identity functions λx.x of category np|np. Making use of | Elimination Rule 2, the simple reflexive sentence (2) is derived as in Figure 1. We use natural deductions for rules or syntactic derivation hereinafter due to the reason stated in [11] as there is a tight correspondence between the structure of proofs and the syntactic structure of the Curry–Howard terms.

Rule 1. Sequent representation of limited contraction 1

Rule 2. | ELIMINATION Despite all these successful treatments over discontinuity, we find that the current categorial machine struggles to legitimately derive constructions concerning zero pronominals, for example, the PROs in (5) and (6). PROs are different from elided constituents in coordination. The latter can be derived by the generalized coordination rule (Rule 3) as shown in Figure 2, whereas the PROs are in embedded clauses rather than a coordination construction, thus not eligible for predicate coordination.

Rule 3. GENERALIZED BOOLEAN COORDINATION SCHEMEHere is the dilemma for deriving pronominals in TLG. On the one hand, the zero pronominals have to be explicit to carry their meaning for appropriate semantic derivation, while on the other hand, Lambek calculus does not allow zero pronominals to be made explicit as it will induce the structural rule of monotonicity and jeopardizes decidability of the system by allowing the addition of some formula that is similar to one in the antecedent.
As a result, an ideal system for the present purpose is to extend LLC further to allow covert anaphoric items to be overt without hurting the system’s decidability. This is our goal in the present study. It differs from earlier extensions of LLC in such a way that it is a direct logical expansion, rather than a lexical enrichment that includes anaphoric slash in the categories of control verbs [12], and may offer a more meticulous version for the noticing trending works of connecting TLG to computational distributional semantics [1620]. In Section 2, we will flash out our theoretical assumptions for zero pronominals, then present the axiomatic presentation, Gentzen style sequent formulation, and labeled natural deduction of the new system LLCM in Sections 35, respectively, proving that LLCM is complete, sound, and decidable. More linguistic phenomena concerning zero pronominals will also be discussed in Section 5.

2. Anaphora Slot and LLCM

We want to follow Jäger’s approach to anaphors [11] and extend LLC in a way that it can accommodate two desirable properties: one, the system should allow covert item to be overt; two, the made-explicit item should be anaphoric. This means that the system should allow both pronominals and zero pronominals. In this way, the PROs in control constructions can take up an actual logical form as well as interpretation during sentence derivation. Thus, (5) and (6) can be derived ideally as in Figures 3 and 4, where to stay can take up the right subject identical to its controller, instead of looting direct object of the upstairs predicate—its left neighbor—to be its subject. (To simplify the derivation, we ignore the morphological distinction between finite and infinite VPs and treat to stay as a single lexical entry.)

We name the location where the zero pronominal resides “anaphora slot.” Thus, an anaphora slot introduction rule in embryo should be like Rule 4.

Rule 4. ANAPHORA SLOT INTRODUCTIONUnder type logical perspective, introducing an anaphora slot in syntax means that the logic of grammatical composition allows somewhat “redundant” (because they are covert in actual discourse) but not arbitrary category in a valid deduction. This amounts to an assumption that the structural rule of monotonicity is part of the grammar in one way or another. Thus, we name our system LLCM, meaning LLC with limited monotonicity.
We put forward a ternary category , in which is an anaphora slot operator (sometimes simplified as ). It may help to reveal how the category of a zero pronominal is introduced, deleted, or concatenated with categories to its left or right. From our experience with control construction and PROs, there is a preferable concatenative order among the adjacent strings “A, B, C,” where B will concatenate with C (if there is a C to B’s right) first and the result will further concatenate leftwards with its left neighbor A. Thus, the zero pronominals, when made overt, should obey the limited contraction 2 (Rule 5) in its sequent representation.

Rule 5. SEQUENT REPRENTATION OF LIMITED CONTRACTIONIn the coming two sections, we will define LLCM of its axiomatic version and Gentzen style sequent formulation, and prove that it is sound, complete, and decidable under such expansion.

3. Model Theory of LLCM

Now we extend the inventory of LLC category-forming connectives by the ternary operator . So the set of LLCM categories ℱ over a collection of atomic categories A is given below.

Definition 1. LLCM CATEGORIES
If F is a well-formed LLCM categories, then F/F, F\F, F|F, F·F and [F<F>F] are also well-formed LLCM categories.
All well-formed LLCM categories are recursively defined as in Definition 1. Next, a sound and complete model-theoretic interpretation for LLCM is presented.

Definition 2. MODEL of LLCM (This model is based on LLC in [11] and a preliminary version of this system is given in our earlier work [21].)
A Model for LLCM is a tuple <W, R, S, T, ∼, f, g>, where W is a non-empty set of linguistic signs, TW4 is a quaternary relation on W; R, SW3 are ternary relations on W; ∼ ⊆ W2 is a binary relation on W; f is a function from atomic categories to subsets of W; and is a function from LLCM-categories to W. The verification relation between points in W and LLCM categories is defined as follows:p = f(p) ⊆ WAB = {x|∃yz[Rxyz & y ║A & z ║B]}A\B = {x|∀yz[Rzyx & y ║A ⇒ z ║B]}A/B = {x|∀yz[Rzxy & y ║B ⇒ z ║A]}A|B = {x|∃y [Sxyg(B) & y ║A]}AC = {x|∃yzu[Txyzu & y ║A & z ║B & u ║C]}The ternary relation R can be taken as ordinary syntactic concatenation between linguistic signs. Rxyz means that if y and z occur adjacently in that order, the combination of the two gets an x. Relation S is a ternary relation in [11]. It is similar to R but it is responsible for anaphoric resolution. Sxyz means that x is changed into y if there is an element similar to z (noted as y ∼ z in meaning postulates) available to be antecedent for anaphora resolution. Relation T models anaphora slot operation and Txyzu means that x is the result of inserting z in between y and u.
The following meaning postulates hold:MP1 ∀xyzwu[Rxyz & Szwu ⇒ ∃ [Sxvu & Rvyw]]MP2 ∀xyzwu[Rxyz & Sywu ⇒ ∃ [Sxvu & Rvwz]]MP3 ∀xyzwuv[Rxyz & Sywu & Szvu ⇒ ∃r[Sxru & Rrwv]]MP4 ∀xyzwu[Rxyz&Szwu&yu ⇒ Rxyw]MP5 ∀A[ ‖A‖ ⇒ (A)]MP6 ∀xyzu[Rxyz ⇒ Txyuz]MP7 ∀xyzu[Txyzu ⇒ ∃ [Rxyv & Rvzu]]MP8 ∀xyzuvw[Rxyz & Tzuvw ⇒ ∃t[Txtvw & Rtyu]]MP9 ∀xyzuvw[Rxyz & Tyuvw ⇒ ∃t[Txtvw & Rtuz]]MP10 ∀xyzuvw[Rxyz & Tzuvw ⇒ ∃t[Txuvt & Rtyw]]MP11 ∀xyzuvw[Rxyz & Tyuvw ⇒ ∃t[Txuvt & Rtwz]]MP12 ∀xyzuvwst[Rxyz & Tyuvw & Tzsvt ⇒ ∃ab[Txavb & Raus & Rbwt]]MP13 ∀xy[Rxyy ⇒ yx]MP14 ∀B[  ≠ ]MP1–5 mean postulates about relation S [11]. MP 6–7 are about relation T, exhibiting an important feature of LLCM. They amount to say that if x contains an anaphora slot and x is composed by y and z, then x can also be composed by y and among which is the result of conjoining z and the covert u. In other words, a complex sign with an anaphora slot can be represented similarly either with or without the anaphor participating in its syntactic composition. MP8–12 show how categories in the ternary operator compose with its neighboring categories. MP13 says that the extension over a set of linguistic signs is closed under R. The last postulate is a structural postulate complementary to MP6 and MP7.

Definition 3. AXIOMATIC VERSION OF LLCM
The axiomatic version of LLCM is the system that is obtained when the following 12 axioms and 4 rules are added to the axiomatic version of L:A1AB|C ⟶ (AB)|CA2A|BC ⟶ (AC)|BA3A|CB|C ⟶ (AB)|CA4AB|A ⟶ A • BA5AC ⟶ [ABC]A6[ABC] ⟶ A • (BC)A7D • [ABC] ⟶ [(DA)〈BC]A8[ABC] • D ⟶ [(AD)〈BC]A9D • [ABC] ⟶ [AB〉(DC)]A10[ABC] • D ⟶ [(AB〉(CD)]A11[ABC] • [DBE] ⟶ [(AD)〈B〉(CE)]A12AA ⟶ A (monotonicity)Deductive rules:D1D2D3D4Then we can prove the soundness and completeness of the axiomatic version in a way that closely follows the proof for L in [22] and LLC in [11]. We will start with the axiom 6 and D2. The rest cases are already proved in [11].

Theorem 1. SOUNDNESS
If LLCM├ A ⟶ B, then for all models M, ║A ⊆ ║B.

Proof. AC ⟶ 
Suppose x ║AC. Then there are y ║A and z ║C such that Rxyz. According to MP14 and MP6, there is a u ║B such that Txyuz. Hence, for x ║AC, there is y ║A, u ║B, and z ║C such that Txyuz, thus x ║.
Now prove [ABC] ⟶ A • (BC).
Suppose x ║. Then there are yA, z ║B, and u ║C such that Txyzu. According to postulate 7, there is a such that Rxyv and Rvzu. Hence  ║BC, and thus x ║A • (BC)║ .
Axioms 7–11 stipulate associativity of the ternary category. Here we will prove axiom 7 only.
Now prove D • [ABC] ⟶ [(DA)〈BC].
Suppose xD. Then there are yD and z ║ such that Rxyz. Furthermore, there is a yD such that Rxyz, and uA, B, and C such that Tzuvw. MP8 entails that there is a t such that Txtvw and Rtyu. Hence t ║DA, and then x ║[(DA)C]║ .
Last, we prove AAA.
Suppose x ║AA. Then there is a y ║A such that Rxyy. MP13 entails that yx. Since ⊆ is closed under W, thus x ║A.
The proof for deduction rules are similar. We will leave them for readers as exercise.

Theorem 2. COMPLETENESS
For all LLCM-models M, if ║A ⊆ ║B, then LLCM├ A ⟶ B. (We may use ├ to stand for LLCM├ when no misunderstanding arises.)

Proof. We start with constructing a canonical model CM = 〈W, R, S, T, ∼, f, g〉, where W is simply a set of all LLCM categories.
For all atomic categories p, f(p) = {A |├ A ⟶ p}.RABC iff├A ⟶ BC, andSABC iff ├A ⟶ B |C, andTABCD iff├ A ⟶ [BCD],and A ∼ B iff├ A ⟶ B,and g(A) = A for all categories A and B.A ⊆ B iff ├ B ⟶ AThen we prove the truth lemma below.

3.1. Truth Lemma

In canonical model CM, it holds for all LLCM categories A and B that ABC iff ├ A ⟶ B.

Proof. We prove this via induction over the complexity of B. If B is atomic, it follows directly from the construction of f. If B is constructed by one of the three Lambek connectives or LLC’s anaphoric connective, the proof of the induction step follows that in [11, 21]. We will only show the induction steps for the ternary operator.
⇒Suppose that B = [CED] and suppose A ║[CED] ║C. Then there are yCC, zEC, and uDC such that TAyzu according to the definition of []. This means that there are A1CC, A2EC, and A3DC such that TAA1A2A3. It follows that ├ A1 ⟶ C,├ A2 ⟶ E,├ A3 ⟶ D, and├ A ⟶ [A1A2A3] by the way the canonical model is defined. By D2, D3, and D4, it follows [A1A2A3] ⟶ [CA2A3], [CA2A3] ⟶ [CEA3], and [CEA3] ⟶ [CED], respectively. Hence├ A ⟶ [CED].
⇐ Suppose that├ A ⟶ [CED]. By the way the model is constructed, there is TACED. By the induction hypothesis, there are C ║CC, E ║EC, and D ║DC. Hence, there is an A such that TACED. By the way [] is defined, thus A ║[CED]║C.
Then we have to prove that all postulates for LLCM in Definition 2 are fulfilled by the model. They follow directly from the model construction, monotonicity of the product, and the truth lemma above, thus will not be provided here. ■
Finally, we will show that all LLCM valid formulae are derivable in LLCM. Let ║AC ⊆ ║BC. Suppose that A ⟶ B is not derivable in LLCM. There should be a A∉║BC by the truth lemma. By identity axiom,├ A ⟶ A. Thus there is always AA║C and it is not the case ║AC ⊆ ║BC, which contradicts our assumption. Hence, A ⟶ B is derivable in LLCM.
Henceforth, the axiomatic system of LLCM is sound and complete.

4. Sequent Presentation of LLCM

In order to characterize the decidability of LLCM, we need its Gentzen-style sequent presentation. The sequent presentation of LLCM extends that of LLC by proposing R and L for anaphora slot operator , and monotonicity. For simplicity, we will omit the labeled λ-terms for the sequent presentation in the present section.

Definition 4. SEQUENT PRESENTATION OF LLCMTo prove the sequent presentation is equivalent to the axiomatic version, Lemma 1 is needed. And also, a function σ that maps all commas in a sequent into products • is utilized to ensure categories to type correspondence. Thus, σ(A) = A; σ(X, Y) = σ(X) • σ(Y).

Lemma 1. The arrow σ(Γ, Δ, [A1C1], Y1, Z1, …, [AnCn], Yn, Zn)⟶[σ(Γ, A1, Y1, …, An, Yn)σ(Δ, C1, Z1, …, Cn, Zn)] is derivable in LLCM’s axiomatic version.

Proof. For n = 1, the arrow σ(Γ, Δ, [A1C1], Y1, Z1) ⟶ [σ(Γ, A1, Y1)σ(Δ, C1, Z1)] holds by axioms 7–10 and the way “σ” is defined.
Provided that the arrow holds when n = k. By induction hypothesis, arrow σ(Γ, Δ, [A1C1], Y1, Z1, …, [AkCk], Yk, Zk) ⟶ [σ(Γ, A1, Y1, …, Ak, Yk)σ(Δ, C1, Z1, …, Ck, Zk)] holds in axiomatic version. Obviously, there is σ([Ak+1Ck+1], Yk+1, Zk+1) ⟶ [σ(Ak+1, Yk+1) σ(Ck+1, Zk+1)] by A8 and A10. Via monotonicity of “• ”, there is: σ(Γ, Δ, [A1BC1], Y1, Z1, …, [AkBCk], Yk, Zk) • σ([Ak+1BCk+1], Yk+1, Zk+1) ⟶ [σ(Γ, A1, Y1, …, Ak, Yk)〈Bσ(Δ, C1, Z1, …, Ck, Zk)] • [σ(Ak+1, Yk+1)〈Bσ(Ck+1, Zk+1)]
By σ’s definition and A11, we get σ(Γ, Δ, [A1BC1], Y1, Z1, …, [AkBCk], Yk, Zk, [Ak+1BCk+1], Yk+1, Zk+1) ⟶ [σ(Γ, A1, Y1, …, Ak, Yk, Ak+1, Yk+1)〈Bσ(Δ, C1, Z1, …, Ck, Zk, Ck+1, Zk+1)]

Lemma 1. is thus proved.

Theorem 3. EQUIVALENCE OF AXIOMATIC AND GENTZEN PRESENTATIONS
LLCM├ X ⇒ A iff├ σ(X) ⟶ A is derivable in the axiomatic version.
The proof is omitted for the current purpose. We prove the equivalence of LLCM’s axiomatic and sequent versions for the same reason that Lambek proves the equivalence of L’s sequent version and its axiomatic counterpart. The decidability is decidable by Cut Elimination in sequent presentation and this result can therein further percolate to its axiomatic version.

Theorem 4. CUT ELIMINATION
If LLCM├ X ⇒ A, then there is a Cut-free sequent proof of LLCM├ X ⇒ A.
To prove this theorem, we have to distinguish three cases: (1) at least one premise of the Cut is an identity axiom; (2) both premises are results of logical rules, and the Cut formula is the active formula in both premises; (3) both premises result from introducing logical rules, and the Cut formula is not the active formula in one premise. The proof is left as an exercise to the reader.

Theorem 5. DECIDABILITY
Decidability in LLCM is decidable.

Proof. For each rule of the Cut-free sequent calculus, the conclusion sequent of each rule contains more symbols than its premises because each formula in the premise occurs as a subformula in the conclusion and each logical rule introduces one connective. In addition, there are only finite ways to match certain sequents with the conclusion of some sequent rule. As a result, there are always at most finite choices to do a bottom-up proof search and every branch of the proof tree is finite. Decidability in LLCM is thus decidable.

5. Tests on More Linguistic Phenomena

5.1. LLCM’s Natural Deduction in Tree-format

Before testing on more linguistic phenomena, we offer LLCM’s labeled natural deduction in tree-format here. As is shown in Sections 1 and 2, labeled deduction in tree-format helps to visualize the type-logical deduction over a sentence. Suppose Δ is a n-ary operator, ΔE and ΔI stand for its elimination rule and introduction rule, respectively. Here, we will offer <>I only.

Definition 5. <>INTRODUCTION RULE IN TREE-FORMAT〈〉 I enables the covert or elided pronominals to be inserted first before a structure is constructed. E, however, is not given because is a temporary notational strategy to show where zero pronominal is. The notational will disappear when the zero pronoun finds its antecedent and multiplies its interpretation with |E. Now we can show the charm of LLCM with more linguistic constructions that allow zero pronominals.

5.2. Deriving Pros

The anaphora and control constructions we list in (1)–(6) exhibit only the tip of the iceberg of the zero pronominals used in natural languages. Generative grammar distinguishes two types of zero pronominals. In addition to the PROs that are limited to the subject position of a non-finite clause as in control constructions of (5) and (6), there are zero pronouns that occur elsewhere in less restricted manners than PROs. The linguistic economy allows pronouns to be dropped in many languages. They are called little pro. For instance, a subject pronoun in Spanish may be dropped from a tensed clause as in (9), and in Chinese, both the subject and object pronouns may be dropped in similar circumstances as in (10) and (11). Apparently, a type-logical system ready to derive correct readings for sentences in these languages is expected to be capable of inserting the elided pronouns in the anaphora slots and multiplying the semantic resource legitimately.(9)José sabe [que él/pro ha sido visto por María]. [22].José know that he/∅ has been see by María.‘José knows that [he] has been seen by Maria.’(10)Zhangsani shuo [Lisi hen xihuan proi/j]. [23].Joe said Lee very liked [him].‘Joei said that Lisi liked [himi/j].(11)Zhangsani shuo [ta/pro hen xihuan proi/j].Joe said he/∅ very like ∅‘Joe said that [he] liked [him/it…].’

Thus, when the Lambek system is equipped with anaphoric slash and an anaphora slot operator, it becomes as powerful as we would expect it to be. For example, derivation of (9) is illustrated in Figure 5.

In Chinese, both subject and object pronominals in a tensed clause can be dropped. However, the dropped object cannot take the matrix subject as its antecedent, but as some other person known in the discourse. For example, in the Chinese discourse (12) below, pro in speaker B’s answer can only refer to the object in A’s question. Ideally, if our system allows the inserted pronominal to search its antecedent across the sentence border, it can derive (10) in the same way as that of (12) in Figure 6. Derivation for sentences like (11) that drops both subject and object is likewise.(12)Speaker A: Zhangsan xihuan huaju ma?Joe like stage-play Q?‘Does Joe like stage play?’Speaker B: Zhangsan shuo ta hen xihuan pro.Joe said he very like [it].

6. Discussion

LLCM’s labeled natural deduction in tree-format shows that in LLCM, sentences with different degrees of zero pronominals are all derivable, be it a PRO with the restricted occurrence, or a pro-drop with less restrictions in syntax. It exhibits a very promising picture. Nevertheless, poetic as it seems to be, the situation of pros is more complicated than we assume because they are under different restrictions in different pro-drop languages. How to tailor the system according to the requirements of different languages remains to be a question. It might be a good idea to set up a universal model and assign different parameters for different languages as in multimodal CCG [24]. We will leave it for future work. [25].

As seen from substructural logics, system LLCM with the ternary complex category is a specific substructural logic system. It not only shares structural rules such as associative law and commutative law, but also contains monotonicity and a variant of limited contraction, whose axiomatic counterpart is AC ⟶ A • (AC). This rule is capable of inserting the left-hand side category A and is exactly what is needed for characterization on zero pronominals. However, linguistic facts also demonstrate that category B in the anaphora slot may relate to categories outside the slot. In other words, the elided or covert category may bear an anaphoric relation with the category outside the slot. Thus, we stipulate those three categories in the anaphora slot are different from each other and propose the variant of contraction. Further research is needed on theoretical significance of this variant from the perspective of substructural logics.

Data Availability

The data used to support the analysis of the study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study was funded by Major Program of National Social Science Foundation of China (Grant no. 17ZDA027) and National Social Science Fund of China (Grant no. 21FZXB020).