Lexon is SYMBOLIC AI.

The Lexon compiler is an AI tool in the original sense. It helps solving what classic AI research found to be a hard problem: the seeming intractability of natural language. Likewise, the currently mainstream, analytical philosophy could not crack it. Machine learning isn’t there yet and in a way is ducking the test.

Importantly, Lexon addresses the challenge in a way that is deterministic and scalable. It tackles this problem first and proceeds from there, using techniques that were used in – and in part created for – strong AI.

But contrary to appearance, Lexon is fundamentally not word-centric: it does not operate on the preconceived meaning of words but on the way they are used, as the later Wittgenstein demanded. This leaves assumptions of analytical philosophy behind that informed early AI efforts.

The definition of AI has changed materially over the past decades, note least for marketing and grant application reasons, and the current use of the label might have been called statistics by the first generation of AI researchers, who were attempting to create general artificial intelligence (AGI) focusing on algorithms rather than data. In the long run, Lexon might provide a key to achieving general AI in combination with machine learning heuristics. In itself, it is exactly the opposite of machine learning: Lexon texts are transparent, self-explanatory and provide full agency. They are performed deterministically rather than with a chance of error. This addresses the problematic shortfalls of machine learning that receive more attention today. That is, Lexon provides what is missing in the current approach to AI.

The potential for change that Lexon untaps is arguably higher than what machine learning adds to the various industries that make use of it. This is mainly because it is about execution.

The advances in governance, trade and law that Lexon can provide by a new quality of robustness, speed and accessibility might lead to the merging of professions, the empowerment of some and the eliminations of others, while increasing productivity and democratic participation in a more direct manner than the change that is driven by machine learning today.

Abstract Syntax Trees

At the heart of Lexon’s power lies its capability to create a special Abstract Syntax Tree (AST), an intermediate format that every compiler creates from its input (program code) to then create its output from (executable programs). So does Lexon.

To quote Wikipedia, an AST is:

"a tree representation of the abstract syntactic structure of source code written in a programming language … ‘abstract’ in the sense that it does not represent every detail appearing in the real syntax, but rather just the structural, content-related details."

The tree reflects the program (or text) that it was created from, which will usually be speaking of things like files, data and algorithms. The novelty is that Lexon creates an AST directly from the prose of a contract, resulting into an AST that consists of subjects, objects and predicates, which in its very structure captures the structure of a document and of natural language, and therefore captures more high-level meaning than ASTs of other languages. It is because the Lexon language is closer to human language, that the resulting AST shapes up closer to human thought and natural language grammar.

This allows for output that is likewise closer to how humans communicate. Because with Lexon, relationships between entities are stored in the way that humans reason about them. And as is well known today, this metadata – the relationships – can be very powerful.

Lexon crosses a crucial threshold by being fully conformant to natural language grammar. This results into a quantum leap that does not happen as long as a language just edges close. It does not matter in this regard that Lexon's readability is achieved by defining a subset of natural language grammar, instead of being able to handle just any prose.

Beyond the grammar, the document structure also plays a key role for Lexon’s readability. It is likewise reflected in a Lexon AST.

Everything else follows from there. Lexon’s parsing and compilation process to create an AST is per se not special, every compiler does this. But because of its higher abstraction level, the Lexon AST captures something that ASTs normally do not: the abstract meaning of a text.

A Thought Experiment

The difference in abstraction levels can be illustrated by contrasting Lexon to the program language of choice for Ethereum, Solidity. A frequent question is if one could also (automatedly) translate from Solidity to Lexon – i.e., the other way around than what Lexon is made for. This is an understandable request, as many projects have invested in Solidity programs and now realize that it would be nice to have them as readable as Lexon code.

The question is interesting in many ways. However, the answer to what this questions really means is, no. Technically it would work. It is certainly possible to automatedly translate Solidity into Lexon, because both are Turing Complete languages. There should not be any problem.

But the answer is “no” regarding the intent of people’s question, which is: would it be possible to get nicely readable Lexon code from such inverse translation. That would not quite be the case.

The result would be more readable than the Solidity code, offering the same zero-learning curve regarding both grammar and vocabulary. But it would still not be intelligible to non-programmers. Because it would show the low-level structure typical for Solidity code, not the human thought behind the program, not the 'business logic'.

Smart contracts of realistic complexity would turn out logically convoluted – reflective of the Solidity code – and would lack the essential feature of Lexon code: being not only written in natural language words but also structured in a coherent way, e.g., such that it can pass as the prose of a legally enforceable contract.

Code translated from Solidity would not – as original Lexon code does – express in clear terms the meeting of the minds that a judge will be looking for, and without which there may not be a legally binding agreement in the first place.

So, while you would end up with a working Lexon program, it would be almost as unreadable for non-programmers as Solidity code always is. The logic presented would still force people to think like programmers. It is precisely the ‘meaning’ that the Lexon AST captures that cannot automatically be added when a Solidity program is the starting point.

Because the Solidity code simply lacks this higher level of abstraction. It is actually the first step of the work of a Solidity coder, to leave this level behind and strip it out. This is true for practically every programming language.

A More Meaningful Level of Abstraction

In the end what makes the difference is not just that Lexon’s vocabulary stays closer to plain English, and not just that Lexon’s grammar is more natural than that of other program languages, but that Lexon’s abstractions operate on a higher level.

And this results in ASTs that express meaning that is not present in source code written in Solidity, and therefore also not in Solidity's ASTs.

This is the essential bit that would be found missing when translating Solidity back to Lexon.

Because until now, as a required first step when programming, this high-level meaning of a program is shred to finer grained piecemeal by the programmer and rendered unrecognizable. Amazingly, it is the business logic of a program itself that does routinely not survive impact.

C. Lopes et al. in their paper Toward Naturalistic Programming write to this effect:

"Researchers are constantly looking for ways to express the programs in a form that more closely follows the way programmers think before they are forced to break their thoughts in operational details imposed by the existing programming languages. We know that this is possible, because when programmers are asked to explain their code, they do so concisely, skipping operational details, sometimes using a thought flow that is quite different from the control flow in the code".

The building blocks that programs are commonly created from today are just too subtle to capture the higher level. The more so, the lower-level (early-generation) the language is.

As a metaphor, the difference could be described by molecules vs. atoms: Solidity loses (or never has) the information about how the atoms are interconnected, and therefore, does not have the notion of molecules and therefore, cannot reflect it for a reader to see. Solidity programs may not lack functionality, but the Solidity AST will only talk of O and H, and not of H2O, as the Lexon AST does.

The connection of O and H would exist implicitly in the Solidity program. But in the Lexon AST, in this metaphor, the H2O molecule would be spelled out explicitly.

Thus, because of the high level that Lexon has as a language, the meaning of text written in Lexon is captured in Lexon’s ASTs in a way not present in the ASTs of a lower level language, such as any mainstream programming language and blockchain smart contract language today.

Artificial Intelligence Tooling

Machines will be capable, within twenty years, of doing any work a man can do. — Herbert A. Simon, 1956

While Lexon is no attempt at sentience, it owes its capabilities to using the models and tools developed for strong AI.

That’s not special, all modern program languages do that. But Lexon uses them in an unabashed, back-to-the-roots style that brings compiler tech full circle. This circle stared many decades ago with a new style of natural language linguistics, which entered computer sciences, and now is used for natural language again:

ASTs are generally created by programs (compilers) that implement grammars, which are defined using Bachus-Naur Form (BNF), which was invented to describe the grammars of programming languages, and was itself based on Context Free Grammar (CFG) as popularized by the linguist Noam Chomsky:

CFG ⟶ BNF ⟶ grammar ⟶ compiler ⟶ AST

The linguistic research this came out of was in fact machine-oriented: Chomsky’s MIT work in the day was financed by the DoD in the hopes that it would produce natural speech-guided weapons systems.

While CFG was intended to help understand natural language better, it was instead very successfully used to create a notation of programming languages, BNF, that became the standard to describe the grammars of computer languages, first among them ALGOL in 1960.

As for linguistics, CFG turned out to be not powerful enough to describe natural languages and the space moved on. Chomsky has long left this approach behind. No speech-controlled weapons systems were developed either. But in computer sciences, the model of CFG thrived. In the form of BNF it is in use for 60 years now to express the grammars of languages of the ‘3rd generation’ – the likes of C, C++, Java – but also for the more logic-leaning languages like Lisp and Prolog that once had the hopes for strong AI riding on them.

The ’full circle’ is that Lexon applies CFG, in the form of BNF, back to natural language, where the model came from. To create a programming language in the intersection of what is expressible in natural language and what is parseable by a machine.

So that a program can be expressed in a way that reads as easy as natural language but can also be conveniently processed by a computer.

A More Elegant Stack

Regarding natural language processing, Lexon's approach cancels out the layer of the computer language itself. Originally,

(1st layer) BNF would be

(2nd layer) used to define a language like Lisp

(3rd layer) that would then be used to program AI, in Lisp

(4th layer) that would then process natural language.

The theory being that thought was something behind language, separable from it, and could be captured on the 3rd layer, the program. In other words, In the standard approach to AI in the 70s, the processing of natural language would have been the subject of a program programmed, e.g., in Lisp, i.e., and reside on the 3rd layer in some data structure suitable to reflect human thought.

Lexon does not go for sentience and does not try to capture thought in analyzed form on the 3rd level, but uses natural language more directly, one layer deeper. With Lexon, it is the grammar of Lexon itself where natural language comes into play, on layer 2:

(1st layer) BNF is used to

(2nd layer) define controlled natural language i.e., Lexon

(3rd layer) to write digital contracts in natural language.

In Lexon’s stack natural language grammar is pervasive, reigning across all three layers. AI and language parsing are not payload of the program on layer 3. They are the basis of all layers.

This works because BNF itself was modelled on CFG that was invented to describe human languages. While it may not be the path to machine awareness, this is useful. What falls by the wayside is a separate programming language grammar in the image of Frege's Begriffsschrift, as they all are. What’s crucial about that is that a translation step is cut out. And not just one translation step – the one that hoped to translate meaning from language to math structures – but all translations steps of meaning. Because there is no other. Lexon just keeps language intact as store of meaning.

With Lexon, the place of natural language is directly adjacent to BNF, i.e., supported directly by the tool modeled on CFG, instead of using BNF to build a non-natural language that then is used to program a program to process natural language.

Who has the taste for it will admit that this is a more elegant and promising stack. With Lexon, ‘meaning’ is processed on the level comparable to the Lisp program code, instead of on the level of runtime data (in the Lisp stack). Lexon drops the idea to separate intelligence and language and to express thought in anything else than natural language. It doesn't try to conjure magic operating 'behind' the veil of language, expressing its meaning in math.

This touches on a deep and controversial question of linguistics: is there, for humans, a neutral representation of reason 'behind' language? One that can intuitively be imagined to be the common well of speech no matter in what language a polyglot expresses herself?

Leibniz thought so but could not find it. Humboldt felt that thought could only exist in language and Loglan was invented to find out if a better language would allow for better thought. Orwell had no doubt that language was needed for thinking to the degree that degrading language could make thinking impossible.

Chomsky subscribed to the hypothesis that an innate faculty of speech existed that would then give rise to language, but later moved away from this view.

Leibniz specifically proposed his Characteristica Universalis as a necessary symbolism that would have to be discovered first, to express pure reason in it, cleaned of the peculiarities of natural language, so that one could automate reasoning.

Suffice it to say that linguists cannot agree and the 20th century saw a back and forth between the theories.

Strong AI research in the 70s very much assumed that sentience should be achievable on a more mathematical level than language. Lexon is a late but timely complement to that – mirroring the more neglected half of linguistic research, which posits that thought might not be separable from language.

And this is one of the rare things Lexon shares with generative AI: that it does not try to unpack meaning. It stays with words.

Preservation instead of Decomposition

This may provide an alternative answer to the 70s quest to find a manageable way to have programs self-modify – something that inevitably made programs impossible to debug and was therefore abandoned (if with a heavy heart).

Self-modification looked promising because the thinking went: for it to be AI, something more than what the programmers put in would have to come out. Not just more numbers or words but more insight, more logic, a new structure. In that light, what would be more plausible than to suspect this 'more' to be found in newly, self-created code?

If the third layer (above), the Lisp program, could only reflect upon itself – modify itself even – it could perhaps produce emergent results on layer 4.

This was an attempt to break through the limitations of the standard von Neumann architecture of computers that separates code from data, i.e., the program from its subject.

Lexon follows a different path: it steps out of the way.

Lexon does not add anything, but instead preserves the structure of the input so well that the output has stronger semblance to human communication. The very translation step is cut out that went from input to meaning and back – i.e., from language to math back to language.

There is still processing, a transformation from input to output. But no attempt to transform thought and logic, expressed in human language, into a condensed essence and back. No attempt, that is, to create 'intelligence.'

But in so far as a program takes human input to produce meaningful results, Lexon can transport more of the human-understandable mesh of meaning from input to output, intact. This has practical benefits ranging from improved communication about code, when writing it; through the long-elusive price of self-documentation; to a quantum leap in frontend generation; and literally involving different parts of the brain in programming.

AI Safety and Data Protection

Lexon should be the language that lawmakers use to articulate the Robotic Laws in that we need now. The advantage of using Lexon is that hardware producers can be obliged to build-in the very code that lawmakers created. The law itself, verbatim, then is program code. The room for honest and dishonest mistakes is eliminated that usually separates the patient letters of a law from its implementation. By the same token, data protection algorithms can be made transparent and mandatory for social media and data processing organizations, public and private alike. This may be the most important application of Lexon.

From the LEXON BOOK.
Get it >