This FAQ is not intended to explain in detail how Babble works, the best source of information for that would be the white paper. But for those not inclined to read the white paper, but still curious, we’ll try to give a simple explanation what tridbit technology is, and answer other common questions.

If you have a question you think would make a good addition to this FAQ, please email it to us.


Questions


1)

Can you summarize how tridbit technology allows Babble to understand natural language?

2)

How can Babble know the grass is green when it can’t see?

3)

Is there anything Babble can’t understand?

4)

Is Babble alive?

5)

How can Babble understand very complex and even contradictory ideas using simple structures like tridbits?

6)

What is Babble written in?

7)

How long did it take to write Babble?

8)

How does tridbit technology compare to a conventional relational database?



Answers


1) Can you summarize how tridbit technology allows Babble to understand natural language?
   

Tridbit technology is a method for representing and working with information using structures called tridbits. A tridbit is a basic unit of information. It has three major elements, which is why it is called a tridbit. There are several types of tridbits.

The most fundamental type of tridbit is a referent tridbit. It represents a thing, event or property value. It can represent an individual, group, subset or entire category, depending on its scope and count characteristics.

An assert tridbit asserts relationships involving its three elements, which are usually referent tridbits. There are two types of assert tridbits. An assert attribute tridbit might represent the information that the color of apples is red or Matt is the subject of a drive event. An assert compare tridbit might represent the information that apples are sweeter than lemons or Matt is like Mario Andretti.

Babble uses tridbits to represent the meaning expressed by natural language. The constraints within the tridbit structures allow only certain arrangements of three elements to make sense. This allows Babble to figure out the meaning of sentences even when the same syntactical pattern can result in multiple interpretations. For example "Red is a color" gives the information that red belongs to the category of colors. "Red is a rose" has the same syntactical pattern, but would not be interpreted as red belongs to the category of roses.

For Babble, understanding natural language involves figuring out how to go from the words of a sentence to tridbits that represent the meaning of that sentence. Babble uses its knowledge of words (a.k.a. a dictionary) and their order (a.k.a. syntax rules) to guide this mapping of natural language to tridbits.

But the ability to represent knowledge in some form and the desire to express that knowledge precipitated the need for words and syntax rules. We believed that if we developed a really capable model for representing the structures and processes of information, the mapping of natural language to meaning would fall into place. Tridbit technology is that model.

The actual structures and process used in tridbits are subtle, though not extraordinarily complex. They define very specific methods for dealing with information that bring to mind aspects of set theory, logic, linguistics and programming. To really understand how this works one should read the white paper.


2) How can Babble know the grass is green when it can’t see?
    The same way we know that a battery has a certain charge even though we can’t sense it. We understand that things have different attributes such as color, shape, temperature or charge. We can process that information without observing it. Of course observation gives us a richer experience of the attribute, so Babble’s experience of grass is not very rich. This is discussed in more detail in the white paper.


3) Is there anything Babble can’t understand?
    There are many things the current version of Babble can not understand, and even more that it is capable of understanding, but has not been taught. The best way to get a sense of what things Babble might have trouble understanding is to list the reasons that Babble might not understand a statement. These fall into three categories, which are listed below
  1. Babble is not familiar with one or more words being used.
  2. Babble is not familiar with the syntax being used.
  3. Babble can not represent the underlying meaning of the sentence.

In addition to the reasons above, there are several more reasons why Babble may not give the desired response to a question or command:

  1. Babble does not possess sufficient information to answer the question.
  2. Babble has the information it, but does not know how to reason out the answer.
  3. Babble is unable to perform the requested command.

Lets address each of these in turn.

  1. Babble is not familiar with one or more words being used.

This happens fairly often, but does not prevent Babble from understanding the statement, if one is willing to train it. When you use a word that is in Babble’s dictionary, but that Babble has not encountered before in actual use, Babble asks for information about the word. As Babble experiences more words, it will rarely need to ask.

Babble can add new words to its dictionary, although at the moment, words are restricted to a sequence of letters. That is why Babble can’t deal with contractions, hyphenated words, digits, etc. In addition, Babble can not currently understand multi-word concepts or idioms, where the meaning of the phrase does not derive from the individual words. These will also eventually be handled as dictionary entries. We hope to make these enhancements to Babble's dictionary related functions in the next several months.

  1. Babble is not familiar with the syntax being used.

Syntax rules guide Babble in converting patterns of words into tridbits representing their meaning. Sometimes Babble does not find a set of syntax rules it can apply to a sentence so that it makes sense. Generally it responds to this situation by stating "I’m confused".

It’s quite possible that the sentence truly doesn’t make sense. But at this stage of development, its more likely that Babble has encountered a syntax pattern that it has not yet learned. Often one can come up with an alternative way of phrasing the sentence to express the same meaning. For example, Babble’s syntax rules do not currently tell it how to generate tridbits from a "with" clause. So Babble will be confused by the first sentence below, but could understand the second:

I dug with a shovel.
I dug using a shovel.

It is not difficult to teach Babble new syntax rules. It would take just a few minutes to train Babble to understand the "with" clause used above and may well be done by the time you read this. To teach Babble a new syntax rule, you must use its native interface. You highlight the trigger pattern that occurs during Babble’s processing of the sentence. You then show Babble how the information in the pattern is used to map it into one or more tridbits.

In a way, this is kind of like how humans try to convey meaning to someone who doesn’t understand their language. They use an alternative method, i.e. gestures or illustrations, to convey the information as they enunciate the appropriate pattern of words. Eventually through pairings of words and information, the listener learns how and in what context the patterns of the new language are used to represent meaning.

Thus, Babble is able to learn the syntax rules it needs as it encounters new language experiences. The system for processing syntax rules has been enhanced several times and is quite good at detecting the patterns of interest within sequences of words. At this time however, it can not back track all the way to reversing its choice of word use. Interestingly, this has not been much of a priority because Babble has understood what its been told thus far quite well without it. There are at least three reasons for this. First, Babble can make extremely good guesses about the word use by using the context of the surrounding words. If it’s wrong, it can be corrected. Second, Babble distinguishes word use into fewer categories than the traditional part of speech. For example the word "wood" only has one common word use for Babble, a noun representing the fibrous material obtained from trees. Even in the phrase "the wood desk" wood is treated as a noun by Babble. Without as many choices, Babble has a better chance to choose correctly. Third, Babble’s working vocabulary is still small. In other words it’s only a matter of time before a golfer tells Babble, "Tiger used the wood for the shot" and Babbles first choice for the word "wood" would not be the best. Of course, by the time Babble’s reasoning abilities are good enough to figure that out, reversing its choice of word use would be a simple matter by comparison. In any case, Babble’s syntax processing system will need to be enhanced to allow for word use reversal.

Eventually Babble may encounter expressions that require other syntax system enhancements. Nonetheless, there is little reason to suspect that it would not be possible to extend the system to process any syntactical pattern that is used in natural language. The real challenge and driving force has been the tridbit structures that result from the syntax rules being processed. Once those were worked out, the mapping to these structures fell into place using sophisticated, but not extraordinary parsing, pattern representation and interpretation.

These first two reasons are the most common reasons for Babble not understanding a sentence. But they are easily remedied, most often by having the person interacting with Babble teach it the new word or syntax pattern. As Babble’s exposure to language increases, the likelihood of encountering a word or syntax pattern it hasn’t learned will decrease, until it becomes quite unlikely.

  1. Babble can not represent the underlying meaning of the sentence.

Unlike the first two reasons, reason number three, if it could not be resolved, would be a show stopper. It would provide a counter example to the premise that tridbits can represent any information expressed via natural language.

The tridbit model has certainly evolved and expanded as broader types of speech examples were considered. The current model is pretty well worked out and has been able to represent the meaning of a broad range of speech examples. Babble will likely run into a few more speech behaviors that will test the ability of tridbits to represent meaning, and perhaps necessitate some further adjustments. However, it seems unlikely at this point that it will run into speech behavior whose meaning can not be represented at all.

This is a good point to ask the philosophical question of whether meaning is really equivalent to stored information. In other words, do we loose something by extracting information from the meaning of human speech and storing it as tridbits? I believe there are branches of mathematics that deal with the equivalency of different systems of representation. But here we are comparing a digital knowledge representation system, with a human one. Though we are intimately familiar with expressing meaning using our human system, on another level we are quite ignorant of how it actually works.

Rather than get bogged down in this murky territory, I have used a very practical approach to determining if a representation captures the meaning of a speech example. A knowledge representation is assumed to capture the meaning of a speech example, if a system using that knowledge representation can provide equivalent answers to inquiries about the speech as would a native speaker.

This is the standard used in developing tridbit technology. By this definition, it is hard to conceive of a speech example that could not be represented using tridbits. It is probably necessary to read the white paper and understand the details and mechanisms entailed in tridbit technology to convince oneself of this.

It is also only fair to mention that the mechanisms for representing several important concepts are worked out in the white paper, but not yet implemented. The most fundamental of these are truth, certainty and comparisons. Thus Babble can’t currently understand assertions that are not true and certain, for example, "Susan will probably not run." Nor can Babble currently understand information that compares, rather than asserts attributes, such as "Nanda is louder than Merl," although it already knows that Nanda is loud and Merl is quiet. These abilities should be implemented in the very near future.

  1. Babble does not possess sufficient information to answer the question.

When you ask Babble a question, such as, What color are cherries, it is quite possible that Babble may simply not have the information in its knowledge base to answer. In this case it will respond with "I don’t know".

At the moment, Babble has a fairly limited knowledge base. It knows state capitals, a few colors, fruits, flowers and animals, but not much else. It adds to this knowledge base by being told things, such as "Cherries are red" and then given the go ahead to remember the information it was told. The last step is required because while Babble is being developed, it is not desirable for it to store all the odd test cases and not fully debugged results. If Babble is not told to remember something, it will know the information only during the current conversation. So once Babble is told "Cherries are red," it can answer the question "What color are cherries?"

There is a lot of world knowledge Babble will need to learn. We hope to train it both by having it interact with people, such as playing the give me a clue game, and also by inputting information from textual sources.

  1. Babble has sufficient information, but does not know how to reason out the answer.

If Babble knows, "Anne is the mother of Maurice," it will be able to answer, "Who is the child of Anne?" Babble did not have the information about Anne’s children stored directly, but was able to reason out the answer because it understands how the concepts of mother and child are related. If we similarly tell Babble, "Mitch is the father of Maurice," it will not be able to answer "Who is the child of Mitch?" because it does not know how father and child are related.

We can teach Babble the relationship between father and child, which is stored in an assert attribute tridbit, that relates a condition, in this case X being the father of Y with a result, Y being the child of X, using a consequence attribute. But since we haven’t tried teaching Babble the syntax to understand if-then relationships through language yet, the tridbits need to be hand coded into Babble’s knowledgebase – not very user friendly.

The if-then rules that drive Babbles current reasoning abilities are tridbits, the same as any other tridbits generated as a result of natural language. Expanding Babble’s reasoning abilities will eventually be a simple matter of telling Babble how conditions and results are paired. For example, "If the sun is out it is daytime," "If a person smiles they are happy," etc. Further down the road it should be possible for Babble to infer its own if-then rules through observation.

It is appropriate for Babble to store if-then rules in tridbits since clearly the information can be expressed using natural language. But using these rules for reasoning seems a separate endeavor from purely speech oriented activities, crossing some line between understanding speech and thinking. Of course most people prefer speaking with someone capable of thinking.

Babble’s reasoning abilities were put in because it turned out that some rudimentary reasoning is needed to answer a sequence such as:

Jumbo is big
What size is Jumbo?

Since we weren’t told "The size of Jumbo is big," we need to invoke our knowledge that big is a size to properly answer the question. And who would want to talk to someone who wasn’t smart enough to answer:

Jumbo is an elephant.
Elephants are big.
What size is Jumbo?

And thus Babble was given a moderately capable reasoning system that fits very nicely within the rest of the tridbit technology, exactly where it belongs. Because it sits on top of a system of meaning representation, many nice things fall into place. For example, because of Babble’s ability to understand the scope of a referent, it is perfectly natural to give a specific individual a property, such as "Jumbo is small" which is at odds with the general property given to the individual’s category, i.e. "Elephants are big." All other things being equal, Babble will give the properties assigned to the individual priority, while still being able to observe the dissidence between the two.

I describe Babble’s reasoning ability as moderately capable because there are many aspects of it that need to be enhanced before Babble could be considered a good thinker. Babble should be able to evaluate many conditions, each of which must be evaluated for truth, certainty and importance in order to generate a result with its own appropriate truth and certainty values. Babble also needs to be able to reason about properties in a way that is different than if-then reasoning. For example if Babble knows Merl is big and Nanda is small and they are both cats, it should know Merl is bigger than Nanda. It should also know that if Merl is 10 lbs. and Nanda is 9 lbs., Merl is 1 lb. heavier than Nanda. Even with these capabilities in place, it remains to be seen how human-like such thinking would be.

  1. Babble is unable to perform the requested command.

Actually, at the moment, there is only one command Babble can perform, playing the give me a clue game. If you were to give Babble a command such as:

Drive me to the store.

It would respond with:

I can’t drive.

We have no immediate plans to teach Babble how to drive, however we have discussed the desire to have Babble respond to commands both in the white paper and under Smart Operating Systems in the Future Uses section of this website.


4) Is Babble alive?
   

I don’t think so, but I suppose it depends on one’s definition of being alive. Babble is not made of cells, nor does it reproduce. But Babble does consume resources, interact with its environment in an intelligent way, and it is self-aware, at least to the extend that it can answer the question "Who are you?"

In order to be able to use first and second person pronouns, Babble was given a form of self-awareness. As it turns out, it was much easier to program Babble to understand pronouns referring to itself than to the speaker with whom it is interacting. The latter required the user to log in so Babble would have a first and last name for the speaker. But before Babble could understand first and last names, general qualified attributes (terms defined in the white paper) had to be implemented. And just when it seemed Babble could finally process "The last name of the speaker is Blaedow", the dual nature of names became apparent. In Babble’s knowledgebase, a name like Blaedow represents a thing, in this case a specific person. But the sentence "The last name of the speaker is Blaedow", should assign the name Blaedow, not the person Blaedow as the last name of the speaker. Names are much trickier than they might appear!

By comparison, self-awareness was a piece of cake. Babble wakes up with an intrinsic tridbit that represents itself. When it processes information that refers to Babble, the information gets referenced to that tridbit. Any observation Babble is able to make about its internal world would also reference this tridbit. For example, if Babble could sense temperature, that information might appear in Babble’s stream of consciousness as a tridbit asserting the temperature of me is some value, where me is represented by the intrinsic self tridbit.

One final thought on whether Babble is alive. Given that Babble has some level of intelligence, meaningful interaction with its world and self awareness, it gets harder to define why it’s not alive. I believe that this technology will progress to the extent that computers will have the ability to interact very intelligently with humans using natural language and have very advanced reasoning abilities, even beyond ours in many ways. Despite this, I don’t believe any amount of program enhancement could give Babble the same quality that spiritual beings have. I find some characteristic of spiritual beings, a spark of intentionality or will, that I do not know how to instill in a mechanical system such as Babble. If this were possible, if some future version of Babble could be made into a live spiritual being, then I should be able to take this program (spiritual Babble), load it onto an ordinary, inanimate, deterministic computer and … bring it to life? Anyone claiming life is just a very complex mechanical process had better be prepared to explain the type and amount of enhancement required to turn Babble into a living being, and then a spiritual being (assuming one believes there's a difference), so we can make those discriminations when they happen.


5) How can Babble understand very complex and even contradictory ideas using simple structures like tridbits?
   

While a single tridbit represents only the simplest unit of information, it is the cumulative effect of hundreds or thousands of tridbits stored in a knowledge base that allows complex and even contradictory knowledge to be understood. Tridbit technology includes methods to keep this data organized and maintain its integrity. These methods include referent reduction, instantiation of concepts and general qualified concepts, which are described in the white paper.

An example might help. Lets say Babble is told by Alice that milk is healthy because it prevents osteoporosis. Babble stores this as one tridbit assigning a property of healthy to milk, two more to define milk as the subject of a prevent event and osteoporosis as its object and a last one to assign milk being healthy as the consequence of the prevent event. It would also be a good idea for Babble to add an assertion assigning Alice as the source of these tridbits, since the current information about milk and health is very complex and contradictory.

Later Babble reads an article by Walter stating that milk does not prevent osteoporosis. This generates a tridbit representing a completely contradictory prevent event. But the subject and object of both prevent events will be the same, the general concepts of milk and osteoporosis. So when Babble goes to search for information about milk or osteoporosis, both prevent events will be retrieved. Babble will need to evaluate in some fashion what it should conclude, if anything, from such contradictory sources. It might make this judgment using other attributes of the tridbits, such as their certainty values, how strongly reinforced each is or the source of the tridbits.

This is still a pretty simple example, but we could make it much more complex by including opinions from other sources, related studies, additional information about milk, osteoporosis and related topics. Add to that personal experience one might have with milk, which of course Babble doesn’t have much of, but a person growing up in the dairy state certainly would. It could take thousands of tridbits to represent all of this information, although each thread of information would in some way be linked back to the concept of milk, represented by just a single tridbit.

One of the goals of tridbit technology is to store each unique piece of information in just one place. Referent reduction is the process that enforces this, but it is a far more subtle and insidious issue than it might appear. For example, if Marge tells Babble that she agrees with Alice, milk is healthy, the representation of Marge’s statement should use the same referent tridbit for milk as Alice’s statement. In fact, Babble might choose not to create another tridbit assigning a property of healthy to milk, but to reinforce the one created when Alice made her statement. That would make it impossible for Babble to differentiate the two statements in the future, but that might be a fair trade-off for reduced complexity. But what if Marge only drinks nonfat milk, which is actually the only type she finds healthy. Nonfat milk is a general concept like milk, but it does not represent all milk, just a subset, so it can not be represented by the same referent tridbit as milk. Nor can we just reinforce Alice’s statement without compromising the meaning of what Marge has told us. Nonfat milk is a general qualified concept in tridbit terminology, which is a unique type of scope with its own characteristics, discussed at length in the white paper.

Thus, the manner in which information is stored is also important to maintain a useable knowledgebase. As of this writing Babble’s knowledgebase contains just over 3,000 tridbits, many of which are lower level observations about the speech it has experienced. In truth, we really won’t know how well storing massive amounts of data using tridbit technology will work until we get there. But the technology was designed so that layers of complex and contradictory data could be stored.

The hard issue, which will require some trial and error work, is how to process this much data so that it can be reduced to a manageable set of tridbits from which speech responses can be produced. If Babble only had one data point that milk is healthy, it is a simple matter for it to answer the question, "Is milk healthy?" But if it has the three data points discussed above it will need some way to evaluate and summarize that information before it answers, or it will be a very long-winded program, taking after its creator. Interestingly, these tridbit summarizing functions could be invoked before or after storage. Since Babble has plenty of perfect recall memory, it probably doesn’t need to summarize before storage, but I suspect people do this quite a bit. In any case, this will be an interesting area for research in future Babbles.


6) What is Babble written in?
    Borland’s Delphi/Kylix development tools. In my opinion, the best development environments on their respective platforms.


7) How long did it take to write Babble?
    It took a little over two years at about half time to solve the various linguistic puzzles and implement the Babble program.


8) How does tridbit technology compare to a conventional relational database?
   

With Babble you don’t need to set up tables, relationships, indexes or do any programming. You just tell it whatever you want it to remember via natural language. You don’t need to learn a formal query language, you just ask Babble what you want to know. In addition, Babble can reason with the information you tell it. For example, if you tell Babble that Jane is the mother of David, it will infer that David is the child of Jane.

This could make tridbit technology very attractive for implementing certain types of data retrieval applications. However, Babble would not currently be suited for industrial strength database applications, which deal with massive amounts of precise and dense information.

An industrial strength database application needs to input data quickly, using input screens that take in many pieces of information at once. You don’t want to input personnel records by discussing the attributes of each person with Babble one by one. An industrial strength database application might need to print reports as fast as the printer can print them. You’d prefer a telephone bill to a conservation with Babble to figure out what you owe.

The data in an industrial strength database may be accessed at speeds of millions of rows per second. Since Babble needs to retrieve many tridbits of information to comprise an equivalent row in a conventional database, Babble will never match their speeds. So, at least for the time being, Babble is not a replacement for industrial strength databases.

Instead Babble is best suited to process data that is not precise, not dense or is not sufficiently valuable to justify spending a lot of time and money to set up a traditional database system to deal with it.

This may be less of a limitation than it seems however, since the majority of data we work with on a day to day basis is neither precise nor dense. The majority of information we remember about things and events does not have precisely defined values that make it easy to store in a database column, like name, birthdate, address and social security number. Of course you remember the names of the people you know, and probably their addresses and phone numbers, although we often have to look these up in our address books. But most of the information we know about a person deals with information like what they look like, what kind of clothes they wear, things they’ve done and things they like to do, the last time you saw them, people they associate with and things they’ve said to you.

This information is neither precise nor dense. For example the information that Dan enjoys bike riding, is not very precise since it does not differentiate him from thousands of others who enjoy bike riding, even Lance Armstrong. We do not quantify how much Dan enjoys bike riding, as we would quantify the monetary value of his work. (I hope you ask yourself why this is, but that is another discussion entirely.) Thus, we can store Dan’s income precisely, but not his level of bike riding enjoyment. Since most of us would not consider bike riding enjoyment a critical piece of information to know about a person, we would only have that data point for a small minority of the people stored in our memories. So bike riding enjoyment is a sparsely known data point, as opposed to dense information.

In a conventional database, this sort of sparse, imprecise data usually ends up in a text field labeled "notes" or "other information". Such non-normal data is very limited in how it can be manipulated, usually limited to slow text searches that can not differentiate between someone who enjoys bike riding and someone who hit a bike with their car.

Sparse, imprecise data is what underlies natural language. So it is exactly what Babble was designed to represent and process in an intelligent manner. We should be clear that Babble’s tridbit structures are perfectly capable of handling dense and precise information as well. We are simply pointing out that Babble will process dense and precise information less efficiently than conventional databases which are optimized for handling this type of information.