Virtually Embodied Conversational Agent: Project Proposal

Natural language is the most common form of communication for us, but because of its complexity, it is rarely used for human-computer interfaces. However, there are many advantages to natural language interfaces - unlike code, everyone can already communicate with it, and it improves the user experience by making it seem as if a person is interacting with the user rather than a computer.

Furthermore, there are advantages to using language not only to communicate, but also to store knowledge. Storing knowledge in natural language makes it natural to produce sentences in response to the user.

My project is an attempt to combine these two concepts in a virtually embodied agent. A virtually embodied agent is a program that can manipulate and respond to changes in a virtual environment, such as a game or simulation. A virtually embodied conversational agent is an agent that can respond to natural language input from a user.

Abstract:

Intelligent Virtual Agent cognitive models often use a series of abstractions to split different tasks into manageable and solvable problems. For example, language is translated from a sentence to a parse tree, and then to a semantic representation. The semantic representation is then used with a knowledge base to transform the semantics into a temporal logic, and then the logic is transformed into statements which can be evaluated. However, such a pipeline has limitations because each of the constituent parts could aid in evaluating other parts for pronoun reference, disambiguation, prepositions, and pragmatics, yet are kept separate in a pipeline model.

I propose a cognitive model that consists of a cross between a semantic spreading activation network and finite state machine, which is embodied in a virtual world by means of callback functions expressed as nodes in the network. Each node in this network represents a concept that is mapped to other nodes with a relationship. This system allows for conceptual relationships found in a semantic network to coexist with and fill in the information needed for the functional callback nodes associated with particular actions. Gates are used to control shortest path and spreading activation calculations when nodes are queried. Learning can take place through the addition of connections either from language input or through automatic learning (such as Long-Term Potentiation - adding connections between nodes that activate together). The FSM aspect is used to model sequences of actions while maintaining conceptual information at each step of the process.

FinalProjectProposal

2 comments:

LuJanuary 27, 2011 at 3:22 PM
This might be a silly question, not knowing much about computational linguistics: when your proposal refers to "gates", do you mean physical hardware gates? Or is that a logical abstraction of some sort?
IanJanuary 27, 2011 at 3:31 PM
They would be logical, abstract gates. For example, a precondition of "studying" would be "has textbook" OR "has lecture notes". Satisfying either would allow you to study, and so you those two states would feed into the input of the OR node. The output of the OR node wouldn't care how the OR gate was activated, just that somehow it was satisfied.

For another example, an accumulator would only pass on the activation after a certain number of activations reached it. So, "Jump five times" would have an accumulator connected to the output of the "Jump" node, and would then activate after five jumps were completed.

Friday, January 21, 2011

Project Proposal

2 comments: