NLP is an interdisciplinary field that uses computational methods to:
Investigate the properties of written human language and model the cognitive mechanisms underlying the understanding and production of written language.
Develop novel practical applications involving the intelligent processing of written human language by computer.
What is NLP? (cont.)
NLP plays a big part in Machine learning techniques:
automating the construction and adaptation of machine dictionaries
modeling human agents' desires and beliefs
essential component of NLP
closer to AI
We will focus on two main types of NLP:
Human-Computer Dialogue Systems
Machine Translation
Human-Computer Dialogue Systems
Usually with the computer modelling a human dialogue participant
Will be able:
To converse in similar linguistic style
Discuss the topic
Hopefully teach
Current Capabilities of Dialogue Systems
Simple voice communication with machines
Personal computers
Interactive answering machines
Voice dialing of mobile telephones
Vehicle systems
Can access online as well as stored information
Currently working to improve
The Future of H-C Dialogue Systems
The final end result of human computer dialogue systems:
Seamless spoken interaction between a computer and a human
This would be a major component of making an AI that can pass the Turing Test
Be able to have a computer function as a teacher
Human Computer Dialogue in Fiction
Halo's Cortana AI
Made from models of a real human brain
Made to run the ship
Made very human conversations
Ender's Game series: Jane
Made from "philotic connection"
Human conversation
Problems of Human-Computer Dialogue
At the moment, most common computer dialogue systems (call systems, chatter bots, etc.) cannot handle arbitrary input
In many cases, the computer can only respond to "expected" speech
Call systems often compensate with "Sorry, I didn't get that," when something unexpected is said.
Problems of Human-Computer Dialogue
Computers need to be able to learn and process colloquial speech
Needed to understand informal speakers:
Understanding varied responses for call systems
Accounting for variations in spoken numbers
Processing colloquialisms is also necessary for seamless dialogue, where the computer must avoid sounding too formal
John Connor: "No, no, no, no. You gotta listen to the way people talk. You don't say 'affirmative,' or [stuff] like that. You say 'no problemo.' "
Successes of Human-Computer Dialogue
So far, human-computer dialogue has been most successful in applications where information about a specific topic is sought from the computer.
Electronic calling systems: company-specific
Travel agents: specific to an airline or destination
However, more complex systems of human-computer dialogue have been produced which can interpret more varied input.
Physics tutoring system (ITSPOKE) which can analyze and explain errors in the response to a physics problem.
Allows for more complex input than "Yes," "No," or "Flight UA-93"
These still cannot compare to true human-human dialogue.
Machine Translation
Important for:
accessing information in a foreign language
communication with speakers of other languages
The majority of documents on the world wide web are in languages other than English
Statistical Translation
Rule based
Works relatively well with large sets of data
Used probability to translate text
Natural translations
Google
Example Based Translation
Converts "parallel" lines of text between language
Only accurate for simple lines
Minimal pairs are easy
Analogy based
Paraphrasing
Takes words and makes them simpler automatically
For example in Spanish conjugated words like usado may be changed to usar
Future of Machine Translation
Goal:
Aim to be able to flawlessly translate languages
Link Human-Computer Dialogue and Machine Translation
Have someone be able to talk in one language to a computer, translate for another person
Translated Video Chat
Machine Translation in Fiction
Star Wars: C-3P0
Interpreter
Could hear and translate alien languages
Final goal of machine translation
Star Trek: Universal Translator
Computer can seamlessly translate alien languages
Problems
Works well only with predictable texts.
Doesn't work well with domains where people want translation the most:
spontaneous conversations
in person
on the telephone
and on the Internet.
Problems
Computers can't deal with ambiguity, syntactic irregularity, multiple word meanings and the influence of context.
Time flies like an arrow.
Fruit flies like a banana.
Accurate translation requires an understanding of the text, situation, and a lot of facts about the world in general.
The box is in the pen.
Problems
The sign is describing a restaurant (the Chinese text, 餐厅, means "dining hall").
In the process of making the sign, the producers tried to translate Chinese text into English with a machine translation system, but the software didn't work, producing the error message,
Successes
Product knowledge bases need to be translated into multiple languages
Hiring a large multilingual support staff is expensive
Machine translation is cheaper and accurate with predictable texts.
Microsoft, Autodesk, Symantec, and Intel use it.
Makes customers happy
Still readable though slightly chunkier than human translations