Natural Language Processing
linking:: AI-900
Text Analytics
- Determine the language of a document or text (for example, French or English).
- Perform sentiment analysis on text to determine a positive or negative sentiment.
- Extract key phrases from text that might indicate its main talking points.
- Identify and categorize entities in the text. Entities can be people, places, organizations, or even everyday items such as dates, times, quantities, and so on.
Resources
- Text Analytics
- Cognitive Services
Capabilities
- Language Detection (name, ISO code, confidence)
- Sentiment Analysis
- Key Phrase Extraction
- Entity Recognition
Speech
Models
- An acoustic model that converts the audio signal into phonemes (representations of specific sounds).
- A language model that maps phonemes to words, usually using a statistical algorithm that predicts the most probable sequence of words based on the phonemes.
Resources
- Speech
- Speech-To-Text API
- Text-To-Speech API
- Speech Translation
- Cognitive Services
Translation
Resources
- Translator Text
- Speech
- Cognitive Services
Translator Text
Using ISO codes you can convert from en to multiple languages such as es and fr-CA. Can be used with profanity filtering and selective translation.
Speech
From languages have to contain the cultural code, and translate to languages without the cultural code.
Language Understanding
Definitions
- Utterances
- Entities
- Intents
Resources
- Language Understanding (authoring, prediction, or both (creating two))
- Cognitive Services (only prediction)
Entities
- Machine-Learned: Entities that are learned by your model during training from context in the sample utterances you provide.
- List: Entities that are defined as a hierarchy of lists and sublists. For example, a device list might include sublists for light and fan. For each list entry, you can specify synonyms, such as lamp for light.
- RegEx: Entities that are defined as a regular expression that describes a pattern - for example, you might define a pattern like [0-9]{3}-[0-9]{3}-[0-9]{4} for telephone numbers of the form 555-123-4567.
- Pattern.any: Entities that are used with patterns to define complex entities that may be hard to extract from sample utterances.