Change Location EUR
 
Mouser Europe - Flag Mouser Europe

Incoterms: DDP is available to customers in EU Member States.
All prices include duty and customs fees on select shipping methods.

Incoterms: DDU applies to most non-EU customers.
Duty, customs fees and taxes are collected at time of delivery.


Please confirm your currency selection:

Euros
Euros are accepted for payment only in EU member states these countries.

US Dollars
USD is accepted in all countries.

Other currency options may also be available - see
Mouser Worldwide.

Bench Talk for Design Engineers

Bench Talk

rss

Bench Talk for Design Engineers | The Official Blog of Mouser Electronics


Natural Language Processing—Syntax Meets Sentiment Carolyn Mathas

(Source: Murrstock - stock.adobe.com)

A branch of artificial intelligence, natural language processing (NLP), lets computers understand the content syntax and a creator’s intent and sentiment. It combines linguistics, computer science, and artificial intelligence (AI) and involves programming computers to process and analyze copious amounts of natural language. Simply, NLP allows us to talk to machines as if they were humans. If you use Siri, Alexa, Hey Google, and chatbots that filter requests, you’re already using NLP.

How NLP Works

NLP uses deep learning and algorithms to interpret and understand human language. Deep learning models convert voice and text (unstructured data) to structured and usable data. It breaks language into words and creates context from the relationship between them.

NLP is used to segment data into a specific group increasingly accurately. This process is broken down into stages. Tokenization is an NLP processing unit such as a word or number, a token, or a sequence of characters. Breaking down each phrase into pieces is tokenization.

Stop word removal removes words such as prepositions and articles that don’t provide value. Stemming/lemmatization transforms words to root form and assesses context use and part of speech tagging—tags words according to the grammatical case.

NLP Challenges

Although the technology has come far in record time, NLP still has challenges, including:

  • What defines a word? Chinese, Japanese, and Arabic, for example, require a unique approach.
  • Recognition mistakes happen if words are jammed together or there are missing spaces between an account number and name.
  • When stop words are removed, grammar information is often removed.
  • Optical character recognition (OCR) is the basis of automatic text recognition. If the output of an OCR process has typos, recognition mistakes, non-text symbols, etc., or has 2D tables, headlines, numbers, and boxes, it’s difficult for machines to read line by line. Computer vision and machine learning (ML) algorithms are solving the difficulties.
  • Often the real challenge, however, is convincing stakeholders to use NLP given its cost and figuring out which corporate uses take priority.

Enter GPT-3

NLP doesn’t operate alone. Generative Pre-trained Transformer 3 (GPT-3), developed by OpenAI, is an AI language model that uses AI to produce human-like text. Based on the transformer architecture, the model predicts the next token in a sequence to perform tasks it has or has not yet been trained on.

OpenAI created an API that encourages developers to develop use cases by inputting text, or a prompt, to the GPT-3 and conditioning it to perform a specific task. Instead of code, developers use "prompt programming," giving GPT-3 examples of the kind of output to generate. This process is improved by providing the algorithm with examples or human feedback data sets.

NLP algorithms are created using a rule-based approach—algorithms that follow manually crafted grammatical rules. A faster process is provided by machine learning algorithm models based on analytical and statistical methods and training. When training increases, more accurate and intuitive prediction results.

The GPT-3 ML model performs human writing and reading comprehension better than humans.

NLP Use Cases

NLP is rapidly moving from word/sentence embedding to conversational capabilities across various industries. The following represents a handful of apps where it’s making tremendous progress:

The Written Word

NLP understands a writer’s recurrent patterns, recognizing when the writer deviates from the patterns and makes suggestions to get back on track. It’s used in content translation, paraphrasing, editing, generation, and SEO advice.

Gaming

Within gaming, players want more realistic and lifelike experiences. NLP can analyze a player’s dialogue and gestures within the game and generate new quests or objectives based on player behavior.

Chatbots

Chatbots are widely used to provide improved service to customers. As NLP advances, responses and assistance will give way to more human conversational capabilities.

Sentiment Analysis

Sentiment analysis uses big data to analyze consumer satisfaction. Modern solutions allow comparing indicators to competitors. Brand managers use it to improve performance and develop improved branding techniques.

NLP and Cloud Services

Merging NLP in the cloud enables NLP-related experiments on large amounts of data handled by big data techniques. Many NLP-related software tasks are integrated and used on the internet, carried out as Software as a Service (SaaS) in cloud computing.

Education Tech

NLP helps students improve their reading and writing. It renders actionable advice that fosters improvement. Grammarly is an example. NLP can accurately match students to suitable reading materials and grade reading scores. By analyzing teacher and student language, NLP can pinpoint mental states during class. It can also identify students struggling with lessons.

Advanced Tools and Tasks

There are three standard models: GPT-3, Codex, and DALL.E-2.

GPT-3

In addition to writing, GPT-3 is also helpful in:

  • Summarization
  • Parsing Unstructured Text
  • Classification
  • Machine Translation

At its simplest, it is a massive prediction engine trained on trillions of pages of text on the internet.

Codex

OpenAI Codex is a general-purpose programming model that translates natural language into code. It contains natural language and billions of publicly available source codes. Proficient in more than a dozen programming languages, it is most capable in Python. OpenAI Codex produces working code so that commands can be issued in English to any software with an API and empowers computers to understand intent better.

DALL.E-2

DALL.E-2 is a neural network-based ML algorithm that generates images from textual descriptions using an NLP technique. It relies on a data set of images and corresponding text descriptions to learn image generation. The algorithm tokenizes text description into a sequence of words and encodes the words into a series of vectors. The vectors pass through a recurrent neural network, generating pixels decoded into an image. The algorithm compares the new image to the original and adjusts pixels. It generates more realistic and accurate shots with 4x greater resolution than previous versions.

Why It's Gaining Traction

There are three main reasons for the rapid growth of NLP use:

  1. Advances in machine learning. Using powerful AI chips, the machines have greater computational power enabling more human-like interactions.
  2. The growth of data availability and data labeling tools that annotate text or audio contribute to rapid NLP expansion.
  3. Businesses are implementing NLP models to match customer expectations.

For programmers, the reason is apparent. Future programmers will write down what they want a piece of software to do, and the computer will generate code. NLP will allow anyone with a decent command of their native language to program.



« Back


Carolyn Mathas is a freelance writer/site editor for United Business Media’s EDN and EE Times, IHS 360, and AspenCore, as well as individual companies. Mathas was Director of Marketing for Securealink and Micrium, Inc., and provided public relations, marketing and writing services to Philips, Altera, Boulder Creek Engineering and Lucent Technologies. She holds an MBA from New York Institute of Technology and a BS in Marketing from University of Phoenix.


All Authors

Show More Show More
View Blogs by Date

Archives