Natural Language Processing AI Technology a Quick and Brief Intro with Examples

2022-02-082022-06-03 | Constantin Bosse

The company that I currently work for, Nautilus Cyberneering, has a 5 year project for which the so called Natural Language Processing AI is key. We essentially want to create a virtual artificial intelligence assistant that you can run from your own local computer and communicate with you through a command line interface.

This assistant we envision, will do all sorts of things that a private user may consider of value. The user will basically interact with the “machine” indicating what he wants to achieve or do, and the “machine” will respond to his input.

As you can imagine such an application will require a good understanding of human language and it could look like this:

Clearly not exactly like in the picture but you get the point. : )

Human communication and understanding is rather complex, as you well know. Hence to achieve this we will employ “Natural Language Processing” artificial intelligence models also abbreviated as NLP.

Starting My Research

Given this I wanted to begin forming my opinion and test a few and ask around if anyone in my network had used any Natural Language Processing AI so far. It happened to be the case.

Some good friends of mine were currently using GPT-3. They told me that to them it was another employee in their company. Knowing them I knew it was no overstatement, when they told me that they used it for code review and research. Especially since their business also happens to be in machine learning, AI and automation solution consulting. Consequently, I became even more interested.

However, if you keep on reading please let me first start by saying that I do not consider myself an expert in this field, so please forgive any mistakes I may make during this post. Still you may find it interestint if you are also new to the topic.

In this post I will do the following:

Briefly explain what NLP is
How do NLPs work
NLP Creation Techniques
Known NLP Models
Share some links to the ones I found most interesting
Give you some examples of their replies to my input
Share some already usable tools

What is Natural Language Processing (NLP)

These are two definitions from different sources:

Wikipedia

“Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. The goal is a computer capable of “understanding” the contents of documents, including the contextual nuances of the language within them.”
wikipedia.org

IBM

“Natural language processing strives to build machines that understand and respond to text or voice data—and respond with text or speech of their own—in much the same way humans do.”
ibm.com

So to sum up:

NLP is an artificial intelligence technology meant to power machines. It processes human language inputs written or spoken, understanding and responding to them.

How do NLPs Work

The before mentioned summary sounds very simple, but it is not. An NLP system needs to:

Recognize speech, which is convert voice data into text data, no matter how they speak, where they come from or what accents or mistakes they make.
Tag words, be they nouns, verbs, articles, etc.
Decide on the intended meaning of a word given many possible meanings based on the context.
Differentiate between block elements such as sentences.
Establish relevant words, for example names of a person, state, etc.
Make contextual cross references, from pronouns or descriptive words, etc.
Infer the emotional load within a text, such as subjectivity, objectivity, sarcasm, etc.
Generate “human” responses from structured information.

I do not know you, but I think that this is even difficult for a human. Recall for instance when you learn a language. All the different accents, double meanings, the different sense of humor, etc. Complex indeed.

If you are curious you can read more here.

Natural Language Processing AI Model Creation Techniques

Creating a single working NLP model is difficult. Evidently, it takes a lot of effort. For many years different approaches came into existence to optimize and test this process. Research in this field has been going on for over half a century. You can get a brief overview of the past models in Wikipedia.

The currently used machine learning methods are two. The two require extensive use of computational power and can be used in combination.

One could write a book on each of them but this is not my intent so that I will try to briefly describe how I have understood them and include a link to more information.

Feature or Representation Learning

A system is set up to automatically discover and learn through prepared sets of labeled or unlabeled data. It essentially learns to recognize and associate features, common patterns, within a context and make associations of meaning. For more information here.

Deep Neural Network Learning

Is an approach in which there are different layers of inter connected nodes. Nodes are computational sets of rules that get adjusted in the form of weights during the training phase. The nodes pass information through them. The data that you input into the system proceeds through this network of decision rules and progresses through the different layers like a decision tree. For more information here.

Known Natural Language Processing AI Models

There currently exist many NLP models. It would seem that there is a race to develop the most powerful one. You will find WU DAO, GPT-3, GPT-J, Meta AI, Bert, etc.

One of the challenges researchers are facing with such models is whether the models have learned reasoning or simply memorize training examples.

Clearly as you can image, some are Open-Source and others not. Through the use and access to these available models many solutions are being. I will briefly highlight some facts about the ones that I have looked at most and which I found demo implementations for or solutions developed on them which you can try.

GPT Group

GPT stands for “Generative Pre-trained Transformer”. These are models trained to predict the next token in a sequence of tokens autonomously. A token being a set of characters when it comes to text characters.

GPT-3

This is the model that has recently created a lot of buzz since 2020 when it came out. In 2020 it was the largest model ever trained. It has been already used to implement marketed solutions by different companies.

The model was developed by OPENAI. It started out as an open source project; however, nowadays its code base has been licensed out exclusively to Microsoft.

It has been trained to perform generalist and niche tasks such as writing code in different programming languages such as Python.

	GPT-2	GPT-3
Date	2019-02	2020-05
Parameters	1.5 Billion	125 Million – 175 Billion
Training Data	10 Billion tokens	499 Billion tokens

Model Progression OpenAI

Here are two interesting links:

An in depth article by Lambda an AI infrastructure company providing computation: https://lambdalabs.com/blog/demystifying-gpt-3/
A link to their API if you are interested: https://openai.com/api/

GPT-J, GPT-Neo & GPT-NeoX

These three models have been developed by EleutherAI. It is an Open-Source project. It is a grassroots collective of researchers working on open-source AI research. The models can from what I read be considered generalist models good for most of the purposes.

	GPT-Neo	GPT-J	GPT-NeoX
Date	2021-03	2021-06	2022-02
Parameters	1,3 to 2,7 Billion	6 Billion	20 Billion

Model Progression EleutherAI

Interesting Responses from GPT-J

Below you will find several screenshots of the responses that I got from their online test interface so that judge for yourself.

AI responding to “Who is the greatest musician of all times?”

AI responding to “which is the best beginner programming language in your opinion?”

AI responding to “what is more important to work or to live?”

Here is the link to the online test instance where I got the responses from if you are interested: h ttps://6b.eleuther.a i/

On the other hand you also can get paid access at goose.a i and test the different EleutherAI models at very reasonable prices.

Wu Dao 2.0 – China’s Monster Natural Language Processing AI

This Natural Language Processing AI model is considered the “monster” and largest NLP model ever. It was generated by the Beijing Academy in june 2021. Its code base is open-source based on PyTorch and it is “multi-modal” being able to process images and text at the same time and being capable to learn from it. Something that the others are not capable of.

It was trained on:

1.2TB Chinese text data in Wu Dao Corpora.
2.5TB Chinese graphic data.
1.2TB English text data in the Pile dataset.

It is supposedly capable of doing all the standard translation etc. but also composing poetry, drawing, singing, etc…

	Wu Dao 2.0
Date	2021-06
Parameters	1,75 Trillion
Training Data	4,9 TB

Model Specs Wu Dao 2.0

Some Implemented Solutions

Here you will find some interesting implementations that you can start using today if you want.

Jasper

This is a tool that I think many digital copy writers will find handy to ease their work.

Jasper (Formerly Jarvis) – #1 AI Writing Assistant

Create content 5x faster with artificial intelligence. Jasper is the highest quality AI copywriting tool with over 3,000 5-star reviews. Best for writing blog posts, social media content, and marketing copy.

Thoughts

Same applies to this solution which helps you speed up your tweets in your own style.

Thoughts – Create intelligent Thoughts

Thoughts leverages state-of-the-art language model GPT-3 to create human-like Thoughts in your writing style

DeepGenX

This is a solution for developers to write code faster and easier.

DeepGenX

We are thrilled to announce CodeGenX! A Code Generation system powered by Artificial Intelligence!

Nevertheless, this is just three from many more. Here is a more extensive list of such solutions.

Final Reflections

Like with the examples above, technology never seizes to amaze me. Evidently, there is great potential in their use. Yet, what are its resulting disadvantages?

OpenAi, for instance decided when they developed their GPT-2 model to not make it fully available due to its potential to create fake news with it. In addition, later OpenAi went one step further and called out to create a general collaboration on AI safety in t his post.

I agree with this line of thought. We have to weigh AI’s possibilities and dangers and check them against our values and beliefs. Technology in the end is nothing but tool, powerful though. Reason for which this old adage from before Christ rings true again:

“With great power comes great responsibility.”
Not from Marvel Comics : )

AI has only started and we are still to see much more of it in the coming years. If you want to read another interesting example of Natural Language Processing AI at work, here is another post of mine.

Good read: AI – Summarizing Books with Human Feedback

2021-11-092022-01-13 | Constantin Bosse

“Summarizing Books with Human Feedback” was published on OpenAI.com. It is a quick read, very clear and well written. They share some examples from on how they had AI summarizing books. In the article they give the example of the classic “Alice’s Adventures in Wonderland”, by Lewis Carroll which they used to test it.

They briefly explain their recursive approach and how they include human feedback during the process evaluating the AI’s summarys.

First edition Bookcover Image modified with AI tools from Corel Painter 2021

They also include a link to their official research paper written by Jeff Wu, Long Ouyang, Daniel M. Ziegler, Nisan Stiennon, Ryan Lowe, Jan Leike, Paul Christiano.

In my opinion it is an impressive task that has been achieved so far. Though there is still lots to improve from briefly looking at the summaries, to have AI summarizing books is still impressive.

It is a remarkable work that I think can have great application especially for technical texts in my opinion, though for fiction texts I think it will not be as applicable. I wish I had had this tool when I was studying to double check my summaries.

I believe that the summaries lack some human touch but this is logic. However this is very subjective though from my side.

Tackling Problems: Plastic – An Example from Guatemala

2018-07-192018-07-15 | Constantin Bosse

I saw this video yesterday, it impressed me for two reasons.

It shows that modern societies problems can be tackled with common sense and a joined will.
The importance of clear priorities no matter the economic cost, in this case respecting nature and safe guarding it.

A Remarkable Example

These people have given it a shot, an entire community joining together and banning the use of plastic, so that their future generations can enjoy their living place as they do.

In essence going back former consumption habits and valuing nature, the environment and acting accordingly out of respect to its resources. It is an example of the movement of an entire community including its leaders wholeheartedly supporting this commitment.

Technologically innovative?

Not at all since they are going back to the former ways of their people. In essence by looking back to former ways they tackle a problem without sophisticated technologies.

The innovativeness here in my opinion stems from the fact that they have banned the use of plastic in an entire community. Radical but see and judge for yourself.

Do you know of other examples? What do you think of this?

Worth reading: Stop worshipping unicorns. Your firm can be entrepreneurial | London Business School

2017-11-16 | Constantin Bosse

Excellent article on corporate entrepreneurship by Professor Gary Hamel and Anna Johnston.

Where they make a point about some of the fallacies that large companies are at a disadvantage to be entrepreneurial.

The article gives insight into Haier’s corporate structure and approach to make its employees effectifely act like they were their own start-ups.

Source: Stop worshipping unicorns. Your firm can be entrepreneurial | London Business School

Online Teamwork Tools: Trello, Slack and Google Suite Apps

2017-01-05 | Constantin Bosse

About two months ago I initiated a team with two others to work on some ideas. I did some research on web based applications and finally I decided to use Trello, Slack and combine them with Google’s apps. I wanted something free, that could be combined with each other, easy to use and with mobile app support.

Trello

The developers define it as a visual project collaboration tool that lets the user organize everything in the form of boards filled with lists. Its free plan is what I have used so far and I have found it very useful and simple to use. It lets you easily create private boards as well as team boards to which you can add members. Once you are on a board you create lists which you fill with so called cards. All these can be arranged and moved around on the board from one list to another or to another board, copied, deleted, etc. It is pretty straightforward and other options are available through the menu on the right. The configuration and set-up is easy and you can download mobile apps for the different operating systems to collaborate on the go.

Here are some screenshots to give you an idea.

Trello keyboard shortcuts

Slack

It is a team collaboration tool, focused on easing communication by letting you create different chat rooms per topic and add easily links, files, etc. It can be integrated with other third party applications like Trello, Google Drive, Box, Dropbox, etc. Its contents is searchable and filters can be applied. The configuration and set-up is easy and you can download mobile apps for the different operating systems to collaborate on the go. You also have the possibility to make individual or group calls, the latter however are not free.

Here are some screenshots to give you an idea.

Slack keyboard shortcuts

Google apps

They are the free online office applications set by Google. I had not used them much before but after having them used for team projects I have come to appreciate them. You have the standard offering of the paid Microsoft version for office. The whole range works well, it takes a little to get used to them. If you switch over to them from lets say Word or Excel you will possibly have no problem if you are a basic user. If you are a poweruser, however you will have to go and do for example some reading for sheets and customize it to suit your workflow.

When it comes to same time online collaboration the apps are impressive since all the team members can be connected at the same time after they have been granted access to the file or folder. One note though is that al will need a gmail account and to have a good internet connection.

Here is a screenshot to give you an idea of the range of cloud apps that can be handy to use.

Here is a link to the Google Suite learning center where you will find all the necessary information.

Also here are some reviews on some of the Google solutions:

Why use them?

Simply stated they are efficient and free, if you are on a tight budget or just do not feel like you want to pay for the software you want to use they are a good start to get going with teamwork.

www.trello.com

www.slack.com

www.google.com/intl/en/about/products/

Just give them a shot, they are free! You have got nothing to loose!

Let me know if you found this useful or whether you are using other tools for online collaboration.

Tips to be creative

2016-04-262016-04-27 | Constantin Bosse

Are you creative or can you be creative?

I have met people who say that they are not creative. They accept things as given and do not even try to be. It is a self-fulfilling prophecy so to say. They will never come up with a bright idea because they do not consider even having one.

Ironically there are others, who when confronted with a problem, do not step back but step forward. They do not accept the status quo and give it a shot. They think of finding a solution and the fear of ridicule and failure is secondary. Some of the greatest inventors had hundreds of ideas before they came up with one bright one. This group creates the other stagnates.

What’s the difference between the two? — Their mindset.

“Think left and think right and think low and think high. Oh, the thinks you can think up if only you try” – Dr. Seuss

Creativity is just another skill

I have always been considered creative by others. In school it was because I was good at drawing and later at work, it was because I came up with solutions. Continue reading →

How to and why to learn to work smarter with EXCEL

2016-03-252016-04-24 | Constantin Bosse

How and why Excel

The other day I was asked about my EXCEL skills. I have been using this program since my time in college and I have come to appreciate the benefit of investing some more time in it.

I think that everybody should know a little more about its use and benefits and get some tips to start learning. However the best is to start with an example of its potential.

From 10 to 3 minutes

If you could reduce the time you spend doing a task from 10 minutes to 3, would you? I definitely would, and often have had to, to stay sane.

You probably may think that this is not much of a deal, to save 7 minutes. However what if you had to do this 100 times in a row? You would save 700 minutes, almost 12 hours.

Continue reading →

Looking for solutions or ideas? A step by step approach

2016-03-112016-03-11 | Constantin Bosse

Ideas search

The other day I was talking to a friend who is currently trying to solve a major technical problem of a product. From what he mentioned this problem had already been challenging others for some years. He did not know where to start due to the complexity of the task. I am no engineer but at my work I have had to solve complex problems too, so I mentioned my approach to him.

Until today I had never thought about the process I followed since it always appeared to me as an exercise of common sense and keeping an open mind.

An Open Mind

Whatever you try to solve, start out with an open mind. I mean by this to start out with the following assumptions:

Everything can be improved but first it must be understood
You are not the first one who has had this problem
There may already be a different approach that can work
You do not know everything
Sometimes a good solution is better than a perfect solution
There is no such thing as a dumb question

By keeping an open mind and not taking things for granted you are capable of three things: Continue reading →