In my
previous article “Natural Language Processing using NLTK package” I gave
a detailed explanation about NLTK package and how we use it. In this article, I
will try to give a brief introduction about Spacy package which is an advanced
NLP package.
If you
didn’t read my previous articles. I recommend you to first go through my
previous article on NLTK package mentioned below and then come back to this
article for more better understanding:
Natural
Language processing using NLTK package
Let’s
start…
What is
SpaCy?
spaCy
is a free, open-source library to perform advanced NLP in Python. It’s written
in Cython and is designed to build information extraction
or natural language understanding systems. It’s built for production use and
provides a concise and user-friendly API.
spaCy’s
Statistical Models
There are some different statistical models present
spacy which are the power engines of this package. These models enable spaCy to
perform several NLP related tasks, such as part-of-speech tagging, named
entity recognition, and dependency parsing.
Below listed are the different statistical models present
in spaCy along with their specifications:
en_core_web_sm: English
multi-task CNN trained
on OntoNotes. Size – 11 MB
en_core_web_md: English
multi-task CNN trained on OntoNotes, with GloVe
vectors trained on Common Crawl. Size – 91 MB
en_core_web_lg: English
multi-task CNN trained on OntoNotes, with GloVe vectors trained on Common
Crawl. Size – 789 MB
Let’s see
how to install scapy
$pip install spacy
Import spacy
Type the
first command on you cmd. After when the spacy package gets installed the
import it by typing the second command.
Let’s see
how we can load the statistical model in spacy. Don’t worry it’s pretty simple.
import spacy
st_model = spacy.load(“en_core_web_sm”)
Similarly,
you can load any statistical model present in spacy and use them.
What
are Spacy Processing Pipelines?
The first step for a text string, when
working with spaCy, is to pass it to an NLP object.
This object is essentially a pipeline of several text pre-processing operations
through which the input text string has to go through.
Figure: S
Enjoyed reading this blog? Then why not share it with others. Help us make this AI community stronger.
To learn more about such concepts related to Artificial Intelligence, visit our insideAIML blog page.
You can also ask direct queries related to Artificial Intelligence, Deep Learning, Data Science and Machine Learning on our live insideAIML discussion forum.