previous article “Natural Language Processing using NLTK package” I gave
a detailed explanation about NLTK package and how we use it. In this article, I
will try to give a brief introduction about Spacy package which is an advanced
didn’t read my previous articles. I recommend you to first go through my
previous article on NLTK package mentioned below and then come back to this
article for more better understanding:
Language processing using NLTK package
is a free, open-source library to perform advanced NLP in Python. It’s written
in Cython and is designed to build information extraction
or natural language understanding systems. It’s built for production use and
provides a concise and user-friendly API.
There are some different statistical models present
spacy which are the power engines of this package. These models enable spaCy to
perform several NLP related tasks, such as part-of-speech tagging, named
entity recognition, and dependency parsing.
Below listed are the different statistical models present
in spaCy along with their specifications:
multi-task CNN trained
on OntoNotes. Size – 11 MB
multi-task CNN trained on OntoNotes, with GloVe
vectors trained on Common Crawl. Size – 91 MB
multi-task CNN trained on OntoNotes, with GloVe vectors trained on Common
Crawl. Size – 789 MB
how to install scapy
$pip install spacy
first command on you cmd. After when the spacy package gets installed the
import it by typing the second command.
how we can load the statistical model in spacy. Don’t worry it’s pretty simple.
st_model = spacy.load(“en_core_web_sm”)
you can load any statistical model present in spacy and use them.
are Spacy Processing Pipelines?
The first step for a text string, when
working with spaCy, is to pass it to an NLP object.
This object is essentially a pipeline of several text pre-processing operations
through which the input text string has to go through.