3.2 Visualizing Part of Speech Basics
The post explains the basics of Spacy library used for NLP
spaCy offers an outstanding visualizer called displaCy:
import spacy
nlp = spacy.load('en_core_web_sm')
# Import the displaCy library
from spacy import displacy
doc = nlp(u"The quick brown fox jumped over the lazy dog's back.")
displacy.render(doc, style='dep', jupyter=True, options={'distance': 110})
The dependency parse shows the coarse POS tag for each token, as well as the dependency tag if given:
for token in doc:
print(f'{token.text:{10}} {token.pos_:{7}} {token.dep_:{7}} {spacy.explain(token.dep_)}')
If you're using another Python IDE or writing a script, you can choose to have spaCy serve up HTML separately.
Instead of displacy.render()
, use displacy.serve()
:
displacy.serve(doc, style='dep', options={'distance': 110})
**After running the cell above, click the link below to view the dependency parse**:
http://127.0.0.1:5000
**To shut down the server and return to jupyter**, interrupt the kernel either through the **Kernel** menu above, by hitting the black square on the toolbar, or by typing the keyboard shortcut `Esc`, `I`, `I`
**NOTE**: We'll use this method moving forward because, at this time, several of the customizations we want to show don't work well in Jupyter.
displacy.serve()
accepts a single Doc or list of Doc objects. Since large texts are difficult to view in one line, you may want to pass a list of spans instead. Each span will appear on its own line:
doc2 = nlp(u"This is a sentence. This is another, possibly longer sentence.")
# Create spans from Doc.sents:
spans = list(doc2.sents)
displacy.serve(spans, style='dep', options={'distance': 110})
Click this link to view the dependency: http://127.0.0.1:5000
Interrupt the kernel to return to jupyter.
Besides setting the distance between tokens, you can pass other arguments to the options
parameter:
NAME | TYPE | DESCRIPTION | DEFAULT |
---|---|---|---|
`compact` | bool | "Compact mode" with square arrows that takes up less space. | `False` |
`color` | unicode | Text color (HEX, RGB or color names). | `#000000` |
`bg` | unicode | Background color (HEX, RGB or color names). | `#ffffff` |
`font` | unicode | Font name or font family for all text. | `Arial` |
For a full list of options visit https://spacy.io/api/top-level#displacy_options
options = {'distance': 110, 'compact': 'True', 'color': 'yellow', 'bg': '#09a3d5', 'font': 'Times'}
displacy.serve(doc, style='dep', options=options)
Click this link to view the dependency: http://127.0.0.1:5000
Interrupt the kernel to return to jupyter.
For more info on displaCy visit https://spacy.io/usage/visualizers