# Speeding up “suggest” with python-Levenshtein (DEPRECATED)

sphobjinv uses `fuzzywuzzy`

for fuzzy-match searching of object
names/domains/roles as part of the
`Inventory.suggest()`

functionality,
also implemented as the CLI suggest subcommand.

`fuzzywuzzy`

uses `difflib.SequenceMatcher`

from the Python standard library for its fuzzy searching.
While earlier versions of sphobjinv were able to make use of
`fuzzywuzzy`

‘s optional link to `python-Levenshtein`

,
a Python C extension providing similar functionality,
due to a licensing conflict this is no longer possible.
sphobjinv now uses a vendored copy of `fuzzywuzzy`

from an
era when it was released under the MIT License.

Formally:

Removed in version 2.2: Acceleration of the sphobjinv “suggest” mode via `python-Levenshtein`

has been deprecated and is no longer available.

The discussion of performance benchmarks and variations in matching behavior is kept below for historical interest.

## Performance Benchmark

The chart below presents one dataset illustrating the performance enhancement
that can be obtained by installing `python-Levenshtein`

.
The timings plotted here are from execution of
`timeit.repeat()`

around a
`suggest()`

call,
searching for the term “function”, for a number of
objects.inv files from different projects (see
here).

The timings were collected using the following code:

```
import sphobjinv as soi
durations = {}
obj_counts = {}
for fn in os.listdir():
if fn.endswith('.inv'):
inv = soi.Inventory(fn)
# Execute the 'suggest' operation 20 times for each file
timings = timeit.repeat("inv.suggest('function')", repeat=20, number=1, globals=globals())
# Store the average timing for each file
durations.update({fn: sum(timings) / len(timings)})
# Store the number of objects
obj_counts.update({fn: inv.count})
```

As can be seen, the fifty-two objects.inv files in this dataset contain widely varying numbers of objects \(n\):

Unsurprisingly, larger inventories require more time to search.
Also relatively unsurprisingly, the time required appears to be
roughly \(O(n)\), since the fuzzy search must be performed once on the
`as_rst`

representation of each object.

For this specific search, using `python-Levenshtein`

instead of
`difflib`

decreases the time required from 0.90 seconds per thousand objects
down to 0.15 seconds per thousand objects,
representing a performance improvement of almost exactly six-fold.
Other searches will likely exhibit somewhat better or worse
improvement from the use of `python-Levenshtein`

,
depending on the average length of the reST-like representations
of the objects in an objects.inv
and the length of the search term.

## Variations in Matching Behavior

Note that the matching scores calculated by
`difflib`

and `python-Levenshtein`

can often
differ appreciably. (This is illustrated in
issue #128
of the `fuzzywuzzy`

GitHub repo.)
This difference in behavior doesn’t have much practical significance,
save for the potential of causing some confusion between users with/without
`python-Levenshtein`

installed.

As an example, the following shows an excerpt of the results of a representative
CLI suggest call **without**
`python-Levenshtein`

:

```
$ sphobjinv suggest objects_scipy.inv surface -asit 40
Name Score Index
------------------------------------------------------ ------- -------
:py:function:`scipy.misc.face` 64 1018
:py:function:`scipy.misc.source` 64 1032
:std:doc:`generated/scipy.misc.face` 64 4042
:std:doc:`generated/scipy.misc.source` 64 4056
:py:data:`scipy.stats.rice` 56 2688
:std:label:`continuous-rice` 56 2896
:py:method:`scipy.integrate.complex_ode.successful` 51 156
:py:method:`scipy.integrate.ode.successful` 51 171
:py:function:`scipy.linalg.lu_factor` 51 967
... more with score 51 ...
:py:attribute:`scipy.LowLevelCallable.signature` 50 5
:py:function:`scipy.constants.convert_temperature` 50 53
:py:function:`scipy.integrate.quadrature` 50 176
... more with score 50 and below ...
```

This is a similar excerpt **with** `python-Levenshtein`

:

```
Name Score Index
------------------------------------------------------ ------- -------
:py:function:`scipy.misc.face` 64 1018
:py:function:`scipy.misc.source` 64 1032
:std:doc:`generated/scipy.misc.face` 64 4042
:std:doc:`generated/scipy.misc.source` 64 4056
:py:method:`scipy.integrate.ode.successful` 51 171
:py:function:`scipy.linalg.lu_factor` 51 967
:py:function:`scipy.linalg.subspace_angles` 51 1003
... more with score 51 ...
:py:function:`scipy.cluster.hierarchy.fcluster` 49 23
:py:function:`scipy.cluster.hierarchy.fclusterdata` 49 24
:py:method:`scipy.integrate.complex_ode.successful` 49 156
... more with score 49 and below ...
```