pospell is a spellchecker for po files containing reStructuedText.
Go to file
rtobar c4feb4d25f
Adjust raw text extraction from docutils documents (#33)
The previous version of this code relied on the Text.rawsource attribute
to obtain the raw, original version of the translated texts contained in
.po files. This attribute however was removed in docutils 0.18, and thus
a different way of obtaining this information was needed.

(Note that this attribute removal was planned, but not for this release
yet: it's currently listed not in 0.18's list of changes, but under
"Future changes". https://sourceforge.net/p/docutils/bugs/437/ has been
opened to get this eventually clarified)

The commit that removed the Text.rawsource mentioned that the data fed
into the Text elements was already the raw source, hence there was no
need to keep a separate attribute. Text objects derive from str, so we
can directly add them to the list of strings where NodeToTextVisitor
builds the original text, with the caveat that it needs to have
backslashes restored (they are encoded as null bytes after parsing,
apparently).

The other side-effect of using the Text objects directly instead of the
Text.rawsoource attribute is that now we get more of them. The document
resulting from docutils' parsing can contain system_message elements
with debugging information from the parsing process, such as warnings.
These are Text elements with no rawsource, but with actual text, so we
need to skip them. In the same spirit, citation_references and
substitution_references need to be ignored as well.

All these changes allow pospell to work against the latest docutils. On
the other hand, the lowest supported version is 0.16: 0.11 through 0.14
failed at rfc role parsing (used for example in the python docs), and
0.15 didn't have a method to restore backslashes (which again made the
python docs fail).

Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>
2021-11-30 17:57:04 +01:00
.github Rename branch. 2021-11-26 10:38:50 +01:00
tests Tox and github actions. (#24) 2020-11-23 14:26:34 +01:00
.gitignore Git ignore file 2018-07-27 14:57:43 +02:00
.pre-commit-hooks.yaml Add pre-commit hook (#14) 2020-05-22 17:48:57 +02:00
.pylintrc Refactor pospell to use multiprocessing (#32) 2021-11-26 10:26:35 +01:00
CHANGELOG.md Bump to v1.0.12. 2021-04-10 00:12:33 +02:00
pospell.py Adjust raw text extraction from docutils documents (#33) 2021-11-30 17:57:04 +01:00
pyproject.toml Tox and github actions. (#24) 2020-11-23 14:26:34 +01:00
README.md Bump requirements. 2021-10-27 19:12:29 +02:00
setup.cfg Adjust raw text extraction from docutils documents (#33) 2021-11-30 17:57:04 +01:00
setup.py Move from setup.py to setup.cfg. 2020-11-23 12:56:58 +01:00
tox.ini Refactor pospell to use multiprocessing (#32) 2021-11-26 10:26:35 +01:00

pospell

pospell is a spellcheckers for po files containing reStructuedText.

Pospell is part of poutils!

Poutils (.po utils) is a metapackage to easily install useful Python tools to use with po files and pospell is a part of it! Go check out Poutils to discover the other tools!

Examples

By giving files to pospell:

$ pospell --language fr about.po
about.po:47:Jr.
about.po:55:reStructuredText
about.po:55:Docutils
about.po:63:Fredrik
about.po:63:Lundh
about.po:75:language
about.po:75:librarie

By using a bash expansion (note that we do not put quotes around *.po to let bash do its expansion):

$ pospell --language fr *.po
…

By using a glob pattern (note that we do put quotes around **/*.po to keep your shell from trying to expand it, we'll let Python do the expansion:

$ pospell --language fr --glob '**/*.po'
…

Usage

usage: pospell [-h] [-l LANGUAGE] [--glob GLOB] [--debug] [-p PERSONAL_DICT]
               [po_file [po_file ...]]

Check spelling in po files containing restructuredText.

positional arguments:
  po_file               Files to check, can optionally be mixed with --glob,
                        or not, use the one that fit your needs.

optional arguments:
  -h, --help            show this help message and exit
  -l LANGUAGE, --language LANGUAGE
                        Language to check, you'll have to install the
                        corresponding hunspell dictionary, on Debian see apt
                        list 'hunspell-*'.
  --glob GLOB           Provide a glob pattern, to be interpreted by pospell,
                        to find po files, like --glob '**/*.po'.
  --debug
  -p PERSONAL_DICT, --personal-dict PERSONAL_DICT

A personal dict (the -p option) is simply a text file with one word per line.

Contributing

You can work in a venv, to install the project locally:

python -m pip install .

And to test it locally:

python -m pip install tox
tox -p all