Compare commits

...

28 Commits

Author SHA1 Message Date
333540f9a8 Make explicit that French is the default language used
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2023-11-20 11:09:20 +00:00
87d1a3e26f
Missing hunspell in CI.
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
2023-11-20 12:02:55 +01:00
f1a9ae321f
Hello woodpekcer.
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2023-11-20 12:00:45 +01:00
c26878af0b
Bump min Python version to 3.7.
Because I do no longer have a 3.6 on my machine to test it.
2023-11-20 11:58:57 +01:00
b164f089d6
Explicitly fail if dict is missing. 2023-11-20 11:58:43 +01:00
8b753bde26
FIX: Discrepancy between docutils rst and sphinx rst
:rfc: don't allow aliases in docutils implementation.

See: https://sourceforge.net/p/docutils/feature-requests/75/
2023-07-21 09:08:12 +02:00
a626a2f3fb
Bump Python versions used in tests. 2023-07-19 10:46:15 +02:00
07d854dcec
docutils is migrationg to argparse. 2023-07-19 10:46:04 +02:00
33eb8f7f7d
Move to pyproject.toml 2023-07-18 15:04:47 +02:00
cf6c1c8919
Don't run hunspell on obsolete values. 2023-04-10 16:43:11 +02:00
d8a2e20e7e Fix typo in README 2023-03-08 08:48:29 +01:00
rtobar
c4feb4d25f
Adjust raw text extraction from docutils documents (#33)
The previous version of this code relied on the Text.rawsource attribute
to obtain the raw, original version of the translated texts contained in
.po files. This attribute however was removed in docutils 0.18, and thus
a different way of obtaining this information was needed.

(Note that this attribute removal was planned, but not for this release
yet: it's currently listed not in 0.18's list of changes, but under
"Future changes". https://sourceforge.net/p/docutils/bugs/437/ has been
opened to get this eventually clarified)

The commit that removed the Text.rawsource mentioned that the data fed
into the Text elements was already the raw source, hence there was no
need to keep a separate attribute. Text objects derive from str, so we
can directly add them to the list of strings where NodeToTextVisitor
builds the original text, with the caveat that it needs to have
backslashes restored (they are encoded as null bytes after parsing,
apparently).

The other side-effect of using the Text objects directly instead of the
Text.rawsoource attribute is that now we get more of them. The document
resulting from docutils' parsing can contain system_message elements
with debugging information from the parsing process, such as warnings.
These are Text elements with no rawsource, but with actual text, so we
need to skip them. In the same spirit, citation_references and
substitution_references need to be ignored as well.

All these changes allow pospell to work against the latest docutils. On
the other hand, the lowest supported version is 0.16: 0.11 through 0.14
failed at rfc role parsing (used for example in the python docs), and
0.15 didn't have a method to restore backslashes (which again made the
python docs fail).

Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>
2021-11-30 17:57:04 +01:00
2844284bb7
Rename branch. 2021-11-26 10:38:50 +01:00
204417d00a
Bump to v1.1. 2021-11-26 10:36:49 +01:00
rtobar
caf1412f49
Allow using only --glob without further po_files (#31)
At the moment pospell complains if invoked with a --glob pattern but
without any other po_files in the command line. This is a problem only
with the check, as the code is ready to handle the situation. To bypass
this problem, one *needs* to pass a po_file in the command-line as well,
even if the glob pattern contains it.

This commit adjusts the condition that checks that input files have been
somehow specified to consider --glob as a source of input files.

Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>
2021-11-26 10:27:05 +01:00
rtobar
3553ecd726
Refactor pospell to use multiprocessing (#32)
One of the main drawbacks of pospell at the moment is that checking is
performed serially by a single hunspell process. In small projects this
is not noticeable, but in slightly bigger ones this can go up a bit
(e.g., in python-docs-es it takes ~2 minutes to check the whole set of
.po files).

The obvious solution to speed things up is to use multiprocessing,
parallelising the process at two different places: first, when reading
the input .po files and collecting the input strings to feed into
hunspell, and secondly when running hunspell itself.

This commit implements this support. It works as follows:

 * A new namedtuple called input_line has been added. It contains a
   filename, a line, and text, and thus it uniquely identifies an input
   line in a self-contained way.
 * When collecting input to feed into hunspell, the po_to_text routine
   collects input_lines instead of a simple string. This is done with a
   multiprocessing Pool to run in parallel across all input files.
 * The input_lines are split in N blocks, with N being the size of the
   pool. Note that during this process input_lines from different files
   might end up in the same block, and input_lines from the same file
   might end up in different blocks; however since input_lines are
   self-contained we are not losing information.
 * N hunspell instances are run over the N blocks of input_lines using
   the pool (only the text field from the input_lines is fed into
   hunspell).
 * When interpreting errors from hunspell we can match an input_line
   with its corresponding hunspell output lines, and thus can identify
   the original file:line that caused the error.

The multiprocessing pool is sized via a new -j/--jobs command line
option, which defaults to os.cpu_count() to run at maximum speed by
default.

These are the kind of differences I see with python-docs-es in my
machine, so YMMV depending on your setup/project:

$> time pospell -p dict2.txt -l es_ES */*.po -j 1
real    2m1.859s
user    2m6.680s
sys     0m3.829s

$> time pospell -p dict2.txt -l es_ES */*.po -j 2
real    1m10.322s
user    2m18.210s
sys     0m3.559s

Finally, these changes had some minor effects on the tooling around
testing. Pylint complained about there being too many arguments now in
check_spell, so pylint's max-args settings has been adjusted as
discussed. Separately, coverage information now needs to be collected
for sub-processes of the test main process; this is automatically done
by the pytest-cov plug-in, so I've switched tox to use that rather than
the more manual running of pytest under coverage (which would otherwise
require some extra setup to account for subprocesses).
2021-11-26 10:26:35 +01:00
8b0d6d8778
Bump requirements.
It's hard to get a freezed set of dependencies working in all tested
versions, so I unpin them from tox.
2021-10-27 19:12:29 +02:00
cafe8f8630
Bump dev requirements. 2021-10-27 17:33:34 +02:00
6c8779826a
Pleases pylint and mypy. 2021-10-27 17:24:27 +02:00
1525acad68
docutils dropped the 'rawsource' attribute in 0.18. 2021-10-27 17:16:02 +02:00
Álvaro Mondéjar
23eb401584
Ignore deleted files using '--modified' option (#28) 2021-04-13 18:33:44 +02:00
d7468aacb1 Bump to v1.0.12. 2021-04-10 00:12:33 +02:00
Álvaro Mondéjar
d81c49221e
Add 'line_length_limit' configuration field to 'docutils.frontend.Values'. (#26) 2021-04-10 00:07:12 +02:00
3e4bb50687
Tox and github actions. (#24) 2020-11-23 14:26:34 +01:00
f7b61e04d0 Move from setup.py to setup.cfg. 2020-11-23 12:56:58 +01:00
Jules Lasne (jlasne)
a42de31a88
Added poutils section to README (#23) 2020-10-14 18:24:46 +02:00
48c9a75b68 Bump version: 1.0.10 → 1.0.11 2020-10-14 00:57:39 +02:00
bdf4a08c5b Handle file opening errors. Closes #18.
Co-authored-by: Christophe Nanteuil <christophe.nanteuil@gmail.com>
2020-10-14 00:56:44 +02:00
12 changed files with 424 additions and 198 deletions

1
.github/FUNDING.yml vendored
View File

@ -1 +0,0 @@
github: JulienPalard

2
.pylintrc Normal file
View File

@ -0,0 +1,2 @@
[DESIGN]
max-args = 6

10
.woodpecker.yml Normal file
View File

@ -0,0 +1,10 @@
---
pipeline:
test:
image: python
commands:
- apt-get update
- apt-get install -y hunspell
- python3 -m pip install tox
- tox run

View File

@ -4,38 +4,60 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
## [1.0.12] - 2021-04-10
### Fixed
- Support for docutils 0.17 thanks to mondeja and xi.
## [Unreleased]
## [1.0.11] - 2020-10-14
### Fixed
- Better handling of FileNotFound, PermissionDenied, or IsADirectory errors.
## [1.0.10] - 2020-10-14
### Fixed
- Use `^` escape char on each line while invoking hunspell, avoiding
it to think some line are comments.
## [1.0.9] - 2020-10-12
### Changed
- pospell now uses `hunspell -a` (was using `hunspell -l`), so
hunspell can tell on which line an error is, instead of having
pospell (wrongly) guess it.
## [1.0.8] - 2020-10-12
### Fixed
- Missing Sphinx option in hardcoded settings from 1.0.7.
## [1.0.7] - 2020-10-11
### Changed
- Hunspell is invoqued a single time.
- Avoid calling docutils.frontend.OptionParser, hardcode settings, saving lots of time.
- pospell is now twice faster on python-docs-fr.
## [1.0.6] - 2020-10-11
### Fixed
- Hunspell compounding mishandling caused some errors to be hidden by pospell.
## [1.0.5] - 2020-07-01
### Fixed
- Some errors were not reported due to [Hunspell not reporting them in
Auto mode](https://github.com/hunspell/hunspell/issues/655).
## [1.0.4] - 2020-06-28
### Fixed
- Avoid glueing words together: "hello - world" was sent to hunspell as "helloworld".
- Don't pass placeholders like %s, %(foo)s, or {foo} to Hunspell.
- Don't pass Sphinx variables with underscores in them to Hunspell, like {days_since}.
## [1.0.3] - 2019-10-17
### Changed
- [Soft hyphens](https://en.wikipedia.org/wiki/Soft_hyphen) are now removed.
## [1.0.2] - 2019-10-16
### Fixed
- In POSIX.1, also drop the .1.
## [1.0.1] - 2019-10-16
### Fixed
- Drop prefixes while dropping accronyms, as in `non-HTTP`.
- Regression fixed while dropping plural form of accronyms like `PEPs`.

View File

@ -1,6 +1,12 @@
# pospell
`pospell` is a spellcheckers for po files containing reStructuedText.
`pospell` is a spellcheckers for po files containing reStructuredText.
## Pospell is part of poutils!
[Poutils](https://pypi.org/project/poutils) (`.po` utils) is a metapackage to easily install useful Python tools to use with po files
and `pospell` is a part of it! Go check out [Poutils](https://pypi.org/project/poutils) to discover the other tools!
## Examples
@ -34,6 +40,7 @@ $ pospell --language fr --glob '**/*.po'
```
## Usage
```
@ -60,3 +67,19 @@ optional arguments:
A personal dict (the `-p` option) is simply a text file with one word
per line.
## Contributing
You can work in a venv, to install the project locally:
```bash
python -m pip install .
```
And to test it locally:
```bash
python -m pip install tox
tox -p all
```

View File

@ -1,44 +1,65 @@
"""pospell is a spellcheckers for po files containing reStructuedText.
"""
"""pospell is a spellcheckers for po files containing reStructuedText."""
import collections
import functools
import io
from string import digits
from unicodedata import category
import logging
import multiprocessing
import os
import subprocess
import sys
from contextlib import redirect_stderr
from itertools import chain
from pathlib import Path
from shutil import which
from string import digits
from typing import List, Tuple
from unicodedata import category
import docutils.frontend
import docutils.nodes
import docutils.parsers.rst
import polib
import regex
from docutils.parsers.rst import roles
from docutils.utils import new_document
from sphinxlint import rst
import regex
__version__ = "1.0.10"
__version__ = "1.3"
DEFAULT_DROP_CAPITALIZED = {"fr": True, "fr_FR": True}
Error = Tuple[str, int, str]
input_line = collections.namedtuple("input_line", "filename line text")
class POSpellException(Exception):
"""All exceptions from this module inherit from this one."""
class Unreachable(POSpellException):
"""The code encontered a state that should be unreachable."""
try:
HUNSPELL_VERSION = subprocess.check_output(
["hunspell", "--version"], universal_newlines=True
).split("\n")[0]
).split("\n", maxsplit=1)[0]
except FileNotFoundError:
print("hunspell not found, please install hunspell.", file=sys.stderr)
sys.exit(1)
class DummyNodeClass(docutils.nodes.Inline, docutils.nodes.TextElement):
pass
"""Used to represent any unknown roles, so we can parse any rst blindly."""
def monkey_patch_role(role):
"""Patch docutils.parsers.rst.roles.role so it always match.
Giving a DummyNodeClass for unknown roles.
"""
def role_or_generic(role_name, language_module, lineno, reporter):
base_role, message = role(role_name, language_module, lineno, reporter)
if base_role is None:
@ -53,82 +74,78 @@ roles.role = monkey_patch_role(roles.role)
class NodeToTextVisitor(docutils.nodes.NodeVisitor):
"""Recursively convert a docutils node to a Python string.
Usage:
>>> visitor = NodeToTextVisitor(document)
>>> document.walk(visitor)
>>> print(str(visitor))
It ignores (see IGNORE_LIST) some nodes, which we don't want in
hunspell (enphasis typically contain proper names that are unknown
to dictionaires).
"""
IGNORE_LIST = (
"emphasis",
"superscript",
"title_reference",
"substitution_reference",
"citation_reference",
"strong",
"DummyNodeClass",
"reference",
"literal",
"Text",
"system_message",
)
def __init__(self, document):
"""Initialize visitor for the given node/document."""
self.output = []
self.depth = 0
super().__init__(document)
def dispatch_visit(self, node):
self.depth += 1
super().dispatch_visit(node)
def dispatch_departure(self, node):
self.depth -= 1
super().dispatch_departure(node)
def unknown_visit(self, node):
"""Mandatory implementation to visit unknwon nodes."""
# print(" " * self.depth * 4, node.__class__.__name__, ":", node)
def unknown_departure(self, node):
"""To help debugging tree."""
# print(node, repr(node), node.__class__.__name__)
@staticmethod
def ignore(node):
"""Just raise SkipChildren.
def visit_emphasis(self, node):
Used for all visit_* in the IGNORE_LIST.
See __getattr__.
"""
raise docutils.nodes.SkipChildren
def visit_superscript(self, node):
raise docutils.nodes.SkipChildren
def visit_title_reference(self, node):
raise docutils.nodes.SkipChildren
def visit_strong(self, node):
raise docutils.nodes.SkipChildren
def visit_DummyNodeClass(self, node):
raise docutils.nodes.SkipChildren
def visit_reference(self, node):
raise docutils.nodes.SkipChildren
def visit_literal(self, node):
raise docutils.nodes.SkipChildren
def __getattr__(self, name):
"""Skip childrens from the IGNORE_LIST."""
if name.startswith("visit_") and name[6:] in self.IGNORE_LIST:
return self.ignore
raise AttributeError(name)
def visit_Text(self, node):
self.output.append(node.rawsource)
"""Keep this node text, this is typically what we want to spell check."""
self.output.append(docutils.nodes.unescape(node, restore_backslashes=True))
def __str__(self):
"""Give the accumulated strings."""
return " ".join(self.output)
def strip_rst(line):
"""Transform reStructuredText to plain text."""
if line.endswith("::"):
# Drop :: at the end, it would cause Literal block expected
line = line[:-2]
line = rst.NORMAL_ROLE_RE.sub("", line)
settings = docutils.frontend.get_default_settings()
settings.pep_references = None
settings.rfc_references = None
settings.pep_base_url = "http://www.python.org/dev/peps/"
settings.pep_file_url_template = "pep-%04d"
parser = docutils.parsers.rst.Parser()
settings = docutils.frontend.Values(
{
"report_level": 2,
"halt_level": 4,
"exit_status_level": 5,
"debug": None,
"warning_stream": None,
"error_encoding": "utf-8",
"error_encoding_error_handler": "backslashreplace",
"language_code": "en",
"id_prefix": "",
"auto_id_prefix": "id",
"pep_references": None,
"pep_base_url": "http://www.python.org/dev/peps/",
"pep_file_url_template": "pep-%04d",
"rfc_references": None,
"rfc_base_url": "http://tools.ietf.org/html/",
"tab_width": 8,
"trim_footnote_reference_space": None,
"syntax_highlight": "long",
}
)
stderr_stringio = io.StringIO()
with redirect_stderr(stderr_stringio):
document = new_document("<rst-doc>", settings=settings)
@ -171,34 +188,48 @@ def clear(line, drop_capitalized=False, po_path=""):
def quote_for_hunspell(text):
"""
"""Quote a paragraph so hunspell don't misinterpret it.
Quoting the manpage:
It is recommended that programmatic interfaces prefix
every data line with an uparrow to protect themselves
against future changes in hunspell."""
against future changes in hunspell.
"""
out = []
for line in text.split("\n"):
for line in text:
out.append("^" + line if line else "")
return "\n".join(out)
def po_to_text(po_path, drop_capitalized=False):
"""Converts a po file to a text file, by stripping the msgids and all
po syntax, but by keeping the kept lines at their same position /
line number.
"""Convert a po file to a text file.
This strips the msgids and all po syntax while keeping lines at
their same position / line number.
"""
buffer = []
input_lines = []
lines = 0
entries = polib.pofile(po_path)
try:
entries = polib.pofile(Path(po_path).read_text(encoding="UTF-8"))
except Exception as err:
raise POSpellException(str(err)) from err
for entry in entries:
if entry.msgid == entry.msgstr:
continue
if entry.obsolete:
continue
while lines < entry.linenum:
buffer.append("")
lines += 1
buffer.append(clear(strip_rst(entry.msgstr), drop_capitalized, po_path=po_path))
input_lines.append(input_line(po_path, lines, ""))
lines += 1
return "\n".join(buffer)
input_lines.append(
input_line(
po_path,
lines,
clear(strip_rst(entry.msgstr), drop_capitalized, po_path=po_path),
)
)
return input_lines
def parse_args():
@ -214,7 +245,7 @@ def parse_args():
type=str,
default="fr",
help="Language to check, you'll have to install the corresponding "
"hunspell dictionary, on Debian see apt list 'hunspell-*'.",
"hunspell dictionary, on Debian see apt list 'hunspell-*' (defaults to 'fr').",
)
parser.add_argument(
"--glob",
@ -225,12 +256,14 @@ def parse_args():
parser.add_argument(
"--drop-capitalized",
action="store_true",
help="Always drop capitalized words in sentences (defaults according to the language).",
help="Always drop capitalized words in sentences"
" (defaults according to the language).",
)
parser.add_argument(
"--no-drop-capitalized",
action="store_true",
help="Never drop capitalized words in sentences (defaults according to the language).",
help="Never drop capitalized words in sentences"
" (defaults according to the language).",
)
parser.add_argument(
"po_file",
@ -252,23 +285,35 @@ def parse_args():
version="%(prog)s " + __version__ + " using hunspell: " + HUNSPELL_VERSION,
)
parser.add_argument("--debug", action="store_true")
parser.add_argument("-p", "--personal-dict", type=str)
parser.add_argument("-p", "--personal-dict", type=Path)
parser.add_argument(
"--modified", "-m", action="store_true", help="Use git to find modified files."
)
parser.add_argument(
"-j",
"--jobs",
type=int,
default=os.cpu_count(),
help="Number of files to check in paralel, defaults to all available CPUs",
)
args = parser.parse_args()
if args.personal_dict is not None and not args.personal_dict.exists():
print(f"Error: dictionary {str(args.personal_dict)!r} not found.")
sys.exit(1)
if args.drop_capitalized and args.no_drop_capitalized:
print("Error: don't provide both --drop-capitalized AND --no-drop-capitalized.")
parser.print_help()
sys.exit(1)
if not args.po_file and not args.modified:
if not args.po_file and not args.modified and not args.glob:
parser.print_help()
sys.exit(1)
return args
def look_like_a_word(word):
"""Used to filter out non-words like `---` or `-0700` so they don't
"""Return True if the given str looks like a word.
Used to filter out non-words like `---` or `-0700` so they don't
get reported. They typically are not errors.
"""
if not word:
@ -282,64 +327,99 @@ def look_like_a_word(word):
return True
def run_hunspell(language, personal_dict, input_lines) -> List[Error]:
"""Run hunspell over the given input lines."""
personal_dict_arg = ["-p", personal_dict] if personal_dict else []
try:
output = subprocess.check_output(
["hunspell", "-d", language, "-a"] + personal_dict_arg,
universal_newlines=True,
input=quote_for_hunspell(text for _, _, text in input_lines),
)
except subprocess.CalledProcessError:
return []
return parse_hunspell_output(input_lines, output.splitlines())
def flatten(list_of_lists):
"""[[a,b,c], [d,e,f]] -> [a,b,c,d,e,f]."""
return [element for a_list in list_of_lists for element in a_list]
def spell_check(
po_files,
personal_dict=None,
language="en_US",
drop_capitalized=False,
debug_only=False,
jobs=os.cpu_count(),
):
"""Check for spelling mistakes in the files po_files (po format,
containing restructuredtext), for the given language.
"""Check for spelling mistakes in the given po_files.
(po format, containing restructuredtext), for the given language.
personal_dict allow to pass a personal dict (-p) option, to hunspell.
Debug only will show what's passed to Hunspell instead of passing it.
"""
errors = []
personal_dict_arg = ["-p", personal_dict] if personal_dict else []
texts_for_hunspell = {}
for po_file in po_files:
if debug_only:
print(po_to_text(str(po_file), drop_capitalized))
continue
texts_for_hunspell[po_file] = po_to_text(str(po_file), drop_capitalized)
# Pool.__exit__ calls terminate() instead of close(), we need the latter,
# which ensures the processes' atexit handlers execute fully, which in
# turn lets coverage write the sub-processes' coverage information
pool = multiprocessing.Pool(jobs) # pylint: disable=consider-using-with
try:
output = subprocess.run(
["hunspell", "-d", language, "-a"] + personal_dict_arg,
universal_newlines=True,
input=quote_for_hunspell("\n".join(texts_for_hunspell.values())),
stdout=subprocess.PIPE,
input_lines = flatten(
pool.map(
functools.partial(po_to_text, drop_capitalized=drop_capitalized),
po_files,
)
)
except subprocess.CalledProcessError:
return -1
if debug_only:
for filename, line, text in input_lines:
print(filename, line, text, sep=":")
return 0
if not input_lines:
return 0
errors = 0
checked_files = iter(texts_for_hunspell.items())
checked_file_name, checked_text = next(checked_files)
checked_lines = iter(checked_text.split("\n"))
currently_checked_line = next(checked_lines)
current_line_number = 1
for line in output.stdout.split("\n")[1:]:
if not line:
# Distribute input lines across workers
lines_per_job = (len(input_lines) + jobs - 1) // jobs
chunked_inputs = [
input_lines[i : i + lines_per_job]
for i in range(0, len(input_lines), lines_per_job)
]
errors = flatten(
pool.map(
functools.partial(run_hunspell, language, personal_dict),
chunked_inputs,
)
)
finally:
pool.close()
pool.join()
for error in errors:
print(*error, sep=":")
return len(errors)
def parse_hunspell_output(inputs, outputs) -> List[Error]:
"""Parse `hunspell -a` output and collect all errors."""
# skip first line of hunspell output (it's the banner)
outputs = iter(outputs[1:])
errors = []
for po_input_line, output_line in zip(inputs, outputs):
if not po_input_line.text:
continue
while output_line:
if output_line.startswith("&"):
_, original, *_ = output_line.split()
if look_like_a_word(original):
errors.append(
(po_input_line.filename, po_input_line.line, original)
)
try:
currently_checked_line = next(checked_lines)
current_line_number += 1
output_line = next(outputs)
except StopIteration:
try:
checked_file_name, checked_text = next(checked_files)
checked_lines = iter(checked_text.split("\n"))
currently_checked_line = next(checked_lines)
current_line_number = 1
except StopIteration:
return errors
continue
if line == "*": # OK
continue
if line[0] == "&":
_, original, count, offset, *miss = line.split()
if look_like_a_word(original):
print(checked_file_name, current_line_number, original, sep=":")
errors += 1
break
return errors
def gracefull_handling_of_missing_dicts(language):
@ -360,24 +440,22 @@ def gracefull_handling_of_missing_dicts(language):
)
if which("apt"):
error("Maybe try something like:")
error(" sudo apt install hunspell-{}".format(language))
error(f" sudo apt install hunspell-{language}")
else:
error(
"""I don't know your environment, but I bet the package name looks like:
f"""I don't know your environment, but I bet the package name looks like:
hunspell-{language}
If you find it, please tell me (by opening an issue or a PR on
https://github.com/JulienPalard/pospell/) so I can enhance this error message.
""".format(
language=language
)
"""
)
sys.exit(1)
def main():
"""Module entry point."""
"""Entry point (for command-line)."""
args = parse_args()
logging.basicConfig(level=50 - 10 * args.verbose)
default_drop_capitalized = DEFAULT_DROP_CAPITALIZED.get(args.language, False)
@ -392,7 +470,7 @@ def main():
)
if args.modified:
git_status = subprocess.check_output(
["git", "status", "--porcelain"], encoding="utf-8"
["git", "status", "--porcelain", "--no-renames"], encoding="utf-8"
)
git_status_lines = [
line.split(maxsplit=2) for line in git_status.split("\n") if line
@ -400,11 +478,20 @@ def main():
args.po_file.extend(
Path(filename)
for status, filename in git_status_lines
if filename.endswith(".po")
if filename.endswith(".po") and status != "D"
)
errors = spell_check(
args.po_file, args.personal_dict, args.language, drop_capitalized, args.debug
)
try:
errors = spell_check(
args.po_file,
args.personal_dict,
args.language,
drop_capitalized,
args.debug,
args.jobs,
)
except POSpellException as err:
print(err, file=sys.stderr)
sys.exit(-1)
if errors == -1:
gracefull_handling_of_missing_dicts(args.language)
sys.exit(0 if errors == 0 else -1)

60
pyproject.toml Normal file
View File

@ -0,0 +1,60 @@
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
[project]
name = "pospell"
authors = [
{name = "Julien Palard", email = "julien@palard.fr"},
]
description = "Spellcheck .po files containing reStructuredText translations"
keywords = [
"po",
"spell",
"gettext",
"reStructuredText",
"check",
"sphinx",
"translation",
]
classifiers = [
"Development Status :: 5 - Production/Stable",
"Intended Audience :: Developers",
"License :: OSI Approved :: MIT License",
"Natural Language :: English",
"Programming Language :: Python :: 3",
]
requires-python = ">= 3.7"
dependencies = [
"polib",
"docutils>=0.18",
"regex",
"sphinx-lint>=0.6.8",
]
dynamic = [
"version",
]
[project.license]
text = "MIT license"
[project.readme]
file = "README.md"
content-type = "text/markdown; charset=UTF-8"
[project.urls]
Homepage = "https://git.afpy.org/AFPy/pospell"
[project.scripts]
pospell = "pospell:main"
[tool.setuptools]
py-modules = [
"pospell",
]
include-package-data = false
[tool.setuptools.dynamic.version]
attr = "pospell.__version__"
[tool.black]

View File

@ -1,15 +0,0 @@
[bumpversion]
current_version = 1.0.10
commit = True
tag = True
[bumpversion:file:setup.py]
search = version="{current_version}"
replace = version="{new_version}"
[bumpversion:file:pospell.py]
search = __version__ = "{current_version}"
replace = __version__ = "{new_version}"
[bdist_wheel]
universal = 1

View File

@ -1,34 +0,0 @@
#!/usr/bin/env python3
import setuptools
with open("README.md") as readme:
long_description = readme.read()
setuptools.setup(
name="pospell",
version="1.0.10",
description="Spellcheck .po files containing reStructuredText translations",
long_description=long_description,
long_description_content_type="text/markdown", # This is important!
author="Julien Palard",
author_email="julien@palard.fr",
url="https://github.com/JulienPalard/pospell",
py_modules=["pospell"],
entry_points={"console_scripts": ["pospell=pospell:main"]},
extras_require={
"dev": ["bandit", "black", "detox", "flake8", "isort", "mypy", "pylint"]
},
install_requires=["polib", "docutils>=0.11", "regex"],
license="MIT license",
keywords="po spell gettext reStructuredText check sphinx translation",
classifiers=[
"Development Status :: 3 - Alpha",
"Intended Audience :: Developers",
"License :: OSI Approved :: MIT License",
"Natural Language :: English",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.5",
"Programming Language :: Python :: 3.6",
],
)

View File

@ -1,2 +1,2 @@
msgid "Rendez-vous à 10h chez Murex"
msgstr "See your at 10h at Murex"
msgid "Rendez-vous à 10h à la fête"
msgstr "See your at 10h at the party"

View File

@ -1,5 +1,3 @@
import os
from types import SimpleNamespace
from pathlib import Path
import pytest

74
tox.ini Normal file
View File

@ -0,0 +1,74 @@
[flake8]
;E203 for black (whitespace before : in slices), and F811 for @overload
ignore = E203, F811
max-line-length = 88
[coverage:run]
; branch = true: would need a lot of pragma: no branch on infinite loops.
parallel = true
concurrency = multiprocessing
omit =
.tox/*
[coverage:report]
skip_covered = True
show_missing = True
exclude_lines =
pragma: no cover
def __repr__
if self\.debug
raise AssertionError
raise NotImplementedError
if __name__ == .__main__.:
[tox]
envlist = py37, py38, py39, py310, py311, py312, flake8, mypy, black, pylint, pydocstyle, coverage
isolated_build = True
skip_missing_interpreters = True
[testenv]
deps =
pytest
coverage
commands = coverage run -m pytest
setenv =
COVERAGE_FILE={toxworkdir}/.coverage.{envname}
[testenv:coverage]
depends = py37, py38, py39, py310, py312
parallel_show_output = True
deps = coverage
skip_install = True
setenv = COVERAGE_FILE={toxworkdir}/.coverage
commands =
coverage combine
coverage report --fail-under 65
[testenv:flake8]
deps = flake8
skip_install = True
commands = flake8 tests/ pospell.py
[testenv:black]
deps = black
skip_install = True
commands = black --check --diff tests/ pospell.py
[testenv:mypy]
deps =
mypy
types-docutils
types-polib
skip_install = True
commands = mypy --ignore-missing-imports pospell.py
[testenv:pylint]
deps = pylint
commands = pylint --disable import-outside-toplevel,invalid-name pospell.py
[testenv:pydocstyle]
deps = pydocstyle
skip_install = True
commands = pydocstyle pospell.py