Make explicit that French is the default language used

Missing hunspell in CI.
Hello woodpekcer.
2023-11-20 11:09:20 +00:00 · 2023-11-20 12:02:55 +01:00 · 2023-11-20 12:00:45 +01:00 · 2023-11-20 11:58:57 +01:00 · 2023-11-20 11:58:43 +01:00 · 2023-07-21 09:08:12 +02:00
12 changed files with 424 additions and 198 deletions
--- a/.github/FUNDING.yml
+++ b/.github/FUNDING.yml
@ -1 +0,0 @@
-github: JulienPalard
--- a/.pylintrc
+++ b/.pylintrc
@ -0,0 +1,2 @@
+[DESIGN]
+max-args = 6
--- a/.woodpecker.yml
+++ b/.woodpecker.yml
@ -0,0 +1,10 @@
+---
+
+pipeline:
+  test:
+    image: python
+    commands:
+      - apt-get update
+      - apt-get install -y hunspell
+      - python3 -m pip install tox
+      - tox run
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -4,38 +4,60 @@ All notable changes to this project will be documented in this file.

 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),

+## [1.0.12] - 2021-04-10
+### Fixed
+- Support for docutils 0.17 thanks to mondeja and xi.

-## [Unreleased]
+## [1.0.11] - 2020-10-14
+### Fixed
+- Better handling of FileNotFound, PermissionDenied, or IsADirectory errors.
+
+## [1.0.10] - 2020-10-14
+### Fixed
+- Use `^` escape char on each line while invoking hunspell, avoiding
+  it to think some line are comments.
+
+## [1.0.9] - 2020-10-12
+### Changed
+- pospell now uses `hunspell -a` (was using `hunspell -l`), so
+  hunspell can tell on which line an error is, instead of having
+  pospell (wrongly) guess it.
+
+## [1.0.8] - 2020-10-12
+### Fixed
+- Missing Sphinx option in hardcoded settings from 1.0.7.
+
+## [1.0.7] - 2020-10-11
+### Changed
+- Hunspell is invoqued a single time.
+- Avoid calling docutils.frontend.OptionParser, hardcode settings, saving lots of time.
+- pospell is now twice faster on python-docs-fr.
+
+
+## [1.0.6] - 2020-10-11
+### Fixed
+- Hunspell compounding mishandling caused some errors to be hidden by pospell.

 ## [1.0.5] - 2020-07-01
 ### Fixed
-
 - Some errors were not reported due to [Hunspell not reporting them in
  Auto mode](https://github.com/hunspell/hunspell/issues/655).

 ## [1.0.4] - 2020-06-28
-
 ### Fixed
-
 - Avoid glueing words together: "hello - world" was sent to hunspell as "helloworld".
 - Don't pass placeholders like %s, %(foo)s, or {foo} to Hunspell.
 - Don't pass Sphinx variables with underscores in them to Hunspell, like {days_since}.

 ## [1.0.3] - 2019-10-17
-
 ### Changed
-
 - [Soft hyphens](https://en.wikipedia.org/wiki/Soft_hyphen) are now removed.

 ## [1.0.2] - 2019-10-16
-
 ### Fixed
-
 - In POSIX.1, also drop the .1.

 ## [1.0.1] - 2019-10-16
-
 ### Fixed
-
 - Drop prefixes while dropping accronyms, as in `non-HTTP`.
 - Regression fixed while dropping plural form of accronyms like `PEPs`.
--- a/README.md
+++ b/README.md
@ -1,6 +1,12 @@
 # pospell

-`pospell` is a spellcheckers for po files containing reStructuedText.
+`pospell` is a spellcheckers for po files containing reStructuredText.
+
+
+## Pospell is part of poutils!
+
+[Poutils](https://pypi.org/project/poutils) (`.po` utils) is a metapackage to easily install useful Python tools to use with po files
+and `pospell` is a part of it! Go check out [Poutils](https://pypi.org/project/poutils) to discover the other tools!


 ## Examples
@ -34,6 +40,7 @@ $ pospell --language fr --glob '**/*.po'
 …
 ```

+
 ## Usage

 ```
@ -60,3 +67,19 @@ optional arguments:

 A personal dict (the `-p` option) is simply a text file with one word
 per line.
+
+
+## Contributing
+
+You can work in a venv, to install the project locally:
+
+```bash
+python -m pip install .
+```
+
+And to test it locally:
+
+```bash
+python -m pip install tox
+tox -p all
+```
--- a/pospell.py
+++ b/pospell.py
@ -1,44 +1,65 @@
-"""pospell is a spellcheckers for po files containing reStructuedText.
-"""
+"""pospell is a spellcheckers for po files containing reStructuedText."""
+import collections
+import functools
 import io
-from string import digits
-from unicodedata import category
 import logging
+import multiprocessing
+import os
 import subprocess
 import sys
 from contextlib import redirect_stderr
 from itertools import chain
 from pathlib import Path
 from shutil import which
+from string import digits
+from typing import List, Tuple
+from unicodedata import category

 import docutils.frontend
 import docutils.nodes
 import docutils.parsers.rst
 import polib
+import regex
 from docutils.parsers.rst import roles
 from docutils.utils import new_document
+from sphinxlint import rst

-import regex
-
-__version__ = "1.0.10"
+__version__ = "1.3"

 DEFAULT_DROP_CAPITALIZED = {"fr": True, "fr_FR": True}

+Error = Tuple[str, int, str]
+
+input_line = collections.namedtuple("input_line", "filename line text")
+
+
+class POSpellException(Exception):
+    """All exceptions from this module inherit from this one."""
+
+
+class Unreachable(POSpellException):
+    """The code encontered a state that should be unreachable."""
+

 try:
    HUNSPELL_VERSION = subprocess.check_output(
        ["hunspell", "--version"], universal_newlines=True
-    ).split("\n")[0]
+    ).split("\n", maxsplit=1)[0]
 except FileNotFoundError:
    print("hunspell not found, please install hunspell.", file=sys.stderr)
    sys.exit(1)


 class DummyNodeClass(docutils.nodes.Inline, docutils.nodes.TextElement):
-    pass
+    """Used to represent any unknown roles, so we can parse any rst blindly."""


 def monkey_patch_role(role):
+    """Patch docutils.parsers.rst.roles.role so it always match.
+
+    Giving a DummyNodeClass for unknown roles.
+    """
+
    def role_or_generic(role_name, language_module, lineno, reporter):
        base_role, message = role(role_name, language_module, lineno, reporter)
        if base_role is None:
@ -53,82 +74,78 @@ roles.role = monkey_patch_role(roles.role)


 class NodeToTextVisitor(docutils.nodes.NodeVisitor):
+    """Recursively convert a docutils node to a Python string.
+
+    Usage:
+
+    >>> visitor = NodeToTextVisitor(document)
+    >>> document.walk(visitor)
+    >>> print(str(visitor))
+
+    It ignores (see IGNORE_LIST) some nodes, which we don't want in
+    hunspell (enphasis typically contain proper names that are unknown
+    to dictionaires).
+    """
+
+    IGNORE_LIST = (
+        "emphasis",
+        "superscript",
+        "title_reference",
+        "substitution_reference",
+        "citation_reference",
+        "strong",
+        "DummyNodeClass",
+        "reference",
+        "literal",
+        "Text",
+        "system_message",
+    )
+
    def __init__(self, document):
+        """Initialize visitor for the given node/document."""
        self.output = []
-        self.depth = 0
        super().__init__(document)

-    def dispatch_visit(self, node):
-        self.depth += 1
-        super().dispatch_visit(node)
-
-    def dispatch_departure(self, node):
-        self.depth -= 1
-        super().dispatch_departure(node)
-
    def unknown_visit(self, node):
        """Mandatory implementation to visit unknwon nodes."""
-        # print(" " * self.depth * 4, node.__class__.__name__, ":", node)

-    def unknown_departure(self, node):
-        """To help debugging tree."""
-        # print(node, repr(node), node.__class__.__name__)
+    @staticmethod
+    def ignore(node):
+        """Just raise SkipChildren.

-    def visit_emphasis(self, node):
+        Used for all visit_* in the IGNORE_LIST.
+
+        See __getattr__.
+        """
        raise docutils.nodes.SkipChildren

-    def visit_superscript(self, node):
-        raise docutils.nodes.SkipChildren
-
-    def visit_title_reference(self, node):
-        raise docutils.nodes.SkipChildren
-
-    def visit_strong(self, node):
-        raise docutils.nodes.SkipChildren
-
-    def visit_DummyNodeClass(self, node):
-        raise docutils.nodes.SkipChildren
-
-    def visit_reference(self, node):
-        raise docutils.nodes.SkipChildren
-
-    def visit_literal(self, node):
-        raise docutils.nodes.SkipChildren
+    def __getattr__(self, name):
+        """Skip childrens from the IGNORE_LIST."""
+        if name.startswith("visit_") and name[6:] in self.IGNORE_LIST:
+            return self.ignore
+        raise AttributeError(name)

    def visit_Text(self, node):
-        self.output.append(node.rawsource)
+        """Keep this node text, this is typically what we want to spell check."""
+        self.output.append(docutils.nodes.unescape(node, restore_backslashes=True))

    def __str__(self):
+        """Give the accumulated strings."""
        return " ".join(self.output)


 def strip_rst(line):
+    """Transform reStructuredText to plain text."""
    if line.endswith("::"):
        # Drop :: at the end, it would cause Literal block expected
        line = line[:-2]
+    line = rst.NORMAL_ROLE_RE.sub("", line)
+    settings = docutils.frontend.get_default_settings()
+    settings.pep_references = None
+    settings.rfc_references = None
+    settings.pep_base_url = "http://www.python.org/dev/peps/"
+    settings.pep_file_url_template = "pep-%04d"
    parser = docutils.parsers.rst.Parser()
-    settings = docutils.frontend.Values(
-        {
-            "report_level": 2,
-            "halt_level": 4,
-            "exit_status_level": 5,
-            "debug": None,
-            "warning_stream": None,
-            "error_encoding": "utf-8",
-            "error_encoding_error_handler": "backslashreplace",
-            "language_code": "en",
-            "id_prefix": "",
-            "auto_id_prefix": "id",
-            "pep_references": None,
-            "pep_base_url": "http://www.python.org/dev/peps/",
-            "pep_file_url_template": "pep-%04d",
-            "rfc_references": None,
-            "rfc_base_url": "http://tools.ietf.org/html/",
-            "tab_width": 8,
-            "trim_footnote_reference_space": None,
-            "syntax_highlight": "long",
-        }
-    )
    stderr_stringio = io.StringIO()
    with redirect_stderr(stderr_stringio):
        document = new_document("<rst-doc>", settings=settings)
@ -171,34 +188,48 @@ def clear(line, drop_capitalized=False, po_path=""):


 def quote_for_hunspell(text):
-    """
+    """Quote a paragraph so hunspell don't misinterpret it.
+
    Quoting the manpage:
    It is recommended that programmatic interfaces prefix
    every data line with an uparrow to protect themselves
-    against future changes in hunspell."""
+    against future changes in hunspell.
+    """
    out = []
-    for line in text.split("\n"):
+    for line in text:
        out.append("^" + line if line else "")
    return "\n".join(out)


 def po_to_text(po_path, drop_capitalized=False):
-    """Converts a po file to a text file, by stripping the msgids and all
-    po syntax, but by keeping the kept lines at their same position /
-    line number.
+    """Convert a po file to a text file.
+
+    This strips the msgids and all po syntax while keeping lines at
+    their same position / line number.
    """
-    buffer = []
+    input_lines = []
    lines = 0
-    entries = polib.pofile(po_path)
+    try:
+        entries = polib.pofile(Path(po_path).read_text(encoding="UTF-8"))
+    except Exception as err:
+        raise POSpellException(str(err)) from err
    for entry in entries:
        if entry.msgid == entry.msgstr:
            continue
+        if entry.obsolete:
+            continue
        while lines < entry.linenum:
-            buffer.append("")
            lines += 1
-        buffer.append(clear(strip_rst(entry.msgstr), drop_capitalized, po_path=po_path))
+            input_lines.append(input_line(po_path, lines, ""))
        lines += 1
-    return "\n".join(buffer)
+        input_lines.append(
+            input_line(
+                po_path,
+                lines,
+                clear(strip_rst(entry.msgstr), drop_capitalized, po_path=po_path),
+            )
+        )
+    return input_lines


 def parse_args():
@ -214,7 +245,7 @@ def parse_args():
        type=str,
        default="fr",
        help="Language to check, you'll have to install the corresponding "
-        "hunspell dictionary, on Debian see apt list 'hunspell-*'.",
+        "hunspell dictionary, on Debian see apt list 'hunspell-*' (defaults to 'fr').",
    )
    parser.add_argument(
        "--glob",
@ -225,12 +256,14 @@ def parse_args():
    parser.add_argument(
        "--drop-capitalized",
        action="store_true",
-        help="Always drop capitalized words in sentences (defaults according to the language).",
+        help="Always drop capitalized words in sentences"
+        " (defaults according to the language).",
    )
    parser.add_argument(
        "--no-drop-capitalized",
        action="store_true",
-        help="Never drop capitalized words in sentences (defaults according to the language).",
+        help="Never drop capitalized words in sentences"
+        " (defaults according to the language).",
    )
    parser.add_argument(
        "po_file",
@ -252,23 +285,35 @@ def parse_args():
        version="%(prog)s " + __version__ + " using hunspell: " + HUNSPELL_VERSION,
    )
    parser.add_argument("--debug", action="store_true")
-    parser.add_argument("-p", "--personal-dict", type=str)
+    parser.add_argument("-p", "--personal-dict", type=Path)
    parser.add_argument(
        "--modified", "-m", action="store_true", help="Use git to find modified files."
    )
+    parser.add_argument(
+        "-j",
+        "--jobs",
+        type=int,
+        default=os.cpu_count(),
+        help="Number of files to check in paralel, defaults to all available CPUs",
+    )
    args = parser.parse_args()
+    if args.personal_dict is not None and not args.personal_dict.exists():
+        print(f"Error: dictionary {str(args.personal_dict)!r} not found.")
+        sys.exit(1)
    if args.drop_capitalized and args.no_drop_capitalized:
        print("Error: don't provide both --drop-capitalized AND --no-drop-capitalized.")
        parser.print_help()
        sys.exit(1)
-    if not args.po_file and not args.modified:
+    if not args.po_file and not args.modified and not args.glob:
        parser.print_help()
        sys.exit(1)
    return args


 def look_like_a_word(word):
-    """Used to filter out non-words like `---` or `-0700` so they don't
+    """Return True if the given str looks like a word.
+
+    Used to filter out non-words like `---` or `-0700` so they don't
    get reported. They typically are not errors.
    """
    if not word:
@ -282,64 +327,99 @@ def look_like_a_word(word):
    return True


+def run_hunspell(language, personal_dict, input_lines) -> List[Error]:
+    """Run hunspell over the given input lines."""
+    personal_dict_arg = ["-p", personal_dict] if personal_dict else []
+    try:
+        output = subprocess.check_output(
+            ["hunspell", "-d", language, "-a"] + personal_dict_arg,
+            universal_newlines=True,
+            input=quote_for_hunspell(text for _, _, text in input_lines),
+        )
+    except subprocess.CalledProcessError:
+        return []
+    return parse_hunspell_output(input_lines, output.splitlines())
+
+
+def flatten(list_of_lists):
+    """[[a,b,c], [d,e,f]] -> [a,b,c,d,e,f]."""
+    return [element for a_list in list_of_lists for element in a_list]
+
+
 def spell_check(
    po_files,
    personal_dict=None,
    language="en_US",
    drop_capitalized=False,
    debug_only=False,
+    jobs=os.cpu_count(),
 ):
-    """Check for spelling mistakes in the files po_files (po format,
-    containing restructuredtext), for the given language.
+    """Check for spelling mistakes in the given po_files.
+
+    (po format, containing restructuredtext), for the given language.
    personal_dict allow to pass a personal dict (-p) option, to hunspell.

    Debug only will show what's passed to Hunspell instead of passing it.
    """
-    errors = []
-    personal_dict_arg = ["-p", personal_dict] if personal_dict else []
-    texts_for_hunspell = {}
-    for po_file in po_files:
-        if debug_only:
-            print(po_to_text(str(po_file), drop_capitalized))
-            continue
-        texts_for_hunspell[po_file] = po_to_text(str(po_file), drop_capitalized)
+    # Pool.__exit__ calls terminate() instead of close(), we need the latter,
+    # which ensures the processes' atexit handlers execute fully, which in
+    # turn lets coverage write the sub-processes' coverage information
+    pool = multiprocessing.Pool(jobs)  # pylint: disable=consider-using-with
    try:
-        output = subprocess.run(
-            ["hunspell", "-d", language, "-a"] + personal_dict_arg,
-            universal_newlines=True,
-            input=quote_for_hunspell("\n".join(texts_for_hunspell.values())),
-            stdout=subprocess.PIPE,
+        input_lines = flatten(
+            pool.map(
+                functools.partial(po_to_text, drop_capitalized=drop_capitalized),
+                po_files,
+            )
        )
-    except subprocess.CalledProcessError:
-        return -1
+        if debug_only:
+            for filename, line, text in input_lines:
+                print(filename, line, text, sep=":")
+            return 0
+        if not input_lines:
+            return 0

-    errors = 0
-    checked_files = iter(texts_for_hunspell.items())
-    checked_file_name, checked_text = next(checked_files)
-    checked_lines = iter(checked_text.split("\n"))
-    currently_checked_line = next(checked_lines)
-    current_line_number = 1
-    for line in output.stdout.split("\n")[1:]:
-        if not line:
+        # Distribute input lines across workers
+        lines_per_job = (len(input_lines) + jobs - 1) // jobs
+        chunked_inputs = [
+            input_lines[i : i + lines_per_job]
+            for i in range(0, len(input_lines), lines_per_job)
+        ]
+        errors = flatten(
+            pool.map(
+                functools.partial(run_hunspell, language, personal_dict),
+                chunked_inputs,
+            )
+        )
+    finally:
+        pool.close()
+        pool.join()
+
+    for error in errors:
+        print(*error, sep=":")
+    return len(errors)
+
+
+def parse_hunspell_output(inputs, outputs) -> List[Error]:
+    """Parse `hunspell -a` output and collect all errors."""
+    # skip first line of hunspell output (it's the banner)
+    outputs = iter(outputs[1:])
+    errors = []
+    for po_input_line, output_line in zip(inputs, outputs):
+        if not po_input_line.text:
+            continue
+        while output_line:
+            if output_line.startswith("&"):
+                _, original, *_ = output_line.split()
+                if look_like_a_word(original):
+                    errors.append(
+                        (po_input_line.filename, po_input_line.line, original)
+                    )
            try:
-                currently_checked_line = next(checked_lines)
-                current_line_number += 1
+                output_line = next(outputs)
            except StopIteration:
-                try:
-                    checked_file_name, checked_text = next(checked_files)
-                    checked_lines = iter(checked_text.split("\n"))
-                    currently_checked_line = next(checked_lines)
-                    current_line_number = 1
-                except StopIteration:
-                    return errors
-            continue
-        if line == "*":  # OK
-            continue
-        if line[0] == "&":
-            _, original, count, offset, *miss = line.split()
-            if look_like_a_word(original):
-                print(checked_file_name, current_line_number, original, sep=":")
-                errors += 1
+                break
+    return errors


 def gracefull_handling_of_missing_dicts(language):
@ -360,24 +440,22 @@ def gracefull_handling_of_missing_dicts(language):
    )
    if which("apt"):
        error("Maybe try something like:")
-        error("  sudo apt install hunspell-{}".format(language))
+        error(f"  sudo apt install hunspell-{language}")
    else:
        error(
-            """I don't know your environment, but I bet the package name looks like:
+            f"""I don't know your environment, but I bet the package name looks like:

    hunspell-{language}

 If you find it, please tell me (by opening an issue or a PR on
 https://github.com/JulienPalard/pospell/) so I can enhance this error message.
-""".format(
-                language=language
-            )
+"""
        )
    sys.exit(1)


 def main():
-    """Module entry point."""
+    """Entry point (for command-line)."""
    args = parse_args()
    logging.basicConfig(level=50 - 10 * args.verbose)
    default_drop_capitalized = DEFAULT_DROP_CAPITALIZED.get(args.language, False)
@ -392,7 +470,7 @@ def main():
    )
    if args.modified:
        git_status = subprocess.check_output(
-            ["git", "status", "--porcelain"], encoding="utf-8"
+            ["git", "status", "--porcelain", "--no-renames"], encoding="utf-8"
        )
        git_status_lines = [
            line.split(maxsplit=2) for line in git_status.split("\n") if line
@ -400,11 +478,20 @@ def main():
        args.po_file.extend(
            Path(filename)
            for status, filename in git_status_lines
-            if filename.endswith(".po")
+            if filename.endswith(".po") and status != "D"
        )
-    errors = spell_check(
-        args.po_file, args.personal_dict, args.language, drop_capitalized, args.debug
-    )
+    try:
+        errors = spell_check(
+            args.po_file,
+            args.personal_dict,
+            args.language,
+            drop_capitalized,
+            args.debug,
+            args.jobs,
+        )
+    except POSpellException as err:
+        print(err, file=sys.stderr)
+        sys.exit(-1)
    if errors == -1:
        gracefull_handling_of_missing_dicts(args.language)
    sys.exit(0 if errors == 0 else -1)
--- a/pyproject.toml
+++ b/pyproject.toml
@ -0,0 +1,60 @@
+[build-system]
+requires = ["setuptools"]
+build-backend = "setuptools.build_meta"
+
+[project]
+name = "pospell"
+authors = [
+    {name = "Julien Palard", email = "julien@palard.fr"},
+]
+description = "Spellcheck .po files containing reStructuredText translations"
+keywords = [
+    "po",
+    "spell",
+    "gettext",
+    "reStructuredText",
+    "check",
+    "sphinx",
+    "translation",
+]
+classifiers = [
+    "Development Status :: 5 - Production/Stable",
+    "Intended Audience :: Developers",
+    "License :: OSI Approved :: MIT License",
+    "Natural Language :: English",
+    "Programming Language :: Python :: 3",
+]
+requires-python = ">= 3.7"
+dependencies = [
+    "polib",
+    "docutils>=0.18",
+    "regex",
+    "sphinx-lint>=0.6.8",
+]
+dynamic = [
+    "version",
+]
+
+[project.license]
+text = "MIT license"
+
+[project.readme]
+file = "README.md"
+content-type = "text/markdown; charset=UTF-8"
+
+[project.urls]
+Homepage = "https://git.afpy.org/AFPy/pospell"
+
+[project.scripts]
+pospell = "pospell:main"
+
+[tool.setuptools]
+py-modules = [
+    "pospell",
+]
+include-package-data = false
+
+[tool.setuptools.dynamic.version]
+attr = "pospell.__version__"
+
+[tool.black]
--- a/setup.cfg
+++ b/setup.cfg
@ -1,15 +0,0 @@
-[bumpversion]
-current_version = 1.0.10
-commit = True
-tag = True
-
-[bumpversion:file:setup.py]
-search = version="{current_version}"
-replace = version="{new_version}"
-
-[bumpversion:file:pospell.py]
-search = __version__ = "{current_version}"
-replace = __version__ = "{new_version}"
-
-[bdist_wheel]
-universal = 1
--- a/setup.py
+++ b/setup.py
@ -1,34 +0,0 @@
-#!/usr/bin/env python3
-
-import setuptools
-
-with open("README.md") as readme:
-    long_description = readme.read()
-
-setuptools.setup(
-    name="pospell",
-    version="1.0.10",
-    description="Spellcheck .po files containing reStructuredText translations",
-    long_description=long_description,
-    long_description_content_type="text/markdown",  # This is important!
-    author="Julien Palard",
-    author_email="julien@palard.fr",
-    url="https://github.com/JulienPalard/pospell",
-    py_modules=["pospell"],
-    entry_points={"console_scripts": ["pospell=pospell:main"]},
-    extras_require={
-        "dev": ["bandit", "black", "detox", "flake8", "isort", "mypy", "pylint"]
-    },
-    install_requires=["polib", "docutils>=0.11", "regex"],
-    license="MIT license",
-    keywords="po spell gettext reStructuredText check sphinx translation",
-    classifiers=[
-        "Development Status :: 3 - Alpha",
-        "Intended Audience :: Developers",
-        "License :: OSI Approved :: MIT License",
-        "Natural Language :: English",
-        "Programming Language :: Python :: 3",
-        "Programming Language :: Python :: 3.5",
-        "Programming Language :: Python :: 3.6",
-    ],
-)
--- a/tests/expected_to_success/hour.po
+++ b/tests/expected_to_success/hour.po
@ -1,2 +1,2 @@
-msgid "Rendez-vous à 10h chez Murex"
-msgstr "See your at 10h at Murex"
+msgid "Rendez-vous à 10h à la fête"
+msgstr "See your at 10h at the party"
--- a/tests/test_pospell.py
+++ b/tests/test_pospell.py
@ -1,5 +1,3 @@
-import os
-from types import SimpleNamespace
 from pathlib import Path

 import pytest
--- a/tox.ini
+++ b/tox.ini
@ -0,0 +1,74 @@
+[flake8]
+;E203 for black (whitespace before : in slices), and F811 for @overload
+ignore = E203, F811
+max-line-length = 88
+
+[coverage:run]
+; branch = true: would need a lot of pragma: no branch on infinite loops.
+parallel = true
+concurrency = multiprocessing
+omit =
+  .tox/*
+
+[coverage:report]
+skip_covered = True
+show_missing = True
+exclude_lines =
+    pragma: no cover
+    def __repr__
+    if self\.debug
+    raise AssertionError
+    raise NotImplementedError
+    if __name__ == .__main__.:
+
+
+[tox]
+envlist = py37, py38, py39, py310, py311, py312, flake8, mypy, black, pylint, pydocstyle, coverage
+isolated_build = True
+skip_missing_interpreters = True
+
+[testenv]
+deps =
+    pytest
+    coverage
+commands = coverage run -m pytest
+setenv =
+  COVERAGE_FILE={toxworkdir}/.coverage.{envname}
+
+[testenv:coverage]
+depends = py37, py38, py39, py310, py312
+parallel_show_output = True
+deps = coverage
+skip_install = True
+setenv = COVERAGE_FILE={toxworkdir}/.coverage
+commands =
+  coverage combine
+  coverage report --fail-under 65
+
+
+[testenv:flake8]
+deps = flake8
+skip_install = True
+commands = flake8 tests/ pospell.py
+
+[testenv:black]
+deps = black
+skip_install = True
+commands = black --check --diff tests/ pospell.py
+
+[testenv:mypy]
+deps =
+    mypy
+    types-docutils
+    types-polib
+skip_install = True
+commands = mypy --ignore-missing-imports pospell.py
+
+[testenv:pylint]
+deps = pylint
+commands = pylint --disable import-outside-toplevel,invalid-name pospell.py
+
+[testenv:pydocstyle]
+deps = pydocstyle
+skip_install = True
+commands = pydocstyle pospell.py
Author	SHA1	Message	Date
rffontenelle	333540f9a8	Make explicit that French is the default language used All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details	2023-11-20 11:09:20 +00:00
Julien Palard	87d1a3e26f	Missing hunspell in CI. All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details	2023-11-20 12:02:55 +01:00
Julien Palard	f1a9ae321f	Hello woodpekcer. Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details	2023-11-20 12:00:45 +01:00
Julien Palard	c26878af0b	Bump min Python version to 3.7. Because I do no longer have a 3.6 on my machine to test it.	2023-11-20 11:58:57 +01:00
Julien Palard	b164f089d6	Explicitly fail if dict is missing.	2023-11-20 11:58:43 +01:00
Julien Palard	8b753bde26	FIX: Discrepancy between docutils rst and sphinx rst :rfc: don't allow aliases in docutils implementation. See: https://sourceforge.net/p/docutils/feature-requests/75/	2023-07-21 09:08:12 +02:00
Julien Palard	a626a2f3fb	Bump Python versions used in tests.	2023-07-19 10:46:15 +02:00
Julien Palard	07d854dcec	docutils is migrationg to argparse.	2023-07-19 10:46:04 +02:00
Julien Palard	33eb8f7f7d	Move to pyproject.toml	2023-07-18 15:04:47 +02:00
Julien Palard	cf6c1c8919	Don't run hunspell on obsolete values.	2023-04-10 16:43:11 +02:00
Mindiell	d8a2e20e7e	Fix typo in README	2023-03-08 08:48:29 +01:00
rtobar	c4feb4d25f	Adjust raw text extraction from docutils documents (#33 ) The previous version of this code relied on the Text.rawsource attribute to obtain the raw, original version of the translated texts contained in .po files. This attribute however was removed in docutils 0.18, and thus a different way of obtaining this information was needed. (Note that this attribute removal was planned, but not for this release yet: it's currently listed not in 0.18's list of changes, but under "Future changes". https://sourceforge.net/p/docutils/bugs/437/ has been opened to get this eventually clarified) The commit that removed the Text.rawsource mentioned that the data fed into the Text elements was already the raw source, hence there was no need to keep a separate attribute. Text objects derive from str, so we can directly add them to the list of strings where NodeToTextVisitor builds the original text, with the caveat that it needs to have backslashes restored (they are encoded as null bytes after parsing, apparently). The other side-effect of using the Text objects directly instead of the Text.rawsoource attribute is that now we get more of them. The document resulting from docutils' parsing can contain system_message elements with debugging information from the parsing process, such as warnings. These are Text elements with no rawsource, but with actual text, so we need to skip them. In the same spirit, citation_references and substitution_references need to be ignored as well. All these changes allow pospell to work against the latest docutils. On the other hand, the lowest supported version is 0.16: 0.11 through 0.14 failed at rfc role parsing (used for example in the python docs), and 0.15 didn't have a method to restore backslashes (which again made the python docs fail). Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>	2021-11-30 17:57:04 +01:00
Julien Palard	2844284bb7	Rename branch.	2021-11-26 10:38:50 +01:00
Julien Palard	204417d00a	Bump to v1.1.	2021-11-26 10:36:49 +01:00
rtobar	caf1412f49	Allow using only --glob without further po_files (#31 ) At the moment pospell complains if invoked with a --glob pattern but without any other po_files in the command line. This is a problem only with the check, as the code is ready to handle the situation. To bypass this problem, one needs to pass a po_file in the command-line as well, even if the glob pattern contains it. This commit adjusts the condition that checks that input files have been somehow specified to consider --glob as a source of input files. Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>	2021-11-26 10:27:05 +01:00
rtobar	3553ecd726	Refactor pospell to use multiprocessing (#32 ) One of the main drawbacks of pospell at the moment is that checking is performed serially by a single hunspell process. In small projects this is not noticeable, but in slightly bigger ones this can go up a bit (e.g., in python-docs-es it takes ~2 minutes to check the whole set of .po files). The obvious solution to speed things up is to use multiprocessing, parallelising the process at two different places: first, when reading the input .po files and collecting the input strings to feed into hunspell, and secondly when running hunspell itself. This commit implements this support. It works as follows: * A new namedtuple called input_line has been added. It contains a filename, a line, and text, and thus it uniquely identifies an input line in a self-contained way. * When collecting input to feed into hunspell, the po_to_text routine collects input_lines instead of a simple string. This is done with a multiprocessing Pool to run in parallel across all input files. * The input_lines are split in N blocks, with N being the size of the pool. Note that during this process input_lines from different files might end up in the same block, and input_lines from the same file might end up in different blocks; however since input_lines are self-contained we are not losing information. * N hunspell instances are run over the N blocks of input_lines using the pool (only the text field from the input_lines is fed into hunspell). * When interpreting errors from hunspell we can match an input_line with its corresponding hunspell output lines, and thus can identify the original file:line that caused the error. The multiprocessing pool is sized via a new -j/--jobs command line option, which defaults to os.cpu_count() to run at maximum speed by default. These are the kind of differences I see with python-docs-es in my machine, so YMMV depending on your setup/project: $> time pospell -p dict2.txt -l es_ES /.po -j 1 real 2m1.859s user 2m6.680s sys 0m3.829s $> time pospell -p dict2.txt -l es_ES /.po -j 2 real 1m10.322s user 2m18.210s sys 0m3.559s Finally, these changes had some minor effects on the tooling around testing. Pylint complained about there being too many arguments now in check_spell, so pylint's max-args settings has been adjusted as discussed. Separately, coverage information now needs to be collected for sub-processes of the test main process; this is automatically done by the pytest-cov plug-in, so I've switched tox to use that rather than the more manual running of pytest under coverage (which would otherwise require some extra setup to account for subprocesses).	2021-11-26 10:26:35 +01:00
Julien Palard	8b0d6d8778	Bump requirements. It's hard to get a freezed set of dependencies working in all tested versions, so I unpin them from tox.	2021-10-27 19:12:29 +02:00
Julien Palard	cafe8f8630	Bump dev requirements.	2021-10-27 17:33:34 +02:00
Julien Palard	6c8779826a	Pleases pylint and mypy.	2021-10-27 17:24:27 +02:00
Julien Palard	1525acad68	docutils dropped the 'rawsource' attribute in 0.18.	2021-10-27 17:16:02 +02:00
Álvaro Mondéjar	23eb401584	Ignore deleted files using '--modified' option (#28 )	2021-04-13 18:33:44 +02:00
Julien Palard	d7468aacb1	Bump to v1.0.12.	2021-04-10 00:12:33 +02:00
Álvaro Mondéjar	d81c49221e	Add 'line_length_limit' configuration field to 'docutils.frontend.Values'. (#26 )	2021-04-10 00:07:12 +02:00
Julien Palard	3e4bb50687	Tox and github actions. (#24 )	2020-11-23 14:26:34 +01:00
Julien Palard	f7b61e04d0	Move from setup.py to setup.cfg.	2020-11-23 12:56:58 +01:00
Jules Lasne (jlasne)	a42de31a88	Added poutils section to README (#23 )	2020-10-14 18:24:46 +02:00
Julien Palard	48c9a75b68	Bump version: 1.0.10 → 1.0.11	2020-10-14 00:57:39 +02:00
Julien Palard	bdf4a08c5b	Handle file opening errors. Closes #18 . Co-authored-by: Christophe Nanteuil <christophe.nanteuil@gmail.com>	2020-10-14 00:56:44 +02:00