Files
libreoffice/compilerplugins
Stephan Bergmann 50d73574b6 Related tdf#104597, tdf#151546: Introduce comphelper::string::reverseCodePoints
69e9925ded "sdext.pdfimport: resolves tdf#104597:
RTL script text runs are reversed" and f6004e1c45
"tdf#151546: RTL text is reversed (Writer pdfimport)" had introduced two calls
to comphelper::string::reverseString into sdext.  That function reverts on the
basis of individual UTF-16 code units, not on the basis of Unicode code points.
And while at least some pre-existing callers of that function want the former
semantics (see below), these two new callers in sdext apparently want the latter
semantics.  Therefore, introduce an additional function
comphelper::string::reverseCodePoints with the latter semantics.

I identified three other places that call comphelper::string::reverseString:
* SbRtl_StrReverse in basic/source/runtime/methods1.cxx apparently implements
  some StrReverse Basic function, where a (presumably non-existing) Basic spec
  would need to decide which of the two semantics is called for.  So leave it
  alone for now.
* SvtFileDialog::IsolateFilterFromPath_Impl in fpicker/source/office/iodlg.cxx
  reverts a string, operates on it, then reverts (parts of) it back.  Whether or
  not that is the most elegant code, using the latter semantics here would
  apparently be wrong, as double invocation of
  comphelper::string::reverseCodePoints is not idempotent when the input is a
  malformed sequence of UTF-16 code units containing a low surrogate followed by
  a high surrogate.
* AccessibleCell::getCellName in svx/source/table/accessiblecell.cxx apparently
  always operates on a string consisting only of Latin uppercase letters A--Z,
  for which both semantics are equivalent.  (So we can just as well stick with
  the simpler comphelper::string::reverseString here.)

(Extending the tests in comphelper/qa/string/test_string.cxx ran into an issue
where loplugin:stringliteralvar warns about deliberate uses of sal_Unicode
arrays rather than UTF-16 string literals wrapped in OUStringLiteral, as those
arrays deliberately contain malformed UTF-16 code unit sequences and thus
converting them into UTF-16 string literals might be considered inappropriate,
see the newly added comment at
StringLiteralVar::isPotentiallyInitializedWithMalformedUtf16 in
compilerplugins/clang/stringliteralvar.cxx for details.  So that loplugin had to
be improved here, too.)

Change-Id: I641cc32c76b0c5f6339ae44d8aa85df0022ffb05
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/142949
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2022-11-18 19:50:28 +01:00
..

Compiler plugins

Overview

This directory contains code for compiler plugins. These are used to perform additional actions during compilation (such as additional warnings) and also to perform mass code refactoring.

Currently only the Clang compiler is supported http://wiki.documentfoundation.org/Development/Clang.

Usage

Compiler plugins are enabled automatically by --enable-dbgutil if Clang headers are found or explicitly using --enable-compiler-plugins.

Functionality

There are two kinds of plugin actions:

  • compile checks - these are run during normal compilation
  • rewriters - these must be run manually and modify source files

Each source has a comment saying whether it's compile check or a rewriter and description of functionality.

Compile Checks

Used during normal compilation to perform additional checks. All warnings and errors are marked '[loplugin]' in the message.

Rewriters

Rewriters analyse and possibly modify given source files. Usage: make COMPILER_PLUGIN_TOOL=<rewriter_name> Additional optional make arguments:

  • it is possible to also pass FORCE_COMPILE=all to make to trigger rebuild of all source files, even those that are up to date. FORCE_COMPILE takes a list of gbuild targets specifying where to run the rewriter ('all' means everything, '-' prepended means to not enable, '/' appended means everything in the directory; there is no ordering, more specific overrides more general, and disabling takes precedence). Example: FORCE_COMPILE="all -sw/ -Library_sc"

  • UPDATE_FILES=<scope> - limits which modified files will be actually written back with the changes

    • mainfile - only the main .cxx file will be modified (default)
    • all - all source files involved will be modified (possibly even header files from other LO modules), 3rd party header files are however never modified
    • <module> - only files in the given LO module (toplevel directory) will be modified (including headers)

Modifications will be written directly to the source files.

Some rewriter plugins are dual-mode and can also be used in a non-rewriting mode in which they emit warnings for problematic code that they would otherwise automatically rewrite. When any rewriter is enabled explicitly via make COMPILER_PLUGIN_TOOL=<rewriter_name> it works in rewriting mode (and all other plugins are disabled), but when no rewriter is explicitly enabled (i.e., just make), all dual-mode rewriters are enabled in non-rewriting mode (along with all non-rewriter plugins; and all non--dual-mode plugins are disabled). The typical process to use such a dual-mode rewriter X in rewriting mode is

make COMPILER_PLUGIN_WARNINGS_ONLY=X \
&& make COMPILER_PLUGIN_TOOL=X FORCE_COMPILE=all UPDATE_FILES=all

which first generates a full build without failing due to warnings from plugin X in non-rewriting mode (in case of --enable-werror) and then repeats the build in rewriting mode (during which no object files are generate).

Code Documentation / HowTos

https://wiki.documentfoundation.org/Clang_plugins