[PATCH V5] add a make check-local command to verify the i18n string formatting

From: ShaoHe Feng <shaohef@linux.vnet.ibm.com> V4 -> V5: improve the regular expression pattern. remove glob V3 -> V4: use python script to check the i18n string formatting For discuss with ZhengSheng, we should extend the check. We should check the string are obsolete or they are statement but not define in i18n.py V2 -> V3: improve the regular expression pattern. V1 -> V2: move the check to top makefile. check all i18n.py improve the checking information. ShaoHe Feng (1): add a make check-local command to verify the i18n string formatting Makefile.am | 5 +++++ contrib/check_i18n.py | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+) create mode 100755 contrib/check_i18n.py -- 1.8.5.3

From: ShaoHe Feng <shaohef@linux.vnet.ibm.com> When I do i18n translation, I find some string formattings are wrong. So I add a check-local command to help the developer to check their string formatting. Every developers please run this command before submit your patch. After you run this command, it may report some invalid string formattings as follow. $ make check-local Checking for invalid i18n string... bad i18n string formatting: KCHAPI0005E: Create is not allowed for %(resource). make: *** [check-local] Error 1 You should check %(resource) is what you want. Ref: https://docs.python.org/2/library/stdtypes.html#string-formatting-operations Signed-off-by: ShaoHe Feng <shaohef@linux.vnet.ibm.com> Signed-off-by: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> Signed-off-by: Christy Perez <christy@linux.vnet.ibm.com> --- Makefile.am | 5 +++++ contrib/check_i18n.py | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+) create mode 100755 contrib/check_i18n.py diff --git a/Makefile.am b/Makefile.am index c68f050..401ff37 100644 --- a/Makefile.am +++ b/Makefile.am @@ -73,7 +73,12 @@ PEP8_WHITELIST = \ SKIP_PYFLAKES_ERR = "\./src/kimchi/websocket\.py" +I18N_FILES = plugins/*/i18n.py \ + src/kimchi/i18n.py \ + $(NULL) + check-local: + contrib/check_i18n.py $(I18N_FILES) find . -path './.git' -prune -type f -o \ -name '*.py' -o -name '*.py.in' | xargs $(PYFLAKES) | \ grep -w -v $(SKIP_PYFLAKES_ERR) | \ diff --git a/contrib/check_i18n.py b/contrib/check_i18n.py new file mode 100755 index 0000000..c1c7160 --- /dev/null +++ b/contrib/check_i18n.py @@ -0,0 +1,56 @@ +#!/usr/bin/python +# +# Project Kimchi +# +# Copyright IBM, Corp. 2014 +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +import imp +import os +import re +import sys + + +PATTERN = re.compile("%\([^\)]*\)[#0\-\+ ]?(\d+|\*)?(\.(\d+|\*))?" + "[lLh]?[cdeEfFgGioursxX%]") +BAD_PATTERN = re.compile("%\([^\)]*\)") + + +def load_i18n_module(i18nfile): + path = os.path.dirname(i18nfile) + mname = i18nfile.replace("/", "_").rstrip(".py") + mobj = imp.find_module("i18n", [path]) + return imp.load_module(mname, *mobj) + + +def check_string_formatting(messages): + for k, v in messages.iteritems(): + if BAD_PATTERN.findall(PATTERN.sub(" ", v)): + print "bad i18n string formatting:" + print " %s: %s" % (k, v) + exit(1) + + +def main(): + print "Checking for invalid i18n string..." + for f in sys.argv[1:]: + messages = load_i18n_module(f).messages + check_string_formatting(messages) + print "Checking for invalid i18n string successfully" + + +if __name__ == '__main__': + main() -- 1.8.5.3

on 2014/04/08 21:18, shaohef@linux.vnet.ibm.com wrote:
+PATTERN = re.compile("%\([^\)]*\)[#0\-\+ ]?(\d+|\*)?(\.(\d+|\*))?" + "[lLh]?[cdeEfFgGioursxX%]") +BAD_PATTERN = re.compile("%\([^\)]*\)")
There are some problems in the regular expressions. The first problem is that "\" is a special character in Python string. For example, we all know "\" gives a special meaning to "\n" and "\t". If we want to use it in a regular expression, we have to escape itself, for example re.compile("%\\(\\)"). This is tedious and error-prone. Usually we just use a prefix "r" before the string literal to stop Python from translating "\", for example, re.compile(r"%\(\)"). The second problem is that "+", "*", "(", ")" automatically lose their special meaning inside "[ ]", we can use them directly in "[ ]". The third problem is that we should comment each part of a complicated regular expression. As a result, I suggest the following the regular expression. # Match all conversion specifier with mapping key PATTERN = re.compile(r'''%\([^)]+\) # Mapping key [#0\-+]? # Conversion flags (optional) (\d+|\*)? # Minimum field width (optional) (\.(\d+|\*))? # Precision (optional) [lLh]? # Length modifier (optional) [cdeEfFgGioursxX%] # Conversion type''', re.VERBOSE) BAD_PATTERN = re.compile(r"%\([^)]*?\)") Notice I drop the space from the conversion flags, and leave only "#", "0", "-" and "+". It is fully legal for someone to write the following, "%(k) done" % {"k": 100} and the result is ' 100one' This is because Python interpreters the space between "%(k)" and "done" as the conversion flag, then Python eats "d" from "done" as the conversion type. Though it's legal, it's error prone. So I think it's better not to support this rare usage of space in the conversion flags, and report it as error. -- Thanks and best regards! Zhou Zheng Sheng / 周征晟 E-mail: zhshzhou@linux.vnet.ibm.com Telephone: 86-10-82454397

On 04/10/2014 03:47 PM, Zhou Zheng Sheng wrote:
on 2014/04/08 21:18, shaohef@linux.vnet.ibm.com wrote:
+PATTERN = re.compile("%\([^\)]*\)[#0\-\+ ]?(\d+|\*)?(\.(\d+|\*))?" + "[lLh]?[cdeEfFgGioursxX%]") +BAD_PATTERN = re.compile("%\([^\)]*\)") There are some problems in the regular expressions.
The first problem is that "\" is a special character in Python string. For example, we all know "\" gives a special meaning to "\n" and "\t". If we want to use it in a regular expression, we have to escape itself, for example re.compile("%\\(\\)"). This is tedious and error-prone. Usually we just use a prefix "r" before the string literal to stop Python from translating "\", for example, re.compile(r"%\(\)").
The second problem is that "+", "*", "(", ")" automatically lose their special meaning inside "[ ]", we can use them directly in "[ ]".
The third problem is that we should comment each part of a complicated regular expression.
As a result, I suggest the following the regular expression.
# Match all conversion specifier with mapping key PATTERN = re.compile(r'''%\([^)]+\) # Mapping key [#0\-+]? # Conversion flags (optional) (\d+|\*)? # Minimum field width (optional) (\.(\d+|\*))? # Precision (optional) [lLh]? # Length modifier (optional) [cdeEfFgGioursxX%] # Conversion type''', re.VERBOSE) BAD_PATTERN = re.compile(r"%\([^)]*?\)")
Notice I drop the space from the conversion flags, and leave only "#", "0", "-" and "+". It is fully legal for someone to write the following, "%(k) done" % {"k": 100} and the result is ' 100one'
This is because Python interpreters the space between "%(k)" and "done" as the conversion flag, then Python eats "d" from "done" as the conversion type. Though it's legal, it's error prone. So I think it's better not to support this rare usage of space in the conversion flags, and report it as error.
ACK. -- Thanks and best regards! Sheldon Feng(冯少合)<shaohef@linux.vnet.ibm.com> IBM Linux Technology Center
participants (3)
-
shaohef@linux.vnet.ibm.com
-
Sheldon
-
Zhou Zheng Sheng