[Concept,4/7] buildman: Fix kconfiglib string-value escaping

Message ID 20260329111037.1352652-5-sjg@u-boot.org
State New
Headers
Series qconfig: Use kconfiglib for database build and defconfig sync |

Commit Message

Simon Glass March 29, 2026, 11:10 a.m. UTC
  From: Simon Glass <sjg@chromium.org>

The C Kconfig implementation (scripts/kconfig/confdata.c) stores string
values from .config files verbatim -- no unescaping on read, no
escaping on write. But kconfiglib's unescape() strips backslashes
before ANY character (\n -> n, \x -> x), and escape() only re-adds
them for \ and ". This means \n in a .config string silently becomes n
after a round-trip.

Fix unescape() to only handle \" (the only escape that matters for
.config parsing), and escape() to only handle " -> \". This makes
string values pass through unchanged, matching the C behaviour.

Also fix _expand_str() (the Kconfig file string parser) to only
consume \\, \", \', and \$ (meaningful Kconfig escapes), leaving \n
and similar sequences as-is.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 doc/develop/qconfig.rst      | 17 +++++------------
 tools/buildman/kconfiglib.py | 32 ++++++++++++++++++++------------
 2 files changed, 25 insertions(+), 24 deletions(-)

-- 
2.43.0
  

Patch

diff --git a/doc/develop/qconfig.rst b/doc/develop/qconfig.rst
index 423fb9118d8..431de08cff3 100644
--- a/doc/develop/qconfig.rst
+++ b/doc/develop/qconfig.rst
@@ -41,18 +41,11 @@  since no ``make`` subprocesses or cross-compiler toolchains are needed.
 Defconfig files containing ``#include`` directives are preprocessed with the
 C preprocessor before loading, matching the behaviour of the build system.
 
-There are two known cosmetic differences compared with the old make-based
-approach:
-
-- ``CONFIG_GCC_VERSION`` reflects the host compiler rather than each board's
-  cross-compiler, since no cross-compiler is invoked.
-
-- Backslash escape sequences in string values (e.g. ``\n``, ``\x1b``) may
-  differ slightly due to kconfiglib's unescape/escape round-trip. This affects
-  a handful of string CONFIGs such as ``CONFIG_AUTOBOOT_PROMPT``.
-
-Neither difference affects the usefulness of the database for finding CONFIG
-combinations or computing imply relationships.
+There is one known cosmetic difference compared with the old make-based
+approach: ``CONFIG_GCC_VERSION`` reflects the host compiler rather than each
+board's cross-compiler, since no cross-compiler is invoked. This does not
+affect the usefulness of the database for finding CONFIG combinations or
+computing imply relationships.
 
 Resyncing defconfigs
 ~~~~~~~~~~~~~~~~~~~~
diff --git a/tools/buildman/kconfiglib.py b/tools/buildman/kconfiglib.py
index 27abbf9a7a1..cdaacf471d8 100644
--- a/tools/buildman/kconfiglib.py
+++ b/tools/buildman/kconfiglib.py
@@ -2727,10 +2727,16 @@  class Kconfig(object):
                 return (s, match.end())
 
             elif match.group() == "\\":
-                # Replace '\x' with 'x'. 'i' ends up pointing to the character
-                # after 'x', which allows macros to be canceled with '\$(foo)'.
+                # Replace '\x' with 'x' for characters that are meaningful
+                # escapes in Kconfig string literals: \\, \", \', and \$
+                # (to cancel macro expansion). Other \<char> sequences
+                # like \n are left as-is, so that the stored value
+                # round-trips correctly through escape().
                 i = match.end()
-                s = s[:match.start()] + s[i:]
+                if i < len(s) and s[i] in "\\\"'$":
+                    s = s[:match.start()] + s[i:]
+                else:
+                    i = match.end()
 
             elif match.group() == "$(":
                 # A macro call within the string
@@ -6176,23 +6182,25 @@  def split_expr(expr, op):
 
 def escape(s):
     r"""
-    Escapes the string 's' in the same fashion as is done for display in
-    Kconfig format and when writing strings to a .config file. " and \ are
-    replaced by \" and \\, respectively.
+    Escapes the string 's' for writing to a .config file. Only " is escaped
+    (to \"), matching the symmetric unescape() behaviour. Backslash sequences
+    like \\ and \n are left as-is, since unescape() preserves them and the C
+    Kconfig implementation does not process them.
     """
-    # \ must be escaped before " to avoid double escaping
-    return s.replace("\\", r"\\").replace('"', r'\"')
+    return s.replace('"', r'\"')
 
 
 def unescape(s):
     r"""
-    Unescapes the string 's'. \ followed by any character is replaced with just
-    that character. Used internally when reading .config files.
+    Unescapes the string 's'. \" is replaced with ". Other \<char> sequences,
+    including \\, are left as-is so that escape() can round-trip them
+    correctly. This matches the C Kconfig implementation which does not
+    unescape string values read from .config files.
     """
     return _unescape_sub(r"\1", s)
 
-# unescape() helper
-_unescape_sub = re.compile(r"\\(.)").sub
+# unescape() helper — only unescape \"
+_unescape_sub = re.compile(r'\\(")').sub
 
 
 def standard_kconfig(description=None):