From patchwork Tue Sep 9 15:18:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Simon Glass X-Patchwork-Id: 281 Return-Path: X-Original-To: u-boot-concept@u-boot.org Delivered-To: u-boot-concept@u-boot.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=u-boot.org; s=default; t=1757497240; bh=WLegCvAEQlPstH5C+p8sSi6rw+tf5KFWn1gpIk/jcDI=; h=From:To:Date:In-Reply-To:References:CC:Subject:List-Id: List-Archive:List-Help:List-Owner:List-Post:List-Subscribe: List-Unsubscribe:From; b=AiN2t8A/znvbSeVbL8xB5AG1xMjgNazLcAuKQsl42+3NN5JD6iqa89GXptd4hIDO+ RhM2JWSUkoVEMFoh5JszZnYxIKjeNJl/5ezxzE6WsoUEqoG2O0gktqG++raiyGLBXu O1zNfMM1sY1lWHmWUJrRlSClIRWBbd48BAZku6u7UNeAumuBwys2WVsC0+hYP9QKhV KJ/+/vJEYknX4pl0OUH8JU22relsaNF8yb/qfYhSvxOTyTTy7nN7JPiKsw84/MoN8R GzQHRWn89IVqsK/ktQ/PS8FQUe+mLiToxnZ0GTOzoVyi6PshYHCCk/9ZMPmpt20Od9 Y8cXWpNEoyyUQ== Received: from localhost (localhost [127.0.0.1]) by mail.u-boot.org (Postfix) with ESMTP id 4AE6467A22 for ; Wed, 10 Sep 2025 03:40:40 -0600 (MDT) X-Virus-Scanned: Debian amavis at Received: from mail.u-boot.org ([127.0.0.1]) by localhost (mail.u-boot.org [127.0.0.1]) (amavis, port 10024) with ESMTP id QTp91KZ4moCI for ; Wed, 10 Sep 2025 03:40:40 -0600 (MDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=u-boot.org; s=default; t=1757497240; bh=WLegCvAEQlPstH5C+p8sSi6rw+tf5KFWn1gpIk/jcDI=; h=From:To:Date:In-Reply-To:References:CC:Subject:List-Id: List-Archive:List-Help:List-Owner:List-Post:List-Subscribe: List-Unsubscribe:From; b=AiN2t8A/znvbSeVbL8xB5AG1xMjgNazLcAuKQsl42+3NN5JD6iqa89GXptd4hIDO+ RhM2JWSUkoVEMFoh5JszZnYxIKjeNJl/5ezxzE6WsoUEqoG2O0gktqG++raiyGLBXu O1zNfMM1sY1lWHmWUJrRlSClIRWBbd48BAZku6u7UNeAumuBwys2WVsC0+hYP9QKhV KJ/+/vJEYknX4pl0OUH8JU22relsaNF8yb/qfYhSvxOTyTTy7nN7JPiKsw84/MoN8R GzQHRWn89IVqsK/ktQ/PS8FQUe+mLiToxnZ0GTOzoVyi6PshYHCCk/9ZMPmpt20Od9 Y8cXWpNEoyyUQ== Received: from mail.u-boot.org (localhost [127.0.0.1]) by mail.u-boot.org (Postfix) with ESMTP id 2F1F467A15 for ; Wed, 10 Sep 2025 03:40:40 -0600 (MDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=u-boot.org; s=default; t=1757431152; bh=S3Wfv0ezUyrrR3cZmQOwJf7feaY9myAfho4vYXS/nI4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uyntU/8m7o7riG7oqoWEHdHTr/a6XX1OG0Op6l9Fywy9Mx0hWJAxnRRp4oeNEDX0l ahXs8mg1LGI/efHRUBeordTH+qbOBgyMCr0tMz58zUFGvEe9fZVoPIVbB5XQjW6SvJ V6geWvThfYc+Smi1vY3AxlKi2C0SIyzpiaai5grQINQhyhA50xkyOCj9X60hsu2zMv xfzir+lwX0gBtQygooJUDYC85TfzCO/SXlwQ5Wwlz/pCe2wm2fw/OqGUHx6KP7avCs tyqjEPHTmzDe4WJLSBFYk+ox+CaaONUDV+ESSMvFCZE1DjClhT3Y0qMdJFyzdCv2L/ mdZpPDDB6H82w== Received: from localhost (localhost [127.0.0.1]) by mail.u-boot.org (Postfix) with ESMTP id 0E32F60026; Tue, 9 Sep 2025 09:19:12 -0600 (MDT) X-Virus-Scanned: Debian amavis at Received: from mail.u-boot.org ([127.0.0.1]) by localhost (mail.u-boot.org [127.0.0.1]) (amavis, port 10026) with ESMTP id xzn6-HPkk-DZ; Tue, 9 Sep 2025 09:19:11 -0600 (MDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=u-boot.org; s=default; t=1757431147; bh=l1aUJ+d+PQm8w6mQqx1Dp3VZCq+fdFfqk72BzA1zow4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FR8BQl70lSuGRas5PP4J+ygC85/amZ3NRHrSSANhU6r2LWE9J5pOEYi7b7YDqFX4W YLQyW7SHNRvBjGpAPXq5wMyjbBvnxQy3uruAWhemwW6is0kaTeKZ4A+GS8Qj61r+jL pag3oItLcjWLZEZzNiQBbQh0BMUQc29IFyByGuvErl1Lnrvi3VV5N45cgu9jpDgyp+ dTllDJdkOuTY4PQK7fOkRZLPMsca3W9yVFjZrxKanpiiKHfJXmOevvJgZuHy6PER9z orXA077Dw4Hf5TzDq0g36i5HmqFBwUUDgxOv9v0++bgC9QIYB4ECjcoRMNR+8/wQn1 ea0vU7q9qYSHQ== Received: from u-boot.org (unknown [73.34.74.121]) by mail.u-boot.org (Postfix) with ESMTPSA id 30741679D6; Tue, 9 Sep 2025 09:19:07 -0600 (MDT) From: Simon Glass To: U-Boot Concept Date: Tue, 9 Sep 2025 09:18:05 -0600 Message-ID: <20250909151824.2327219-9-sjg@u-boot.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250909151824.2327219-1-sjg@u-boot.org> References: <20250909151824.2327219-1-sjg@u-boot.org> MIME-Version: 1.0 X-MailFrom: sjg@u-boot.org X-Mailman-Rule-Hits: max-size X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; news-moderation; no-subject; digests; suspicious-header Message-ID-Hash: XSGCNH6WYP3DN2FIPBI47R6XRVJ47DNG X-Message-ID-Hash: XSGCNH6WYP3DN2FIPBI47R6XRVJ47DNG X-Mailman-Approved-At: Wed, 10 Sep 2025 09:40:38 -0600 CC: Heinrich Schuchardt , Simon Glass , Claude X-Mailman-Version: 3.3.10 Precedence: list Subject: [Concept] [PATCH 08/18] ulib: scripts: Add a script to support symbol-renaming List-Id: Discussion and patches related to U-Boot Concept Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: Simon Glass When U-Boot is used as a library with other programs, some U-Boot function names may conflict with the program, or with standard-library functions. For example, printf() is defined by U-Boot but is typically used by the program as well. The easiest solution is to rename symbols in the object file, so that they appear with a 'ub_' prefix when linked with the program. Add a new build_api.py script which can: - rename symbols based on a rename.syms file - generate a header file (with the renamed symbols) for use by the program This makes use of the 'objcopy --redefine-sym' feature. The tool has 100% test coverage. Co-developed-by: Claude Signed-off-by: Simon Glass --- scripts/build_api.py | 714 +++++++++++++++++++++++++++++++++ test/scripts/test_build_api.py | 704 ++++++++++++++++++++++++++++++++ 2 files changed, 1418 insertions(+) create mode 100755 scripts/build_api.py create mode 100644 test/scripts/test_build_api.py diff --git a/scripts/build_api.py b/scripts/build_api.py new file mode 100755 index 00000000000..7adc6c978a3 --- /dev/null +++ b/scripts/build_api.py @@ -0,0 +1,714 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +"""Script to parse rename.syms files and generate API headers""" + +import argparse +import filecmp +import os +import re +import subprocess +import sys +import time +from concurrent.futures import ThreadPoolExecutor, as_completed +from dataclasses import dataclass +from itertools import groupby +from typing import List + +# Add the tools directory to the path for u_boot_pylib +sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', 'tools')) + +# pylint: disable=wrong-import-position,import-error +from u_boot_pylib import tools +from u_boot_pylib import test_util + +# API header template parts +API_HEADER = '''#ifndef __ULIB_API_H +#define __ULIB_API_H + +#include +#include + +/* Auto-generated header with renamed U-Boot library functions */ + +''' + +API_FOOTER = '''#endif /* __ULIB_API_H */ +''' + +def rename_function(src, old_name, new_name): + """Rename a function in C source code + + Args: + src (str): The source code containing the function + old_name (str): Current function name to rename + new_name (str): New function name + + Returns: + str: Source code with the function renamed + """ + # Pattern to match function declaration/definition + # Matches: return_type func(parameters) + pattern = r'\b' + re.escape(old_name) + r'\b(?=\s*\()' + + # Replace all occurrences of the function name (in function comment too) + renamed_code = re.sub(pattern, new_name, src) + return renamed_code + + +@dataclass +class Symbol: + """Represents a symbol rename operation for library functions. + + Used to track how functions from a header file should be renamed + to create a namespaced API (e.g., printf -> ub_printf). + """ + hdr: str # Header file containing the function + orig: str # Original function name + new_name: str # New function name after renaming + + +class RenameSymsParser: + """Parser for rename.syms files. + + Format: + file: header.h + symbol1 + symbol2=renamed_symbol2 + symbol3 + + Lines starting with 'file:' specify a header file. + Lines indented with space or tab specify symbols from that header. + Symbol lines can use '=' for explicit renaming, otherwise 'ub_' prefix is + added. + Comments start with '#' and must begin at start of line. + Empty lines are allowed. + Trailing spaces are stripped but no other whitespace is allowed except for + symbol indentation. + """ + def __init__(self, fname: str): + """Initialize the parser with a rename.syms file path + + Args: + fname (str): Path to the rename.syms file to parse + """ + self.fname = fname + self.syms: List[Symbol] = [] + + def parse_line(self, line: str, hdr: str) -> Symbol: + """Parse a line and return a Symbol or None + + Args: + line (str): The line to parse (already stripped) + hdr (str): Current header file name + + Returns: + Symbol or None: Symbol if line contains a symbol definition, + None otherwise + """ + if '=' in line: + # Explicit mapping: orig=new + orig, new = line.split('=', 1) + orig = orig.strip() + new = new.strip() + else: + # Default mapping: add 'ub_' prefix + orig = line + new = f'ub_{orig}' + + return Symbol(hdr=hdr, orig=orig, new_name=new) + + def parse(self) -> List[Symbol]: + """Parse the rename.syms file and return list of symbols + + Returns: + List[Symbol]: List of symbol rename operations + """ + hdr = None + content = tools.read_file(self.fname, binary=False) + for line_num, line in enumerate(content.splitlines(), 1): + line = line.rstrip() + + # Skip empty lines and comments + if not line or line.startswith('#'): + continue + + # Check for file directive + if line.startswith('file:'): + hdr = line.split(':', 1)[1].strip() + continue + + # Check for symbol (indented line with space or tab) + if line[0] not in [' ', '\t']: + # Non-indented, non-file lines are invalid + raise ValueError(f'Line {line_num}: Invalid format - ' + f"symbols must be indented: '{line}'") + + if hdr is None: + raise ValueError(f"Line {line_num}: Symbol '{line.strip()}' " + f'found without a header file directive') + + # Process valid symbol + symbol = self.parse_line(line.strip(), hdr) + if symbol: + self.syms.append(symbol) + return self.syms + + def dump(self): + """Print the parsed symbols in a formatted way""" + print(f'Parsed {len(self.syms)} symbols from ' + f'{self.fname}:') + print() + hdr = None + for sym in self.syms: + if sym.hdr != hdr: + hdr = sym.hdr + print(f'Header: {hdr}') + print(f' {sym.orig} -> {sym.new_name}') + print(f'\nTotal: {len(self.syms)} symbols') + + +class DeclExtractor: + """Extracts function declarations from header files with comments + + Expects functions to have an optional preceding comment block (either /**/ + style or // single-line) followed immediately by the function declaration. + The declaration may span multiple lines until a semicolon or opening brace. + + Properties: + lines (str): List of lines from the header file, set by extract() + """ + + def __init__(self, fname: str): + """Initialize with header file path + + Args: + fname (str): Path to the header file + """ + self.fname = fname + self.lines = [] + + def find_function(self, func: str): + """Find the line index of a function declaration + + Args: + func (str): Name of the function to find + + Returns: + int or None: Line index of function declaration, or None if + not found + """ + pattern = r'\b' + re.escape(func) + r'\s*\(' + + for i, full_line in enumerate(self.lines): + line = full_line.strip() + # Skip comment lines and find actual function declarations + if (not line.startswith('*') and not line.startswith('//') and + re.search(pattern, full_line)): + return i + + return None + + def find_preceding_comment(self, func_idx: int): + """Find comment block preceding a function declaration + + Args: + func_idx (int): Line index of the function declaration + + Returns: + int or None: Start line index of comment block, or None if not found + """ + # Search backwards from the line before the function declaration + for i in range(func_idx - 1, -1, -1): + line = self.lines[i].strip() + if not line: + continue # Skip empty lines + if line.startswith('*/'): + # Find the start of this comment block + for j in range(i, -1, -1): + if '/**' in self.lines[j]: + return j + break + if line.startswith('//'): + # Found single-line comment, include it if it's the first + # non-empty line before function + return i + if not line.startswith('*'): + # Hit non-comment content, no preceding comment + break + return None + + def extract_lines(self, start_idx: int, func_idx: int): + """Extract comment and function declaration lines + + Args: + start_idx (int): Starting line index (comment or function) + func_idx (int): Function declaration line index + + Returns: + str: Lines containing the complete declaration joined with newlines + """ + lines = [] + + # Add comment lines if found + if start_idx < func_idx: + lines.extend(self.lines[start_idx:func_idx]) + + # Add function declaration lines + for line in self.lines[func_idx:func_idx + 10]: + lines.append(line) + if ';' in line or '{' in line: + break + + return '\n'.join(lines) + + def extract(self, func: str): + """Find a function declaration in a header file, including its comment + + Args: + func (str): Name of the function to find + + Returns: + str or None: The function declaration with its comment, or None + if not found + """ + self.lines = tools.read_file(self.fname, binary=False).split('\n') + + func_idx = self.find_function(func) + if func_idx is None: + return None + + comment_idx = self.find_preceding_comment(func_idx) + start_idx = comment_idx if comment_idx is not None else func_idx + + return self.extract_lines(start_idx, func_idx) + + @staticmethod + def extract_decl(fname, func): + """Find a function declaration in a header file, including its comment + + Args: + fname (str): Path to the header file + func (str): Name of the function to find + + Returns: + str or None: The function declaration with its comment, or None + if not found + """ + extractor = DeclExtractor(fname) + return extractor.extract(func) + + +class SymbolRedefiner: + """Applies symbol redefinitions to object files using objcopy + + Processes object files to rename symbols using objcopy --redefine-sym. + Always copies modified files to an output directory. + + Properties: + redefine_args (List[str]): objcopy arguments for symbol redefinition + symbol_names (set): Set of original symbol names to look for + """ + + def __init__(self, syms: List[Symbol], outdir: str, max_workers, + verbose=False): + """Initialize with symbols and output settings + + Args: + syms (List[Symbol]): List of symbols to redefine + outdir (str): Directory to write modified object files + max_workers (int): Number of parallel workers + verbose (bool): Whether to show verbose output + """ + self.syms = syms + self.outdir = outdir + self.verbose = verbose + self.max_workers = max_workers + self.redefine_args = [] + self.symbol_names = set() + + # Build objcopy command arguments and symbol set + for sym in syms: + self.redefine_args.extend(['--redefine-sym', + f'{sym.orig}={sym.new_name}']) + self.symbol_names.add(sym.orig) + + def redefine_file(self, infile: str, outfile: str): + """Apply symbol redefinitions to a single object file + + Args: + infile (str): Input object file path + outfile (str): Output object file path + """ + cmd = ['objcopy'] + self.redefine_args + [infile, outfile] + subprocess.run(cmd, check=True, capture_output=True, text=True) + if self.verbose: + print(f'Copied and modified {infile} -> {outfile}') + + def _process_single_file(self, path: str, outfile: str) -> bool: + """Process a single file (for parallel execution) + + Args: + path (str): Input file path + outfile (str): Output file path + + Returns: + bool: True if file was modified, False otherwise + """ + # Always run objcopy to apply redefinitions + self.redefine_file(path, outfile) + + # Check if the file was actually modified + return not filecmp.cmp(path, outfile, shallow=False) + + def process(self, work_items: List[tuple[str, str]]) -> \ + tuple[List[str], int]: + """Process object files and apply symbol redefinitions + + Args: + work_items (List[tuple[str, str]]): List of + (input_path, output_path) tuples + + Returns: + tuple[List[str], int]: List of output object file paths and + count of modified files + """ + # Process files in parallel + outfiles = [] + modified = 0 + + with ThreadPoolExecutor(max_workers=self.max_workers) as executor: + # Submit all jobs + future_to_item = { + executor.submit(self._process_single_file, path, outfile): + (path, outfile) + for path, outfile in work_items + } + + # Collect results + for future in as_completed(future_to_item): + path, outfile = future_to_item[future] + was_modified = future.result() + if was_modified: + modified += 1 + outfiles.append(outfile) + + # Sort outfiles to maintain consistent order + outfiles.sort() + return outfiles, modified + + @staticmethod + def apply_renames(obj_files, syms, outdir: str, max_workers, verbose=False): + """Apply symbol redefinitions to object files using objcopy + + Args: + obj_files (List[str]): List of object file paths + syms (List[Symbol]): List of symbols + outdir (str): Directory to write modified object files + max_workers (int): Number of parallel workers + verbose (bool): Whether to show verbose output + + Returns: + tuple[List[str], int]: List of output object file paths and + count of modified files + """ + if not syms: + return obj_files, 0 + + redefiner = SymbolRedefiner(syms, outdir, max_workers, verbose) + + # Setup: create output directory and prepare work items + os.makedirs(outdir, exist_ok=True) + + # Prepare work items - just input and output paths + work_items = [] + for path in obj_files: + uniq = os.path.relpath(path).replace('/', '_') + outfile = os.path.join(outdir, uniq) + work_items.append((path, outfile)) + + return redefiner.process(work_items) + + +class ApiGenerator: + """Generates API headers with renamed function declarations + + Processes symbols and creates a unified header file with renamed function + declarations extracted from original header files. + """ + + def __init__(self, syms: List[Symbol], include_dir: str, verbose=False): + """Initialize with symbols and include directory + + Args: + syms (List[Symbol]): List of symbols + include_dir (str): Directory to search for header files + verbose (bool): Whether to print status messages + """ + self.syms = syms + self.include_dir = include_dir + self.verbose = verbose + self.missing_decls = [] + self.missing_hdrs = [] + + def process_header(self, hdr: str, header_syms: List[Symbol]): + """Process a single header file and its symbols + + Args: + hdr (str): Header file name + header_syms (List[Symbol]): Symbols from this header + + Returns: + List[str]: Lines for this header section + """ + lines = [f'/* Functions from {hdr} */'] + + path = os.path.join(self.include_dir, hdr) + if not os.path.exists(path): + self.missing_hdrs.append(hdr) + else: + # Extract and rename declarations from the actual header + for sym in header_syms: + orig = DeclExtractor.extract_decl( + path, sym.orig) + if orig: + # Rename the function in the declaration + renamed_decl = rename_function( + orig, sym.orig, sym.new_name) + lines.append(renamed_decl) + else: + self.missing_decls.append((sym.orig, hdr)) + lines.append('') + + lines.append('') + return lines + + def check_errors(self): + """Check for missing headers or declarations and build error message + + Returns: + str: Error messages, or '' if None + """ + msgs = [] + if self.missing_hdrs: + msgs.append('') + msgs.append('Missing header files:') + for header in self.missing_hdrs: + msgs.append(f' - {header}') + + if self.missing_decls: + msgs.append('') + msgs.append('Missing function declarations:') + for func_name, hdr in self.missing_decls: + msgs.append(f' - {func_name} in {hdr}') + + return '\n'.join(msgs) + + def generate(self, outfile: str): + """Generate the API header file + + Args: + outfile (str): Path where to write the new header file + + Returns: + int: 0 on success, 1 on error + """ + # Process each header file + out = [] + sorted_syms = sorted(self.syms, key=lambda s: s.hdr) + by_header = {hdr: list(syms) + for hdr, syms in groupby(sorted_syms, key=lambda s: s.hdr)} + for hdr, syms in by_header.items(): + out.extend(self.process_header(hdr, syms)) + + # Check for errors and abort if any declarations are missing + error_msg = self.check_errors() + if error_msg: + print(error_msg, file=sys.stderr) + return 1 + + # Write the header file + content = API_HEADER + '\n'.join(out) + API_FOOTER + tools.write_file(outfile, content, binary=False) + if self.verbose: + print(f'Generated API header: {outfile}') + + return 0 + + @staticmethod + def generate_hdr(syms, include_dir, outfile, verbose=False): + """Generate a new header file with renamed function declarations + + Args: + syms (List[Symbol]): List of symbols + include_dir (str): Directory to search for header files + outfile (str): Path where to write the new header file + verbose (bool): Whether to print status messages + + Returns: + int: 0 on success, 1 on error + """ + if not syms: + print('Warning: No symbols found', file=sys.stderr) + return 0 + + generator = ApiGenerator(syms, include_dir, verbose) + return generator.generate(outfile) + + +def run_tests(processes, test_name): # pragma: no cover + """Run all the tests we have for build_api + + Args: + processes (int): Number of processes to use to run tests + test_name (str): Name of specific test to run, or None to run all tests + + Returns: + int: 0 if successful, 1 if not + """ + # pylint: disable=import-outside-toplevel,import-error + # Import our test module + test_dir = os.path.join(os.path.dirname(__file__), '../test/scripts') + sys.path.insert(0, test_dir) + + import test_build_api + + sys.argv = [sys.argv[0]] + + result = test_util.run_test_suites( + toolname='build_api', debug=True, verbosity=2, no_capture=False, + test_preserve_dirs=False, processes=processes, test_name=test_name, + toolpath=[], + class_and_module_list=[test_build_api.TestBuildApi]) + + return 0 if result.wasSuccessful() else 1 + + +def run_test_coverage(): # pragma: no cover + """Run the tests and check that we get 100% coverage""" + sys.argv = [sys.argv[0]] + test_util.run_test_coverage('scripts/build_api.py', None, + ['tools/u_boot_pylib/*', '*/test*'], '.') + + +def parse_args(argv): + """Parse and validate command line arguments + + Args: + argv (List[str]): Arguments to parse + + Returns: + tuple: (args, error_code) where args is argparse.Namespace or None, + and error_code is 0 for success or 1 for error + """ + parser = argparse.ArgumentParser( + description='Parse rename.syms file and show symbols') + parser.add_argument('rename_syms', nargs='?', + help='Path to rename.syms file') + parser.add_argument('-d', '--dump', action='store_true', + help='Dump parsed symbols') + parser.add_argument('-r', '--redefine', nargs='*', metavar='OBJ_FILE', + help='Apply symbol redefinitions to object files') + parser.add_argument('-a', '--api', metavar='HEADER_FILE', + help='Generate API header with renamed functions') + parser.add_argument('-i', '--include-dir', metavar='DIR', + help='Include directory containing header files') + parser.add_argument('-o', '--output-dir', metavar='DIR', + help='Output directory for modified object files') + parser.add_argument('-v', '--verbose', action='store_true', + help='Show verbose output') + parser.add_argument('-j', '--jobs', type=int, metavar='N', + help='Number of parallel jobs for symbol processing') + parser.add_argument('-P', '--processes', type=int, + help='set number of processes to use for running tests') + parser.add_argument('-t', '--test', action='store_true', dest='test', + default=False, help='run tests') + parser.add_argument('-T', '--test-coverage', action='store_true', + default=False, + help='run tests and check for 100%% coverage') + args = parser.parse_args(argv) + + # Check if running tests - if so, rename_syms is optional + running_tests = args.test or args.test_coverage + + if not running_tests and not args.rename_syms: # pragma: no cover + print('Error: rename_syms is required unless running tests', + # pragma: no cover + file=sys.stderr) # pragma: no cover + return None, 1 # pragma: no cover + + # Validate argument combinations + if args.redefine is not None and not args.redefine: + # args.redefine is [] when --redefine used with no object files + print('Error: --redefine requires at least one object file', + file=sys.stderr) + return None, 1 + + if args.redefine is not None and not args.output_dir: + print('Error: --output-dir is required with --redefine', + file=sys.stderr) + return None, 1 + + if args.api and not args.include_dir: + print('Error: --include-dir is required with --api', + file=sys.stderr) + return None, 1 + + return args, 0 + + +def main(argv=None): + """Main entry point for the script + + Args: + argv (List[str], optional): Arguments to parse. Uses sys.argv[1:] + if None. + + Returns: + int: Exit code (0 for success, 1 for error) + """ + if argv is None: + argv = sys.argv[1:] + args, error_code = parse_args(argv) + if error_code: + return error_code + + # Handle test options + if args.test: # pragma: no cover + test_name = args.rename_syms # pragma: no cover + return run_tests(args.processes, test_name) # pragma: no cover + + if args.test_coverage: # pragma: no cover + run_test_coverage() # pragma: no cover + return 0 # pragma: no cover + + symbols_parser = RenameSymsParser(args.rename_syms) + syms = symbols_parser.parse() + + if args.dump: + symbols_parser.dump() + + if args.redefine is not None: + # Determine number of jobs + jobs = args.jobs if args.jobs else min(os.cpu_count() or 4, 8) + start_time = time.time() + outfiles, modified = SymbolRedefiner.apply_renames( + args.redefine, syms, args.output_dir, jobs, args.verbose) + # Print the list of output files for the build system to use + if args.output_dir: + print('\n'.join(outfiles)) + elapsed = time.time() - start_time + if args.verbose: + print(f'Processed {len(args.redefine)} files ({modified} modified) ' + f'in {elapsed:.3f} seconds ({jobs} threads)', file=sys.stderr) + + if args.api: + result = ApiGenerator.generate_hdr(syms, args.include_dir, args.api, + args.verbose) + if result: + return result + + return 0 + + +if __name__ == '__main__': # pragma: no cover + sys.exit(main()) diff --git a/test/scripts/test_build_api.py b/test/scripts/test_build_api.py new file mode 100644 index 00000000000..8295c08715e --- /dev/null +++ b/test/scripts/test_build_api.py @@ -0,0 +1,704 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# pylint: disable=cyclic-import +"""Test suite for build_api.py script""" + +import contextlib +from io import StringIO +import os +import subprocess +import sys +import tempfile +import unittest + +# Add the scripts directory to the path +script_dir = os.path.join(os.path.dirname(__file__), '..', '..', 'scripts') +sys.path.insert(0, script_dir) + +# Add the tools directory to the path for u_boot_pylib +tools_dir = os.path.join(os.path.dirname(__file__), '..', '..', 'tools') +sys.path.insert(0, tools_dir) + +# pylint: disable=wrong-import-position,import-error +from build_api import rename_function, RenameSymsParser, DeclExtractor +from build_api import ApiGenerator, SymbolRedefiner, main +from u_boot_pylib import tools + + +class TestBuildApi(unittest.TestCase): + # pylint: disable=too-many-public-methods + """Test suite for build_api.py script""" + + def setUp(self): + """Create temporary files for testing""" + # pylint: disable=R1732 + self.tmpdir = tempfile.TemporaryDirectory() + # Create a temp file path for symbols.syms that tests can write to + self.sympath = os.path.join(self.tmpdir.name, 'symbols.syms') + + def tearDown(self): + """Clean up temporary files""" + self.tmpdir.cleanup() + + def write_tmp(self, content, filename): + """Create a temporary text file with given content""" + temp_path = os.path.join(self.tmpdir.name, filename) + tools.write_file(temp_path, content, binary=False) + return temp_path + + def test_rename_function(self): + """Test basic function renaming""" + source_code = ''' +/** + * sprintf() - Format a string and place it in a buffer + * + * @buf: The buffer to place the result into + * @fmt: The format string to use + * @...: Arguments for the format string + * + * The function returns the number of characters written + * into @buf. + * + * See the vsprintf() documentation for format string extensions over C99. + */ +int sprintf(char *buf, const char *fmt, ...) +__attribute__ ((format (__printf__, 2, 3))); +''' + result = rename_function(source_code, 'sprintf', 'my_sprintf') + + # Check that the function name was changed + assert 'int my_sprintf(char *buf' in result + assert 'int sprintf(char *buf' not in result + + def test_rename_sym_parser(self): + """Test parsing symbol definition file format""" + content = '''# Test symbols.syms file +file: stdio.h + printf + scanf + +file: string.h + strcpy + strlen=ub_str_length + +file: stdlib.h + malloc=custom_malloc +''' + tools.write_file(self.sympath, content, binary=False) + + parser = RenameSymsParser(self.sympath) + renames = parser.parse() + + # Check we got the right number of renames + assert len(renames) == 5 + + # Check default prefix mapping + printf_rename = next(r for r in renames if r.orig == 'printf') + assert printf_rename.hdr == 'stdio.h' + assert printf_rename.new_name == 'ub_printf' + + # Check explicit mapping + strlen_rename = next(r for r in renames if r.orig == 'strlen') + assert strlen_rename.hdr == 'string.h' + assert strlen_rename.new_name == 'ub_str_length' + malloc_rename = next(r for r in renames if r.orig == 'malloc') + assert malloc_rename.hdr == 'stdlib.h' + assert malloc_rename.new_name == 'custom_malloc' + + def test_rename_sym_with_real_file(self): + """Test parsing with realistic symbols.syms file""" + symbols_content = '''# Symbols for U-Boot library +file: stdio.h + printf + sprintf + snprintf + scanf + sscanf + +file: string.h + memcpy + memset + strlen + strcpy + strcmp + +file: stdlib.h + malloc + free + calloc +''' + symbols_path = self.write_tmp(symbols_content, 'realistic_symbols.syms') + + parser = RenameSymsParser(symbols_path) + + # Should have some renames + renames = parser.parse() + assert renames + + # Check that printf gets renamed to ub_printf + printf_rename = next((r for r in renames if r.orig == 'printf'), None) + assert printf_rename is not None + assert printf_rename.hdr == 'stdio.h' + assert printf_rename.new_name == 'ub_printf' + + def test_rename_with_parser(self): + """Test integration between parser and renaming""" + content = '''file: stdio.h + sprintf + printf +''' + tools.write_file(self.sympath, content, binary=False) + parser = RenameSymsParser(self.sympath) + renames = parser.parse() + # Use the parser results to rename functions in source code + source_code = ''' +int sprintf(char *buf, const char *fmt, ...); +int printf(const char *fmt, ...); +''' + result = source_code + for rename in renames: + result = rename_function(result, rename.orig, rename.new_name) + + # Check that both functions were renamed + assert 'int ub_sprintf(char *buf' in result + assert 'int ub_printf(const char *fmt' in result + assert 'int sprintf(char *buf' not in result + assert 'int printf(const char *fmt' not in result + + def test_redefine_option(self): + """Test symbol redefinition in object files""" + content = '''file: stdio.h + printf +''' + rename_syms = self.write_tmp(content, 'redefine_symbols.syms') + + # Create a simple C file with printf (use format string to prevent + # optimization to puts) + c_code = ''' +#include +void test_function() { + printf("%s %d\\n", "Hello", 123); +} +''' + c_file_path = self.write_tmp(c_code, 'test.c') + obj_file_path = c_file_path.replace('.c', '.o') + # obj file will be cleaned up automatically with tmpdir + + # Compile the C file to object file + compile_cmd = ['gcc', '-c', c_file_path, '-o', obj_file_path] + subprocess.run(compile_cmd, capture_output=True, text=True, check=True) + + # Check that the object file contains printf symbol + nm_cmd = ['nm', obj_file_path] + result = subprocess.run(nm_cmd, capture_output=True, text=True, + check=True) + assert 'printf' in result.stdout + + # Test the parser + parser = RenameSymsParser(rename_syms) + renames = parser.parse() + + # Verify we have the expected rename + assert len(renames) == 1 + assert renames[0].orig == 'printf' + assert renames[0].new_name == 'ub_printf' + assert renames[0].hdr == 'stdio.h' + + # Test the actual symbol redefinition + outfiles, modified = SymbolRedefiner.apply_renames( + [obj_file_path], renames, self.tmpdir.name, 1) + assert outfiles + assert modified == 1 # Should have modified 1 file + obj_file_path = outfiles[0] # Use the output file for checking + + # Check that the symbol was renamed + nm_cmd = ['nm', obj_file_path] + result = subprocess.run(nm_cmd, capture_output=True, text=True, + check=True) + + # Should now have ub_printf instead of printf + out = result.stdout.replace('ub_printf', '') + assert 'ub_printf' in result.stdout + assert 'printf' not in out + + def test_extract_decl(self): + """Test extracting function declarations from headers""" + content = '''#ifndef TEST_H +#define TEST_H + +/** + * sprintf() - Format a string and place it in a buffer + * + * @buf: The buffer to place the result into + * @fmt: The format string to use + * @...: Arguments for the format string + * + * The function returns the number of characters written + * into @buf. + */ +int sprintf(char *buf, const char *fmt, ...) +\t\t__attribute__ ((format (__printf__, 2, 3))); + +// Another function without detailed comment + +int printf(const char *fmt, ...); + +/** + * strlen() - Calculate the length of a string + * @s: The string to measure + * + * Return: The length of the string + */ + +size_t strlen(const char *s); + +/* Broken comment block - ends without proper start */ +*/ +#define SOME_MACRO 1 +int broken_comment_func(void); + +/* Normal function preceded by non-comment content */ +int other_content_func(void); + +#endif +''' + hdr = self.write_tmp(content, 'test.h') + # Test finding sprintf with comment + decl = DeclExtractor.extract_decl(hdr, 'sprintf') + assert decl is not None + expected = '''/** + * sprintf() - Format a string and place it in a buffer + * + * @buf: The buffer to place the result into + * @fmt: The format string to use + * @...: Arguments for the format string + * + * The function returns the number of characters written + * into @buf. + */ +int sprintf(char *buf, const char *fmt, ...) +\t\t__attribute__ ((format (__printf__, 2, 3)));''' + assert decl == expected, ( + f'Expected:\n{expected}\n\nGot:\n{decl}') + + # Test finding printf without detailed comment + decl = DeclExtractor.extract_decl(hdr, 'printf') + assert decl is not None + expected = '''// Another function without detailed comment + +int printf(const char *fmt, ...);''' + assert decl == expected, ( + f'Expected:\n{expected}\n\nGot:\n{decl}') + + # Test finding strlen with comment + strlen_decl = DeclExtractor.extract_decl(hdr, 'strlen') + assert strlen_decl is not None + expected_strlen = '''/** + * strlen() - Calculate the length of a string + * @s: The string to measure + * + * Return: The length of the string + */ + +size_t strlen(const char *s);''' + assert strlen_decl == expected_strlen, ( + f'Expected:\n{expected_strlen}\n\nGot:\n{strlen_decl}') + + # Test function not found + assert not DeclExtractor.extract_decl(hdr, 'nonexistent') + + # Test function with broken comment block (should return None) + broken_decl = DeclExtractor.extract_decl(hdr, 'broken_comment_func') + assert broken_decl is not None + assert 'int broken_comment_func(void);' in broken_decl + + # Test function preceded by non-comment content (no comment) + other_decl = DeclExtractor.extract_decl(hdr, 'other_content_func') + assert other_decl is not None + assert 'int other_content_func(void);' in other_decl + + def test_extract_decl_malformed_comment(self): + """Test extracting declaration with malformed comment block""" + # Create header where */ appears but no /** is found backwards + content = '''#ifndef TEST_H +#define TEST_H + +some code here +*/ +int malformed_func(void); + +#endif +''' + hdr = self.write_tmp(content, 'malformed.h') + + # This should find the function but no comment (malformed comment) + decl = DeclExtractor.extract_decl(hdr, 'malformed_func') + assert decl is not None + assert decl == 'int malformed_func(void);' + + def test_symbol_redefiner_coverage(self): + """Test SymbolRedefiner edge cases for better coverage""" + content = '''file: stdio.h + printf + custom_func +''' + rename_syms = self.write_tmp(content, 'coverage_symbols.syms') + + # Create C file with defined symbol (not just undefined reference) + c_code_defined = ''' +void printf(const char *fmt, ...) { + // Custom printf implementation +} +''' + c_file_defined = self.write_tmp(c_code_defined, 'defined_symbol.c') + obj_file_defined = c_file_defined.replace('.c', '.o') + + # Compile to create object with defined symbol + compile_cmd = ['gcc', '-c', c_file_defined, '-o', obj_file_defined] + subprocess.run(compile_cmd, capture_output=True, text=True, check=True) + + # Create C file with no target symbols at all + c_code_no_symbols = ''' +void other_func(void) { + int x = 42; +} +''' + c_file_no_symbols = self.write_tmp(c_code_no_symbols, 'no_symbols.c') + obj_file_no_symbols = c_file_no_symbols.replace('.c', '.o') + + compile_cmd = ['gcc', '-c', c_file_no_symbols, '-o', + obj_file_no_symbols] + subprocess.run(compile_cmd, capture_output=True, text=True, check=True) + + # Test with both files + parser = RenameSymsParser(rename_syms) + renames = parser.parse() + + # This should process both files - one with defined symbol, one without + # target symbols + # Test with verbose output + stdout = StringIO() + with contextlib.redirect_stdout(stdout): + outfiles, modified = SymbolRedefiner.apply_renames( + [obj_file_defined, obj_file_no_symbols], renames, + self.tmpdir.name, 1, verbose=True) + + assert outfiles + assert len(outfiles) == 2 + # Should have modified 1 file (the one with defined symbol) + assert modified == 1 + assert 'Copied and modified' in stdout.getvalue() + + def test_apply_renames_empty_symbols(self): + """Test SymbolRedefiner.apply_renames with empty symbol list""" + # Create a simple object file + c_code = ''' +void test_func(void) { + int x = 42; +} +''' + c_file = self.write_tmp(c_code, 'test_empty_syms.c') + obj_file = c_file.replace('.c', '.o') + + compile_cmd = ['gcc', '-c', c_file, '-o', obj_file] + subprocess.run(compile_cmd, capture_output=True, text=True, check=True) + + # Call apply_renames with empty symbol list + empty_syms = [] + obj_files = [obj_file] + result_files, modified = SymbolRedefiner.apply_renames( + obj_files, empty_syms, self.tmpdir.name, 1) + + # Should return the original obj_files unchanged and 0 modified + assert result_files == obj_files + assert modified == 0 + + def test_api_generation_empty_symbols(self): + """Test API generation with empty symbol list""" + api_file = self.write_tmp('', 'empty_api.h') + + # Test generate_hdr with empty symbol list + stderr = StringIO() + with contextlib.redirect_stderr(stderr): + result = ApiGenerator.generate_hdr([], '/nonexistent', api_file) + + # Should return 0 and print warning + assert result == 0 + assert 'Warning: No symbols found' in stderr.getvalue() + + def test_parse_args_errors(self): + """Test main() with parse_args validation errors""" + + # Test 1: --redefine with no object files + test_args = ['test.syms', '--redefine', '--output-dir', '/tmp'] + + stderr = StringIO() + with contextlib.redirect_stderr(stderr): + result = main(test_args) + + assert result == 1 + assert 'Error: --redefine requires at least one object file' in \ + stderr.getvalue() + + # Test 2: --redefine without --output-dir + test_args = ['test.syms', '--redefine', 'test.o'] + + stderr = StringIO() + with contextlib.redirect_stderr(stderr): + result = main(test_args) + + assert result == 1 + assert 'Error: --output-dir is required with --redefine' in \ + stderr.getvalue() + + # Test 3: --api without --include-dir + test_args = ['test.syms', '--api', 'api.h'] + + stderr = StringIO() + with contextlib.redirect_stderr(stderr): + result = main(test_args) + + assert result == 1 + assert 'Error: --include-dir is required with --api' in stderr.getvalue() + + def test_main_function_paths(self): + """Test main function with different argument combinations""" + + # Create test files + content = '''file: stdio.h + printf +''' + rename_syms = self.write_tmp(content, 'rename.syms') + + c_code = ''' +#include +void test_function() { + printf("%s\\n", "test"); +} +''' + c_file = self.write_tmp(c_code, 'main_test.c') + obj_file = c_file.replace('.c', '.o') + + compile_cmd = ['gcc', '-c', c_file, '-o', obj_file] + subprocess.run(compile_cmd, capture_output=True, text=True, check=True) + + # Test redefine path + test_args = [rename_syms, '--redefine', obj_file, '--output-dir', + self.tmpdir.name, '--verbose'] + stdout = StringIO() + stderr = StringIO() + with (contextlib.redirect_stdout(stdout), + contextlib.redirect_stderr(stderr)): + result = main(test_args) + assert result == 0 + + # Check that timing message was printed to stderr with verbose + stderr = stderr.getvalue() + assert 'Processed 1 files (0 modified) in' in stderr + + def test_main_function_with_jobs(self): + """Test main function with --jobs option to exercise max_workers path""" + + # Create test files + content = '''file: stdio.h + printf +''' + rename_syms = self.write_tmp(content, 'rename.syms') + + c_code = ''' +#include +void test_function() { + printf("%s\\n", "test"); +} +''' + c_file = self.write_tmp(c_code, 'jobs_test.c') + obj_file = c_file.replace('.c', '.o') + + compile_cmd = ['gcc', '-c', c_file, '-o', obj_file] + subprocess.run(compile_cmd, capture_output=True, text=True, check=True) + + # Test redefine path with explicit --jobs option + test_args = [rename_syms, '--redefine', obj_file, '--output-dir', + self.tmpdir.name, '--jobs', '2', '--verbose'] + stdout = StringIO() + stderr = StringIO() + with (contextlib.redirect_stdout(stdout), + contextlib.redirect_stderr(stderr)): + result = main(test_args) + assert result == 0 + + # Check that timing message includes thread count + stderr = stderr.getvalue() + assert 'Processed 1 files (0 modified) in' in stderr + + # Test API generation path with verbose output + fake_stdio = '''#ifndef STDIO_H +#define STDIO_H +int printf(const char *fmt, ...); +#endif +''' + self.write_tmp(fake_stdio, 'stdio.h') + api_file = self.write_tmp('', 'main_api.h') + + test_args = [rename_syms, '--api', api_file, '--include-dir', + self.tmpdir.name, '--output-dir', self.tmpdir.name, + '--verbose'] + + stdout = StringIO() + with contextlib.redirect_stdout(stdout): + result = main(test_args) + + assert result == 0 + assert 'Generated API header:' in stdout.getvalue() + + def test_main_api_generation_failure(self): + """Test main() when API generation fails""" + + # Create test files that will cause API generation to fail + content = '''file: nonexistent.h + missing_function +''' + rename_syms = self.write_tmp(content, 'failing_api.syms') + api_file = self.write_tmp('', 'failing_api.h') + + # This will fail because nonexistent.h doesn't exist + test_args = [rename_syms, '--api', api_file, '--include-dir', + '/nonexistent_dir', '--output-dir', self.tmpdir.name] + + stderr = StringIO() + with contextlib.redirect_stderr(stderr): + result = main(test_args) + + # Should return 1 because API generation failed + assert result == 1 + assert 'Missing header files:' in stderr.getvalue() + + def test_api_generation(self): + """Test API header generation""" + content = '''file: stdio.h + printf +''' + tools.write_file(self.sympath, content, binary=False) + + api = self.write_tmp('', 'api.h') + parser = RenameSymsParser(self.sympath) + renames = parser.parse() + + # Generate the API header - this will fail since stdio.h is not found + captured = StringIO() + with contextlib.redirect_stderr(captured): + result = ApiGenerator.generate_hdr(renames, '/nonexistent', api) + + # This test expects failure since stdio.h header is not available + assert result == 1 + + def test_api_generation_missing_headers(self): + """Test API generation error handling for missing header files""" + content = '''file: nonexistent.h + missing_func +''' + tools.write_file(self.sympath, content, binary=False) + + api = self.write_tmp('', 'api.h') + parser = RenameSymsParser(self.sympath) + renames = parser.parse() + + # This should exit with an error + captured = StringIO() + with contextlib.redirect_stderr(captured): + result = ApiGenerator.generate_hdr(renames, '/nonexistent', api) + assert result == 1, f'Expected return code 1, got {result}' + + assert 'Missing header files:' in captured.getvalue() + assert 'nonexistent.h' in captured.getvalue() + + def test_api_generation_missing_functions(self): + """Test API generation error handling for missing functions""" + # Create a fake stdio.h with a different function for testing + fake_stdio_content = '''#ifndef STDIO_H +#define STDIO_H +int existing_func(void); +#endif +''' + self.write_tmp(fake_stdio_content, 'stdio.h') + include_dir = self.tmpdir.name + + content = '''file: stdio.h + nonexistent_function +''' + tools.write_file(self.sympath, content, binary=False) + + api = self.write_tmp('', 'api.h') + parser = RenameSymsParser(self.sympath) + renames = parser.parse() + + # This should exit with an error for missing function declarations + captured = StringIO() + with contextlib.redirect_stderr(captured): + result = ApiGenerator.generate_hdr(renames, include_dir, api) + assert result == 1, f'Expected return code 1, got {result}' + + assert 'Missing function declarations:' in captured.getvalue() + assert 'nonexistent_function in stdio.h' in captured.getvalue() + + def test_parser_exceptions(self): + """Test parser error handling for invalid formats""" + + # Test 1: Symbol without header file + inval1 = '''# Test file with symbol before header + printf +file: stdio.h + scanf +''' + temp_path1 = self.write_tmp(inval1, 'test1.syms') + parser = RenameSymsParser(temp_path1) + with self.assertRaises(ValueError) as cm: + parser.parse() + self.assertIn("Symbol 'printf' found without a header file directive", + str(cm.exception)) + + # Test 2: Invalid format (non-indented, non-file line) + inval2 = '''file: stdio.h + printf +invalid_line_here + scanf +''' + temp_path2 = self.write_tmp(inval2, 'test2.syms') + parser = RenameSymsParser(temp_path2) + with self.assertRaises(ValueError) as cm: + parser.parse() + self.assertIn("Invalid format - symbols must be indented", + str(cm.exception)) + + def test_main_dump_symbols(self): + """Test main function with dump option""" + content = '''file: stdio.h + printf + sprintf + +file: string.h + strlen +''' + rename_syms = self.write_tmp(content, 'test_symbols.syms') + + # Mock sys.argv to simulate command line arguments + original_argv = sys.argv + try: + sys.argv = ['build_api.py', rename_syms, '--dump'] + + # Capture stdout to check dump output + captured = StringIO() + with contextlib.redirect_stdout(captured): + result = main() + + assert result == 0 + output = captured.getvalue() + assert all(item in output for item in + ['printf', 'sprintf', 'strlen', 'stdio.h', 'string.h']) + + finally: + sys.argv = original_argv + + +if __name__ == "__main__": + unittest.main()