[Concept,9/9] codman: Add documentation

Message ID 20251124134932.1991031-10-sjg@u-boot.org
State New
Headers
Series codman: Add a new source-code analysis tool |

Commit Message

Simon Glass Nov. 24, 2025, 1:49 p.m. UTC
  From: Simon Glass <simon.glass@canonical.com>

Provide a description of the purpose of codman and some examples of how
to use it.

Signed-off-by: Simon Glass <simon.glass@canonical.com>
---

 doc/develop/codman.rst  |   1 +
 doc/develop/index.rst   |   1 +
 tools/codman/codman.rst | 426 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 428 insertions(+)
 create mode 120000 doc/develop/codman.rst
 create mode 100644 tools/codman/codman.rst
  

Patch

diff --git a/doc/develop/codman.rst b/doc/develop/codman.rst
new file mode 120000
index 00000000000..a4f5c03d72d
--- /dev/null
+++ b/doc/develop/codman.rst
@@ -0,0 +1 @@ 
+../../tools/codman/codman.rst
\ No newline at end of file
diff --git a/doc/develop/index.rst b/doc/develop/index.rst
index 1a8e0168c67..d325ad23897 100644
--- a/doc/develop/index.rst
+++ b/doc/develop/index.rst
@@ -101,6 +101,7 @@  Refactoring
 
    checkpatch
    coccinelle
+   codman
    qconfig
 
 Code quality
diff --git a/tools/codman/codman.rst b/tools/codman/codman.rst
new file mode 100644
index 00000000000..d58bceb2101
--- /dev/null
+++ b/tools/codman/codman.rst
@@ -0,0 +1,426 @@ 
+.. SPDX-License-Identifier: GPL-2.0+
+
+===================
+Codman code manager
+===================
+
+The codman tool analyses U-Boot builds to determine which source files and lines
+of code are actually compiled and used.
+
+U-Boot is a massive project with thousands of files and nearly endless
+configuration possibilities. A single board configuration might only compile a
+small fraction of the total source tree. Codman can help answer questions like:
+
+* "I just enabled ``CONFIG_CMD_NET``, how much code did that actually add?"
+* "How much code would I remove by disabling ``CONFIG_CMDLINE``?
+
+Simply searching for ``CONFIG_`` macros or header inclusions is tricky because
+the build logic takes many forms: Makefile rules, #ifdefs, IS_ENABLED(),
+CONFIG_IS_ENABLED() and static inlines. The end result is board-specific in any
+case.
+
+Codman cuts through this complexity by analysing the actual build artifacts
+generated by the compiler:
+
+#. Builds the specified board
+#. Parses the ``.cmd`` files to find which source file were compiled.
+#. Analyses the source code (with unifdef) or the object files (dwarf tables)
+   to figure out which files and lines were compiled.
+
+Usage
+=====
+
+Basic usage, from within the U-Boot source tree::
+
+  ./tools/codman/codman.py -b <board> [flags] <command> [command-flags]
+
+Codman operations does out-of-tree builds, meaning that the object files end up
+in a separate directory for each board. Use ``--build-base`` to set that. The
+default is ``/tmp/b`` meaning that a sandbox build would end up in
+``/tmp/b/sandbox``, for eaxmple.
+
+Relationship to LSPs
+====================
+
+LSPs can allow you to see unused code in your IDE, which is very handy for
+interactive use. Codman is more about getting a broader picture, although it
+does allow individual files to be listed. Codman does include a ``--lsp`` option
+but this doesn't work particularly well.
+
+Commands
+========
+
+The basic functionality is accessed via these commands:
+
+* ``stats`` - Show statistics (default if no command given)
+* ``dirs`` - Show directory breakdown
+* ``unused`` - List unused files
+* ``used`` - List used files
+* ``summary`` - Show per-file summary
+* ``detail <file>...`` - Show line-by-line analysis of one or more files
+* ``copy-used <dir>`` - Copy used source files to a directory
+
+
+This will build the board and show statistics about source file usage.
+
+Adjusting Configuration (-a)
+============================
+
+Sometimes you want to explore "what if" scenarios without manually editing
+``defconfig`` files or running menuconfig. The ``-a`` (or ``--adjust``) option
+allows you to modify the Kconfig configuration on the fly before the analysis
+build runs.
+
+This is particularly useful for **impact analysis**: seeing exactly how much
+code a specific feature adds to the build.
+
+Syntax
+------
+
+The `CONFIG_` prefix is optional.
+
+* ``-a CONFIG_OPTION``: Enable a boolean option (sets to 'y').
+* ``-a ~CONFIG_OPTION``: Disable an option.
+* ``-a OPTION=val``: Set an option (``CONFIG_OPTION``) to a specific value.
+* ``-a CONFIG_A,CONFIG_B``: Set multiple options (comma-separated).
+
+Examples
+--------
+
+**Check the impact of USB:**
+
+Enable the USB subsystem on the sandbox board and see how the code stats change::
+
+  codman -b sandbox -a CMD_USB stats
+
+**Disable Networking:**
+See what code remains active when networking is explicitly disabled::
+
+  codman -b sandbox -a ~NET,NO_NET stats
+
+**Multiple Adjustments:**
+Enable USB and USB storage together::
+
+  codman -b sandbox -a CONFIG_CMD_USB -a CONFIG_USB_STORAGE stats
+
+Common Options
+==============
+
+Building:
+
+* ``-b, --board <board>`` - Board to build and analyse (default: sandbox, uses buildman)
+* ``-B, --build-dir <dir>`` - Use existing build directory instead of building
+* ``--build-base <dir>`` - Base directory for builds (default: /tmp/b)
+* ``-n, --no-build`` - Skip building, use existing build directory
+* ``-a, --adjust <config>`` - Adjust CONFIG options (see section above)
+
+Line-level analysis:
+
+* ``-w, --dwarf`` - Use DWARF debug info (most accurate, requires rebuild)
+* ``-i, --include-headers`` - Include header files in unifdef analysis
+
+Filtering:
+
+* ``-f, --filter <pattern>`` - Filter files by wildcard pattern (e.g.,
+  ``*acpi*``)
+
+Output control:
+
+* ``-v, --verbose`` - Show verbose output
+* ``-D, --debug`` - Enable debug mode
+* ``--top <N>`` - (for ``stats`` command) Show top N files with most inactive
+  code (default: 20)
+
+The ``dirs command`` has a few extra options:
+
+* ``-s, --subdirs`` - Show a breakdown by subdirectory
+* ``-f, --show-files`` - Show individual files within directories (with ``-s``)
+* ``-e, --show-empty`` - Show directories with 0 lines used
+
+Other:
+
+* ``-j, --jobs <N>`` - Number of parallel jobs for line analysis
+
+How to use commands
+===================
+
+The following commands show the different ways to use codman. Commands are
+specified as positional arguments after the global options.
+
+Basic Statistics (``stats``)
+-----------------------------
+
+Show overall statistics for sandbox build::
+
+    $ codman -b qemu-x86 stats
+    ======================================================================
+    FILE-LEVEL STATISTICS
+    ======================================================================
+    Total source files:    14114
+    Used source files:      1046 (7.4%)
+    Unused source files:   13083 (92.7%)
+
+    Total lines of code:  3646331
+    Used lines of code:   192543 (5.3%)
+    Unused lines of code: 3453788 (94.7%)
+    ======================================================================
+
+    ======================================================================
+    LINE-LEVEL STATISTICS (within compiled files)
+    ======================================================================
+    Files analysed:              504
+    Total lines in used files:209915
+    Active lines:             192543 (91.7%)
+    Inactive lines:            17372 (8.3%)
+    ======================================================================
+
+    TOP 20 FILES WITH MOST INACTIVE CODE:
+    ----------------------------------------------------------------------
+      2621 inactive lines (56.6%) - drivers/mtd/spi/spi-nor-core.c
+        669 inactive lines (46.7%) - cmd/mem.c
+        594 inactive lines (45.8%) - cmd/nvedit.c
+        579 inactive lines (89.5%) - drivers/mtd/spi/spi-nor-ids.c
+        488 inactive lines (27.4%) - net/net.c
+    ...
+
+
+Directory Breakdown (``dirs``)
+------------------------------
+
+See which top-level directories contribute code::
+
+    codman dirs
+
+Output shows breakdown by directory::
+
+    BREAKDOWN BY TOP-LEVEL DIRECTORY
+    =================================================================================
+    Directory                                  Files    Used  %Used  %Code    kLOC   Used
+    ---------------------------------------------------------------------------------
+    arch                                         234     156     67     72    12.3   8.9
+    board                                        123      45     37     25     5.6   1.4
+    cmd                                           89      67     75     81     3.4   2.8
+    common                                       156     134     86     88     8.9   7.8
+    ...
+
+For detailed subdirectory breakdown::
+
+    codman dirs --subdirs
+
+With ``--show-files``, also shows individual files within each directory::
+
+    codman dirs --subdirs --show-files
+
+You can also specify a file filter::
+
+    codman -b qemu-x86 -f "*acpi*" dirs -sf
+    =======================================================================================
+    BREAKDOWN BY TOP-LEVEL DIRECTORY
+    =======================================================================================
+    Directory                                  Files    Used  %Used  %Code     kLOC    Used
+    ---------------------------------------------------------------------------------------
+    arch/x86/include/asm                           5       2     40     36      0.6     0.2
+    arch/x86/lib                                   5       1     20      6      1.2     0.1
+    acpi.c                                      65      65  100.0       0
+    cmd                                            1       1    100    100      0.2     0.2
+    acpi.c                                     216     215   99.5       1
+    drivers/qfw                                    1       1    100     93      0.3     0.3
+    qfw_acpi.c                                 332     309   93.1      23
+    include/acpi                                   5       4     80     91      3.3     3.0
+    include/dm                                     1       1    100    100      0.4     0.4
+    include/power                                  1       1    100    100      0.2     0.2
+    lib/acpi                                      13       3     23     14      3.9     0.5
+    acpi_writer.c                              131      63   48.1      68
+    acpi_extra.c                               181     177   97.8       4
+    acpi.c                                     304     304  100.0       0
+    lib/efi_loader                                 1       1    100    100      0.1     0.1
+    efi_acpi.c                                  75      75  100.0       0
+    ---------------------------------------------------------------------------------------
+    TOTAL                                         78      15     19      7     17.5     1.2
+    =======================================================================================
+
+
+Detail View (``detail``)
+------------------------
+
+See exactly which lines are active/inactive in a specific file::
+
+    $ codman -b qemu-x86 detail common/main.c
+    ======================================================================
+    DETAIL FOR: common/main.c
+    ======================================================================
+    Total lines:       115
+    Active lines:       93 (80.9%)
+    Inactive lines:     22 (19.1%)
+
+        1 | // SPDX-License-Identifier: GPL-2.0+
+        2 | /*
+        3 |  * (C) Copyright 2000
+        4 |  * Wolfgang Denk, DENX Software Engineering, wd@denx.de.
+        5 |  */
+    ...
+        23 |
+        24 | static void run_preboot_environment_command(void)
+        25 | {
+        26 | 	char *p;
+        27 |
+        28 | 	p = env_get("preboot");
+        29 | 	if (p != NULL) {
+        30 | 		int prev = 0;
+        31 |
+    -   32 | 		if (IS_ENABLED(CONFIG_AUTOBOOT_KEYED))
+    -   33 | 			prev = disable_ctrlc(1); /* disable Ctrl-C checking */
+        34 |
+        35 | 		run_command_list(p, -1, 0);
+        36 |
+    -   37 | 		if (IS_ENABLED(CONFIG_AUTOBOOT_KEYED))
+    -   38 | 			disable_ctrlc(prev);	/* restore Ctrl-C checking */
+        39 | 	}
+        40 | }
+        41 |
+
+
+Lines with a ``-`` marker are not included in the build.
+
+Unused Files (``unused``)
+-------------------------
+
+Find all source files that weren't compiled::
+
+    $ codman -b qemu-x86 unused  |head -15
+    Finding all source files......
+    Found 1043 used source files...
+    Loading configuration......
+    Loaded 8913 Kconfig symbols...
+    Loaded 8913 config symbols...
+    Analysing preprocessor conditionals......
+    Excluding 539 header files (use -i to include them)...
+    Running unifdef on 504 files......
+    Unused source files (13083):
+    arch/arc/cpu/arcv1/ivt.S
+    arch/arc/cpu/arcv2/ivt.S
+    arch/arc/include/asm/arc-bcr.h
+
+Used Files (``used``)
+---------------------
+
+List all source files that were included in a build::
+
+    $ codman -b qemu-x86 used  |head -15
+    Finding all source files......
+    Found 1046 used source files...
+    Loading configuration......
+    Loaded 8913 Kconfig symbols...
+    Loaded 8913 config symbols...
+    Analysing preprocessor conditionals......
+    Excluding 542 header files (use -i to include them)...
+    Running unifdef on 504 files......
+    Used source files (1046):
+    arch/x86/cpu/call32.S
+    arch/x86/cpu/cpu.c
+    ...
+
+
+Per-File Summary (``summary``)
+------------------------------
+
+Shows detailed per-file statistics (requires ``-w`` or ``-l``)::
+
+    $ codman -b qemu-x86 summary
+    ==========================================================================================
+    PER-FILE SUMMARY
+    ==========================================================================================
+    File                                                  Total   Active Inactive  %Active
+    ------------------------------------------------------------------------------------------
+    arch/x86/cpu/call32.S                                    61       61        0   100.0%
+    arch/x86/cpu/cpu.c                                      399      353       46    88.5%
+    arch/x86/cpu/cpu_x86.c                                   99       99        0   100.0%
+    arch/x86/cpu/i386/call64.S                               92       92        0   100.0%
+    arch/x86/cpu/i386/cpu.c                                 649      630       19    97.1%
+    arch/x86/cpu/i386/interrupt.c                           630      622        8    98.7%
+    arch/x86/cpu/i386/setjmp.S                               65       65        0   100.0%
+    arch/x86/cpu/intel_common/cpu.c                         325      325        0   100.0%
+    ...
+
+Copy Used Files (``copy-used``)
+-------------------------------
+
+Extract only the source files used in a build::
+
+  codman copy-used /tmp/sandbox-sources
+
+This creates a directory tree with only the compiled files, useful for creating
+minimal source distributions.
+
+Analysis Methods
+================
+
+The script supports several analysis methods with different trade-offs.
+
+Firstly, files are detected by looking for .cmd files in the build. This
+requires a build to be present. Given the complexity of the Makefile rules, it
+seems like a reasonable trade-off. These directories are excluded:
+
+* tools/
+* test/
+* scripts/
+* doc/
+
+unifdef
+-------
+
+For discovering used/unused code, the unifdef mechanism produces reasonable
+results. This simulates the C preprocessor using the ``unifdef`` tool to
+determine which lines are active based on CONFIG_* settings.
+
+**Note:** This requires a patched version of unifdef that supports U-Boot's
+``IS_ENABLED()`` and ``CONFIG_IS_ENABLED()`` macros, which are commonly used
+throughout the codebase. It also supports faster operation, reducing run time
+by about 100x on the U-Boot code base.
+
+The tools:
+
+1. Reads .config to extract all CONFIG_* symbol definitions
+2. Generates a unifdef configuration file with -D/-U directives
+3. Runs ``unifdef -k -E`` on each source file to process conditionals, with
+   ``-E`` enabling the IS_ENABLED() support
+4. Compares original vs. processed output using line-number information
+5. Lines removed by unifdef are marked as inactive
+
+This method Uses multiprocessing for parallel analysis of source files, so it
+runs faster if you have plenty of CPU cores (e.g. 3s on a 22-thread
+Intel Ultra 7).
+
+The preprocessor-level view is quite helpful. It is also possible to see .h
+files using the ``-i`` flag
+
+Since unifdef does fairly simplistic parsing it can be fooled and show wrong
+results.
+
+
+DWARF (``-w/--dwarf``)
+----------------------
+
+The DWARF analyser uses debug information embedded in compiled object files to
+determine exactly which source lines generated machine code. This is arguably
+more accurate than unifdef, but it won't count comments, declarations and
+various other features that don't actually generate code.
+
+The DWARF analyser:
+
+1. Rebuilds with ``CC_OPTIMIZE_FOR_DEBUG`` to prevent aggressive inlining
+2. For each .o file, runs ``readelf --debug-dump=decodedline`` to get line info
+3. Parses the DWARF line number table to map source lines to code addresses
+4. Aggregates results across all object files
+5. Any source line that doesn't appear in the line table is marked inactive
+
+As with unifdef, this uses multiprocessing for parallel analysis of object
+files. It achieves similar performance.
+
+
+See Also
+========
+
+* :doc:`../build/buildman` - Tool for building multiple boards
+* :doc:`qconfig`
+* :doc:`checkpatch` - Code-style checking tool