# Automated Code Generation ```{caution} The **sourcegen** utility is an experimental part of Cantera and may be changed without notice. ``` Cantera's **sourcegen** utility is a Python based source generator for creating Cantera interfaces for other languages. The following output options are supported: - `clib`: [](clib-extensions); used to generate the [CLib API](../clib/index). - `csharp`: [](dotnet-extensions); used to implement the [.NET API](../dotnet/index). - `yaml`: [](yaml-extensions); simple illustration that summarizes the CLib interface in YAML format. ## Usage The sourcegen utility is a command-line tool that is used for code generation. It can be invoked without installation as ```bash python -m interfaces.sourcegen.src.sourcegen ``` from the root folder of the Cantera source code. For frequent usage, it is recommended to install sourcegen into the same virtual environment as used by SCons (for example the Conda environment used to compile Cantera from source) via: ```bash python -m pip install -e interfaces/sourcegen ``` where the `-e` (or `--editable`) option ensures that changes to the sourcegen utility take effect without a need to re-install. Running: ```shell % sourcegen --help ``` displays the following help text: ```shell usage: sourcegen [-h] [-v] [--api {clib,csharp,yaml}] [--output OUTPUT] [--root ROOT] Source generator for creating Cantera interface code. options: -h, --help show this help message and exit -v, --verbose show additional logging output --api {clib,csharp,yaml} language of generated Cantera API code --output OUTPUT specifies the OUTPUT folder name --root ROOT specifies the Cantera source ROOT folder (default is '.') ``` ## Overview The sourcegen utility parses the XML tree generated by [Doxygen](https://www.doxygen.org), using YAML configuration files that provide instructions for constructing CLib interface functions from underlying C++ functions and methods. For more information, see [](sourcegen-config). The utility is used to generate the [CLib API](clib-extensions) itself, as well as generated language interfaces built on top of it, such as the .NET interface and others. (sec-sourcegen-details)= ## Details Automatic code generation involves initialization steps to resolve CLib interface information using [](sourcegen-config). A subsequent scaffolding step delegates the source generation to a language-specific sub-package. 1. **Parse Header File Specifications:** The commandline utility relies on a `HeaderFileParser` object that parses [](sec-sourcegen-specifications) and generates intermediate `HeaderFile` objects that represent individual CLib modules. The `HeaderFile` dataclass contains the following information: - `path`: Output folder. - `funcs`: List of functions to be scaffolded (initially empty). - `prefix`: Prefix used for CLib function names. - `base`: Base class of C++ methods (if applicable). - `parents`: List of C++ parent class(es). - `derived`: Dictionary of derived C++ class(es) and alternative prefixes. - `recipes`: List of header recipes read from YAML. - `docstring`: Lines representing docstring of YAML file. Each YAML specification file results in exactly one `HeaderFile` object. 1. **Resolve Recipes:** As a minimum, a [YAML Recipe](sec-sourcegen-recipes) specifies a `name` that either corresponds to a function within the `Cantera` namespace or a method or variable of the implemented base class. The `CLibSourceGenerator.resolve_tags` method is used to cross-reference individual recipes with known Doxygen tags. The information is used to detect the [CLib Function Type](sec-sourcegen-function-types) of a recipe and to generate a corresponding `CFunc` object that holds relevant CLib interface information used for subsequent scaffolding: - `ret_type`: Return type of CLib function. - `name`: Name of CLib function. - `arglist`: CLib function argument list. - `brief`: Brief description. - `wraps`: Implemented C++ function/method (if applicable). - `returns`: Description of returned value. - `base`: Qualified scope of function/method (if applicable). - `uses`: List of auxiliary C++ methods (if applicable). The information is used to update `recipe` list entries and build the `funcs` list for each `HeaderFile` object. 1. **Language-Specific Source Generation:** Each language-specific sub-package is required to export a class that derives from `SourceGenerator` and implement a `generate_source` method that takes a list of `HeaderFile` objects with their resolved recipes as an argument. The `generate_source` method uses this information to generate syntactically correct source code in the destination language. Each sub-package can contain a yaml-based config file named `config.yaml`. The core script recognizes two special keys: - `ignore_files`: a list of YAML specification file names\ These files will be ignored entirely from source generation, for example because they cannot be parsed directly or contain functionality that is not planned for implementation in the destination language. - `ignore_funcs`: a mapping of YAML specification file names to lists of recipe names\ The listed recipes contained within those files will not be scaffolded, for example because they cannot be translated automatically and need to be written by hand in the destination language. The config file may contain additional values for use by the language-specific sub-package. Further processing of generated code depends on the build process of the destination language. ```{tip} The [YAML Source Generator](yaml-extensions) serves as an example to illustrate code generation based on `HeaderFile` contents. ```