srcmlcpp: C++ code parsing#
litgen provides three separate python packages, srcmlcpp is one of them:
codemanip: a python package to perform textual manipulations on C++ and Python code. See code_utils.pysrcmlcpp: a python package that build on top of srcML in order to interpret the XML tree produced by srcML as a tree of python object resembling a C++ AST.litgen: a python package that generates python bindings from C++ code.
srcmlcpp will transform C++ source into a tree of Python objects (descendants of CppElement) that reflect the C++ AST.
This tree is used by litgen to generate the python bindings. It may also be used to perform automatic C++ code transformations.
Transform C++ code into a CppElement tree#
Given the C++ code below:
code = """
// A Demo struct
struct Foo
{
const int answer(int *v=nullptr); // Returns the answer
};
"""
srcmlcpp can produce a tree of CppElement with this call:
import srcmlcpp
options = srcmlcpp.SrcmlcppOptions()
cpp_unit = srcmlcpp.code_to_cpp_unit(options, code)
cpp_unit is then a tree of Python object (descendants of CppElement) that represents the source code.
Here is what it looks like under a debugger:

Transform a CppElement tree into C++ code#
Transformation to source code from a tree of CppElement#
CppElement provides a method str_code() that can output the C++ code it contains. It is close to the original source code (including comments), but can differ a bit.
Note
Any modification applied to the AST tree by modifying the CppElements objects (CppUnit, CppStruct, etc.) will be visible using this method
from litgen.demo import litgen_demo
litgen_demo.show_cpp_code(cpp_unit.str_code())
// A Demo struct
struct Foo
{
public: // <default_access_type/>
const int answer(int * v = nullptr); // Returns the answer
};
“Verbatim” transformation from tree to code#
You can obtain the verbatim source code (i.e. the exact same source code that generated the tree), with a call to str_code_verbatim().
Note
This will call the srcML executable using the srcml xml tree stored inside
cpp_unit.srcml_xml, which guarantees to return the same source codeAny modification applied to the AST tree by modifying the
CppElementpython objects (CppUnit, CppStruct, etc.) will not be visible using this method
print(cpp_unit.str_code_verbatim())
// A Demo struct
struct Foo
{
const int answer(int *v=nullptr); // Returns the answer
};
CppElement types#
When parsing C++ code, it will be represented by many python objects, that represents differents C++ tokens.
See the diagram below for more information:

litgen and srcmlcpp#
For information, when litgen transform C++ code into python bindings, it will transform the CppElement tree into a tree of AdaptedElement.
See diagram below:
