Writing files at compile time
Version | 2 |
Created | 2015-08-12 |
Status | Draft |
Last modified | -- |
Author | Jakob Demler |
Introduction
Generating code at compile time using compile time reflection is one of the most distinguishable features of the D programming language. It is used for serialization, unit testing libraries, template engines and many more. Nevertheless there are some reoccuring problems with the development process and usage of mixins:
- debugging
- The debugger cannot attach to mixedin code; one has to print the resulting code with pragma(msg, ) to inspect it
- compile-speed
- Even if the resulting code does not change, mixedin code has to be compiled anyway
- scalability
- If the generated code is needed in two seperated places of the code, it has to be mixedin twice or has to be mixedin into a special module which introduces overhead
- transparency
- As user of a library which relys on compile time code generation one has often no idea what code is generated
To deal with these problems this proposal will introduce the idea of writing the generated code into D sourcefiles which entails some solutions.
Syntax
Considering that the writing to files at compile time and at run time are two very different processes, a syntax must be found that makes them easily distinguishable. Also as reading files at compile time is already implemented using the import-keyword the following syntax is proposed:
export(filename, content);
Where filename is follows C standard library’s fopen semantics and content is the string written to the file. Both filename and content have to be known at compile time.
Even though the export keyword collides with the access specifier, it seems to be the best solution as it is intuitive, matches the import syntax for reading files at compile time, is easily distinguishable from run time file writing and does not break current code (as export is already a keyword).
Use Cases
vibe.d’s diet templates
How does this feature now solve the problems mentioned above? Consider for example vibe.d’s diet templates:
User provided template files are compiled with usage of compile time reflection and provided types to sourcecode. At the moment the resulting code is mixed in into a templated function which is called by the user:
https://github.com/rejectedsoftware/vibe.d/blob/master/source/vibe/templ/diet.d lines: 54ff
void compileDietFileIndent(string template_file, size_t indent, ALIASES...)(OutputStream stream__)
{
[...]
// Generate the D source code for the diet template
//pragma(msg, dietParser!template_file(indent));
static if (is(typeof(diet_translate__)))
mixin(dietParser!(template_file, diet_translate__)(indent));
else
mixin(dietParser!template_file(indent));
}
Using the write-to-file at compile time feature the code would change to something like the following:
void compileDietFileIndent(string template_file, size_t indent, ALIASES...)(OutputStream stream__)
{
[...]
// Generate the D source code for the diet template
static if (is(typeof(diet_translate__)))
enum contents = dietParser!(template_file, diet_translate__)(indent);
else
enum contents = dietParser!template_file(indent);
export(template_file ~ ".d", contents);
mixin("import vibe.templ.diet.compiled.templatename");
renderCompiledTemplate!(Aliases)(stream);
}
This example looks more verbose but it entails serveral improvements: The .dt.d file can not only be inspected (without commenting in the pragma(msg, ) line) but also debugged using a debugger. It does not have to be compiled every time the compileDietFileIdent function is compiled but only if the resulting code changes (either the .dt file or the used types change). A vibe.d user can inspect and understand what is going on under the hood easily. Resulting in less magic, surprise and in case of bugs inside the template engine or the template, in faster location of the bug.
SQL Library
Imagine a sql library that encapsulates the sql itself from the user:
CreateTableIfNotExists!UserDefinedType(DBConnection);
SQL would be generated using compile time reflection and TMP. The user of this library would be able to change the resulting sql with attributes.
The user does not need to inspect the resulting sql (as is the purpose of this imaginary library) but, nevertheless for debugging and understanding it would be helpfull if he could.
The library could generate .sql.d files which include the generated sql as constant strings. Implementing CreateTableIfNotExists could look like this:
void CreateTableIfNotExists(UDT)(DBConnection con)
{
mixin("import "~moduleName!UDT~".sql;");
con.exec(CreateTableIfNotExistsSql);
}
The code could be created in a central register function:
void RegisterUDT(UDT)()
{
export("sql/"~moduleName!UDT~".sql.d", GenerateSql!UDT);
}
Thinking further
Altough the primary use case seems to be code generation, all computations based on the type system that result in string would benefit from this feature: Think of libraries that generate configuration files, css or any other dsl from user defined types with the help of attributes, compile time reflection and TMP. Though this could also be achieved with the features at hand (generating the string at compile time and writing it to a file at run time) an ‘export’ functionality would make it much more convinient to do so.
As we only start to recognize the capabilities of TMP in combination with compile time reflection, it is important not to underestimate the implications of the extensions of its features.
Alternatives
Two step compilation
As mentioned in the original forum thread, it is already possible to do this using a workaround: If a that is generated exists it is imported otherwise it is created. However, this results in two distinct run times and there is no way to conveniently regenerate the sourcefile if the input has changed.
Expanding Mixins and Templates
Another idea, mentioned in the forum discussion, is to expand template instantiations and mixins during the compilation process and writing the expanded versions to .di files. Expanding template instantiations would be handled with template constraints. So for every instantiation a new template would be generated with the exact fitting constraint for it. These expanded source files would be debuggable. Nevertheless this idea has some flaws:
- The resulting interface files will become horribly huge. Imagine templated functions that are instantiated a thousand times or more.
- There would be no way to easily tell the compiler which mixins and templates to expand and which not.
- In case of users using a library the interesting part of the gernerated code would not be in a user source file but in some source file inside the library. To debug and/or inspect it the user needs to know in which source file the relevant code is generated.
All in all this idea seems to be able to serve the needs for small mixins and projects. But in combination with TMP and even a medium sized project this approach becomes impractical.
Implementation Considerations
The discussion of the first version of this DIP revealed some fundamental flaws in its original idea:
- A file system must be optional in the compilation process
- How can the compiler ensure that an imported file is already generated, especially in case of parallel compilation
These problems are solved by letting the export-function not write to the file system directly, but into a compiler-internal key-value storage. Imports would then look into this storage and block in case of a not yet generated file. If the compiler is blocked by such an import it would queue the current source file and move on with the next. The compiler would report an error in two cases:
- If a key is written more than once
- If an imported key is never written to
At the end of the compilation process the compiler can write those key-value pairs as files to the file system. This is optional and controlled by a compiler switch.
Problems and limitations
Even though the use cases appear intriguing, the create file at compile time feature does not seem to solve all problems:
- One is not able to manually edit the generated source between the generation and import process, as both happen inside one compiler call.
- It is not trivially possible to only generate a source file if it is not up-to-date anymore. Unnecesarry computations are not avoided.
Copyright
This document has been placed in the Public Domain.