Recipe conventions and best practices
Conventions to follow, pitfalls to avoid
To help you create reliable and scalable recipes, this document will describe an assortment of conventions and best practices to keep in mind when making new recipes.
To get the most use out of this document, it would be beneficial for you to:
One of the most important conventions to keep in mind when creating recipes is the concept of "Do no harm". If your recipe cannot determine that a change is safe, it should make no changes rather than making a potentially wrong change.
For example, in the RenameLocalVariableToCamelCase recipe, we prevent the recipe from renaming variables to a name that could be a namespace conflict. Likewise, we opt out of renaming class fields since they might be in use by some downstream API. We favor making fewer changes over making wrong changes.
Relatedly, when making changes, always strive to make the most minimal, least invasive version of that change. For instance, in the NoGuavaImmutableSetOf recipe, we restrict the recipe from changing any code that does not use at least Java 9.
By reducing a recipe's scope and minimizing the number of changes a recipe makes, you'll make it easier for others to review the changes. This, in turn, will result in more pull requests being accepted and more trust in your recipes.
On the other hand, a change that unnecessarily clobbers formatting or is otherwise overly broad in scope will burden code reviewers and drastically reduce the rate at which they accept changes.
When authoring your tests, always remember to test that changes aren't made when they shouldn't be. For instance, if your recipe only affects a certain package, make sure you include a test that shows that other packages are unaffected.
By following these conventions, you'll ensure that:
Before you begin writing a new recipe, take the time to examine existing recipes. You may find that what you want to do can be done by combining existing recipes without writing any code. You can achieve this by creating a declarative YAML file.
For instance, let's say you wanted to create a recipe that combines the functionality of numerous other static analysis recipes. You could imperatively code a new recipe that calls one recipe after the next. However, that approach is strongly discouraged. Instead, you should declaratively list out each recipe you want to run in a YAML file. In doing so, you will not only save time, but you will also substantially reduce the number of potential errors.
If your recipe can be declarative, it should be.
For instance, let's say you were creating a recipe that adds the
publicvisibility modifier to a ClassDeclaration. If you did not call
super.visitClassDeclaration()in your overridden
visitClassDeclarationmethod, then your recipe would not visit or check for inner classes. It's possible this is what you want, but it's also possible that this is a bug.
By not calling
super.visitSomeLstElement(), you will often improve the performance of your recipe. However, in many cases, that could be a confusing bug. It's your responsibility to take the time to think through these options and decide one way or another.
If you do decide to include
super.visitSomeLstElement(), it is often most convenient to do so at the beginning of each
visitmethod you override.
Most recipes are not universally applicable to every source file. For instance, a recipe that makes changes to a method introduced in Java 17 would not be useful to run on Java 8 code. Likewise, a recipe that works with ArrayLists would not apply to a file that does not contain an ArrayList.
Instead of running your recipe on every file, you can have your recipe provide some context on when it should be run. By doing so, you'll not only make it so your recipe can be run on more files more quickly, but you'll also enhance the readability of your recipe. That, in turn, simplifies debugging and maintenance and leads to better recipes.
For instance, in the MigrateCollectionsSingletonSet recipe, we add a check that ensures the Java version is 9 and that the file contains a
singletonmethod. We can be confident that this recipe won't make changes if those preconditions do not apply.
Preconditions are most worthwhile when they can cheaply prevent a visitor from running unnecessarily.
Recipes must have no mutable state. Likewise,
Recipe.getVisitor()must always return a brand-new visitor instance. This is because, during recipe execution, the same recipe may be invoked multiple times, possibly on sources it has seen before. Any mutable state could lead to strange and confusing bugs. Following this rule is essential for OpenRewrite to operate reliably and correctly.
During test execution, recipe execution is single-threaded for simplicity. However, outside of tests, recipe execution is parallelized.
OpenRewrite detects that a visitor/recipe has made a change based on referential equality. Mutating fields on an LST breaks this detection, and can potentially cause incorrect diffs to be created.
Since referential equality is so important for OpenRewrite to detect changes, typical collection creation and manipulation patterns like newing up your own collection or using Java streams are usually a mistake. Instead, we provide the
ListUtilsclass which provides referential-equality conscious data access and manipulation faculties that avoid unnecessary memory allocations.
If we used a typical Java stream, regardless of whether or not there were any changes, we'd end up with a newly allocated list. This would lead to OpenRewrite incorrectly saying that there were changes even if our recipe did not intend to make any.
Throughout our LSTs and other APIs, we have been careful to use Java's nullability annotations to indicate whether a field is nullable or not. If a field is nullable then there will be LSTs where that field is null. To operate correctly on the wildly diverse code which exists throughout the world, Recipe/Visitors must not make changes if a field they need is null.
Even very simple pieces of code have complex LST representations which are tedious and error-prone to construct by hand. Never attempt to build up LSTs by hand. Instead, you should use faculties like JavaTemplate to turn code snippets into LST elements. For data formats like XML or JSON, it is usually more convenient to use the format's parser to turn a snippet of text into usable LST elements.
OpenRewrite executes recipes in a loop. Each iteration of that loop through the full recipe list is called a cycle. This is so that if one recipe makes a change and another recipe would do something based on that change, it will have a chance to do so. This happens regardless of the order the recipes are executed in.
By default, only a single cycle is executed unless some recipe in the group overrides
true. For larger recipes, such as a framework migration, the performance impact of causing another cycle can be substantially detrimental. Whenever possible, a recipe should complete all of its work the first time (which avoids the need to override
When passing state between functions in the same visitor, there are two main options: cursor messaging and execution context messaging.
Cursor messaging is the preferred method. With it, you can utilize
TreeVisitor.getCursor()to access a map that arbitrary data can be read from or written to. For instance, in the AddDependency recipe, we pass data from the overridden visitTag method to the overridden visitDocument method in our
Cursorobject is a stack that keeps track of a visitor's current progress through an LST. Once everything has been visited in a particular cycle, all of this information is then thrown away. There is no risk of this state leaking into places that you don't want.
ExecutionContextobject, on the other hand, has a few key differences that make it less suitable for most situations. All recipes in a run will have the same
ExecutionContextobject. This object contains a map, similar to the
Cursor, that recipes can read from or write arbitrary data to. Unlike the
Cursormap, though, this
ExecutionContextmap sticks around between recipe run cycles. This poses some considerable danger.
Imagine one recipe author decided to write a
fooobject to the
ExecutionContextso that the
bashfunctions could both interact with it. Now, let's imagine another recipe author, unaware of the fact that the other recipe wrote a
fooobject to the
ExecutionContext, also decides that their recipe should read and write to a
fooobject. If those recipes get chained together, both of those recipes could break.
Because of that, whenever shared state is needed within a single visitor or recipe, the cursor stack should be used instead of adding state to the execution context.