Specifying behavior
September 27, 2011 Leave a comment
In the paper “Six Learning Barriers in End-User Programming Systems”, Andrew J. Ko and his colleagues show that programmers make numerous assumptions when working with unfamiliar APIs, over three-quarters of them about API behavior. While programmers can directly examine type definitions and method signatures, they need to infer behavior from method and parameter names. It is not entirely surprising that many such assumptions turn out to be incorrect. Ko’s paper documents a total of 130 cases when programmers failed to complete the assigned task. In 36 of those cases, the programmers did not succeed in making the API call at all. In a further 38 cases, they were unable to understand why the call behaved differently than expected and what to do to correct it. In another 25 cases, they were unable to successfully combine two or more method calls to solve the problem.
Why self-documenting APIs are rare
Under-specified behavior causes serious usability issues in numerous APIs. Many developers honestly believe in self-documenting APIs, but as we will show, fully self-documenting APIs are an ideal towards we should aim, rather than a result we can realistically expect to achieve. Despite our very best efforts, subtle and unintuitive behavior is present in most APIs.
Even in the seemingly clear-cut cases, figuring out the precise behavior without additional help can be unexpectedly daunting. Take the TeamsIdentifier
class shown below as an example:
//Uniquely identifies an entity. class TeamsIdentifier { //Constructs an identifier from a string. TeamsIdentifier(String id) {...} //Returns the id as a String. java.lang.String asString() {...} //Convenience method to return this id as an array. TeamsIdentifier[] asTeamsIdArray() {...} // Returns a copy of the object. java.lang.Object clone() {...} //Checks if two ids are equal. boolean equalsId(TeamsIdentifier id) {...} // Intended for hibernate use only. java.lang.String getTeamsId() {...} boolean equals(java.lang.Object o) {...} int hashCode() {...} void setTeamsId(java.lang.String id) {...} //Returns a string representation of the id. java.lang.String toString() {...} }
It looks straightforward enough, you say. Let’s see if you can answer, in total confidence, the following questions:
Expression | True or False? |
---|---|
TeamsIdentifier id1 = new TeamsIdentifier(“name”) |
? |
id1.equalsId(id2) |
? |
id1.toString().equals(“name”) |
? |
id1.getTeamsId().equals(“name”) |
? |
TeamsIdentifier id = new TeamsIdentifier(“a.b.c”) |
? |
TeamsIdentifier id = new TeamsIdentifier(“a:b:c”) |
? |
Knowing that AssetIdentifier
and UserIdentifier
both extend TeamsIdentifier
, can you answer, again in total confidence, the questions below?
Expression | True or False? |
---|---|
AssetIdentifier assetId = new AssetIdentifier(“Donald”) |
? |
assetId.equalsId(userId) |
? |
assetId.toString().equals(userId.getTeamsId()) |
? |
Of course, we can make sensible assumptions about what the correct behavior should be, but we have to honestly admit that we don’t really know. For that we either need to test the API or look at the implementation. Looking at the implementation is rarely a practical option. Learning by trial and error is time consuming and it doesn’t tell us which observed behavior is by design as opposed to merely accidental. For example, if we get the same AssetIdentifier
object back every time, we might incorrectly assume that we can write id1 == id2
instead of id1.equals(id2)
. Our program works correctly only until the next version of the API comes out.
We provide a huge service to our users when we remove guesswork from API usage by properly documenting behavior.
Using code for specifying behavior
Code is more concise and precise than words. It is difficult to think of a good reason why not to use code for specifying API behavior. We are documenting for developers, who should welcome, and have no problem understanding code. The above tables document the behavior of TeamsIdentifier
and its derived classes when we enter the appropriate True or False values into the second column. You probably noticed that the code in the first column is similar to what we would write for unit tests. In the case of APIs, unit tests are twice as useful because they also document the expected behavior. Some developers call these code snippets assertions, while those familiar with the work of Professor Bertrand Mayer call this particular method of specifying behavior Design by Contract. Starting with version 4.0, the .Net Framework natively supports design by contract, while third-party tools exist for many other programming languages.
No matter what we call it or what tool we use, we should precisely specify API behavior using code.
Indicating stateless, accessor and mutator methods
The existence of observable internal state is a primary cause of unintuitive behavior, since it allows a method call to modify the result of the next (seemingly unrelated) call. A stateful algorithm controls access rights in multi-user systems. Is it possible to discover, from studying the API alone, how moving a document into a different folder affects its access rights? Isn’t it true that this depends not only on the security settings assigned to the document itself and those of the destination folder, but also on the security settings of its parent folder and recursively up to the root folder? Doesn’t it also depend on the user’s assigned roles, group memberships and perhaps on the security model currently in use? All these settings may be accessible via the API, but they alone won’t tell us how the access control algorithm actually works.
Realizing that state prevents us from designing self-documenting APIs, we could be tempted to stick to stateless APIs. While this isn’t always possible, it is an excellent idea to isolate the impact of internal state to the smallest possible part of APIs. We should have as many stateless methods as possible, since their behavior only depends on the parameter values. In object-oriented environments we should also favor immutable objects, which have state that cannot be changed once the objects are created. Fixed state is obviously less predictable than no state, but more predictable than evolving state.
Where we cannot avoid modifiable state, we should group the affected methods into two distinct categories: accessors, which can only read the state, and mutators, which can also change it. Accessors are like gauges on a control panel, and mutators are like switches and buttons. The accessors produce the same result when called a second or third time in a row, while mutators may produce a different result every time. Inserting a call to an accessor into the middle of an existing program is safe, while inserting a mutator may change the behavior of the subsequent API calls, breaking the program’s logic.
We must explicitly tell callers if a method is stateless, an accessor, or a mutator to help them use it correctly. We cannot rely on them guessing correctly or on naming conventions alone. We won’t be able to start all accessor names with “get” or “is” – show()
or print()
are accessors, as are many other, less obviously named methods. Because mutators are the most challenging, it is a good idea to keep their number to an absolute minimum and pay careful attention to their design.
Using strong invariants
Not all mutators are equally problematic. The stronger the invariant, the more predictable and intuitive the behavior becomes. The invariant is a set of statements (assertions) about behavior, which always hold true, regardless of state. It is essentially guaranteed, predictable behavior. We will illustrate this with an API, which helps us cover a geometrical shape with a triangular mesh as shown in the figure below:
Depending on our design, some or all of the following statements may be true after each API call:
- The whole geometric area is fully covered with the mesh
- All triangles in the mesh are regular (the triangle area is not null, no two nodes overlap each other, the three nodes don’t lie on the same straight line, etc.)
- There are no unconnected nodes
- No two triangles overlap each other
- Every node lies either inside or on the boundary of the geometric shape
- Every edge lies either inside or on the boundary of the geometric shape
The simplest API we can imagine, which requires us to insert and connect nodes directly, cannot guarantee any of this and would be rather difficult to use (remember, you cannot see the mesh when programming with an API!). We intuitively know that an API, which could guarantee all of the above invariants, would be much easier to use, but is such an API feasible? While it is not easy to figure them out, such mutators exist, and they are known as the Delaunay mesh refinement operators. Here are four of them:
Triangle split – splits a triangle into three smaller ones by adding a new node in the middle
Edge split – replaces two adjacent triangles with four smaller ones by splitting the common edge into two halves
Edge flip – changes the shape of two adjacent triangles by flipping the shared edge to the other diagonal of the bounding rectangle
Node nudge – changes the shape of the connected triangles by repositioning a node inside the polygon defined by the neighboring nodes
Notice how simple it is to describe what each method does? To see the big difference this design makes, try to describe how to correctly refine a mesh by inserting and (re)connecting nodes, and then do it again using the Delaunay operators. Which is easier?
Great APIs have strong invariants, but as we just saw, this doesn’t happen by itself, it requires careful design.
Using weak preconditions
Weak preconditions help callers just like strong invariants. If invariants are constraints on the API designer, preconditions are constraints on the caller: conditions which should be met for the call to succeed. From the caller perspective, the invariants should be strong, and the preconditions weak. In an ideal world, all API calls would succeed and produce correct results for all possible arguments. In the real world, this is either impossible or it conflicts with other design requirements. The trick is to stay as close to the ideal solution as possible.
For example, one of our APIs limits the length of string method parameters to less than 255 characters for efficient database storage and better performance. On the other hand, it would be easier to use without these limitations. Web Services APIs, in general, are infamous for taking complex data structures as arguments, yet they only work when these data structures are appropriately constructed. The documentation rarely states the preconditions explicitly, leading to backbreaking trial-and-error style programming.
To sum it up, weak preconditions (or no preconditions) are better than strong ones, and documented preconditions are far preferable to undocumented ones.
Conclusion
Observable state is just one of the many reasons why self-documenting APIs are a largely unreachable ideal. Reentrancy, performance characteristics, extensibility via inheritance, the use of callbacks, caching, clustering and distributed state can all lead to complex, unintuitive behavior. While careful design using strong invariants and weak preconditions can make API behavior more predictable, behavior still needs to be explicitly specified. The recommended way of specifying behavior is with code in the form of unit tests, assertions or contracts.
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 Canada License.