Java API Design Checklist

There are many different rules and tradeoffs to consider during Java API design. Like any complex task, it tests the limits of our attention and memory. Similar to the pilots’ pre-flight checklist, this list helps software designers remember obvious and not so obvious rules while designing Java APIs. It is a complement to and intended to be used together with the API Design Guidelines.

We also have some before-and-after code examples to show how this list can help you remember overlooked design requirements, spot mistakes, identify less-than-optimal design choices and opportunities for improvements.

Click the [explain] link next to a checklist item (where available) for details about the rationale, examples, design tradeoffs or other limitations of applicability.

This list uses the following conventions:

   (Do) verb...  - Indicates the required design
    Favor...     - Indicates the best of several design alternatives
    Consider...  - Indicates a possible design improvement
    Avoid...     - Indicates a design weakness
    Do not...    - Indicates a design mistake

1. Package Design Checklist

1.1. General

  • 1.1.1. Favor placing API and implementation into separate packages [explain]
  • 1.1.2. Favor placing APIs into high-level packages and implementation into lower-level packages [explain]
  • 1.1.3. Consider breaking up large APIs into several packages [explain]
  • 1.1.4. Consider putting API and implementation packages into separate Java archives [explain]
  • 1.1.5. Avoid (minimize) internal dependencies on implementation classes in APIs [explain]
  • 1.1.6. Avoid unnecessary API fragmentation [explain]
  • 1.1.7. Do not place public implementation classes in the API package [explain]
  • 1.1.8. Do not create dependencies between callers and implementation classes [explain]
  • 1.1.9. Do not place unrelated APIs into the same package [explain]
  • 1.1.10. Do not place API and SPI into the same package [explain]
  • 1.1.11. Do not move or rename the package of an already released public API [explain]

1.2. Naming

  • 1.2.1. Start package names with the company’s official root namespace [explain]
  • 1.2.2. Use a stable product or product family name at the second level of the package name [explain]
  • 1.2.3. Use the name of the API as the final part of the package name [explain]
  • 1.2.4. Consider marking implementation-only packages by including “internal” in the package name [explain]
  • 1.2.5. Avoid composite names [explain]
  • 1.2.6. Avoid using the same name for both package and class inside the package [explain]
  • 1.2.7. Avoid using “api” in package names [explain]
  • 1.2.8. Do not use marketing, project, organizational unit or geographic location names [explain]
  • 1.2.9. Do not use uppercase characters in package names [explain]

1.3. Documentation

  • 1.3.1. Provide a package overview (package.html) for each package [explain]
  • 1.3.2. Follow standard Javadoc conventions [explain]
  • 1.3.3. Begin with a short, one sentence summary of the API [explain]
  • 1.3.4. Provide enough details to help deciding if and how to use the API [explain]
  • 1.3.5. Indicate the entry points (main classes or methods) of the API [explain]
  • 1.3.6. Include sample code for the main, most fundamental use case [explain]
  • 1.3.7. Include a link to the Developer Guide [explain]
  • 1.3.8. Include a link to the Cookbook [explain]
  • 1.3.9. Indicate related APIs
  • 1.3.10. Include the API version number [explain]
  • 1.3.11. Indicate deprecated API versions with the @deprecated tag
  • 1.3.12. Consider including a copyright notice [explain]
  • 1.3.13. Avoid lengthy package overviews
  • 1.3.14. Do not include implementation packages into published Javadoc

2. Type Design Checklist

2.1. General

  • 2.1.1. Ensure each type has a single, well-defined purpose
  • 2.1.2. Ensure types represent domain concepts, not design abstractions
  • 2.1.3. Limit the number of types [explain]
  • 2.1.4. Limit the size of types
  • 2.1.5. Follow consistent design patterns when designing related types
  • 2.1.6. Favor multiple (private) implementations over multiple public types
  • 2.1.7. Favor interfaces over class inheritance for expressing simple commonality in behavior [explain]
  • 2.1.8. Favor abstract classes over interfaces for decoupling API from implementation [explain]
  • 2.1.9. Favor enumeration types over constants
  • 2.1.10. Consider generic types [explain]
  • 2.1.11. Consider placing constraints on the generic type parameter
  • 2.1.12. Consider using interfaces to achieve similar effect to multiple inheritance
  • 2.1.13. Avoid designing for client extension
  • 2.1.14. Avoid deep inheritance hierarchies
  • 2.1.15. Do not use public nested types
  • 2.1.16. Do not declare public or protected fields
  • 2.1.17. Do not expose implementation inheritance to the client

2.2. Naming

  • 2.2.1. Use a noun or a noun phrase
  • 2.2.2. Use PascalCasing
  • 2.2.3. Capitalize only the first letter of acronyms [explain]
  • 2.2.4. Use accurate names for purpose of the type [explain]
  • 2.2.5. Reserve the shortest, most memorable name for the most frequently used type
  • 2.2.6. End the name of all exceptions with the word “Exception” [explain]
  • 2.2.7. Use singular nouns (Color, not Colors) for naming enumerated types [explain]
  • 2.2.8. Consider longer names [explain]
  • 2.2.9. Consider ending the name of derived class with the name of the base class
  • 2.2.10. Consider starting the name of an abstract class with the word “Abstract” [explain]
  • 2.2.11. Avoid abbreviations
  • 2.2.12. Avoid generic nouns
  • 2.2.13. Avoid synonyms
  • 2.2.14. Avoid type names used in related APIs
  • 2.2.15. Do not use names which differ in case alone
  • 2.2.16. Do not use prefixes
  • 2.2.17. Do not prefix interface names with “I”
  • 2.2.18. Do not use types names used in Java core packages [explain]

2.3. Classes

  • 2.3.1. Minimize implementation dependencies
  • 2.3.2. List public methods first [explain]
  • 2.3.3. Declare implementation methods private
  • 2.3.4. Define at least one public concrete class which extends a public abstract class [explain]
  • 2.3.5. Provide adequate defaults for the basic usage scenarios
  • 2.3.6. Design classes with strong invariants
  • 2.3.7. Group stateless, accessor and mutator methods together
  • 2.3.8. Keep the number of mutator methods at a minimum
  • 2.3.9. Consider providing a default no-parameter constructor [explain]
  • 2.3.10. Consider overriding equals(), hashCode() and toString() [explain]
  • 2.3.11. Consider implementing Comparable [explain]
  • 2.3.12. Consider implementing Serializable [explain]
  • 2.3.13. Consider making classes re-entrant
  • 2.3.14. Consider declaring the class as final [explain]
  • 2.3.15. Consider preventing class instantiation by not providing a public constructor [explain]
  • 2.3.16. Consider using custom types to enforce strong preconditions as class invariants
  • 2.3.17. Consider designing immutable classes [explain]
  • 2.3.18. Avoid static classes
  • 2.3.19. Avoid using Cloneable
  • 2.3.20. Do not add instance members to static classes
  • 2.3.21. Do not define public constructors for public abstract classes clients should not extend [explain]
  • 2.3.22. Do not require extensive initialization

2.4. Interfaces

  • 2.4.1. Provide at least one implementing class for every public interface
  • 2.4.2. Provide at least one consuming method for every public interface
  • 2.4.3. Do not add methods to a released public Java interface
  • 2.4.4. Do not use marker interfaces
  • 2.4.5. Do not use public interfaces as a container for constant fields

2.5. Enumerations

  • 2.5.1. Consider specifying a zero-value (“None” or “Unspecified”, etc) for enumeration types
  • 2.5.2. Avoid enumeration types with only one value
  • 2.5.3. Do not use enumeration types for open-ended sets of values
  • 2.5.4. Do not reserve enumeration values for future use
  • 2.5.5. Do not add new values to a released enumeration

2.6. Exceptions

  • 2.6.1. Ensure that custom exceptions are serialized correctly
  • 2.6.2. Consider defining a different exception class for each error type
  • 2.6.3. Consider providing extra information for programmatic access
  • 2.6.4. Avoid deep exception hierarchies
  • 2.6.5. Do not derive custom exceptions from other than Exception and RuntimeException
  • 2.6.6. Do not derive custom exceptions directly from Throwable
  • 2.6.7. Do not include sensitive information in error messages

2.7. Documentation

  • 2.7.1. Provide type overview for each type
  • 2.7.2. Follow standard Javadoc conventions
  • 2.7.3. Begin with a short, one sentence summary of the type
  • 2.7.4. Provide enough details to help deciding if and how to use the type
  • 2.7.5. Explain how to instantiate the type
  • 2.7.6. Provide code sample to illustrate the main use case for the type
  • 2.7.7. Include links to relevant sections in the Developer Guide
  • 2.7.8. Include links to relevant sections in the Cookbook
  • 2.7.9. Indicate related types
  • 2.7.10. Indicate deprecated types using the @deprecated tag
  • 2.7.11. Document class invariants
  • 2.7.12. Avoid lengthy type overviews
  • 2.7.13. Do not generate Javadoc for private fields and methods

3. Method Design Checklist

3.1. General

  • 3.1.1. Make sure each method does only one thing
  • 3.1.2. Ensure related methods are at the same level of granularity
  • 3.1.3. Ensure no boilerplate code is needed to combine method calls
  • 3.1.4. Make all method calls atomic
  • 3.1.5. Design protected methods with the same care as public methods
  • 3.1.6. Limit the number of mutator methods
  • 3.1.7. Design mutators with strong invariants
  • 3.1.8. Favor generic methods over a set of overloaded methods
  • 3.1.9. Consider generic methods
  • 3.1.10. Consider method pairs, where the effect of one is reversed by the other
  • 3.1.11. Avoid “helper” methods
  • 3.1.12. Avoid long-running methods
  • 3.1.13. Avoid forcing callers to write loops for basic scenarios
  • 3.1.14. Avoid “option” parameters to modify behavior
  • 3.1.15. Avoid non-reentrant methods
  • 3.1.16. Do not remove a released method
  • 3.1.17. Do not deprecate a released method without providing a replacement
  • 3.1.18. Do not change the signature of a released method
  • 3.1.19. Do not change the observable behavior of a released method
  • 3.1.20. Do not strengthen the precondition of an already released API method
  • 3.1.21. Do not weaken the postcondition of an already released API method
  • 3.1.22. Do not add new methods to released interfaces
  • 3.1.23. Do not add a new overload to a released API

3.2. Naming

  • 3.2.1. Begin names with powerful, expressive verbs
  • 3.2.2. Use camelCasing
  • 3.2.3. Reserve “get”, “set” and “is” for JavaBeans methods accessing local fields
  • 3.2.4. Use words familiar to callers
  • 3.2.5. Stay close to spoken English
  • 3.2.6. Avoid abbreviations
  • 3.2.7. Avoid generic verbs
  • 3.2.8. Avoid synonyms
  • 3.2.9. Do not use underscores
  • 3.2.10. Do not rely on parameter names or types to clarify the meaning of the method

3.3. Parameters

  • 3.3.1. Choose the most precise type for parameters
  • 3.3.2. Keep the meaning of the null parameter value consistent across related method calls
  • 3.3.3. Use consistent parameter names, types and ordering in related methods
  • 3.3.4. Place output parameters after the input parameters in the parameter list
  • 3.3.5. Provide overloaded methods with shorter parameter lists for frequently used default parameter values
  • 3.3.6. Use overloaded methods for operations with the same semantics on unrelated types
  • 3.3.7. Favor interfaces over concrete classes as parameters
  • 3.3.8. Favor collections over arrays as parameters and return values
  • 3.3.9. Favor generic collections over raw (untyped) collections
  • 3.3.10. Favor enumeration types over Boolean or integer types
  • 3.3.11. Favor putting single object parameters ahead of collection or array parameters
  • 3.3.12. Favor putting custom type parameters ahead of standard Java type parameters
  • 3.3.13. Favor putting object parameters ahead of value parameters
  • 3.3.14. Favor interfaces over concrete classes as return types
  • 3.3.15. Favor empty collections to null return values
  • 3.3.16. Favor returning values which are valid input for related methods
  • 3.3.17. Consider making defensive copies of mutable parameters
  • 3.3.18. Consider storing weak object references internally
  • 3.3.19. Avoid variable length parameter lists
  • 3.3.20. Avoid long parameter lists (more than 3)
  • 3.3.21. Avoid putting parameters of the same type next to each other
  • 3.3.22. Avoid out or in-out method parameters
  • 3.3.23. Avoid method overloading
  • 3.3.24. Avoid parameter types exposing implementation details
  • 3.3.25. Avoid Boolean parameters
  • 3.3.26. Avoid returning null
  • 3.3.27. Avoid return types defined in unrelated APIs, except core Java APIs
  • 3.3.28. Avoid returning references to mutable internal objects
  • 3.3.29. Do not use integer parameters for passing predefined constant values
  • 3.3.30. Do not reserve parameters for future use
  • 3.3.31. Do not change the parameter naming or ordering in overloaded methods

3.4. Error handling

  • 3.4.1. Throw exception only for exceptional circumstances
  • 3.4.2. Throw checked exceptions only for recoverable errors
  • 3.4.3. Throw runtime exceptions to signal API usage mistakes
  • 3.4.4. Throw exceptions at the appropriate level of abstraction
  • 3.4.5. Perform runtime precondition checks
  • 3.4.6. Throw NullPointerException to indicate a prohibited null parameter value
  • 3.4.7. Throw IllegalArgumentException to indicate an incorrect parameter value other than null
  • 3.4.8. Throw IllegalStateException to indicate a method call made in the wrong context
  • 3.4.9. Indicate in the error message which parameter violated which precondition
  • 3.4.10. Ensure failed method calls have no side effects
  • 3.4.11. Provide runtime checks for prohibited API calls made inside callback methods
  • 3.4.12. Favor standard Java exceptions over custom exceptions
  • 3.4.13. Favor query methods over exceptions for predictable error conditions

3.5. Overriding

  • 3.5.1. Use the @Override annotation
  • 3.5.2. Preserve or weaken preconditions
  • 3.5.3. Preserve or strengthen postconditions
  • 3.5.4. Preserve or strengthen the invariant
  • 3.5.5. Do not throw new types of runtime exceptions
  • 3.5.6. Do not change the type (stateless, accessor or mutator) of the method

3.6. Constructors

  • 3.6.1. Minimize the work done in constructors
  • 3.6.2. Set the value of all properties to reasonable defaults
  • 3.6.3. Use constructor parameters only as a shortcut for setting properties
  • 3.6.4. Validate constructor parameters
  • 3.6.5. Name constructor parameters the same as corresponding properties
  • 3.6.6. Follow the guidelines for method overloading when providing multiple constructors
  • 3.6.7. Favor constructors over static factory methods
  • 3.6.8. Consider a no parameter default constructor
  • 3.6.9. Consider a static factory method if you don’t always need a new instance
  • 3.6.10. Consider a static factory method if you need to decide the precise type of object at runtime
  • 3.6.11. Consider a static factory method if you need to access external resources
  • 3.6.12. Consider a builder when faced with many constructor parameters
  • 3.6.13. Consider private constructors to prevent direct class instantiation
  • 3.6.14. Avoid creating unnecessary objects
  • 3.6.15. Avoid finalizers
  • 3.6.16. Do not throw exceptions from default (no-parameter) constructors
  • 3.6.17. Do not add a constructor with parameters to a class released without explicit constructors

3.7. Setters and getters

  • 3.7.1. Start the name of methods returning non-Boolean properties with “get”
  • 3.7.2. Start the name of methods returning Boolean properties with “is”, “can” or similar
  • 3.7.3. Start the name of methods updating local properties with “set”
  • 3.7.4. Validate the parameter of setter methods
  • 3.7.5. Minimize work done in getters and setters
  • 3.7.6. Consider returning immutable collections from a getter
  • 3.7.7. Consider implementing a collection interface instead of a public propertie of a collection type
  • 3.7.8. Consider read-only properties
  • 3.7.9. Consider making a defensive copy when setting properties of mutable types
  • 3.7.10. Consider making a defensive copy when returning properties of mutable type
  • 3.7.11. Avoid returning arrays from getters
  • 3.7.12. Avoid validations which cannot be done with local knowledge
  • 3.7.13. Do not throw exceptions from a getter
  • 3.7.14. Do not design set-only properties (with public setter no public getter)
  • 3.7.15. Do not rely on the order properties are set

3.8. Callbacks

  • 3.8.1. Design with the strongest possible precondition
  • 3.8.2. Design with the weakest possible postcondition
  • 3.8.3. Consider passing a reference to the object initiating the callback as the first parameter of the callback method
  • 3.8.4. Avoid callbacks with return values

3.9. Documentation

  • 3.9.1. Provide Javadoc comments for each method
  • 3.9.2. Follow standard Javadoc conventions
  • 3.9.3. Begin with a short, one sentence summary of the method
  • 3.9.4. Indicate related methods
  • 3.9.5. Indicate deprecated methods using the @deprecated tag
  • 3.9.6. Indicate a replacement for any deprecated methods
  • 3.9.7. Avoid lengthy comments
  • 3.9.8. Document common behavioral patterns
  • 3.9.9. Document the precise meaning of a null parameter value (if permitted)
  • 3.9.10. Document the type of the method (stateless, accessor or mutator)
  • 3.9.11. Document method preconditions
  • 3.9.12. Document the performance characteristics of the algorithm implemented
  • 3.9.13. Document remote method calls
  • 3.9.14. Document methods accessing out-of-process resources
  • 3.9.15. Document which API calls are permitted inside callback methods
  • 3.9.16. Consider unit tests for illustrating the behavior of the method

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 Canada License.

API Design Webinar – Video and Slides

This is the recording of a webinar I presented to the OpenText R&D staff. It summarizes the material of the nine articles (totaling approximately 35 printed pages) published so far:

If you had no time to read the articles, perhaps you can spare 60 minutes to watch the video. If not, you can browse the slides to get an overall idea of the topics covered so far.  I apologize for the quality of the recording made over a regular phone line by our teleconferencing provider.  I do not apologize for my Hungarian accent, which is, of course, entirely  intentional…

Video Recording

Slides

 

 

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 Canada License.

Considering the perspective of the caller

Regular software design produces mediocre APIs because focus on implementation hurts APIs in many different ways. We will illustrate how with an internal Java API to avoid singling out any well-known public APIs. We immediately notice the complexity. The class ContentInstance, for example, appears 5 levels deep in the class hierarchy, and implements 4 additional interfaces (not counting the standard Serializable interface):

com.company.service.javabean
Class ContentInstance

java.lang.Object
  extended by com.company.service.common.DataObject
      extended by com.company.service.javabean.ManagedObject
          extended by com.company.service.javabean.ExtensibleObject
              extended by com.company.service.javabean.ContentItem
			extended by com.company.service.javabean.ContentInstance

All Implemented Interfaces:
    IAttributedObject, IChannelAssociate, IPersistable, IRelatedAttribute, java.io.Serializable

ContentInstance itself defines (or overrides) 33 methods, which is reasonable, but it also inherits a whopping 79 methods from ManagedObject, 8 methods from ExtensibleObject and 7 methods from DataObject for a grand total of over one hundred methods. That’s a lot of methods in a single class! We agree that complex problems require complex solutions and we should have no problem seeing similar complexity in a (hidden) implementation. In APIs, however, we greatly value simplicity. We don’t like spending time browsing through dozens of classes and hundreds of methods. We like it when ContentInstance is always ContentInstance and not ManagedObject or IChannelAssociate depending on the context, which tends to happen a lot when using such complex inheritance hierarchies. We are glad that someone else did the implementation work for us, but we don’t feel the need to understand how they did it. We focus on what we are implementing when using APIs, and frankly, we don’t have much time for anything else. The better the API manages to hide implementation details from us, the more we appreciate it.

Excessive abstraction is another problem which arises from implementation-focused API design. DataObject, ManagedObject, ExtensibleObject, ContentItem and ContentInstance are all pure design abstractions with no corresponding real-world objects or concepts we could immediately relate to. What’s the difference between DataObject and ManagedObject or between ContentItem and ContentInstance? We need to understand the whole API before we can understand its parts, a daunting task with a large API. We are happy to acknowledge that no complex problem can be solved without powerful abstractions. On the other hand, we need to confess that we find it difficult to understand someone else’s abstract concepts, because for this we must think like that other person. We wish the other person thought more like us instead, in familiar concepts like Document, Folder, Project or User.

Complex and counter-intuitive usage patterns are a third annoyance of implementation-focused design.  Read a programmer’s recollection of trying to figure out how to create a new instance of the ContentInstance class: “At first, I didn’t expect that ContentInstance can be instantiated because I found no constructor and no factory method in the class definition. Only after further investigation did I discover the newInstance() factory method ContentInstance inherits from the abstract super class ManagedObject. I was confused by the abstract base class declaring a factory method while the concrete class did not. Eventually, I learned from the documentation that using the ContentInstance and ContentType classes is similar to using Object and Class in the Java reflection API. The correct way of instantiating a ContentInstance was by calling ContentType.newInstance(), the inherited static newInstance() method proving to be a bit of a red herring. While the analogy with the Java Reflection API certainly helped, I started wondering if writing a program using this API would be just as awkward as writing an entire Java program using the reflection API…”

There are many other signs of implementation-focused design in the API. An out-of-process call is evident when IManagedObjectRef.getManagedObject() method throws a java.rmi.RemoteException. The object-relational mapping layer (Castor) is revealed when the class AttributeData inherits from org.exolab.castor.jdo.TimeStampable. Internal caching is obvious from methods like ContentType.clearCache(), while the underlying database schema is visible and accessible through methods like AttributeDefinitionData.getColumn(). With so many implementation details spilling out, we are left to wonder which part of our code will break when we upgrade to the next version of the API.

Designed for use versus designed to implement

The above issues may seem hard to avoid, but in practice they are not. While doing API usability tests at Microsoft, Jeffrey Stylos and his colleagues discovered a surprisingly easy way to do it. It happened almost by accident: on a few occasions, they asked the developers to solve simple programming tasks without giving them any specific APIs to use, instead they allowed them to make up the APIs they wanted to use. They were surprised by what they saw: when asked to send a simple text message using an unspecified messaging interface, all the developers wrote:

TextMessage msg = new TextMessage();

and not a single one of them wrote

MessageFactory factory = (MessageFactory) DirContext.lookup(“MessageFactory”);
TextMessage msg = factory.createTextMessage();

Given the opportunity to design their own graphics API, none of the developers wanted to write

Image.draw(false);

instead, they wanted to use two distinct methods, similar to these:

Image.overlay();  //draw over previous image
Image.draw();       //erase previous image before drawing

Boolean method parameters hardly ever figured in any of the APIs which developers were asking for. None of the developers thought they would need to do complex initialization steps; instead they assumed that the API will work right out-of-the box. Few of them expected that they will be required to extend classes, to implement interfaces, or to catch and handle exceptions. None of them wrote more than a few lines of code for the basic scenarios they were asked to implement. They just assumed that the API will take care of the details and hide the complexity from them.

The APIs the developers designed for themselves to use and someone else to implement were thus very different from the APIs they would design for themselves to implement and someone else to use. The conclusion is that the APIs we design are heavily influenced by our point of view. Let’s call these the caller’s point of view and the implementer’s point of view. Since the API is implemented only once but used many times, it should be clear that the caller’s point of view is dominant in API design.

Write client code first

So how do we design APIs from the caller point of view? By doing the exact same thing the developers were asked to do in the above experiment: writing the client code first. Not just once, but separately for every core usage scenario we want the API to cover. As we do this, repeating API usage patterns will emerge, as well as the types and methods which we need to provide. It helps when the developer writing these usage scenarios is not the same as the one implementing the API, to prevent any accidental “implementation bias”. It also helps if more than one person contributes code scenarios, so that personal preferences and programming style won’t have an undue influence on the API.

It is very important to point out that we are advocating writing real code for solving real problems as use cases, not pretend or throw-away code. The written code should be constantly updated and maintained as the API evolves and it should work correctly when the API is finally implemented. We don’t consider this wasted time, as this code can be reused for samples in the API documentation and as part of the API test suite. We should be skeptical about any API for which code samples are not readily available.

If an application has a graphical user interface, it is very common practice to model the API after the GUI, with method calls corresponding roughly to user actions, method parameters to user input, and results to what is displayed on the screen. This correctly reflects the user’s perspective and has nothing to do with the implementation, right? Well, the caller (programmer) and the user are not the same; they have drastically different needs. Issues with such APIs include:  insistence to log in with user name and password even when writing code for  unsupervised batch processes, excessive dependence on exception handling (the result of reusing existing input validation and error reporting logic), over-abundance of basic data types in method signatures (especially the string data type), data structures that have the same fields as forms displayed on the user’s screen, and so on. Such APIs are perhaps useful for developing alternative GUIs, but are less suitable for other scenarios.

The true test of our commitment to the “consider the caller perspective” guideline comes when we need to provide programmatic access to existing functionality. The implementation already exists. What are we going to do? Start coding scenarios and design an API from scratch? Or do we succumb to temptation, document the implementation we already have and call it an API? After all, any other API will require bridging, will introduce new bugs and may cause some performance problems. Why waste time on it?

If so many embarrassing (for the designer) and irritating (for the user) API design mistakes can be easily avoided by considering the viewpoint of the caller and writing the client code first, why is this not a common practice? Honestly, we don’t know the answer. The necessary time and effort certainly plays a role. Ultimately, just like in the case of User Experience Design, API Design is either part of an organization’s culture or it isn’t. One thing is for sure: considering the viewpoint of the caller is an essential part of any disciplined API design process.

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 Canada License.

Why developer-friendliness is central to API design

Today, APIs play a bigger role in software development than ever before. The evolution of computing has been dominated by ever-increasing levels of abstraction; the use of higher-level languages, of course, but also the development of platforms, libraries, and frameworks. Professor Douglass C. Smith claims the progression of this second category far outpaced the development of programming languages.  Developers are also noticing that difficulty has shifted from designing algorithms and data structures to choosing, learning, and using an ever-growing set of APIs. Yet, many developers may be surprised to learn just how much effort goes into designing these APIs.

Microsoft took API design very seriously since the beginning of Windows in the early 80s, understanding that the success of a platform depends on the number of developers it attracts (watch Steve Ballmer yelling “developers, developers, developers” on stage). A more recent example of Microsoft’s API design efforts is showcased in the extensive guidelines developed for the design of the huge .Net Framework API. And Microsoft is not alone in these efforts. Sun (now Oracle) has very similar guidelines for the APIs that form the Java SDK. In a 2002 interview, Joshua Bloch, former architect in the Core Java Platform Group (currently with Google), talks about Sun’s approach to API design. In yet another example, SAP AG hired a team of researchers from Carnegie Mellon University to help design a set of new web services interfaces.

Taking API design seriously isn’t restricted to the large software vendors. Volunteers working on the Eclipse open-source project are asked to follow the guidelines published to the Eclipse API Central. The Qt Application Development Framework has The Little Manual of API Design. And the examples could continue… But if it looks like we are going to use the “everybody else is doing it” fallacy here, there is much more to it. We believe you can derive very concrete and palpable benefits from taking a disciplined approach towards API design, and we will discuss these next.

Our claims

If books are judged by their cover, then software platforms, services, frameworks, and libraries are judged by their APIs. When developers evaluate the APIs, they are likely passing judgment on the entire service or component. Paying special attention to API design is a smart investment towards making a good impression, especially since publicly visible APIs are typically only a tiny fraction of the entire code base.

Developers and organizations are increasingly realizing that API quality directly impacts the quality of their own code. It might sound weird to hold API authors responsible for the quality of someone else’s code, but this happens more and more. Copy-and-paste style programming, boilerplate code, and similar questionable practices used to be blamed on the lack of programming skills until it was noticed that sometimes even the smartest, most careful developers can only avoid them by painfully wrapping APIs with their own classes and methods. It does not take a genius to figure out what these wrappers are: efforts to eliminate or isolate the effects of API deficiencies. Since APIs are written once, but used many times, calls to fix such APIs are getting louder. With thousands of public web services and free open source components to choose from, the days when developers silently put up with whatever APIs we gave them are long gone.

Any effort put into making APIs easier to learn, easier to use, and easier to troubleshoot directly translates into considerable programmer productivity gains. Let’s return to the main premise of the opening paragraph: today developers spend most of their time learning APIs, using APIs, and debugging code that is nothing more than moderately complex business logic wrapped around multiple complex API calls. It should come as no surprise that better APIs make developers more productive.

Concrete data supports the above claim. Studies conducted in 2007 by a team of researchers lead by Brad Myers and Jeff Stylos determined that even minor changes to APIs, for example replacing factory methods with regular constructors, can lead to significant programmer productivity gains. Moreover, the difference in productivity was measured regardless of the experience level of the developers. While the measured times varied little among participants using constructors, it showed differences up to a factor of 10 between participants using the factory method, a clear sign that they were struggling with the API. The same surprisingly large difference in productivity, up to a factor of 10 times, was measured when a method was moved to a different class within the API, or even after the same team redesigned the API of a big and complex SAP Business Rules Framework. Precise productivity measurements like these are difficult, and published results are still rare. While most claims about big increases in programmer productivity are only supported by anecdotal evidence, we have little reason to doubt their trustworthiness.

While numerous studies show that API quality is the main driving factor of software adoption, many developers and quite a few organizations are still not convinced. Perhaps they did not read Brian Foote’s The Selfish Class, which claims that “Software artifacts that cannot attract programmers are not reused, and fade into oblivion.” When adoption doesn’t happen, it is still common to blame the programmer’s monumental ego, FUD spread by the competition, or even the mystifying Not Invented Here Syndrome, rather than poor API quality.  But often you don’t even know where to start because many APIs lack useful documentation. In addition, some APIs are so closely tailored for a specific usage scenario that they are tedious or even impossible to use in a different context. Others come with so many dependencies that it is indeed easier to re-implement the functionality from scratch. As long as API quality remains low, software adoption rates remain low.

What are good APIs

But what are good APIs?  What appeals to programmers? What qualities should an API possess to attract them? Where should we direct our efforts?

We are repeatedly told to design intuitive APIs, although practical advice on how to do it remains in short supply. Wikipedia defines intuition as “understanding without apparent effort”, while Merriam Webster describes it as “the power or faculty of attaining to direct knowledge or cognition without evident rational thought and inference”. The truth is that the intuitive thought process is so quick and so closely associated with words like natural, straightforward, expected or obvious that nobody completely understands how intuition really works. Most contrast intuition with complex logical reasoning or learning by trial-and-error. Often, it is easier to identify counter-intuitive APIs from a sense of deep frustration we experience while using them. While it is hard to pinpoint precisely what makes APIs intuitive, some API design best practices, such as considering the perspective of the caller or striving for consistency are known to turn out intuitive APIs. We will discuss these practices in future installments.

We are also constantly reminded to keep APIs simple. We are even encouraged to leave functionality out to prevent them from getting complex. While the desire to satisfy 100% of all imaginable use cases unquestionably leads to overly-complex APIs, one cannot always simplify APIs by leaving out features. Obviously, there is a limit beyond which crippled APIs are no longer useful. We will discuss how to tell if APIs are getting complex and techniques for keeping it simple in a separate installment. Simplicity is an important requirement, but it needs to be carefully balanced against many other requirements.

Self-proclaimed experts are quick to point out that APIs should be easy to understand, remember, and use, but most of them fail to follow up with practical advice. Let’s start with the observation that understanding programs is quite different from understanding colloquial English. It is more like understanding a text from an exotic field of science.

“In this paper we study top quark production from string balls at LHC and compare with the parton fusion results at NNLO using pQCD. We find significant top quark production from string balls at LHC which is comparable to standard model pQCD results. We also find that ds/dpT of top quarks from string balls does not decrease significantly with increase in pT, whereas it decreases sharply in case of standard model pQCD scenario.”

A particle physicist familiar with the terms used can easily decipher the meaning of the above text. Good APIs strive to maintain programmer familiarity by minimizing the number of new terms and concepts and reusing existing programming vocabulary wherever possible. When naming new concepts they stay as close as possible to the spoken language and respect the established conventions of the programming environment. APIs that ignore this requirement are just as difficult to understand as the text above. Choosing memorable names is important enough to have an entire installment on this topic alone.

Speaking about ease of use… With so many APIs to work with, developers don’t have the time to thoroughly evaluate them. Instead, they just give them a quick test drive. If an API doesn’t allow for this kind of experimentation, or if the first impressions are not positive, it will not be given further consideration. We should encourage experimentation by making core scenarios very easy to implement and by providing code samples developers can play with. It helps with experimentation if APIs are safe to play with. Safety also plays a crucial role in the later phases of development, reducing the number of issues we need to debug and fix. API safety is often equated with good error detection and reporting but it is more than that: the API should make it impossible or hard for the developer to get it into a bad situation. Rico Marini calls this concept the “Pit of Success”. We will talk more about making APIs safe in a future installment.

Because developers like to experiment with APIs, they rarely read the documentation up-front. This raises the question of whether APIs need to be documented at all. To answer it, let’s remember that in everyday speech we can understand a message even if parts of it have been lost; for example we can read an article from a ragged, dirty newspaper picked up from the subway floor. The same is not true for short, terse messages like Tweets or SMS. A single missing word can make such messages incomprehensible. We say that such messages lack redundancy.  APIs also lack redundancy and the main purpose of documentation is to provide it. Redundancy should not be confused with repetition; it is the same information but conveyed in a different form, increasing the chances that it will be understood. Simply repeating the information found in method signatures in the documentation does nothing to help. We will discuss writing helpful documentation in a future installment. Because specifying behavior accurately is critically important and challenging at the same time, we will discuss this topic separately.

Purpose-made APIs

It is important to realize that the above-discussed API qualities do not appear spontaneously and that API design requires special attention. This is a crucial point. Many developers still doubt the need for a separate API design step because they see software interfaces spontaneously emerging at module boundaries during implementation. But experience shows that such spontaneous software interfaces have serious shortcomings when used as APIs, not because of developer negligence or incompetence, but because the design trade-offs we routinely make during implementation simply do not result in good APIs. To make matters worse, fixing such APIs after they are published is extremely difficult. As Joshua Bloch says, “Public APIs, like diamonds, are forever. You have one chance to get it right so give it your best”. This thought alone should be enough to question the practice of letting APIs emerge spontaneously. APIs won’t improve until API design is separated from the implementation effort and it is treated as a separate task. We will call such APIs purpose-made APIs.

Purpose-made APIs are much more likely to contain intuitive, meaningfully named methods and types, more likely to be carefully documented, and more likely to come with helpful samples, tutorials and unit tests. Such APIs are often simple, yet complete, and can evolve over time without breaking their clients; they have few dependencies and pack a lot of functionality. They are safer to use, encourage experimentation and support writing fast and efficient client code. The list isn’t complete. It just sketches the background against which Erich Gamma’s statement should be understood: “A key lesson here is that API is not just a documented class. And, APIs don’t just happen; they are a big investment.”

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 Canada License.