Here's a story you might find relatable. We kicked off a new integration project that started 3 months later than everyone wanted. APIs just shipped after a bunch of delays. We were handed API docs and a test environment and started work building out the new feature. The documentation was pretty good despite some inconsistencies. We tried to be good about giving feedback which the API team was mostly happy to receive. Bugs would be patched, and new versions of the API would be deployed frequently. Things were relatively smooth.
As we proceeded with testing, we faced unexpected issues with a few endpoints. It turned out that a new version of the APIs had been released without clear communication about the changes. This created a scramble to understand what had been modified and how it had occurred. Eventually, we discovered that a redundant field had been removed from the responses. Unfortunately, since no one should have been using that field, removing it shouldn't have been a big deal. In my opinion, the absence of a Schema caused these problems.
A Schema is essentially a metadata document that describes an API. It defines the inputs and outputs of a system while eliminating ambiguity. When used correctly, it creates the foundation for a contract between an API and its consumers. Here's an example of an OpenAPI schema describing a health check endpoint for a RESTful API.
title: Example API
summary: Health Check
description: Check the health of the API
description: API is healthy
description: API is experiencing issues
enum: [healthy, offline]
/health route is defined with an accompanying summary and description. The schema also describes the expected body when the request is successful. It describes a
HealthStatus component which is expected to be a string with 1 of 2 possible values
offline. While this is a contrived example, it does illustrate the level of detail that you can expect to have in your schemas.
Documenting your API in this manner has numerous benefits beyond its inherent value. For instance:
- By doing so, you can generate server and client code stubs to ensure that they are compatible with the schema.
- You gain access to an ecosystem of tools, such as playgrounds that help you write and test your schemas, as well as compatibility testing tools to test backward compatibility between schemas.
- Since it is just code, you have the option of version-controlling changes to your APIs through Git or any other VCS of your preference.
Despite my example illustrating an OpenAPI schema, this applies to any Interprocess Communication (IPC) that requires message serialization. Other tools like JSONSchema, GraphQL, Protocol Buffers, and Cap n' Proto may better suit your needs. Regardless of the tool or platform you use, the general principles remain the same.
With these benefits in mind, I believe that transitioning to a Schema Driven (or Schema First) workflow is the most significant advantage. A good Schema Drvien workflow places the schema at the center of the development process. It becomes the nexus point for communication between teams and creates a shared framework for understanding the various systems being implemented. Some key aspects of a Schema Driven workflow include:
- Single Source of Truth: All parties involved, including the API, client teams, and other stakeholders, should rely on the Schema as the true definition of input and output data structures between the systems.
- Easier Parallel Development. Teams can start working without waiting for APIs to be fully implemented. They can use the schema to generate code, stubs, or mock data to simulate API behavior when it's unavailable. If the contract changes, build errors will indicate necessary adjustments to support the new contract.
- Contract Testing: The schema serves as a basis for contract testing, where API providers and clients validate that their implementations conform to the agreed-upon schema.
- Versioning: You get all of the benefits that come with version control. You can use a git commit hash or tag to identify a specific version of the schema. When requirements change, the Schema can be updated using the same change control process used to manage code. This gives vested parties a chance to review and provide feedback on the Schema using the same code review process that they are familiar with.
If my teams had adopted these practices early on as part of our development processes, I think it would have improved communication massively. Both the API and client-side teams would have had a lot more confidence in the system as a whole, and there would have been a lot fewer (maybe none) surprises that manifested late in the development process. Anecdotally, newer teams that I've worked with that have taken up a Schema Driven approach have been better for it.
To sum up, there are numerous tools accessible to assist you in defining a Schema for your APIs and implementing Schema Driven workflows. These ecosystems have had ample time to develop, and the available tools are quite proficient. I would even go as far as arguing that it costs you more not to have a Schema at this point.
It's 2023. Your API should have a Schema.
openapi-generator: OpenAPI Generator allows generation of API client libraries (SDK generation), server stubs, documentation and configuration automatically given an OpenAPI Spec (v2, v3) (no date). ↩︎
openapi-diff: Utility for comparing two OpenAPI specifications (no date). ↩︎
Sandoval, K. (2017) Using A schema-first design as your single source of truth, Nordic APIs. Available at: https://nordicapis.com/using-a-schema-first-design-as-your-single-source-of-truth/ (Accessed: July 4, 2023). ↩︎
Or perhaps later in the future? Hello from the past! ↩︎