Industry · Jan 10, 2023

What is JSON Schema and how is it used in APIs?

Many modern APIs today use JSON to communicate information between the API vendor and the consumer. JSON is a way to format data in a structured manner that can be read and understood by computers, i.e. an API client or server.

JSON has become immensely popular to build APIs with because it is simple, human‐readable and interoperable with a wide variety of platforms.

This is a valid JSON document:

"I am a JSON document."

JSON can also describe more complex information:

{
  "name": {
    "first": "Jürgen",
    "last": "Wolf"
  },
  "address": {
    "street": "12 Orchard Street",
    "city": "Saint Paul",
    "state": "MN",
    "zip": "55104"
  },
  "phone": "952-907-6136"
}

Alternatively, the same information represented as an XML document may look like this:

<?xml version="1.0" encoding="UTF-8" ?>
<contact>
  <name>
    <first>Jürgen</first>
    <last>Wolf</last>
  </name>
  <address>
    <street>12 Orchard Street</street>
    <city>Saint Paul</city>
    <state>MN</state>
    <zip>55104</zip>
  </address>
  <phone>952-907-6136</phone>
</contact>

You can see that the XML representation conveys the same amount of information but is more dense. The JSON representation is argulably simpler to read and write.

The fact that these JSON documents can be embedded within this article verbatim is a testament to the fact that JSON is both human‐ and machine‐readable.

JSON is also widely compatible across different platforms. JSON stands for JavaScript Object Notation and it is no surprise that JavaScript programs have native support for JSON. In fact, every JSON document is also valid JavaScript code. Most other platforms or languages have either native support for encoding and decoding JSON documents, or there are widely supported third-party libraries available.

What are the downsides of using JSON in APIs?

The decision to implement an API using JSON does come with some tradeoffs.

JSON is not as efficient at transmitting data as other binary formats such as Procotol Buffers. This means that network requests will consume more data, impacting performance considerations like latency and cost. However in the vast majority of cases the impact of this is not noticable.

The simplicity of JSON also often comes at the cost of program correctness. It is not possible to tell that a JSON document is valid for a particular purpose by looking at the document alone. This is problematic for APIs in particular since APIs are fundamentally interfaces that two parties must agree on.

For example, if a client wants to send contact information for a user to a server, it may send

{
  "first_name": "Lucas",
  "middle_name": "Casper",
  "last_name": "Sortland"
}

However, the server may expect this instead:

{
  "full_name": "Lucas Casper Sortland",
  "addressable_name": "Lucas"
}

In this example the client is submitting the user's name from a form that contains separate input fields for first, middle and last name. But the server is following best practice, and is modelling the name as a single full name, with a shorter variation used to address the end user directly. Since the client is submitting the name in the wrong format the submission will not succeed.

From just inspecting the client's payload, there doesn't seem to be anything incorrect about it. The two parties need a way to agree on and document what the expected format for the name should be.

TIP: Unless you have a hard constraint it is better to model names as a single full name to be as inclusive as possible. There are many examples of names that can't be modelled by a single structure. Review the W3C's guidance on personal names when designing user experiences.

JSON Schema describes whether a JSON document is valid for a particular purpose

JSON Schema is typically used in APIs to describe the kinds of data the server can accept (the request payload) and the data the client can expect to receive (the response payload).

A schema is itself a JSON document. It looks like this:

{
  "type": "string"
}

This schema describes a JSON document that consists of a single string of any value. It could be used to describe the response of an endpoint that returns a status value.

The following example shows a HTTP request to a /status endpoint and the corresponding response that returns a single string value:

GET /status HTTP/1.1
Host: api.example.com
Accept: application/json
HTTP/1.1 200 OK
Date: Tue, 10 Jan 2023 22:38:34 GMT
Content-Type: application/json
Content-Lenth: 8

"active"

A problem with returning a single string, is that you can't extend the resonse with additional data. If the server wanted return more data alongside the status, the response payload may look like this instead:

HTTP/1.1 200 OK
Date: Tue, 10 Jan 2023 22:38:34 GMT
Content-Type: application/json
Content-Lenth: 67

{
  "status": "active",
  "status_changed": "2020-01-10T08:00:03"
}

A client program that tried to parse this response payload expecting a single string value wouldn't work.

To avoid this, the API can define the new response format using the following schema:

{
  "type": "object",
  "properties": {
    "status": {
      "type": "string"
    },
    "status_changed": {
      "type": "string",
      "format": "date-time"
    }
  }
}

The original JSON schema defining a single string is now embedded inside the new one under the status key. JSON Schema allows schemas to be composed of subschemas, allowing you to define complex structures efficiently.

JSON Schema powers developer documentation

You may be wondering, why bother going to the trouble of using JSON Schema to describe your API contract when you are already writing accompanying documentation for your API. It’s true — most developers will never consume JSON Schema directly. Instead they will access developer documentation through a public website, developer portal, or occasionally (but hopefully not) a PDF or Word document provided by the API vendor.

The challenge for all API vendors is keeping the accompanying developer documentation up to date with the API implementation. Like all software, APIs receive updates to fix bugs, remediate security vulnerabilities and deliver new functionality. Externally maintained documentation often falls behind the deployed version of the API, which leads to drift between what is documented behavior and how it actually behaves.

The advantage of JSON Schema lies in the fact that it is machine readable. This means it can act as the single source of truth because tooling can parse and transform it into any other target format, including developer documentation. In fact, many developer portals today are powered by JSON Schema underneath.

The way this works is that JSON Schemas supports metadata annotations, such as property descriptions.

{
  "type": "object",
  "properties": {
    "status": {
      "type": "string",
      "description": "The user's status. All users start out as 'active' and can become 'suspended'."
    },
    "status_changed": {
      "type": "string",
      "description": "An ISO 8601 timestamp indicating when the user's status last changed."
    }
  }
}

Metadata annotations don’t affect the validity of a JSON document, but can be utilized by other tooling.

The benefits of JSON Schema don’t stop at documentation

Since JSON schema is just JSON, it can be manipulated by tooling for various other purposes.

For developers providing an API, JSON Schema can be used by code generation tools to efficiently implement type declarations in the native language of the application, along with serialization, deserialization and validation logic. This allows developers to immediately begin working on the application’s business logic, without having to write tedious model definitions themselves.

Generated models can be shipped as part of SDKs in native language, providing a superior experience for developers consuming the API.

Quality engineers can use JSON Schema validation tools to assert that the API is responding with the correct data as defined by the contract specification.

Technical writers can maintain documentation in the same source of truth as model definitions, ensuring that published documenation is always in sync with the API.

Frontend developers can use JSON Schema validation tools to ensure that an end user has filled out a form correctly before submitting it to the server. This enables providing real-time feedback to the end user, instead of having them submit the form and get an error. Furthermore, JSON Schema enables the frontend and backend to use the same source of truth for what constitutes valid form data.

What are some examples of JSON Schema in the real world?

OpenAPI is a specification for describing entire HTTP APIs, including the endpoints and HTTP methods (e.g. GET or POST), parameters, headers, and of course, request and response payloads. Typically you will find JSON Schema embedded inside OpenAPI documents, where it is used to describe the request and response payloads. Many popular APIs publish open-source OpenAPI definitions, for example GitHub's REST API OpenAPI Description, Twilio's OpenAPI Specification and Stripe's OpenAPI Specification.

JSON Schema isn’t just for APIs, it can be used anywhere JSON is used.

The web development toolchain, SWC, uses JSON Schema to define the format of its configuration file, .swcrc:

{
  "$schema": "https://json.schemastore.org/swcrc",
  "jsc": {
    "parser": {
      "syntax": "ecmascript",
      "jsx": false,
      "dynamicImport": false,
      "privateMethod": false,
      "functionBind": false,
      "exportDefaultFrom": false,
      "exportNamespaceFrom": false,
      "decorators": false,
      "decoratorsBeforeExport": false,
      "topLevelAwait": false,
      "importMeta": false
    },
    "transform": null,
    "target": "es5",
    "loose": false,
    "externalHelpers": false,
    // Requires v1.2.50 or upper and requires target to be es2016 or upper.
    "keepClassNames": false
  },
  "minify": false
}

The example configuration file uses the special property $schema to designate the schema that the configuration should be validated against.

By designating an actual schema the CLI tool can validate the .swcrc file before running any commands, eliminating one potential source of bugs or unexpected behaviour in the build process.

So far we’ve seen real-world applications of JSON Schema primarily being consumed by developers. There is nothing stopping applications from providing a friendly interface in front of JSON Schema, allowing non-technical users to edit schemas.

In fact, Segment does just this. Segment uses JSON Schema in Tracking Plans to allow their customers to define exactly what kinds of events and attributes an account should accept and enforce consistency across all sources.

User interface for editing JSON Schema by Segment

Summary

While JSON comes with many advantages owing to its simplicity and ubiquity, its lack of support for structured type information and documentation means that its usage in APIs can lead to bugs and incorrect implementations.

JSON Schema is a way of defining whether a JSON document is valid for a particular purposes. It has become the defacto way to describe API request and response payloads, largely thanks to its adoption by the OpenAPI specification.

While the primary use case of JSON Schema is to validate JSON documents, because it is itself JSON it can be used in many more applications by developers, quality engineers, technical writers, product managers, and even end users.

Additional links and resources

About Criteria

Criteria is a collaborative API design platform that implements JSON Schema for you. You don't need to be technical to get started.

Start designing APIs now →