Cohost "API" Documentation

This is documentation of the cohost API as used by the cohost web frontend. At the time of writing there is no official API for developers, though it is planned. This is all reverse-engineered. Use at your own risk.

Table of Contents

Overview

Most interaction with cohost is done with a protocol called tRPC. This is a simple protocol built on top of JSON and URL query parameters. Endpoints are described in the tRPC section.

Some interactions are instead done with non-tRPC requests. This includes parts of the initial log-in flow.

You will be making most requests to various paths under https://cohost.org/api/v1/, with tRPC requests under https://cohost.org/api/v1/trpc/.

Authentication

As this API is not intended for developers, but for the official cohost client, there are no API tokens. Instead, a cookie connection.sid is set for the domain cohost.org during the first log-in. This cookie has an expiration time of 7 days, but is automatically refreshed with each new request you make to the API, so as long as your code makes a request at least once every 7 days, you will not need to re-authenticate. Therefore as the author I recommend that you do not store the user’s credentials in most interactive situations. Unattended bots may be better served by storing credentials, in case the account gets logged out for other reasons.

Before logging in, I recommend checking whether you are currently logged in using the login.loggedIn tRPC endpoint. You don’t need to log in again if you are.

At a high level, the log-in flow is as follows:

TODO: details

Posting

You don’t need to use the tRPC API for simple text posts. At least thats is is what iliana has told me. I do not know if that is actually true, but there IS a tRPC endpoint for posting as well. Documenting the tRPC endpoing is probably better because the non-tRPC posting method may eventually stop existing.

TODO how to post a message without tRPC, if it is possible?

tRPC API Overview

Cohost uses tRPC for most of its API. tRPC is a tool for writing typesafe APIs, with good ol’ JSON-over-HTTP. If you are familiar with GraphQL, it is somewhat similar. However, the main deal with tRPC is that instead of defining the schema in a language-agnostic schema language, you instead just write out a bunch of typescript code, and that is your schema. The tRPC library handles forming all the API requests/responses for the developer from there.

From the trpc.io docs, here is an example API schema:

type User = { id: string; name: string; };

userList: () => User[];
userById: (id: string) => User;
userCreate: (data: { name: string }) => User;

It’s kinda cool honestly! But, it’s also heavily tied to typescript. There are no first-party ways to interact with a tRPC API in anything other than typescript, because that’s just not what tRPC’s goal is; someone who wanted to that would have used GraphQL probably.

Of course, that does not mean that we cannot use the tRPC API in other languages. In fact, the request/response format is quite simple to understand. It would be fun to re-implement the tRPC in typescript honestly, but for now we are sticking to just documenting the API’s observed usage by the official client.

Additionally, to quote vogon cohost

our ultimate strategy for shipping the public API is to port all of our hand-coded REST APIs to tRPC, deprecate the old ones, and then re-export them as REST APIs with generated OpenAPI specs using trpc-openapi.

So anything we learn reverse-engineering this now is knowledge that will continue to be useful when the official API exists.

Project Handles, Post IDs, etc.

You’ll see these a lot in the API, let’s get them out of the way early.

userId: Every cohost user has a single numeric userId. This is the id of the account.

projectId: Cohost pages are called projects in the API. projectId is the numeric ID of the project. These are sequentially globally allocated across the website.

projectHandle: The projectHandle is the name of a project, the part after the @. So for example, https://cohost.org/artemis has a project handle of artemis.

postId: All posts have a numeric ID. These IDs are sequentially globally allocated across the website. That means that a postId is all you need to reference a specific post on the website. In practice, the API sometimes requires you specify the projectHandle of the post too. I don’t know why.

tRPC Call Structure

tRPC read-calls are made by sending an HTTP GET to

https://cohost.org/api/v1/trpc/<endpoint>[,<endpoint>...]?batch=1&input=<URL-encoded-JSON-string>

tRPC write-calls are similarly made by sending an HTTP POST to

https://cohost.org/api/v1/trpc/<endpoint>[,<endpoint>...]?batch=1

Except, with a write-call, the JSON input is sent as the body of the request instead of getting stuffed into the input query parameter.

As indicated above, you can call multiple endpoints in a single request by separating the endpoints with commas. You should in your head model this as if you are ALWAYS doing a “batch” query, because nothing is actually different when you are calling just a single endpoint.

The mysterious batch parameter.

The batch=1 query parameter is a boolean flag that enables the feature where multiple API calls can be made within a single request. The cohost website always sets this, even if it is only doing one API call in a request.

My recommendation is to always include it, and always operate within this “batch” mode, even if you are only making one request at a time. It may seem silly, but because this is how the entirety of the website works right now, it will be easier for our reverse engineering efforts if we do the same.

Thank you vogon for explaining to me.

tRPC call inputs

The input is a JSON object containing any inputs to the tRPC calls, with the index of the call as the key in the object. For example, if you make a single call, you will have an object like this:

{
  "0": {
    "my_cool_argument": "meowmeowmeow"
  }
}

If yoou were to make two calls, you’d have

{
  "0": {
    "my_cool_argument": "meowmeowmeow"
  },
  "1": {
    "my_other_cool_argument": "lol, lmao"
  }
}

You might be wondering “why are the inputs an object instead of an array”? Good question, and it’s because if an endpoint doesn’t take any parameters, you can leave it out of the input altogether. So if you queried two endpoints, but the first one didn’t have any arguments, you could have an input like this:

{
  "1": {
    "my_other_cool_argument": "lol, lmao"
  }
}

Also notice that the keys are STRINGS, they are not numbers, they are STRINGS. You need to get that right.

tRPC call outputs

While the input is an object, the output is actually an array, because every call has to have an output. You’ll get an array like this, in the same order as your calls.

[
  {
    "result": {
      "data": {
        "some_stuff_you_care_about": "sup"
      }
    }
  },
  {
    "error": {
      "message": "",
      "code": -32003,
      "data": {
        "code": "FORBIDDEN",
        "httpStatus":403,
        "path": "posts.delete",
        "errorCode": "not-authorized"
      }
    }
  },
  {
    "result": {
      "data": {
        "some_other_stuff_you_care_about": "nice shoes"
      }
    }
  },
]

As far as I can tell, succesfull calls always give you this { "result": { "data": { } } } structure.

GET Examples

For a simple example, we’ll use login.loggedIn. login.loggedIn takes no parameters, so we can omit it from the input. However, we still need to provide an input object, so we send an empty JSON object ({}) as the input. The request is:

GET https://cohost.org/api/v1/trpc/login.loggedIn?batch=1&input=%7B%7D

As a response, we might receive this JSON:

[
  {
    "result": {
      "data": {
        "loggedIn": true,
        "userId": 1,
        "email": "someone@example.com",
        "projectId": 1,
        "modMode": false,
        "activated": true,
        "readOnly": false,
        "emailVerifyCanceled": false,
        "emailVerified": true,
        "twoFactorActive": false
      }
    }
  }
]

Notice that the reponse is actually a single object inside an array. As I said before, there is nothing special about just making one call, or making several at once.

For the next example, let’s do two read calls at once. We’ll get my notifications count and my asks count.

We query notifications.count,asks.unreadCount with input:

{
  "0": {
    "projectHandle": "artemis"
  },
  "1": {
    "projectHandle": "artemis"
  }
}

Which results in this encoded HTTP request:

GET https://cohost.org/api/v1/trpc/notifications.count,asks.unreadCount?batch=1&input=%7B%220%22%3A%7B%22projectHandle%22%3A%22artemis%22%7D%2C%221%22%3A%7B%22projcectHandle%22%3A%22artemis%22%7D%7D

And we get this response:

[
  {
    "result": {
      "data": {
        "count": 1
      }
    }
  },
  {
    "result": {
      "data": {
        "count": 0
      }
    }
  },
]

POST examples

Write-calls work basically the same, but we make POST requests, and the JSON goes into the request body instead of the query arguments.

So for example, to delete a post we’d make a call to posts.delete with arguments

{
  "0": {
    "projectHandle": "artemis",
    "postId": 2352150
  }
}

The raw HTTP request (filtering out some stuff) would be

POST /api/v1/trpc/posts.delete?batch=1 HTTP/1.1
Host: cohost.org
Accept: */*
content-type: application/json
Content-Length: 50
Cookie: connect.sid=[redacted]

{"0":{"projectHandle":"artemis","postId":2352150}}

And the response body would be

[
  {
    "result": {}
  }
]

Dehydrated tRPC Calls

I’ve been doing a lot of my reverse-engineering by looking at so-called “dehydrated” tRPC calls.

When you first load a page like your profile page or the timeline, part of the website is pre-rendered HTML so your browser can start showing you something immediately. If you quickly scroll down, unless your computer is super fast, you’ll start to see new posts pop into existence. Cohost isn’t actually loading those posts over the network, it already has the data, but it’s rendering those posts on demand. This splits the post rendering load between their servers and your computer, while keeping everything relatively smooth and efficient. Pretty neat honestly. But how is it rendering those posts?

This is where the dehydrated tRPC calls come in. If you view-source one of these pages, you’ll see this:

<script type="application/json" id="trpc-dehydrated-state"></script>

which a bunch of JSON in it. In here is a bunch of cached tRPC calls with

There is actually a caching layer in the client that sits between the main tRPC library that most of the client is using, and the part of the tRPC library that sends requests to cohost’s servers. Most of the client code does not need to care whether data is cached or not. It asks for data, and if the endpoint and input data match something in cache it uses that, otherwise it sends the request off to cohost’s servers. This lets cohost decide what to pre-load and what not to pre-load just by changing what’s in the dehydrated state.

But what this also means is that it’s an excellent way for us to reverse-engineer the API, because we have a whole static list of example requests and responses to stare at without having to poke around the website and convince the web client to do the requests for us to capture in the browser dev tools.

tRPC Endpoint Documentation

This documents individual endpoints. It is an extremely incomplete list right now, because I am tired and need a break before documenting more. Undocumented things include posting, uploading attachments, loading timelines/profiles, draft interaction, etc.

I am going to omit the outer object for inputs / outer array for outputs.

{
  "0": {}
}

or

[
  {
    "result": {
      "data": {
        "cool_data": "yeah"
      }
    }
  }
]

Instead I will write

{}

and

{
  "result": {
    "data": {
      "cool_data": "yeah"
    }
  }
}

GET login.loggedIn

The main reasons you want to call this are

It doesn’t take any inputs, and only requires your session cookie, so that’s why it’s pretty useful. You use it to map your session cookie to the actual account data.

Input

None

Output

{
  "result": {
    "data": {
      // Is the user logged in
      "loggedIn": bool,

      // numeric user ID
      "userId": number,

      // user email
      "email": string,

      // currently active user project ID
      "projectId": number,

      // is the user a website moderator? At least thats what I think
      // this means
      "modMode": bool,

      // Is the user's account activated to allow posting
      "activated": bool,

      // I'm not sure what this is but I guess maybe if this is true
      // then the user can't change their profile data or boost
      // things? shrug.
      "readOnly": bool,

      // email verify canceled, I guess
      "emailVerifyCanceled": bool,

      // Is the user's email verified
      "emailVerified": bool,

      // Does the user have two-factor auth enabled?
      "twoFactorActive": bool
    }
  }
}

GET notifications.count

Input

{
  "result": {
    "data": {
      "projectHandle": string,
    }
  }
}

Output

{
  "result": {
    "data": {
      "count": number
    }
  }
}

GET asks.unreadCount

Input

{
  "result": {
    "data": {
      "projectHandle": string,
    }
  }
}

Output

{
  "result": {
    "data": {
      "count": number
    }
  }
}

POST posts.delete

Delete a post. Will fail if you’re not logged in as a user that has permissions to delete the post.

Input

{
  postId: number,
  projectHandle: string
}

Output

{
  "result": {}
}

POST relationships.unsilencePost

Unsilence a post on the currently logged-in account. If the post isn’t currently silenced, the call succeeds anyway.

Input

{
  // numeric ID of post to unsilence
  toPostId: number
}

Output

{
  "result": {}
}