1

I have a scenario where:

  • I have a REST API to manage resources, for instance workspaces. This REST API is closed in the sense that cannot be altered.

  • I want to CRUD workspaces but save additional information about them, which the main REST API does not support.

  • So, i create another REST API, call it client API, which CRUDs workspaces in the main REST API and saves additional information in a database.

  • Important to note that there's no other clients using the main REST API, that is, there's no sync problems.

My issue is what's the best way to keep integrity between workspaces in the main REST API and workspaces that live in the client REST API?

As an example, to create a workspace, there's POST /workspaces in the client API which:

1 - Create a workspace in main REST API.
2 - Create a workspace in client database.

If step 2 fails, i have a created workspace in the main REST API and not in the client database.

What's the best approach to tackle this?

4
  • do step 2 first Commented Apr 10, 2018 at 14:00
  • I think a couple of clarifications are in order: 1. What exactly do you mean by "I want to CRUD workspaces"? 2. When you say workspaces API is closed, do you mean interface, implementation or both? 3. If there are no other clients using the main API, what makes it closed? 4. It's not clear why no other clients implies no synchronization problems. Are you saying there's only ever one client instance running? Commented Apr 10, 2018 at 14:18
  • 1- CRUD workspaces i mean i want to manage workspaces entities, which i manage via the main REST API (vendor software), but also keep additional data about them in my database. (ex: create a workspace in the vendor API and them save workspace id and additional data in my database). 2 - I mean it's a vendor software which i cannot modify. I cannot enhance the vendor API. 3 - 2 answer that if i correctly understood. 4 - Yes, only my API interacts with the vendor software. By clients i mean my API. Commented Apr 10, 2018 at 14:27
  • This question is fairly too broad. It's a common problem of the distributed computing, with no silver-bullets. The solution mostly depends on the projects needs and constraints. I suggest you read about eventual consistency and compensative transactions. And overall, get familiar with the fallacies of the distributed computing. Commented Apr 10, 2018 at 14:49

2 Answers 2

1

I would recommend that you create your own interface that your client uses that wraps the vendor interface. Perhaps that's what you mean by 'CRUD' here but be careful with mixing CRUD and REST because the HTTP operations (GET, PUT, POST etc.) don't align exactly CRUD (create, read, update, delete).

Your client should not care about the vendor API at all. Your wrapper API will be all it knows about. The work you wish to do should then be simple.

Your concern with the DB insert failing is valid. Without a 2-phase commit interface, your best bet here is going to use a local storage for failures with retry. One thing you also might want to consider is doing a preliminary insert (and commit) before calling the API which represents a pending transaction. This will make it easier to detect if you run into a problem that needs remediation. If this fails, you can abort before creating the resource. That should reduce the amount of failures on the second step but won't eliminate the possibility. If this succeeds. the resource creation succeeds but the insert with the id fails, you store the id to a backup store such as file. A log containing the id should also be written. A retry procedure can be implemented to run periodically to attempt to resolve the issue. If you find instances in your database where the id has not been stored, you can troubleshoot.

4
  • I have been thinking about compensative transactions as a possible solution. If the workspace creation success but the attributes creation fails, we could adhere to the 409 HTTP status semantics. In other words, allowing clients to retry the request with a PUT. In other words, make the new API tolerant to fails. I'm aware of 409 response as responses to a PUT request. The RFC specifies they are "most likely" but not only. What do you think? Commented Apr 10, 2018 at 14:54
  • @Laiv It's an interesting idea but I think there are a couple of issues with using a 409 error. First of all, 400-class errors are for user issues and this is a server-side problem. Also, there isn't really a conflict here, it's just a failure to complete the transaction. The bigger issue I think is pushing the work of resolving an inconsistent state due to a server error to the client. It also borders on being a leaky abstraction. Commented Apr 10, 2018 at 16:08
  • @JimmyJames, your first part of the answer describes my current state. I just could not describe it good enough in the question. As what concerns your approach i think is a good enough solution. An advantage i have is that inconsistent state in the vendor is OK. I mean, i can and will log errors to check for failures, but it's not a big issue if a workspace exists in the vendor part but not on the client application. Of course that other concerns will happen but it's a good starting point. Commented Apr 10, 2018 at 17:02
  • @JimmyJames you are right 40x would be missleading. Seems more appropiated for issues of concurrency. Commented Apr 10, 2018 at 17:20
1

As i see it, you have the concept of workspaces for your application, but different clients are interested in more / less information? I think your current approach is unlikely to prove maintainable in the long run.

I would stick to using a single API endpoint for your workspaces domain, and use query string parameters to modify the data you get back. This way, you only have a single data source and dont need to worry about synchronizing across databases.

For example, if your basic workspace model contained a name and some text:

{
    "name": "Workspace Name",
    "text": "Text"
}

You could access this with your standard workspace call: example.com/workspaces/1. If you wanted to get a more detailed model: example.com/workspaces/1?detailed=true. The second call could populate extra fields that could simply be ignored by clients that didnt know how to process them.

3
  • 1
    Smells a bit like versioning, or layering. Good approach. Commented Apr 10, 2018 at 12:42
  • Thanks for the answer! I think i did not explain good enough. The thing is, the main REST API is from a vendor software. Specifically it's geoserver REST API. I want to build an admin software which manages geoserver workspaces but keeps additional information about those workspaces. Imagine you have an admin panel where you can create geoserver workspaces and keep, to the admin panel context, additional information about those workspaces. So, this way, i have to keep track of which workspaces were created and add additional information about them. Commented Apr 10, 2018 at 13:58
  • To a very high level, i want to save an entity in an external API and save a local association in my database. And this introduces problems about integrity. My question is, what's the best approach to deal with this. And, keeping in mind that only 1 single client interacts with the external API. Commented Apr 10, 2018 at 14:01

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.