M5.10 - API design plan

Comments

Completed

Authors: Neil Caithness, Milo Thurston

Revision: 1.0

This document presents the initial design of the API for external computing services. It is not the API specification, which is the topic of MS 5.13 API Baseline Documentation due for release at the end of March 2011.

Overview:

  1. A ScratchPad (at NHM) posts a job request to the Job Scheduler (at Oxford). Authentication is simple as there is essentially only one authentic user (the ScratchPad server at the NHM, or a few such servers if more than one.)
  2. The Job Scheduler enters the job in its Job Table and retrieves data files based on information in the post. The job is sent to the appropriate service (at Oxford or elsewhere). On completion of the job the Job Scheduler updates the Job Table with a job completed status flag and an output url.
  3. The ScratchPad server polls the Job Table until the completed status flag is returned and then fetches the output from the supplied url.

 

Implementation:

  1. Ruby on Rails to be used, with MongoDB as the database. This choice of database is anticipated as allowing more flexibility for dealing with job parameters than a standard relational database.
  2. ScratchPad to post data to the Job Scheduler, comprising of job type, user id and a URL for the data files. The system will then wget the files. Post to be on port 443 using certificate authentication.
  3. Information to be stored in a "jobs" table using a "job" model. This will contain:
    1. Rails job id
    2. NHM supplied user id
    3. Job type
    4. Location of data file(s) (URL)
    5. Job status (an integer, pending, sent, processing, finished, error etc.)
    6. An array of any parameters required for the job.
  4. Whilst the job is running the ScratchPad should poll a URL using their job ID which would find_by_id and return the info. by rendering JSON.
  5. Submission to the remote service would presumably have to be by means of a Rake task run from cron, to process any pending jobs and attempt to submit them to the processing service.
  6. An implication for the ScratchPad server(s) is that the ScratchPad code would require forms for job submission to be added, and certificates would be needed on the server(s) for authentication against one held at Oxford.
  7. Details of the data processing service still undetermined but we hope that it will be able to post queries back (see the separate data_service_post controller above).

 


The above scenario was informed by discussion around the following drawing:

Revisions anticipated.
Comments welcome.
NC/MT