PPUKM Drug Formulary

A comprehensive drug reference app developed with the needs of The Pharmacy Department of the National University Hospital of Malaysia's practicing staff in mind

Technical decisions

The choice to use PHP on traditional LAMP-stack shared hosting was primarily a cost-driven one; nevertheless, it proved to be more than adequate seeing as Formulary is a straightforward, read-heavy app with a read-write ratio in excess of 2000:1.

Initially, the delta-sync endpoint was implemented in code, approximately as follows;

$DELETED = -1;
$HASH_CHANGED = 1;
$NEW = 2;

/*[{
    "drug_id": 1,
    "hash": "4Aid+/cssFK2Dv0lPI3Qv8jH/uKYnM4zNbglWUvqh2o="
},{
            "drug_id": 2,
    "hash": "ew6IP+Zxy25GUCFR3vuGU168w2RI7gpAFGJxwLL9LeI="
}]*/

$output = [];
$hashes = json_decode($request->getContent(), true);
foreach ($hashes as $hash_entry) {
    // Check in DB whether the particular id - hash combo exists; get by ID first
    $drug = DrugCurrentVersion::whereDrugId($hash_entry['drug_id'])->take(1)->get();
    if ($drug->isEmpty()) {
        array_push($output, array('drug_id' => $hash_entry['drug_id'], 'status' => $DELETED));
    } else {
        $drug = $drug[0];
        $drug_output = [];
        if (base64_encode(hex2bin($drug->hash)) !== $hash_entry['hash']) {
            foreach (self::DRUG_FIELDS as $drug_field) {
                $drug_output[$drug_field] = $drug[$drug_field];
            }
            $drug_output['hash'] = base64_encode(hex2bin($drug['hash']));
            array_push($output, array('drug_id' => $hash_entry['drug_id'], 'status' => $HASH_CHANGED, 'drug' => $drug_output));
        }
    }
}

// get all the new drugs
$new_drugs = DrugCurrentVersion::query()->whereNotIn('drug_id', array_column($hashes, 'drug_id'))->get();
foreach ($new_drugs as $new_drug) {
    $drug_output = [];
    foreach (self::DRUG_FIELDS as $drug_field) {
        $drug_output[$drug_field] = $new_drug[$drug_field];
    }
    $drug_output['hash'] = base64_encode(hex2bin($new_drug['hash']));
    array_push($output, array('drug_id' => $new_drug['drug_id'], 'status' => $NEW, 'drug' => $drug_output));
}

return response()->json($output);

The main bottleneck was doing

foreach ($hashes as $hash_entry) {
  $drug = DrugCurrentVersion::whereDrugId($hash_entry['drug_id'])->take(1)->get();
  ...

followed by the cloning of the returned entities.

One straightforward optimization could have been to create a HASH index on the primary id key, or perhaps load the entire drugs table into memory at once. The ultimate solution was to offload this "diffing" process to MySQL for the following reasons:

  • The MySQL server was higher-specced and would likely handle concurrency better than php-fpm.
  • Our problem is easily expressed in terms of JOINs; the code is perfectly concise despite manually written SQL queries.
  • The entire drugs table wouldn't need to be loaded into PHP memory all at once.
public function drugSyncChangeState(Request $request)
{
  $DELETED = -1;
  $HASH_CHANGED = 1;
  $NEW = 2;

  $hashes = $request->input("hashes");

  $lastSync = $request->input("lastSync");
  $forceSyncAt = SysKV::query()->whereKey("force_sync_at")->first();
  $forceSync = false;

  if (!empty($lastSync) && !empty($forceSyncAt)) {
      $lastSync = Carbon::parse($lastSync);
      $forceSyncAt = Carbon::parse($forceSyncAt->value);
      $forceSync = $lastSync <= $forceSyncAt;
  }

  $changedState = DB::transaction(function () use ($hashes, $forceSync, $DELETED, $HASH_CHANGED, $NEW) {
      /**
        * Prepare temporary state table.
        */
      $tempTableName = str_replace('.', '', uniqid("drugsync_temp_", true));

      DB::statement("
          CREATE TEMPORARY TABLE $tempTableName (
              drug_id INT UNSIGNED,
              hash BINARY(32),
              PRIMARY KEY (drug_id)
          );
      ");

      $hashInsert = [];
      foreach ($hashes as $id => $hash) {
          $hashInsert[] = ['drug_id' => $id, 'hash' => $this->hex2bin($hash)]; //If hash is not valid hex, will sync it.
      }

      DB::table($tempTableName)->insert($hashInsert);

      /**
        * SQL for finding deleted drugs.
        * @noinspection SqlResolve
        */
      $deletedSQL = "
          SELECT $DELETED AS change_status, COALESCE(t.drug_id, d.drug_id) as drug_id, d.hash
          FROM $tempTableName t
                    LEFT JOIN (select * from drugs_current d where deleted_at IS NULL) d ON t.drug_id = d.drug_id
          WHERE d.drug_id IS NULL
      ";

      /**
        * SQL for finding new drugs.
        * @noinspection SqlResolve
        */
      $newSQL = "
          SELECT $NEW as change_status, COALESCE(t.drug_id, d.drug_id) as drug_id, d.hash
          FROM $tempTableName t
                    RIGHT JOIN (select * from drugs_current d where deleted_at IS NULL) d
                              ON t.drug_id = d.drug_id
          WHERE t.drug_id IS NULL
      ";

      /**
        * SQL for finding changed drugs.
        */
      $changedHashSQL = "
          SELECT $HASH_CHANGED AS change_status, d.drug_id, d.hash
          FROM (select * from drugs_current d where deleted_at IS NULL) d
          WHERE drug_id IN (SELECT DISTINCT drug_id FROM $tempTableName)
            AND NOT EXISTS(SELECT 1
                            FROM $tempTableName
                            WHERE $tempTableName.drug_id = d.drug_id
                              AND $tempTableName.hash = d.hash);            
      ";

      /**
        * SQL that just forces sync for existing drugs.
        */
      $skipHashCheckSQL = "
          SELECT $HASH_CHANGED AS change_status, d.drug_id, d.hash
          FROM (select * from drugs_current d where deleted_at IS NULL) d
          WHERE drug_id IN (SELECT DISTINCT drug_id FROM $tempTableName);      
      ";

      $step3 = $forceSync ? $skipHashCheckSQL : $changedHashSQL;

      $results = DB::select("
          $deletedSQL
          UNION
          $newSQL
          UNION
          $step3
      ");
      DB::statement("DROP TABLE $tempTableName;"); // Drop the temporary state table.
      return $results; //Return and end transaction.
  });

  $results = [];
  foreach ($changedState as $row) {
      $results[] = [
          'drug_id' => (int)$row->drug_id,
          'status' => (int)$row->change_status
      ];
  }

  return response()->json($results);
}

Metrics

I conducted a load test using Gatling with the following parameters; 20 concurrent users per second, over 30 seconds, with a 1 second pause between requests:
    ScenarioBuilder scn = scenario("RecordedSimulation")
      .exec(
        http("request_0")
          .post("/api/app/v1/drugs/syncState")
          .headers(headers_0)
          .body(RawFileBody("recordedsimulation/0000_request.json"))
      ).pause(1);

	  setUp(scn.injectOpen(constantUsersPerSec(20).during(30))).protocols(httpProtocol);

A mean response time of 562 ms was achieved, with each request body containing 1372 drug entity ID-hash pairs.

================================================================================
---- Global Information --------------------------------------------------------
> request count                                        600 (OK=600    KO=0     )
> min response time                                    374 (OK=374    KO=-     )
> max response time                                   1465 (OK=1465   KO=-     )
> mean response time                                   562 (OK=562    KO=-     )
> std deviation                                        255 (OK=255    KO=-     )
> response time 50th percentile                        432 (OK=432    KO=-     )
> response time 75th percentile                        522 (OK=522    KO=-     )
> response time 95th percentile                       1107 (OK=1107   KO=-     )
> response time 99th percentile                       1227 (OK=1227   KO=-     )
> mean requests/sec                                  18.75 (OK=18.75  KO=-     )
---- Response Time Distribution ------------------------------------------------
> t < 800 ms                                           473 ( 79%)
> 800 ms <= t < 1200 ms                                114 ( 19%)
> t >= 1200 ms                                          13 (  2%)
> failed                                                 0 (  0%)
================================================================================

The figure of 5.5K MAU is arrived at by:

  • ~2.5K users are from the Statistics section of the Google Play Console, Monthly Active Users.
  • ~1K of users, comprising 33% of the entire iOS userbase, opted in to share their usage information, which was taken from the Metrics section in Apple's App Store Connect, ACTIVE LAST 30 DAYS.

    It was known beforehand that iOS, in particular iPad users, outnumber Android users among the target demographic, hence extrapolating to 3K is a reasonable assumption.