Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

  1. Tools Required

  2. Roles and Responsibilities on Duty

  3. WhatsApp Template

...

  1. Grafana Dashboards:

    1. Monnify Dashboard

    2. Moniepoint Success Rate Dashboard

    3. NIP Liquidity Manager

  2. MetaBase

  3. New Relic

  4. Monnify BackOffice - Switching Providers:

    1. Collections Back Office

    2. ATLAS UI - adding providers

  5. Skype Channels

    1. NALA <> Moniepoint

    2. MONNIFY<> SQUAD

    3. Monnify vs Fidelity Virtual account

    4. Payattitude /Teamapt

  6. Monnify Web Portal for Testing/Simulations

  7. Pager duty (Access granted by Simpa Saiki)

  8. Status page

  9. GCP access (Log explorer ,Workloads)

  10. Jira Support Desk

  11. WhatsApp groups

    1. VGG

    2. Baxi

    3. Wema-Monnify

    4. Sterling Monnify

    5. Coral pay

    6. TSE Monnify

    7. Monnify operations group

  12. Slack channels

    1. apm-monitoring-alerts

    2. Grafana-monnify-alerts

...

S/N

Panels

Implications

Issues

Threshold

Escalation

1

Disbursement (Rsp Time)

The average response time per transaction from the provider

  • Delayed Response from Provider

  • Database Lag (Kafka-Monnify)

> 4 seconds

NIP Success Rate < 94%

TSE

2

Pending Disbursements (Total)

It is the count of transactions currently pending and is caused by the following:

  • Destination not Available/Unresponsive

  • High Response Time

> 100 transaction count (above 10 mins)

TSE

3

Outflow (MPT, Sterling, Wema, Fidelity)

These are monitored because we are integrated to them for “Collections” also. Hence, when there is a downtime on this panel, there will be a downtime on the corresponding “Collections” panels

< 60%

TSE

4

Super Merchant Panels

Baxi, NALA, VGG, Abeg Tech, Palmpay are super merchants that utilize monnify’s disbursement API.

Last Transaction >

1 hour

TSE

5.

Balances

These are the disbursement account balances.

Balances < 300mil for
Habari Pay, e-Tranzact Account

< 300 million

TSE

6.

ATLAS Providers Success Rates

Transactions are failing . Resolution is to turn on other providers eg (ISW, Habari Pay, ETZ, Hydrogen Pay etc.)

Success rate < 95 -90 %

< 94%

Specific Bank on the provider is < 50%

TSE

7.

Disbursement Performance - By Banks (10m)

Transactions are failing on that specific bank

Bank is encountering technical issues

**Success rate on RED especially for major banks

Send communication to critical stakeholders(monnify operation groups ,TSE)

...

S/N

Panels

Implications

Issues

Ideal Threshold

Escalation

1

Kafka Retry Queue & Kafka Queue Backlog

Shows the count of posting & settlement entries pending execution

Delayed Job Execution/ Blocked Job Service

> 1,000 (Red)

*This threshold should only apply before and after 10pm.

Reason: By 10pm, the posting and settlement are being processed hence there might be high frequency

  • If job is not completed at 12 am kindly escalate

TSE

2

Unsettled OLAM Transactions

These are the volume and value of settlement transactions pending for a merchant (OLAM)

Will be executed when the Kafka Retry Queue has been processed

Will be processed after 10pm

TSE

3

MJS - In Progress, Being Processed, & New

These are panels for monnify-job-service

If Job-Services are blocked

MJS -Being Processed > 1,000

TSE

4

Monnify Metabase Replica lag

This is the time-gap between the Monnify-live Database and the Replica

N/A

>60 seconds (monitor the spike before escalating)

TSE and critical stakeholders(DBA)

5

Unsent Webhook Notifications

webhook notifications not sent by merchants

webhook notifications not sent by merchants

> 400 count

TSE

OTHERS

Transactions stuck on atlas MJS (Monnify-Job-Service)
At certain times, the queueing system for jobs (atlas-monnify-job-service) on the atlas-service gets clogged due to pending transactions or errors amongst other reasons. Thus affecting disbursements sent from Monnify-disbursement-service to atlas-service. Below are the panels to monitor to get these instances.

...