Table of Contents
Tools Required
Roles and Responsibilities on Duty
WhatsApp Template
Must View:
https://teamapt.atlassian.net/wiki/x/UoCgWw
https://teamapt.atlassian.net/wiki/x/AYDzWg
https://teamapt.atlassian.net/wiki/x/p4CTW
TOOLS REQUIRED
Grafana Dashboards:
Monnify BackOffice - Switching Providers:
ATLAS UI - adding providers
Skype Channels
NALA <> Moniepoint
MONNIFY<> SQUAD
Monnify vs Fidelity Virtual account
Payattitude /Teamapt
Pager duty (Access granted by Simpa Saiki)
Status page
GCP access (Log explorer ,Workloads)
WhatsApp groups
VGG
Baxi
Wema-Monnify
Sterling Monnify
Coral pay
TSE Monnify
Monnify operations group
Slack channels
apm-monitoring-alerts
Grafana-monnify-alerts
ROLES AND RESPONSIBILITIES ON DUTY
Daily Report Format
Incident logging on Jira
Grafana Checks
DISBURSEMENT
At the start and during every shift, it is essential to monitor the performance of the transactions being processed and services for withdrawals/transfers from merchants' wallets to destination bank accounts (external bank accounts and Moniepoint bank accounts). Grafana, New Relic, and alerts from Slack and Pager duty are used.
S/N | Panels | Implications | Issues | Threshold | Escalation |
---|---|---|---|---|---|
1 | Disbursement (Rsp Time) | The average response time per transaction from the provider |
| > 4 seconds NIP Success Rate < 94% | TSE |
2 | Pending Disbursements (Total) | It is the count of transactions currently pending and is caused by the following: |
| > 100 transaction count (above 10 mins) | TSE |
3 | Outflow (MPT, Sterling, Wema, Fidelity) | These are monitored because we are integrated to them for “Collections” also. Hence, when there is a downtime on this panel, there will be a downtime on the corresponding “Collections” panels | < 60% | TSE | |
4 | Super Merchant Panels | Baxi, NALA, VGG, Abeg Tech, Palmpay are super merchants that utilize monnify’s disbursement API. | Last Transaction > 1 hour | TSE | |
5. | Balances | These are the disbursement account balances. | Balances < 300mil for | < 300 million | TSE |
6. | ATLAS Providers Success Rates | Transactions are failing . Resolution is to turn on other providers eg (ISW, Habari Pay, ETZ, Hydrogen Pay etc.) | Success rate < 95 -90 % | < 94% Specific Bank on the provider is < 50% | TSE |
7. | Disbursement Performance - By Banks (10m) | Transactions are failing on that specific bank | Bank is encountering technical issues | **Success rate on RED especially for major banks | Send communication to critical stakeholders(monnify operation groups ,TSE) |
View the failure reasons section for reason of failures
COLLECTION
For each provider, it is required to monitor and review the transaction notifications received per bank provider and ensure we are getting traffic as required. At any point where the performance drops, it is required to reach out to the provider to address the issue promptly.
S/N | Panels | Implications | Issues | Threshold | Escalation |
---|---|---|---|---|---|
1 | In-Flows (Wema, Sterling, Moniepoint) | These are panels showing successful inflow transactions counts. **Inflow from GTB and Fidelity are usually very small due to little transactions. |
| Transaction Count (Wema, Sterling, Moniepoint): < 150 Last Transactions Time (Wema, Sterling, MPT, Fidelity): > 60 seconds (for GTB): > 30 minutes | TSE and Provider Bank |
2 | NE Success Rate (Wema & Sterling) | Shows the success rate of name enquiries done on the Banks | Downtime from the Bank API service | Success Rate: <75% | TSE |
3 | Pending Push Transaction | It is the count of transactions currently pending. |
| > 100 transaction counts | TSE |
4 | No TSQ and No Trnx Record | Count of transactions that are not completely processed due to pending TSQ or transactions not being logged on payment session due to some errors. |
| > 100 transaction counts | TSE |
5 | Coral Pay Pending Transaction | Count of transactions currently pending and processing after successful allocation of virtual account to the merchant. | Merchant settles Monnify after 10pm daily for all transactions processed. | Values should be noted as at Start and End of Day. The value does not decrease after 11 pm | TSE |
6 | Coral Pay’s Last Transaction | Last time transactions was received for processing | No transaction routed through Monnify | > 60 minutes | TSE |
7 | Uncompleted Rejected Payments | Are duplicated transactions that occurred based on job service being blocked. | Blocked Job Service | >100 transaction count | TSE |
8 | Pending Expiration | Shows the number of pending transactions that should have been cleared | Blocked Job Service | >0 | TSE |
CARD, USSD & PHONE PAYMENTS
S/N | Panels | Implications | Issues | Ideal Threshold | Escalation |
---|---|---|---|---|---|
1 | Card and offline payment | Gives the success rate of ISW card, Habari pay cards and USSD payment performed | Service Downtime | Last transaction on Habari Pay card >30 min Success Rate: Zero Data | TSE |
PERFORMANCE SUMMARY BY PROVIDERS
S/N | Panels | Implications | Issues | Ideal Threshold | Escalation |
---|---|---|---|---|---|
1 | Pool Accounts (Available) | Shows the current available virtual accounts per bank (Wema, Sterling, or Moniepoint) | N/A | Acceptable =1 million Bad threshold<800k critical threshold < 300k | TSE |
TRANSACTION & SETTLEMENT PROCESSING (KAFKA)
S/N | Panels | Implications | Issues | Ideal Threshold | Escalation |
---|---|---|---|---|---|
1 | Kafka Retry Queue & Kafka Queue Backlog | Shows the count of posting & settlement entries pending execution | Delayed Job Execution/ Blocked Job Service | > 1,000 (Red) *This threshold should only apply before and after 10pm. Reason: By 10pm, the posting and settlement are being processed hence there might be high frequency
| TSE |
2 | Unsettled OLAM Transactions | These are the volume and value of settlement transactions pending for a merchant (OLAM) | Will be executed when the Kafka Retry Queue has been processed | Will be processed after 10pm | TSE |
3 | MJS - In Progress, Being Processed, & New | These are panels for monnify-job-service | If Job-Services are blocked | MJS -Being Processed > 1,000 | TSE |
4 | Monnify Metabase Replica lag | This is the time-gap between the Monnify-live Database and the Replica | N/A | >60 seconds (monitor the spike before escalating) | TSE and critical stakeholders(DBA) |
OTHERS
Transactions stuck on atlas MJS (Monnify-Job-Service)
At certain times, the queueing system for jobs (atlas-monnify-job-service) on the atlas-service gets clogged due to pending transactions or errors amongst other reasons. Thus affecting disbursements sent from Monnify-disbursement-service to atlas-service. Below are the panels to monitor to get these instances.
S/N | Panels | Implications | Issues | Ideal Threshold | Escalation |
---|---|---|---|---|---|
1 | Monnify App Replica Lag | Shows the time it takes for transaction records to be evident on Merchant transaction dashboard | N/A | >60 seconds (monitor the spike before escalating) | TSE |
2 | Pending MJS Requests, Atlas MJS New Job Requests | Pending MJS Requests = 6,000 | Job Service Block | Atlas MJS New Requests having continuous spikes without any drop | TSE |
Handover Report
To be generated
Please note that issues that occur during shift whether transactional or tied to the portal should be communicated to the TSE when there is no resolution in site or confusion on what action to take. When in doubt, always ask!!!
Whatsapp escalation templates(sample)
Communications to Partners (Monnify Operations Group, Baxi, VGG and NALA (Skype), when there is a downtime from Specific banks -
1. Hi Team, Please note that we are receiving "Timeout waiting for response from destination" error from Ecobank. Disbursements to this bank will be failing at this time.
2. Hi Team, Please note that we are receiving "Timeout waiting for response from destination" error from Standard Chartered Bank Nigeria Ltd. Disbursements to this bank will be failing at this time.
Communications to Partners (Monnify Operations Group, Baxi, VGG and NALA (Skype), when there is a downtime from NIBSS -
3. Dear Team,
We are encountering issues with the central transfer processor for banks (NIBSS), which is affecting our disbursement service. Consequently, merchants may experience delays (pending disbursements) and failures when transferring funds from their wallets to other bank accounts.
These challenges are originating from our connection with NIBSS. We are maintaining close contact with them to get updates on the resolution. We are mitigating this challenge by routing transactions via other providers.
We apologize for any inconvenience caused and will provide updates as we receive them.
Communications to Partners (Monnify Operations Group, Baxi, VGG and NALA (Skype), when there is a downtime from NIBSS (high response time) -
Dear Team,
We are encountering a high response time with the central transfer processor for banks (NIBSS), which is affecting our disbursement service. Consequently, you may experience delays (pending disbursements) when attempting to transfer funds from your wallet to other bank accounts.
These challenges are originating from NIBSS. We are maintaining close contact with them to get updates on the resolution.
We apologize for any inconvenience caused and will provide updates as we receive them.
Thank you for your patience and understanding.
Communications to Partners ( Baxi, VGG and NALA (Skype), when the last transaction time is high -
5. Hello Team,
We noticed that your last transaction was 1.45 hour ago.
Kindly let us know if there are any issues.
Jira Ticket Update Template
Panel | Monnify-Type | Response Message (if applicable) | Date
Example: Pending Disbursements | 20250203;
Wema | Disbursement | Timeout waiting for response from destination | 20250203
ESCALATION MATRIX
Emmanuel Eke TBD
Depends on Jira service management setup
Shift Pattern
TeamApt currently runs a 24/7 monitoring schedule - See rota here
Monnify currently runs 06:00 - D+1 02:00