Add monitoring rules for the following components in your environment via monitoring tools like SCOM and associated Management Packs:
Windows Servers
SQL Server
IIS Web Server (in API Machines)
API End points
Billing Agent Service
You may in addition add monitoring for:
DB growth and/or the Disk(s) hosting Billing Database is monitored for performance and more importantly Available Free Space.
Event logs for Error level entries - though, beware of many false positives here. Only add monitoring of specific events if you have seen helping identify issues.
Down level source systems and dependencies like:
Microsoft Azure Stack (MAS) usage endpoint
Windows Azure Pack (WAP) usage endpoint
Active Directory and/or ADFS, if your environment depends on it
Having the operations team educated and routed with automated alerts is also essential to prevent failures and to enable faster recovery from any failures.