January 14, 2025 · Yuki Tanaka · open banking

Why Open Banking Transaction Categories Are Broken (And What It Costs SMB Banking Apps)

Transaction feeds from Plaid, Tink, and MX arrive with miscategorization rates of 20-35%. We break down why this happens structurally and what it means for cash-flow accuracy in SMB banking dashboards.

The feed delivers data. It does not deliver accuracy.

When you integrate a Plaid, MX, or Tink connection into your SMB banking app, you get a transaction stream. Each record arrives with a category field — something like FOOD_AND_DRINK, SERVICE, or TRANSFER. Product teams often treat this field as ground truth, piping it directly into dashboards, expense breakdowns, and cash-flow models. That assumption is the root of the problem.

Industry-observed miscategorization rates on open banking feeds for SMB transaction populations run between 20% and 35%. That range isn't conjecture — it's what you find when you cross-reference open banking category fields against merchant-confirmed data, MCC codes from the ISO 8583 payment message, and accounting records from the same business. For a neobank serving 5,000 active SMB accounts each with 100+ monthly transactions, you are processing somewhere between 100,000 and 175,000 wrong category labels every month and rendering them as facts in your UI.

Why miscategorization happens structurally

Understanding the failure modes matters because they're not random noise — they're systematic, which means they produce systematic errors in downstream features.

The raw merchant name problem

Open banking connections typically retrieve transaction data from the bank's OFX or screen-scraped feed. That feed carries the raw merchant name as it appears on the bank statement: strings like SQ *PETES PLUMBING ARLINGTON, AMZN MKTP US*2B4KL9A3, or CHECKCARD 0312 SYSCO CORP. The aggregator's classification engine sees this string and attempts to assign a category.

For consumer transactions, this works reasonably well. The universe of merchants a consumer visits — grocery stores, restaurants, gas stations — maps to a known taxonomy. For SMB transactions, the merchant space is dramatically wider and the stakes are different. A plumbing supply company purchase classified as Home Improvement instead of Cost of Goods Sold doesn't matter for someone's personal PFM. For a plumber tracking business expenses, it misrepresents their entire cost structure.

MCC code inheritance and the B2B gap

Merchant Category Codes, defined in the ISO 8583 message standard and maintained by card networks, are assigned at merchant onboarding. They reflect how a merchant registered, not necessarily what a given business customer is buying. An MCC of 5065 (Electrical Parts and Equipment) on a large-format purchase might be raw materials for a contractor or it might be office supplies for an electrician's back-office. The aggregator receives the MCC and maps it to a category bucket. The context of who is buying and why is absent from the signal entirely.

Open banking APIs operating under PSD2 in the EU and FDX-aligned data models in the US face the same constraint: the category inference layer has no business context. It cannot distinguish between a $4,000 ACH credit that is a customer payment on an invoice and a $4,000 ACH credit that is an owner capital injection. Both might arrive tagged as TRANSFER.

Recurring transaction detection failure

Subscription and recurring vendor payments are a significant portion of SMB expense structure — SaaS tools, insurance premiums, commercial lease ACH debits, payroll runs. When a recurring transaction changes amount slightly (annual renewal price increase, usage-based SaaS overage) or the descriptor changes (payment processor renaming), many aggregator engines lose the recurring match and reclassify the transaction from scratch. A payroll ACH that was correctly labeled Payroll gets relabeled TRANSFER_DEBIT on the month the processor description format changed. The category history for that line item now has a break that makes trend analysis useless.

What it costs: concrete failure modes for SMB banking features

Consider a growing HVAC contracting business using your SMB neobank. They run roughly 200 transactions per month: vendor payments to supply houses (MCC 5075, Heating, Plumbing, Air-Conditioning Supplies), fuel cards (MCC 5541, Service Stations), payroll ACH, equipment financing, and customer invoice payments. A 25% miscategorization rate on that volume means 50 transactions per month carrying wrong labels. Here is what that produces:

Cash-flow forecast error: The 30-day forecast model sees two large customer payments labeled as TRANSFER rather than Revenue. It excludes them from income projection. The dashboard shows the business as cash-flow negative when it is actually cash-flow positive. The owner calls support — or stops trusting your app.
Expense breakdown distortion: Supply house purchases misclassified as Shopping inflate a category that has no business meaning for this company. The expense pie chart is meaningless for any business decision.
Tax export errors: If your app exports expense categorization to accounting tools or generates Schedule C-adjacent summaries, wrong categories produce wrong tax inputs. This is a trust-destroying event when an accountant finds it.
Burn rate and runway calculation failure: Payroll runs misclassified as generic debits break the recurring expense stack. Runway calculations undercount fixed costs. For a business with 60-day receivable cycles, wrong runway numbers mean wrong financing decisions.

The aggregator is solving a different problem

This is not a criticism of Plaid, MX, or Tink as products. They built transaction connectivity infrastructure — the plumbing that gets data from 11,000+ financial institutions into a normalized format. That is genuinely hard, and they do it well. Their categorization layer was designed for consumer PFM use cases: helping individuals see where their personal spending goes. The taxonomy (Food, Travel, Entertainment, Shopping) maps to consumer behavior, not to business accounting categories.

We're not saying the aggregators are bad at categorization — we're saying they built the right categorization for the wrong context when you're serving SMBs. The problem isn't their execution; it's that consumer-PFM categorization logic applied to business transaction data produces systematically wrong outputs for business-specific features. A neobank building cash-flow forecasting for an electrician or a marketing agency or a staffing firm needs business-context-aware reclassification. That is a different product from what the aggregation layer provides.

The reclassification gap in the modern open banking stack

The modern SMB banking stack typically looks like this: bank account connectivity (Plaid/MX/Tink) feeds a normalized transaction stream into an internal data model, which powers dashboard components, expense analytics, and cash-flow features. Somewhere in that pipeline, the category field is treated as input rather than as a noisy signal that needs a correction layer.

What the stack is missing is a reclassification step between the aggregator output and the feature layer. That step needs to do several things that a general-purpose enrichment API does not do well: understand SMB-specific merchant relationships, detect and repair recurring transaction classification breaks, handle ACH transaction context (is this a payroll run, a vendor payment, a customer receipt?), and produce categories that map to business accounting concepts rather than consumer spending buckets.

The cost of not having this layer is not abstract. It is the gap between a dashboard that a business owner trusts and one they stop opening after two months.

Measuring what you're actually showing users

Before investing in fixing the categorization layer, you need to understand your current accuracy baseline. A practical approach: take a 90-day transaction sample from 20-30 of your active SMB accounts. For each transaction, record the category your app assigned and the category a human reviewer would assign given the full business context (industry, known vendors, transaction pattern). Bucket mismatches by type: wrong business category, correct general category, uncategorized. Calculate your effective accuracy rate.

Most banking product teams who run this exercise find accuracy in the 60-75% range for their SMB population — higher for consumer-like transactions (food, entertainment), considerably lower for B2B payments, ACH transfers, and industry-specific vendors. The gap between 70% accuracy and 94% accuracy is the gap between a feature that undermines trust and one that earns it.

The data quality problem in your transaction feed is solvable. But it has to be solved deliberately, at the right layer in the stack, with tooling built for business context rather than adapted from consumer finance.

← All posts Get API Access