Multi-tenant architecture, integration engineering, scalability work, and proprietary algorithm development qualify under IRC Section 41. Vertical SaaS, horizontal platforms, marketplaces, workflow automation, and API-first companies qualify right now. Section 174 amortization makes documenting the credit more important than ever.
Most SaaS companies that qualify do not think of their engineering work as research. But if your team is engineering multi-tenant data isolation under performance constraints, building integration layers against legacy systems with unknown behavior, designing search or recommendation algorithms where the outcome is uncertain at the start, or scaling architecture under load conditions that have no obvious solution, the R&D credit likely applies right now.
Since 2022, Section 174 requires capitalization and amortization of research and experimental expenditures over 5 years for domestic R&D and 15 years for foreign R&D. This applies to most SaaS engineering costs and significantly increases current-year taxable income. The R&D credit partially offsets this impact by directly reducing tax liability on the same expenses. SaaS companies that previously skipped the credit because the documentation burden seemed disproportionate to the benefit now have a different calculus: the documentation is required for Section 174 anyway, and the credit captures real value from work that would otherwise just create a deferred deduction.
The work must aim to develop or improve the functionality, performance, reliability, or quality of a process, technique, formula, or software component. SaaS companies meet this test through engineering faster query performance, more reliable multi-tenant data isolation, lower-latency real-time sync, more efficient search and recommendation systems, more robust integration architecture, or scalable concurrency under load. Experimental failure counts. Failed scaling approaches, rejected architecture patterns, and integration designs that did not meet performance thresholds all contribute qualifying research expenses.
A vertical SaaS team building a clinical workflow platform engineers a custom multi-tenant data isolation pattern to satisfy HIPAA segregation requirements without sacrificing query performance. Their first approach using row-level security in Postgres meets isolation requirements but creates unacceptable query latency at scale. A second approach using schema-per-tenant fails on cross-tenant analytics. A third approach combining schema isolation with a materialized analytics layer meets both targets after four iterations. All iterations qualify because the intent throughout was to improve technical performance under engineering uncertainty.
This prong is met by any SaaS team developing a technically better architecture, integration layer, algorithm, or platform component. Backend engineers, platform engineers, data engineers, and ML engineers all perform work that satisfies this test as part of their standard scope.
The work must rely on principles of computer science, mathematics, engineering, or physical science. SaaS engineering is inherently grounded in these disciplines: distributed systems theory, database systems, algorithm design, software architecture, and cryptography all satisfy this prong. Business decisions about pricing, packaging, customer segmentation, and go-to-market strategy do not qualify, but the engineering underlying the platform mechanics does.
A workflow automation platform engineers a novel DAG execution engine drawing on graph theory, distributed systems consensus, and queue scheduling research. A vertical SaaS team designs a custom search ranking algorithm using information retrieval theory and statistical learning principles. Both draw on recognized scientific and mathematical foundations and satisfy the technological prong directly.
The threshold is low for SaaS engineering because the scientific foundation is inherent to the discipline. Distributed systems, databases, algorithms, and software architecture all rest on established computer science principles.
The work must aim to eliminate uncertainty about the capability or method of achieving a technical result. SaaS development is dense with this kind of uncertainty: whether a multi-tenant database design will scale to the target tenant count, whether a real-time sync architecture will hold consistency guarantees under partial network failures, whether a recommendation algorithm will achieve the target relevance metric at the target latency, or whether an integration layer can hold throughput against a legacy system with poorly documented behavior.
An API-first platform company develops a novel webhook delivery system and does not know at the start whether their queue architecture and retry logic can achieve the target delivery success rate under load conditions involving customer endpoint flapping. The engineering team runs systematic load tests, adjusts the queueing and retry behavior through multiple iterations, and validates throughput against the defined reliability requirement. The uncertainty about the technical method is eliminated through the experimental process.
SaaS engineering frequently involves uncertainty about both capability (can this scale at all) and method (which architecture will hold the SLA). Either form of uncertainty qualifies.
The work must involve a process of evaluating alternatives to eliminate technical uncertainty. This does not require a formal lab or a dedicated research team. In SaaS development, the experimental process is typically the engineering workflow itself: designing candidate architectures, running benchmarks and load tests, evaluating performance against defined criteria, and iterating on the data model, the algorithm, or the system architecture. Contemporaneous documentation of this process is the foundation of a defensible R&D credit study.
A two-sided marketplace team evaluates three different matching algorithm approaches before deploying their production matching engine. Each approach is benchmarked against historical transaction data with defined relevance and latency targets. Results are documented in architecture decision records and benchmark spreadsheets. The systematic evaluation of alternatives is the process of experimentation. The documentation of that process is what makes the credit defensible under examination.
Architecture decision records, benchmark logs, A/B test results against defined performance criteria, and load test comparisons all constitute experimental processes under IRC Section 41.
For the full four-part test explanation with examples across industries, see the main R&D Tax Credit page.
Each sub-sector below includes the qualifying activities, the typical expense breakdown, and the primary exclusion. Select your company type.
An $8,000,000 ARR clinical workflow platform serving multi-specialty medical practices identified a structural problem: their existing row-level-security multi-tenant pattern was meeting HIPAA data isolation requirements but creating query latency that made the analytics layer unusable for their largest customers. Their backend platform team spent 11 months evaluating four alternative tenant isolation architectures (schema-per-tenant, hybrid schema with shared analytics, partitioned tables with tenant-aware routing, and a custom data routing layer) across a defined benchmark cohort of 14 representative workload patterns drawn from production traffic. The team built two working implementations and ran adversarial isolation tests on each before committing to the final design.
The work grew naturally from product engineering, not labeled research. Architecture decision records, the benchmark logs comparing the four candidate architectures across the 14 workload patterns, and the rejected implementation branches all formed contemporaneous proof of experimentation. The team described the project as "fixing the analytics layer." aecre's technical interview process identified the qualifying experimental structure within that description and built the proof of experimentation documentation around it.
A Series B contract intelligence platform serving in-house legal teams identified a gap during enterprise pilots: their existing pipeline using off-the-shelf NLP models was extracting standard contract clauses well but failed reliably on indemnification, limitation of liability, and IP assignment clauses where layout and language varied significantly across contract types. Their machine learning and platform engineering team spent 13 months developing a custom extraction pipeline, evaluating three alternative model architectures (a fine-tuned encoder approach, a retrieval-augmented generation pipeline, and a custom span-based extractor) and four data labeling strategies across a defined evaluation cohort of 1,200 enterprise contracts spanning eight contract types. They built working implementations of each and measured precision, recall, and processing cost.
The team had no internal R&D classification for the work. They considered it "fixing extraction quality." But the documented technical uncertainty about whether any of the candidate model architectures would meet the precision threshold required for legal review use, the systematic evaluation of alternative architectures and labeling strategies against the 1,200-contract evaluation cohort, and the outside legal annotation contractor engagement at 65% all met the criteria for qualifying research expenses under IRC Section 41.
A 28-person workflow automation platform serving operations teams identified a structural problem: their existing DAG execution engine handled simple stateless workflows correctly but could not hold idempotency guarantees on long-running workflows that included external API calls with side effects. Customer-facing failures during partial network outages were eroding platform trust. Their platform engineering team spent 10 months designing a novel durable execution architecture, evaluating four alternative approaches (event-sourced state replay, Saga compensation patterns, custom checkpoint-and-resume engine, and a hybrid model) across a defined adversarial test cohort of 22 failure scenarios spanning network partitions, API timeouts, and worker crashes. They built working implementations of two of the four candidates and ran adversarial chaos tests on each.
The work was performed under contract pressure from enterprise customers requiring specific reliability SLAs. The team described the work as "rebuilding the execution layer" and never framed it as research. Architecture decision records, the rejected implementation branches, the chaos test logs, and the engineering memoranda comparing recovery semantics across approaches all formed contemporaneous proof of experimentation that aecre identified during the technical interview.
A 19-person API-first infrastructure platform identified a customer-facing reliability problem: their standard webhook delivery pattern was achieving acceptable success rates under normal conditions but degrading sharply when customer endpoints were unhealthy or unreachable, with retries either overloading customer infrastructure or dropping events silently. Their distributed systems team spent eight months designing a novel adaptive delivery architecture, evaluating five alternative approaches (exponential backoff with jitter, adaptive rate based on customer endpoint health signals, circuit breakers with shadow traffic, queue-per-customer with isolation, and a hybrid attestation model) across a defined benchmark cohort of 16 customer endpoint behavior patterns drawn from production telemetry. The team built three working implementations and ran sustained load tests against simulated unhealthy endpoints.
The team thought of the work as a reliability project, not research. But the documented technical uncertainty about whether any of the candidate delivery architectures would hold the customer reliability SLA without overloading downstream infrastructure, the systematic evaluation of alternatives across the 16-pattern benchmark, and the comparative load test results all met the criteria for qualifying research expenses under IRC Section 41.
A two-sided marketplace serving a regional services category identified a structural problem during scaling: their default greedy matching algorithm was creating long wait times for buyers in dense urban areas with high request volume, while leaving suppliers idle in adjacent zones. The supply imbalance was eroding marketplace liquidity. Their platform engineering and data science team spent 12 months developing a novel multi-objective matching engine, evaluating three alternative algorithm families (optimization-based bipartite matching with constraint relaxation, reinforcement-learning-based dispatch, and a heuristic dispatch with predictive supply rebalancing) across a defined simulation cohort of 18 historical demand patterns spanning peak and off-peak conditions. The team built working implementations of each and ran adversarial simulations measuring match success rate, wait time, supplier utilization, and computational cost.
The team described the work in operations and matching terms (dispatch quality, supply utilization, wait time) and never connected it to research and development tax framing. aecre's technical interview process identified the qualifying experimental structure across the 12-month program and built the documentation file around the existing engineering artifacts: simulation logs, algorithm comparison memoranda, and the production A/B test results that the team had already produced.
Answer the quick check questions to see if your company qualifies.
Most SaaS pass-through entities (S-Corps, partnerships, LLCs) see the full credit benefit at individual rates. Nearly 40 states stack additional credits on top of the federal credit. The federal number is the floor.
The feasibility conversation takes 30 minutes. We assess your qualifying activities, estimate credit value, and tell you plainly whether a study makes sense for your company. No commitment, no cost.
Book a Free AssessmentWe respond within one business day. Partner-led from first conversation through filing.