How CodeClaim Works
A transparent explanation of our analysis process, the technology behind it, and its limitations.
1. What We Analyse
CodeClaim connects to your GitHub repository and reads commit diffs only. We do not download, store, or analyse your full source code. A commit diff shows what changed between two versions of a file, which is enough to understand what work was done without exposing your entire codebase.
We fetch commits within the date range you specify (typically your company's accounting period). For each commit, we extract the diff and the commit message.
2. How AI Classification Works
CodeClaim uses AI (Anthropic Claude) to analyse your code commits and classify R&D activities. Commit diffs (not full source code) are sent to the AI for analysis.
For each commit, the AI attempts to determine whether the work represents qualifying R&D activity under HMRC's definition. It considers whether the commit involved seeking an advance in science or technology, resolving scientific or technological uncertainty, or work that a competent professional in the field could not readily deduce.
Commits are grouped into features (logical units of work) and each feature receives a classification and confidence score.
The AI Model
We use Anthropic's Claude language models. The initial classification pass uses a smaller, faster model for efficiency. The report narrative generation uses a larger model for more detailed, nuanced writing. Neither model has been specifically fine-tuned on R&D tax credit claims; they are general-purpose language models prompted with HMRC's published guidance.
3. What the AI Cannot Do
Understanding these limitations is essential before relying on any output from CodeClaim.
- It cannot determine eligibility. Whether your activities qualify for R&D tax relief is a judgement that depends on your specific technical context, industry baseline, and the state of knowledge at the time the work was done. The AI does not have this context. Only a qualified tax adviser with knowledge of your business can make this determination.
- It cannot guarantee accuracy. AI classification is probabilistic. It will sometimes classify qualifying work as non-qualifying, and non-qualifying work as qualifying. Confidence scores reflect model certainty, not real accuracy. A 90% confidence score does not mean there is a 90% chance the classification is correct.
- It cannot understand business context. The AI only sees code diffs and commit messages. It does not know your business goals, your market, your prior R&D work, or what a competent professional in your field would already know. These are all critical factors in determining R&D eligibility.
- It cannot replace a competent professional. HMRC requires that R&D claims identify a "competent professional" who can attest to the technological uncertainties involved. CodeClaim is not a competent professional. Your CTO, lead engineer, or technical director should review and validate the technical claims in the report.
- It cannot account for HMRC's evolving interpretation. HMRC's guidance on what constitutes qualifying R&D, particularly in software, is subject to change. The AI's training data has a knowledge cutoff and may not reflect the most recent HMRC decisions, tribunal rulings, or policy changes.
- It cannot calculate costs precisely. Staff time allocation is estimated from commit timestamps using statistical heuristics, not actual payroll records. These estimates must be validated against your real time-tracking and payroll data by your accountant.
4. What the Reports Contain
Generated reports include a technical narrative structured around HMRC's Additional Information Form (AIF) questions, covering:
- Field of science or technology
- Baseline knowledge and existing capabilities
- Target advancement sought
- Scientific or technological uncertainties
- How uncertainties were overcome
Reports also include a project summary, cost allocation estimates, and supporting evidence derived from commit data.
Every report is a draft. It is a starting point for your professional tax adviser, not a finished document ready for HMRC. Your adviser should review, revise, and validate every section before any claim is submitted.
5. Data Handling
Raw code diffs are deleted immediately after analysis. Only the resulting classification and report data are retained.
Commit diffs are sent to Anthropic (our AI provider) for analysis. Anthropic does not use this data for training. The diffs are processed in memory and not retained by Anthropic after the API call completes.
Classification results (which features qualify, confidence scores, narratives) are stored in our database so you can access your reports. You can request deletion of all your data at any time.
All data is processed and stored in the EU (AWS London, eu-west-2). See our Privacy Policy for full details.
6. Why Professional Review Is Required
HMRC is actively scrutinising R&D tax credit claims, with approximately 1 in 5 claims facing an enquiry in 2024. Software claims are under particular focus, with tighter definitions of qualifying activities and mandatory digital submissions requiring more granular detail.
Submitting an inaccurate or inflated claim can result in the claim being rejected, penalties of up to 100% of the incorrectly claimed amount, and in serious cases, criminal investigation. A qualified tax adviser will:
- Validate that the technical narrative accurately reflects qualifying R&D under current HMRC guidance
- Verify cost allocation figures against actual payroll records
- Ensure the claim structure and documentation meets current compliance requirements
- Identify any areas where the AI may have over- or under-classified activities
- Provide professional sign-off that strengthens the claim in the event of an HMRC enquiry
7. Our Role
CodeClaim is a software tool operated by Jelifish Ltd. We are not a tax agent, claims agent, or tax adviser. We do not submit claims on your behalf, advise on eligibility, or represent you in dealings with HMRC.
Our role is to automate the time-consuming parts of the R&D claim preparation process: reading code, classifying commits, and drafting initial narratives. The judgement calls, the validation, and the submission remain your responsibility and that of your professional advisers.
For full legal terms, see our Terms of Service.