Test Community Network

AI detection and authenticity

Last updated: 12 May 2026 · Reviewed by Tim Burnett (Admin)

TLDR

AI detection can be a useful signal, but it does not by itself prove authorship or secure validity. In assessment, the deeper question is whether the task still gives trustworthy evidence of what the learner can do. The stronger trend in the evidence is towards assessment design, clearer expectations, and process evidence rather than detector-led enforcement. Detection failures and false accusations are not edge cases; they are part of the main risk picture.

Definition

AI detection in assessment is the use of software or review methods to flag possible generative AI use in learner work. Authenticity is the more important assessment concept: whether the work genuinely shows the learner’s own capability and whether the task still supports the decision being made. Detection may help identify risk, but it does not settle whether AI use was acceptable for the construct being assessed. In practice, the authenticity question is often about assessment conditions, permitted support, and whether the design still makes the intended capability visible.

Why It Matters

This matters because AI has made weak assumptions about authorship harder to defend. Coursework, portfolios, essays, projects, and take-home tasks all depend on what support is allowed and what evidence of individual capability is being collected. For regulated qualifications, authenticity is a validity and public-trust issue as well as a misconduct issue. Where non-exam assessment cannot be made secure and authentic, the format itself may come under pressure. In classroom and workplace-style assessment, the practical response increasingly points towards better task design, revision history, checkpoints, and clearer learner communication rather than relying on a detector alone. The latest TCN source in this run is especially relevant: colleges are already spending large sums on detectors that still produce inaccurate flags and privacy concerns, while a separate TCN item shows how pervasive student AI use has become. That combination suggests the issue is not simply tool adoption, but whether the assessment design and appeal process can stand up when AI use is already normal.

Key Concepts

- **Authenticity**: whether the work genuinely evidences the learner’s own capability. - **Detection**: a tool-based attempt to flag possible AI involvement, usually probabilistically rather than conclusively. - **Permitted AI use**: support that is allowed because it does not undermine the construct. - **Construct**: the ability the assessment is meant to evidence. - **Assessment conditions**: the task setting, tools, supervision, checkpoints, and evidence trail that shape what can be trusted. - **Revision history**: process evidence showing how work changed over time, which can be more informative than a single final submission. - **LLM-resistant design**: task design intended to make generic AI shortcuts less useful while preserving the intended construct.

What Experts Agree On

The strongest evidence points to a shared conclusion: generic AI detection is too blunt to carry the whole authenticity burden. It may help as one signal, but it does not answer the core assessment question of whether the learner’s work remains valid evidence of capability. There is also broad convergence that authenticity is not simply a policing problem. Better task design, clearer policy, and visible process evidence are more defensible foundations for decision-making than after-the-fact inference from a detector. The direction of travel in the literature and practitioner material is towards design and governance, not detection alone. A further point of agreement is that “authentic” does not always mean “AI-free”. Some assessments should resist generic AI use; others should explicitly include AI because that is closer to the construct being assessed.

What Is Contested

The main disagreement is how far authenticity should be protected through restriction versus redesign. Tight controls can help in narrow tasks, but they may also move assessment away from the kinds of writing, research, and digital work people now do in practice. There is also no settled answer on where AI use should be classed as prohibited assistance, permitted support, or part of the construct. That boundary depends on the purpose of the assessment, the stakes, and the quality of the evidence trail. Vendor messaging tends to frame detection as a solution, but that remains a market signal rather than independent validation for high-stakes use.

Risks

- False positives leading to unfair sanctions if detector output is treated as proof. - False reassurance if a detector replaces stronger authenticity controls. - Validity loss if anti-AI controls push the assessment away from the intended construct. - Policy confusion when permitted, prohibited, and expected AI use are not clearly distinguished. - Procurement risk if suppliers are chosen on claims rather than independent evidence. - Appeals and reputation risk if decisions cannot be explained clearly and consistently. - Inclusivity risk if performance varies by language group, writing style, or adversarial technique.

Good Practice

1. **State the construct first.** Define what capability the assessment is meant to evidence. 2. **Classify AI use explicitly.** Decide whether AI is prohibited, permitted, expected, or part of the construct. 3. **Map the evidence trail.** Identify where authorship, judgement, and decision-making can be observed. 4. **Choose controls that fit the stakes.** For higher-stakes assessment, use

Existing Heading Outline

Use this outline to preserve the page structure unless the new sources clearly justify a better split or clearer subpage strategy. - # AI detection and authenticity - ## TLDR - ## Definition - ## Why It Matters - ## Key Concepts - ## What Experts Agree On - ## What Is Contested - ## Risks - ## Good Practice - ## Options or Comparison - ## Example in Practice - ## Key Sources - ## Vendor Landscape - ## FAQs - ### Can AI detectors prove that a learner used ChatGPT? - ### What is the difference between AI detection and authenticity? - ### Should AI be banned in assessment? - ### What is the safest way to deal with AI misuse in coursework? - ### Are AI detectors fair across different learners? - ## Related Pages - ## Last Reviewed By - ## Suggested Citation - ## Sources

Options or Comparison

| Option | Best fit | Strengths | Trade-offs | |---|---|---|---| | **Prohibit AI** | High-stakes tasks where unaided performance is essential | Clear boundary, easier to police in some settings | Can encourage over-restriction and may not reflect real-world practice | | **Permit AI with rules** | Tasks where support is allowed but judgement must remain learner-owned | More realistic, clearer learner expectations | Needs precise guidance and evidence of what was done independently | | **Build AI into the construct** | Assessments meant to test AI literacy, critique, verification, or responsible use | Aligns with emerging practice and workplace relevance | Requires careful design so the construct is still visible |

Example in Practice

A university programme has used a detector to review coursework, but recent reporting shows the tool is both expensive and flawed. The programme team rethinks the policy: rather than relying on the detector as a verdict machine, it adds short checkpoints and a brief oral follow-up for higher-stakes assignments so assessors can see the student’s own judgement. That makes the process more defensible than a blanket reliance on software output.

Key Sources

- MIT resource explaining why AI detectors are unreliable. - TCN summary of reporting on flawed and costly AI detectors in higher education. - TCN summary of pervasive student AI use and its implications for detection and policy.

Vendor Landscape

Vendor claims in this area still tend to overstate certainty. Detection suppliers often present their tools as decisive or highly accurate, but the stronger evidence base says that output is probabilistic and context-dependent. The market lesson is not that detectors are useless; it is that they cannot carry the authenticity burden on their own.

FAQs

### Can AI detectors prove that a learner used ChatGPT? No. They may raise a concern, but they do not prove authorship or misconduct on their own. ### What is the difference between AI detection and authenticity? Detection is a tool-based flag; authenticity is the assessment judgement about whether the work genuinely evidences the learner’s own capability. ### Should AI be banned in assessment? Not automatically. The right answer depends on the construct: some tasks need unaided work, some can permit rules-based support, and some should deliberately include AI use. ### What is the safest way to deal with AI misuse in coursework? Use clear permitted-use rules, add process evidence, and avoid treating a detector as proof on its own. ### Are AI detectors fair across different learners? Not reliably enough to assume so. Subgroup effects, language variation, and false positives remain key concerns.

Last Reviewed By

Tim Burnett (Admin)

Suggested Citation

Test Community Network. "AI detection and authenticity." TCN AI & Assessment Wiki. Last reviewed 2026-05-12. https://www.testcommunity.network/wiki/ai-detection-and-authenticity.html

Sources

- MIT resource explaining why AI detectors are unreliable. - TCN summary of reporting on flawed and costly AI detectors in higher education. - TCN summary of pervasive student AI use and its implications for detection and policy.

Sources

  1. Mitsloanedtech
  2. MIT resource explaining why AI detectors are unreliable.
  3. Copyleaks
  4. Mitsloanedtech
  5. Copyleaks
  6. TCN summary of reporting on flawed and costly AI detectors in higher education.
  7. FE Week
  8. Mitsloanedtech
  9. MIT resource explaining why AI detectors are unreliable.
  10. MIT resource explaining why AI detectors are unreliable.
  11. Arxiv
  12. Data
  13. Test Community Network
  14. Mitsloanedtech
  15. Arxiv
  16. Data
  17. TCN summary of reporting on flawed and costly AI detectors in higher education.
  18. TCN summary of reporting on flawed and costly AI detectors in higher education.
  19. TCN summary of pervasive student AI use and its implications for detection and policy.
  20. TCN summary of reporting on flawed and costly AI detectors in higher education.
  21. TCN summary of pervasive student AI use and its implications for detection and policy.
  22. TCN summary of pervasive student AI use and its implications for detection and policy.
  23. Test Community Network
  24. Test Community Network
  25. Arxiv
  26. Arxiv
  27. Test Community Network
  28. Cambridge Assessment
  29. Test Community Network

← Back to Artificial Intelligence (AI) in Assessment