Methodology

How Veraclue evaluates LLM security

A structured, repeatable approach to LLM security assessment that produces evidence-linked findings for enterprise review workflows.

1
2
3
4

Built for repeatable security assurance

Our methodology combines structured evaluation techniques with evidence generation to produce consistent, defensible security assessments.

Structured Test Suites

Repeatable evaluation scenarios with defined parameters, expected outcomes, and evidence collection points.

Evidence Generation

Complete audit trails for every finding, including prompts, responses, and reproduction steps.

Framework Alignment

Direct mapping to enterprise compliance frameworks for audit preparation and governance workflows.

Re-test Validation

Consistent re-testing methodology to validate remediation efforts and track improvement over time.

What Veraclue evaluates

Prompt Attacks

Systematic testing for injection attacks and prompt manipulation

Jailbreak & Policy Bypass

Detection of attempts to bypass safety controls and policies

Data Leakage

Assessment of sensitive information exposure risks

Fairness Analysis

Evaluation of model outputs for bias and fairness across demographic groups

Privacy Compliance

Assessment of privacy-related risks and compliance with data protection requirements

Transparency Check

Validation of model transparency, explainability, and disclosure practices

How evaluations are structured

1
Test Definition

Each evaluation begins with clearly defined test parameters:

  • ·Scope and boundaries
  • ·Model configuration parameters
  • ·Test case specifications
  • ·Success criteria

2
Execution Protocol

Standardized execution ensures reproducibility:

  • ·Deterministic test sequences
  • ·Fixed random seeds
  • ·Controlled environment variables
  • ·Consistent timing measurements

3
Evidence Collection

Complete audit trail documentation:

  • ·Request/response logging
  • ·Environment state capture
  • ·Reproduction steps
  • ·Context preservation

4
Finding Classification

Structured categorization and scoring:

  • ·Risk category assignment
  • ·Severity scoring methodology
  • ·Confidence level assessment
  • ·Reproducibility rating

Risk categories

Critical Risk

Immediate security threats that could result in data exposure, system compromise, or unauthorized access

Examples:

·Prompt injection leading to system prompt disclosure
·Bypass of safety controls enabling harmful outputs
·Unauthorized tool access or data exfiltration

High Risk

Significant security weaknesses that could be exploited with moderate effort or specific conditions

Examples:

·Partial safety control bypass
·Data leakage through indirect means
·Tool misuse with limited impact

Medium Risk

Security concerns that require attention but have limited immediate impact or exploitability

Examples:

·Information disclosure through context manipulation
·Inconsistent safety responses
·Minor tool behavior issues

Low Risk

Minor security observations or potential improvements with minimal current risk

Examples:

·Verbose error messages
·Inconsistent response formatting
·Minor documentation gaps

How findings are interpreted

Severity Assessment

Each finding is evaluated across multiple dimensions:

Impact PotentialHigh/Med/Low
Exploit ComplexityLow/Med/High
ReproducibilityCertain/Likely/Possible
Scope AffectedSystem/Component/Local

Confidence Scoring

Statistical confidence based on test consistency:

95-100%Certain finding
80-94%High confidence
60-79%Moderate confidence
<60%Low confidence

Contextual Analysis

Findings are interpreted within deployment context:

·Model deployment environment
·Existing security controls
·Operational constraints
·Compliance requirements

Stakeholder Impact

Analysis considers affected stakeholders:

·End users and data subjects
·System administrators
·Security teams
·Compliance officers

Evidence references and traceability

Complete Audit Trail

Test Suite Initialized2024-01-15 14:32:18 UTC

Evaluation parameters defined and environment prepared

Evidence:suite-config.json, environment-state.json
Test Case Executed2024-01-15 14:32:45 UTC

Prompt injection test case #001 submitted to model endpoint

Evidence:prompt-001.json, request-headers.json, response-001.json
Finding Detected2024-01-15 14:33:02 UTC

System prompt disclosure identified in response analysis

Evidence:finding-F-001.json, response-analysis.json, evidence-trace.json
Finding Validated2024-01-15 14:33:15 UTC

Reproduction confirmed with consistent results across iterations

Evidence:validation-results.json, reproducibility-report.json

Evidence Package Structure

Evidence Package Example
{
"finding_id": "PI-1",
"test_suite": "prompt-injection-v3",
"evidence": {
"prompt": "...",
"response": "...",
"context": "...",
"environment": {
"model": "gpt-4-turbo",
"version": "2024-01-15",
"parameters": {
"temperature": 0.7,
"max_tokens": 4096
}
}
}
}
Structured JSON format
Complete audit trail
Machine-readable

Reproducibility and repeatability

Deterministic Testing

Ensuring consistent results across test runs:

  • ·Fixed random seeds for all stochastic operations
  • ·Consistent model configuration parameters
  • ·Controlled environment state
  • ·Standardized test sequences

Variation Analysis

Accounting for model behavior variability:

  • ·Multiple test iterations per scenario
  • ·Statistical significance testing
  • ·Behavior pattern analysis
  • ·Confidence interval calculation

Cross-Validation

Validating findings across conditions:

  • ·Multiple model versions
  • ·Different temperature settings
  • ·Varying context windows
  • ·Alternative prompt formulations

Re-test Protocol

Standardized re-testing methodology:

  • ·Pre- and post-remediation testing
  • ·Delta analysis and comparison
  • ·Regression testing for fixes
  • ·Long-term behavior monitoring

Deterministic vs probabilistic behavior

Understanding Model Behavior Variability

Large language models exhibit both deterministic and probabilistic behaviors. Our methodology accounts for this variability while ensuring consistent security assessment outcomes.

Deterministic Behaviors

Consistent responses under identical conditions:

  • ·Safety policy enforcement
  • ·Tool usage restrictions
  • ·Content filtering rules
  • ·System prompt protection

Probabilistic Behaviors

Variable responses requiring statistical analysis:

  • ·Creative response generation
  • ·Contextual understanding
  • ·Reasoning and inference
  • ·Novel problem solving

Testing Approach for Variable Behavior

Statistical Testing Methodology

For probabilistic behaviors, we employ statistical testing:

Sample Sizen=100 iterations
Confidence Level95%
Success Threshold>70% success rate
P-value<0.05
Effect SizeCohen's d >0.5
Power Analysis80% minimum

Behavior Classification

Findings are classified based on behavior consistency:

Consistent bypass (100% success rate)
Frequent bypass (70-99% success rate)
Occasional bypass (30-69% success rate)
Rare bypass (<30% success rate)

Framework mapping approach

SOC 2 Type II

Service Organization Control 2 reporting for security, availability, and confidentiality

Mapped Controls:

  • ·CC6.1 - Security Operations
  • ·CC7.1 - System Operations
  • ·A1.1 - Security Requirements

Mapping Approach:

Direct mapping of findings to Trust Services Criteria

NIST AI RMF

Artificial Intelligence Risk Management Framework from NIST

Mapped Controls:

  • ·RM-1 - Risk Assessment
  • ·RM-2 - Risk Treatment
  • ·GA-4 - Risk Assessment

Mapping Approach:

Alignment with AI risk management functions and categories

ISO 27001

International standard for Information Security Management Systems

Mapped Controls:

  • ·A.12 - Operations Security
  • ·A.14 - System Acquisition & Development Security
  • ·A.18 - Compliance

Mapping Approach:

Mapping to information security management system requirements and controls

Re-test methodology

Continuous Validation Process

Our re-test methodology ensures that security improvements are validated and tracked over time.

1

Baseline Establishment

Initial assessment establishes baseline security posture with complete evidence documentation.

2

Remediation Validation

Re-test after fixes to validate vulnerability resolution and identify any regressions.

3

Progress Tracking

Comparative analysis shows security posture improvement over time with measurable metrics.

Re-test Protocol Requirements

Same test parameters as baseline
Consistent environment state
Complete evidence collection
Before/after comparison
Delta analysis reporting
Trend analysis over time

Who this methodology is for

Security Teams

Structured evaluation methodology for security reviews, vulnerability assessments, and compliance workflows.

  • ·Repeatable security assessments
  • ·Evidence-based vulnerability management
  • ·Compliance audit preparation

Engineering Teams

Technical evaluation approach for development workflows and security integration.

  • ·Reproducible testing procedures
  • ·Repeatable export and review workflows
  • ·Remediation validation workflows

Risk and Governance Teams

Framework-aligned methodology for risk assessment and governance workflows.

  • ·Framework compliance validation
  • ·Risk assessment documentation
  • ·Audit trail maintenance

What Veraclue does not claim

Our methodology is designed for security assurance, not comprehensive AI safety or alignment testing.

Scope Limitations

  • ·Not a complete AI safety solution
  • ·Does not replace human security review
  • ·Limited to defined evaluation scenarios
  • ·Not a real-time monitoring system

Technical Constraints

  • ·Model-specific evaluation scope
  • ·Dependent on model API availability
  • ·Cannot predict all failure modes
  • ·Limited to current model capabilities

Compliance Boundaries

Our methodology supports compliance workflows but does not guarantee compliance outcomes. Organizations must:

  • ·Review findings in organizational context
  • ·Implement appropriate security controls
  • ·Maintain ongoing security monitoring

From methodology to evidence

Our structured methodology produces comprehensive evidence packs that document every aspect of the security evaluation process.

Methodology Output

  • ·Structured test execution
  • ·Evidence collection
  • ·Finding classification
  • ·Framework mapping

Evidence Pack Contents

  • ·Assurance report (PDF)
  • ·Evidence references (JSON)
  • ·Framework mapping (JSON/CSV)
  • ·Structured exports (CSV)
View Sample Evidence Pack

See how our methodology translates into audit-ready security artifacts

Ready to see our methodology in action?

Get a comprehensive methodology walkthrough

Schedule a detailed walkthrough of our evaluation methodology and see how it produces structured, evidence-linked security assessments.