SolidityGPT

AI-powered Solidity test generator: 240x faster than manual, 95% coverage, security-aware.

Problem Statement

SolidityGPT is the first comprehensive AI-powered test generator for Solidity smart contracts, purpose-built for Hardhat 3.The ProblemWriting comprehensive Solidity tests is time-consuming and error-prone:Developers spend 2-3 hours writing tests for a single complex contractManual testing misses edge cases and security vulnerabilitiesTesting is cited as the #1 IDE use case by 28.3% of developersGitHub Copilot generates Solidity code with a 44% vulnerability rateExisting tools either generate skeleton code only or focus on detection, not test generationThe SolutionSolidityGPT generates production-ready, security-aware tests in 30-60 seconds with a single command:npx hardhat generate-tests --contract YourContract --security --refineKey FeaturesAI-Powered Generation: Uses Claude Sonnet 4.5 or GPT-5 to generate comprehensive test suitesSecurity-Aware: Automatically detects reentrancy risks, access control issues, and generates security-focused testsIterative Refinement: Compiles tests, runs them, and auto-fixes errors until they passQuality Validation: Scores test quality (0-100) and provides improvement suggestionsDual Format Support: Generates both Solidity (.t.sol) and TypeScript testsFramework-Native: Deep Hardhat 3 integration with Foundry-compatible EDR engineImpact Metrics⏱️ Time: 2-3 hours → 30-60 seconds (240x faster)📊 Coverage: Achieves 85-95% coverage automatically with 100% function coverage✅ Quality: 93.5/100 average quality score across 50+ contracts tested🎯 Accuracy: 99% test pass rate with refinement enabled💰 Cost: ~$0.20 per contract vs. $200-300 of developer timeWhat Makes It DifferentUnlike security scanners that only find bugs, SolidityGPT generates working tests. Unlike Copilot that generates vulnerable code, SolidityGPT is security-focused and validates output. Unlike skeleton generators, it creates complete, production-ready test logic.How it's madeTechnical ArchitectureCore Stack:TypeScript - Plugin implementationHardhat 3 - Framework integration with new plugin system and declarative task API@solidity-parser/parser - AST-based contract parsingAI Models - Claude Sonnet 4.5 (13/13 instruction adherence) & GPT-5Foundry Test Framework - Generated tests use forge-std/Test.sol syntaxora + chalk - Beautiful CLI with colored output and progress trackingHow It's Pieced TogetherParsing Pipeline ContractParser → AST Analysis → Extract Functions/Events/State Variables We parse Solidity source using @solidity-parser/parser and extract detailed contract structure including function signatures, parameters, visibility modifiers, and state variables.Security Analysis Module SecurityAnalyzer → Pattern Matching → Vulnerability Detection Custom heuristics detect:Reentrancy risks (external calls before state changes)Access control patterns (onlyOwner, role-based)Arithmetic operations (overflow/underflow risks)Unchecked external callsAI Prompt Engineering PromptBuilder → Context Assembly → Structured Prompts Generates optimized prompts with:Contract source codeSecurity analysis resultsExpected test structure (Foundry format)Edge case requirementsDual AI Integration AIService → [Claude Sonnet 4.5 | GPT-5] → Retry Logic + FallbackPrimary: Claude Sonnet 4.5 (latest model, excellent Solidity understanding)Fallback: GPT-5 (fast, reliable)Automatic retry with exponential backoffAPI key management for both providersValidation & Quality Scoring TestValidator → Syntax Check + Quality Analysis → 0-100 Score Validates:Correct imports and SPDX licensesetUp() function existsTest function naming (test_, testFuzz_)Proper assertions (assertEq, assertGt, vm.expectRevert)Edge case coverageIterative Refinement System (Novel Approach) TestRefiner → Compile → Run Tests → Fix Errors → Repeat (max 3 iterations) This is our "secret sauce":Writes generated tests to diskCompiles with HardhatCaptures compilation errorsFeeds errors back to AI: "Fix these errors: [error list]"AI generates improved versionRepeats until tests compile and passResult: 95% → 99% success ratePartner TechnologiesHardhat 3 - We're one of the first plugins built for Hardhat 3's new architecture:Declarative plugin registration (no hooks required)New task API with .addOption() and .addFlag()EDR engine (Ethereum Development Runtime) with Foundry compatibilityThis means our generated tests run on both Hardhat and pure FoundryAnthropic Claude Sonnet 4.5 - Latest model with exceptional Solidity knowledge:13/13 instruction adherence score200K token context window (handles large contracts)Excellent at understanding security patternsBetter than GPT-4 for blockchain code generationOpenAI GPT-5 - Fallback provider with broad availability:Fast generation timesReliable API uptimeGood general coding knowledge

Solution

Technical ArchitectureCore Stack:TypeScript - Plugin implementationHardhat 3 - Framework integration with new plugin system and declarative task API@solidity-parser/parser - AST-based contract parsingAI Models - Claude Sonnet 4.5 (13/13 instruction adherence) & GPT-5Foundry Test Framework - Generated tests use forge-std/Test.sol syntaxora + chalk - Beautiful CLI with colored output and progress trackingHow It's Pieced TogetherParsing Pipeline ContractParser → AST Analysis → Extract Functions/Events/State Variables We parse Solidity source using @solidity-parser/parser and extract detailed contract structure including function signatures, parameters, visibility modifiers, and state variables.Security Analysis Module SecurityAnalyzer → Pattern Matching → Vulnerability Detection Custom heuristics detect:Reentrancy risks (external calls before state changes)Access control patterns (onlyOwner, role-based)Arithmetic operations (overflow/underflow risks)Unchecked external callsAI Prompt Engineering PromptBuilder → Context Assembly → Structured Prompts Generates optimized prompts with:Contract source codeSecurity analysis resultsExpected test structure (Foundry format)Edge case requirementsDual AI Integration AIService → [Claude Sonnet 4.5 | GPT-5] → Retry Logic + FallbackPrimary: Claude Sonnet 4.5 (latest model, excellent Solidity understanding)Fallback: GPT-5 (fast, reliable)Automatic retry with exponential backoffAPI key management for both providersValidation & Quality Scoring TestValidator → Syntax Check + Quality Analysis → 0-100 Score Validates:Correct imports and SPDX licensesetUp() function existsTest function naming (test_, testFuzz_)Proper assertions (assertEq, assertGt, vm.expectRevert)Edge case coverageIterative Refinement System (Novel Approach) TestRefiner → Compile → Run Tests → Fix Errors → Repeat (max 3 iterations) This is our "secret sauce":Writes generated tests to diskCompiles with HardhatCaptures compilation errorsFeeds errors back to AI: "Fix these errors: [error list]"AI generates improved versionRepeats until tests compile and passResult: 95% → 99% success ratePartner TechnologiesHardhat 3 - We're one of the first plugins built for Hardhat 3's new architecture:Declarative plugin registration (no hooks required)New task API with .addOption() and .addFlag()EDR engine (Ethereum Development Runtime) with Foundry compatibilityThis means our generated tests run on both Hardhat and pure FoundryAnthropic Claude Sonnet 4.5 - Latest model with exceptional Solidity knowledge:13/13 instruction adherence score200K token context window (handles large contracts)Excellent at understanding security patternsBetter than GPT-4 for blockchain code generationOpenAI GPT-5 - Fallback provider with broad availability:Fast generation timesReliable API uptimeGood general coding knowledgeParticularly Hacky/Notable ThingsHardhat 3 API Workaround Hardhat 3's ArgumentType enum isn't exported from hardhat/config, breaking type safety for optional parameters. We solved this by:Creating a script-based approach using direct module imports (works perfectly)Adding @ts-ignore comments for the task API approachDocumenting both methods in TASK_USAGE.mdSelf-Healing Tests The refinement loop is essentially "AI pair programming with itself": Generate → Compile Fails → AI reads errors → AI fixes itself → Repeat This achieves 99% success rate without human intervention.ES Module + CommonJS Compatibility Hardhat 3 uses ES modules but many ecosystems still use CommonJS. We:Use .js extensions in imports (required for ES modules)Set "type": "module" in package.jsonConfigure "moduleResolution": "NodeNext" in tsconfigExport both .js and .d.ts files for maximum compatibilityZero-Dependency Security Analysis Instead of using heavyweight security tools, we built lightweight pattern matching that catches 95%+ of common issues using simple AST traversal and regex patterns.Quality Scoring Algorithm Custom scoring system (0-100) that checks:Test count (more tests = higher score)Edge case coverage (boundary conditions, zero values)Security test presence (reentrancy, access control)Assertion quality (specific checks vs. generic)Code structure (setUp, proper naming)Beautiful CLI UX Phase 3 added professional colored output: 🤖 SolidityGPT - AI-Powered Test Generator ============================================================ ✔ SolidityGPT initialized ✔ Found 1 contract(s) to process✔ Generated tests for SimpleToken → test/SimpleToken.t.sol Functions tested: 8 ✨ Quality score: 95/100============================================================ 📊 SummaryTime taken: 28.4s Contracts processed: 1 ✓ Successful: 1 Total functions: 8 Avg quality score: 95.0/100============================================================Project Stats3 Phases completed in 2 weeks4,490 lines of documentation (6 comprehensive docs)50+ contracts tested during development5 example contracts showcasing different patterns8 core modules working in harmony2 AI providers with automatic fallback100% TypeScript with full type safety (except Hardhat 3 task API workaround)

Hackathon

ETHOnline 2025

2025

Contributors

bilgin-kocak
21 contributions