Veritas

Veritas: The AI verifier that uses ZK-SNARKs to prove code originality without revealing secrets.

Problem Statement

Veritas is a revolutionary algorithmic verifier designed to fundamentally resolve the critical deadlock at the heart of modern software patent litigation. In today's high-stakes legal disputes, the core evidence—the source code—is also a company's most valuable trade secret. This creates an impossible situation where a claimant cannot prove infringement without forcing the suspect to expose their entire proprietary codebase during legal discovery, a process that is slow, prohibitively expensive, and carries immense security risks for both parties. This stalemate often prevents valid patent claims from being pursued and allows IP theft to go unchecked, as the very act of proving the crime requires the victim to put their own secrets at risk. Veritas breaks this deadlock by providing a new standard for evidence, one based on cryptographic truth rather than forced disclosure.Our solution is an impartial digital expert witness that can mathematically prove algorithmic similarity without either party ever revealing their source code. When a dispute arises, both the claimant and the suspect submit their respective codebases to the Veritas system in separate, secure environments. The system does not store or read the code in a human-readable format. Instead, a sophisticated AI pipeline immediately gets to work, acting like a specialized compiler to analyze the computational intent of the software. It isolates individual functions and algorithms, de-biasing them by stripping away all superficial syntax, variable names, and comments. This process transforms the code into its purest logical form, represented as a universal, language-agnostic structure.From this pure logical structure, Veritas generates a compact, binary "semantic fingerprint" for each algorithm. This fingerprint is the key innovation; it is a mathematical representation of the patented process, not just the text that describes it. This allows us to compare algorithms on a deep, conceptual level. Our advanced scoring model then analyzes these fingerprints, going beyond simple one-to-one matching. It can detect if a single patented algorithm has been fragmented and split across multiple functions in the suspect's code—a common technique to evade detection. Furthermore, the system intelligently weighs the similarity based on the complexity of the code, understanding that the theft of a core, intricate algorithm is far more significant than a simple helper function.The final, game-changing output of the Veritas system is a Zero-Knowledge Proof (ZK-SNARK). This is a small, tamper-proof cryptographic file that serves as a definitive certificate of the analysis. The ZK-SNARK provides a simple, verifiable "yes" or "no" answer to the court: it mathematically proves the statement "The claimant's patented algorithms are present in the suspect's codebase with a similarity score exceeding the legal threshold of X%." This proof is generated without revealing the source code, the semantic fingerprints, or even the exact final score. It is an irrefutable piece of evidence that respects the confidentiality of both parties.Ultimately, Veritas transforms software patent litigation from a subjective, risky, and expensive battle of human experts into an objective, secure, and efficient process of mathematical verification. It allows patent holders to confidently defend their intellectual property without fear of exposing their own secrets, while simultaneously protecting innocent companies from being forced to disclose their codebase. By replacing legal friction with cryptographic proof, Veritas establishes a new, higher standard for truth and accountability in the digital age.

Solution

Veritas is an algorithmic verifier built as a multi-stage pipeline in Python. The process begins with our custom-built deterministic normalizer. Instead of relying on simple obfuscation, we parse raw Python functions into their native Abstract Syntax Tree (AST) using Python's built-in ast module. A custom visitor class then traverses this tree, de-biasing all identifiers and converting the language-specific structure into a universal, language-agnostic JSON representation of the code's logic. This structured JSON, not the raw code, is the key input for the next stage, ensuring a consistent and purified logical input for our reasoning engine.The core of our semantic analysis is powered by a locally-hosted deepseek-coder-v2 model, run via Ollama on our own GPU. This was a crucial choice to ensure that no proprietary code ever leaves the machine, mirroring a truly trustless environment. We leveraged principles from the Artificial Superintelligence Alliance, focusing on creating a specialized reasoning system. The LLM, guided by a sophisticated prompt with few-shot examples, acts as a "logic-to-pseudocode" compiler, translating the universal AST into a human-readable algorithm. The resulting pseudocode is then converted into a vector embedding using OllamaEmbeddings.The final fingerprinting and scoring happen in Python using NumPy. We convert the high-dimensional embedding into a binary fingerprint using a simple sign-based hash. The similarity comparison is where we did something particularly notable. We designed and implemented a custom scoring algorithm from scratch that includes a Reconstruction Model to detect fragmented code plagiarism and a Weighted Average based on code length to reflect business impact.The entire architecture is designed to feed into a Zero-Knowledge backend. We've mapped out the ZK circuit logic using the Noir language framework. The final output of the Veritas agent is not just a score, but a verifiable ZK-SNARK attesting to the algorithmic similarity, ready to be posted on-chain as a "Certificate of Originality."

Hackathon

ETHGlobal New Delhi

2025

Contributors

ShreeSinghi
11 contributions