Screenshots



Problem Statement
This project was inspired by the following paper: https://arxiv.org/abs/2307.09009It detailed how ChatGPT's behavior and performance have been changing over time. That makes it difficult for companies to integrate the models into their pipeline, considering the unpredictability of these changes. The researchers thus developed a set of benchmarks that they ran on two snapshots of OpenAI's models. In this project, this benchmarking process is made recurrent and the results are stored on-chain for immutability and transparency purposes.
Solution
The project is split into three modules:Front-end: one-page application made with Vue.js 3 with GPT-3.5 Turbo and GPT-4 benchmarksSmart Contract: benchmark storing contract deployed on GnosisLLMDrift Scripts: scripts meant for bacalhau, running the LLMDrift benchmarks on the gpt-3.5-turbo and gpt-4 current models, and writing the result on the Gnosis chain. These scripts were based on the "lchen001/LLMDrift" repo, developed by the researchers of the aforementioned paper.
Hackathon
ETHGlobal Paris
2024
Contributors
- codethazine
46 contributions