Public web platform for large language model evaluation using anonymous pairwise comparisons, crowd-sourced voting, real-time model identity reveals and aggregate performance tracking for both open-source and proprietary LLMs across community-submitted prompts.





































































Clean and fast interface, relevant results and a system for refining searches. Much better than Bing.