Apps tagged with 'ai-safety'

All apps in Apps tagged with 'ai-safety'category. Use the filters below to narrow down your search.

Llama Guard
Like
Llama Guard is an LLM-based input-output safeguard model geared towards Human-AI conversation use cases.
Cost / License
Free
Open Source
Application type
Large Language Model (LLM) Tool
Origin
United States
Platforms
Self-Hosted
Best alternatives are WildGuardandShieldGemma
3 alternatives
Wardstone
Like
Wardstone is an LLM firewall and AI guardrail API that protects AI applications from prompt attacks, harmful content, data leakage, and suspicious links in a single inference call with ~30ms latency.
Cost / License
Freemium
Proprietary
Origin
United Kingdom
Platforms
Online
Software as a Service (SaaS)
WildGuard
Like
WildGuard is an open, lightweight moderation tool for LLM safety that achieves three goals:
Cost / License
Free
Open Source
Application type
Large Language Model (LLM) Tool
Origin
United States
Platforms
Self-Hosted
Python
Best alternatives are StatewrightandLlama Guard
3 alternatives
ShieldGemma
Like
ShieldGemma is a set of instruction tuned models for evaluating the safety of text and images against a set of defined safety policies. You can use this model as part of a larger implementation of a generative AI application to help evaluate and prevent generative AI...
Cost / License
Free
Proprietary
Application type
Large Language Model (LLM) Tool
Origin
United States
Platforms
Self-Hosted
Google Cloud Platform
Best alternatives are Llama GuardandWildGuard
3 alternatives
T
T
Toxic Prompt RoBERTa
Like
A text classification model that can be used as a guardrail to protect against toxic prompts and responses in conversational AI systems.
Cost / License
Free
Open Source (MIT)
Application type
Large Language Model (LLM) Tool
Origin
United States
Platforms
Self-Hosted
Best alternatives are Llama GuardandShieldGemma
3 alternatives
Petri
Like
Petri is an alignment auditing agent for rapid, realistic hypothesis testing. It autonomously crafts environments, runs multi turn audits against a target model using human like messages and simulated tools, and then scores transcripts to surface concerning behavior.
Cost / License
Free
Open Source (MIT)
Origin
United States
Platforms
Mac
Windows
Linux
Self-Hosted
+1

Llama Guard

Cost / License

Application type

Origin

Platforms

Wardstone

Cost / License

Origin

Platforms

WildGuard

Cost / License

Application type

Origin

Platforms

ShieldGemma

Cost / License

Application type

Origin

Platforms

Toxic Prompt RoBERTa

Cost / License

Application type

Origin

Platforms

Petri

Cost / License

Origin

Platforms