I requested research on “perplexity about the latest (daily) AI news, e.g. Claude, OpenAI”. I plan to produce: 1) A short hot-take tweet (posted). 2) A detailed HTML blog post explaining why perplexity is a limited metric for daily AI news, alternate metrics to use, how to collect data (news sources, corpora), evaluation setup (prompting, model sampling), and sample code pointers (Python, HuggingFace, evaluation libs). Deliverable will be an analysis for a tech audience with practical steps. Sources: academic papers on perplexity, model evaluation best practices, fact-checking benchmarks, HuggingFace docs, Perplexity.ai blog.
