
CrowdStrike & Meta unveil open benchmarks for AI in cyber defence
CrowdStrike has announced the launch of CyberSOCEval, a suite of benchmarks developed in partnership with Meta to evaluate artificial intelligence performance in real-world security operations centres (SOCs).
The new benchmarks utilise Meta's CyberSecEval framework and CrowdStrike's threat intelligence data to measure and compare the effectiveness of large language models (LLMs) within cyber defence scenarios. The intention is to provide clear and open standards for assessing AI capabilities in critical security workflows such as incident response, malware analysis, and threat comprehension.
Benchmarks for AI in SOC
Security operations centres often confront a high volume of alerts and the constant evolution of threats. While adoption of AI is seen as a way to improve efficiency and response times, many organisations remain early in their implementation, particularly regarding LLMs. The absence of standard benchmarks has made it difficult to determine which AI models and use cases are effective in real-world attacks.
Through CyberSOCEval, CrowdStrike and Meta aim to provide a framework that enables both businesses and model developers to identify which AI systems deliver tangible improvements in SOC performance and defence capabilities. The benchmarks leverage observed adversarial tactics and expert-designed security scenarios to test AI under operational conditions, allowing for the validation of readiness against genuine threats.
The initiative is designed to facilitate an industry-wide shift toward transparency and evidence-driven adoption of AI within cyber security teams. By making the suite open source, the companies believe it will drive wider collaboration and continuous improvement across the sector.
Industry perspectives
Vincent Gonguet, Director of Product, GenAI at Superintelligence Labs at Meta, outlined the objectives and potential of the collaboration.
"At Meta, we're committed to advancing and maximizing the benefits of open source AI – especially as large language models become powerful tools for organizations of all sizes. Our collaboration with CrowdStrike introduces a new open source benchmark suite to evaluate the capabilities of LLMs in real world security scenarios. With these benchmarks in place, and open for the security and AI community to further improve, we can more quickly work as an industry to unlock the potential of AI in protecting against advanced attacks, including AI-based threats."
Daniel Bernard, Chief Business Officer at CrowdStrike, highlighted the strategic significance of the partnership for the cybersecurity community.
"When two leaders like CrowdStrike and Meta come together, it's larger than collaboration, it's about setting the direction of cybersecurity for the AI era. By combining CrowdStrike's adversary intelligence and leadership in AI-native cybersecurity, with Meta's AI research expertise and vast dataset, we're helping customers – and cybersecurity as a sector – adopt AI systems with confidence. This partnership sets a new bar for how AI in the SOC should be built and deployed, empowering defenders to stay ahead of the adversary."
Through these statements, both companies emphasised the importance of collaboration and shared benchmarking to meet new challenges in the field.
Availability and scope
The CyberSOCEval benchmark suite is now accessible for use by the global AI and security community. It is built to allow organisations to test LLMs in environments that reflect the pressures and complexities faced by modern SOCs. Model developers are expected to use the benchmarks as reference points for improving AI products, with the broader aim of enhancing return on investment and operational effectiveness in security teams.
CrowdStrike and Meta have stated that open access to the benchmarks is intended to encourage further development and shared refinement within the industry as AI evolves as both a tool for defenders and a potential vector for advanced threats.