Decentralizing Federated Learning Systems for Active Cyber Defense

Project Overview

Omnipotent Analytics (OA) has been awarded a grant to research and develop a groundbreaking project focused on active security measures that utilize advanced AI technologies. This is a collaborative effort between a Swedish governmental body, OA, and a multi-national manufacturing company. Beyond the initial scope of this project, the fostering of business relations between traditional companies and web3 protocols has immense applications for future innovations within traditional markets.

Our proposal is for the development of a large language model (LLM) capable of functioning as both a simulated bad actor (for signal and data collection) and an active security system that can mitigate and report active hacks. The core of this project is based on federated learning principles, aiming to deploy small AI agents on edge devices that collect data while responding to security threats. Alongside this, a central AI system would be responsible for aggregating the information, which can then be used to continually upgrade and refine the edge models, enhancing their effectiveness and fully actualizing a federated learning system.

OA aim to document our journey, from this initial conception all the way to a fully realised, decentral and self-teaching AI security product. We invite you to join us in exploring the different tools we could make use of for development, alongside the insights we glean in the ecosystems of decentralized data collection and AI training.


Key Components

In order to conceptualise the scope of this research, a number of core principles and components must be defined, alongside the goals for both research and development.

Federated Learning:

The concept of federated learning is a sub-field of machine learning that focuses on having multiple entities collaboratively training a model. This serves to decentralize the training process while also enhancing overall privacy and security, due to the sensitive data remaining on local devices. 

In this scenario, the multiple entities responsible for training are what we will refer to as Edge AI Agents. These are a multitude of agents deployed on edge devices who will relay data collected while responding to security threats back to a central AI hub system, which will process the data, using it to improve the effectiveness of the overall model.

Active Security Measures:

Our LLM will be designed to have a meticulous set of abilities tailored towards improving security. It will be trained to simulate any potential malicious actors, providing insights into attack vectors for assailants and highlighting any vulnerabilities within the system. To help facilitate this the model will be integrated with a honeypot such as a NGINX server, a seemingly vulnerable portion of the system that will bait in potential attackers for the purpose of analysing and recording their methods. This will then be relayed back to the central AI, that will facilitate improving upon the Edge Agents for future attacks.

Decentralized Data Collection and AI Training tools:

Within the blockchain environment there are a number of decentralized protocols focused on AI training and data collection. For the purposes of this research we are currently looking at three of these who have firmly positioned themselves at the forefront of AI training and data collection; Autonolas(OLAS), Bittensor, and JAM. 

OLAS is a platform specifically focusing on many individual agents that can act interoperably between blockchains and technologies. This makes it a prime candidate for our focus on the privacy-preserving machine learning that we’re looking for, without compromising on sensitive data protection by ensuring data never leaves the local device.

Bittensor is a single blockchain-based framework that allows for the creation and training of models on a global scale using their neural network. Leveraging tokenized incentives to encourage participation in their network, it is one of the largest decentralized protocols and acts as an exceptional hub for any AI data collection or training systems.

JAM, short for Join-Accumulate Machine, is a specialised data collection and sharing platform that focuses on a secure and scalable way to gather and utilize data for AI training. It refers to itself as a prospective design to succeed relay chains. JAM excels at the handling of complex computational algorithms.

Research and Development Goals

The first stage of this project is a 6-month period for the production of an external research paper that will detail our findings and proposed methodologies utilizing the tools that are available. This paper will then be submitted for peer review and publication. Alongside this, an internal paper specifically targeting the intricacies of federated learning systems and decentralized data collection, will provide invaluable insights for future projects, research and collaborations. Finally, we aim for this to be a significant step within the AI world, bridging the gap between traditional companies and web3 or blockchain protocols. The potential of crowd-sourcing the training and enhancement of models while also ensuring security and privacy is a level of innovation we are eager to explore.

Potential Impact

We believe this project has the potential to revolutionize the way we approach cybersecurity and AI training. By leveraging federated learning and decentralized data collection we can create more robust and secure AI systems that will improve upon themselves, as attackers adapt, so too will these systems. The integration of technologies like OLAS, Bittensor, and JAM will further enhance the capability to provide a scalable and secure framework for future AI developments.

OA are committed to exploring and advancing the field of active security measures through innovative research and development. We seek to pave the way for a more secure and intelligent future, not only through improvements to security measures but also in fostering the advancement of collaboration and knowledge sharing between the AI and cybersecurity communities.

Previous
Previous

Exploring Novel Research in Federated Learning and Honeypots