FIDOMAP is a novel multimedia content analysis platform designed to empower users in the fight against disinformation. At its core, FIDOMAP provides a user-friendly web interface that facilitates the detection and localization of manipulations and synthetic elements within visual content, utilizing state-of-the-art algorithms produced by multimedia forensics researchers. The FIDOMAP backend consists of a plugin-based software architecture where multiple forensic algorithms can be integrated. Each algorithm is packaged in a container and associated with a set of metadata that describes the supported types of files, the type of analysis performed, and the format of the results it produces. Each visual content sent by the user is routed by an orchestrator to the appropriate plugins for processing. The results are then displayed through an interactive web interface that allows the user to view and compare the identified traces. We anticipate that FIDOMAP will be utilized by both experienced users through self-hosting and by organizations dedicated to online disinformation. This way, these organizations to extend the service to users who may be less technically proficient.
From an implementation standpoint, the FIDOMAP platform is realized through a client-server architecture on the web. The frontend consists of a web application with the sole task of handling user analysis requests, sending the data to the server for processing, and ultimately presenting the analyzed results in a user-friendly format. On the other hand, the backend of the FIDOMAP platform is composed of an orchestrator which acts as a central control unit, receiving multimedia content submitted for analysis. Upon reception, it distributes these multimedia files to the most appropriate algorithms. Once each algorithm completes its processing, the orchestrator consolidates the individual results into a single response that is then delivered back to the frontend for presentation to the user.
Each algorithm is packaged in a container to enable easy distribution along with all required dependencies and to facilitate its execution as a self-contained tool. To facilitate the selection of algorithms to be executed by the orchestrator, each algorithm is associated with a set of metadata indicating, among other things, the type of multimedia content that can be analyzed (image, video), any restrictions on the specific format (e.g., jpeg, h264), and the detectable type of manipulation (e.g., splicing, copy-move, deepfake). The response provided by each algorithm can vary in complexity depending on the depth of the analysis performed by the respective method. For instance, an algorithm that simply identifies whether the content was generated using a specific generative model may return only a boolean value accompanied by the confidence level associated with the decision. On the other hand, an algorithm capable of locating the manipulated area in an image will need to return the corresponding bounding box or even a localization map at the patch or pixel level. The format of the responses must still comply with the specifications defined by the FIDOMAP platform. This is essential because once these results are transmitted to the client, they need to be visually presented and therefore mapped to one of the available visualization methods.
Principal investigator: Dr. Daniele Baracchi
- Dr. Roberto Verdecchia