MULTI-STAKEHOLDER MEDIA PROVENANCE MANAGEMENT TO COUNTER SYNTHETIC MEDIA RISKS IN NEWS PUBLISHING

The rise of indirect content distribution via third party social media platforms has introduced a new conduit for synthetic or manipulated content. That content purports to be legitimate news, or to come from legitimate news sources, and can present the consumer with apparent brand integrity markings, which convey authority. Three major global news organizations and a leading technology provider have come together to demonstrate a mechanism to tackle this problem that can operate at scale. The BBC, The New York Times Company, and CBC/Radio-Canada in cooperation with Microsoft have developed a proposed open standards approach which can be used by large and small news organizations to protect the provenance of news stories in audio/visual/textual media. INTRODUCTION The rise of social media and video hosting platforms has created a significant problem for identifying content provenance on the internet. Re-hosting of media has meant that the origin of media content is increasingly obfuscated, undermining consumer trust and enabling the propagation of dis/misinformation often using established and trusted brand imagery to amplify the deception. 1 We use the term disinformation to cover the broadest definition of information disorder – disinformation, misinformation and malinformation In order to meet this societal challenge, it is important to consider both technical and media business perspectives. Consequently, the authors of this paper have come together to demonstrate a provenance verification system that can be implemented at massive scale. Our approach enables consumers to determine the publication source of media, independent of the site or server hosting it. This will foster trust in the provenance of the media, and offer assurance that media is authentic and has not been altered since its original publication. We will present a prototype implementation of an open-standards media provenance architecture. This has been developed to enable content publishers to authenticate content as part of their publication workflow and for consumers to verify the content as received. The paper will detail the components of this architecture, including media provenance registration, provenance data binding to the media, provenance data distribution and consumer verification. The system architecture has been developed to support many types of publishers and media, including streaming video. We envisage this initial implementation will provide the stimulus for wider standardization of the common interoperable data structures and interfaces required, leading to a distributed ecosystem of content provenance system implementations and operators. SCOPING THE DISINFORMATION THREAT Disinformation – A multi-faceted problem Disinformation can enter the news ecosystem in many forms. First Draft has defined seven types of mis and disinformation [1]. This paper will address the risks caused by Imposter Content, Manipulated Content, Mis-contextualization and Fabricated Content. Our aim is to authenticate the provenance and status of a piece of media by technically linking it to its published source and signaling any tampering in its distribution. We do not make any assessment of the relative truth or trustworthiness implicit in it, or that of the publishing organization or reporter. Deep Fakes and Brand Hijacking – The next generation of threat The problem of malicious actors assuming the trusted brand identities of well-known news publishers is a current reality. Media now often reaches its audience via indirect paths, independent of the publishers’/broadcasters’ own digital sites. The malicious use of the established brand markings allows bad actors to add credibility to fictitious works. With the advent of AI generated Deep Fakes, there is now a risk that powerful traditional symbols of authority, trusted news brand hosts and sets, can be used to amplify disinformation. There are three approaches that can be deployed to counter the risk of Deep Fake synthetic content in news. The first is Media Education. This relies on training the consumer to increase their level of skepticism. While effective, it runs counter to decades of effort to build audience trust in news brands. The second approach is to use AI-based Deep Fake detection algorithms. This may have only short-term efficacy. Generative Adversarial Network implementations can use these tools to recursively test and improve the sophistication of the fakes they are meant to detect. This leaves Provenance as the third defensive strategy. Provenance strategies, to be effective, will require a coordinated approach across the news publishing, social media and technology eco-systems. This is the reason the Origin Alliance was formed. Figure 1 Three Responses to Deep Fake News PROVENANCE – THE AUTHENTICATION OF MEDIA Separation of the signal from the noise As the amount of disinformation continues to add noise to news ecosystems, it becomes important to have a consistent method to easily identify valid signals. Adding provenance information and binding it to the media amplifies information coming from publishers and broadcasters, making valid signals and therefore trustworthy content, easier to identify. Special attention will need to be taken during the design to allow for early deployment cases. Initially, very few legitimate news sources will have implemented the Origin system, and the player will need to reflect a neutral, rather than negative opinion on provenance. There will also be cases where the provenance of media needs to be intentionally obfuscated for the security of the reporter. The system needs to allow for a recognized actor to attest to the source of provenance without providing detailed information. The chain of provenance News items are built using multiple inputs. The intention of the Origin approach is to build a chain of provenance from the point of publishing to the point of presentation. The news publisher, via their news standards and practices, will attest to the provenance of all upstream sources. Other work is being done (by Adobe and others [2]) to capture the provenance chain from the lens to the editorial system. Conversations to create an end-to-end open standard are ongoing. Figure 2 The Media Provenance Sequence Diverse publishing environments and formats News publishers vary in scale, technical capacity and media types employed in storytelling. An effective provenance solution must be accessible and affordable for all producers of news content. The larger global media brands involved in the creation of the Origin Alliance recognize that any system implemented will need to have a way of being simply implemented by news organizations of all sizes. Access to a positive (authenticated) provenance system cannot become a barrier to entry for any news organization. It is also important to emphasize that positively determining the provenance link between a media story and its publisher is in no way an editorial endorsement of the validity of the news content. The Origin consumer user interface is intended to show provenance; care must be taken to distinguish provenance from trustworthiness (or truth), which is a wider and more complex issue. The techniques of authentication will vary across the different digital formats for text, audio, photo and video files. The embedding techniques will vary by format, however, the data structures for provenance should be common. The Origin approach defines a minimum common data requirement, with extensions for media type and for variations between publishers. Figure 3 An Extensible Provenance Standard Participation across the ecosystem In addition to alignment between news publishers, a fully effective provenance solution will require cooperation across the complete technology stack. Cloud media services and editorial tool vendors will have to offer common feature implementations. Social media platforms will have to monitor for provenance signals and provide appropriate distribution treatments based on validity of the response. This will be a larger industry conversation. Many of these discussions are underway via the Partnership on AI [3] and its Media Integrity group [4]. Figure 4 Cross Industry Alignment ORIGIN – A PROPOSED MEDIA PROVENANCE SOLUTION