Felegrafen – analys av smutsiga tidningar

I veckan kommer en delstudie av projektet ”Digitala lägg” att presenteras på den digitala humaniora-konferensen i Oslo – Digital Humanities in The Nordic Countries. Projektpresentationen heter, ”Att hantera felegrafen: Textanalys av smutsiga tidningar”, och anslaget ger en fingervisning om innehållet:

Aftonbladet kunde i en notis i oktober 1847 rapportera att ny teknik inte alltid är att lita på när åskan fått den elektriska telegrafen att sända obegripliga textmeddelanden. Liknande brus har uppstått när just denna gamla notis digitaliserats – men istället för åskan så är det nu den maskinella inläsningen (OCR) och den automatiska indelningen av textmassan till sammanhörande stycken som förvränger orden. Pionjärerna inom distant reading arbetar vanligtvis med mer grafiskt rena, ofta skönlitterära, böcker som är enklare att digitalisera. Att arbeta med tidningstexter innebär i högre grad att finna metoder för att hantera det omfattande brus som tillförts materialet genom digitaliseringsprocessen.

Projektpresentationen sker inom ramen för konferensens ”panel-poster”, och vår affisch – som mina kollegor Johan Jarlbrink och Roger Mähler tagit fram – kan laddas ned här: Nordic-DHC_poster_Telegrafen_Roger_Johan_Pelle.

Första projektpresentationen av Digitala modeller

Imorgon på Nordiska museet anordnar Digisam ett seminarium med syfte att fördjupa dialogen mellan minnesinstitutioner och universitet – i detta fall med utgångspunkt i de fem projekt som beviljats medel i den första utlysningen från Vitterhetsakademin och RJ kring “Samlingarna och forskningen” förra året. Programmet finns här: Dagordning_ digisam_RJ-sem. Projektet “Digitala Modeller. Teknikhistoriens samlingar, digital humaniora & industrialismens berättelser” tilldelades då medel – allt i form av ett samarbete mellan Tekniska museet och HUMlab på Umeå universitet. Jag leder detta projekt och gör imorgon min första presentation (även om projektet officiellt inte startar förrän i april). Mina slides kan laddas ned här: snickars_digitala_modeller_presentation.

Förståelsen för vad internet gör med samhället är för grund

Jag har idag i DN Kultur publicerat en tredelad replik i det meningsutbyte som där pågått en tid – Digitaliseringen. Förståelsen för vad internet gör med samhället är för grund. Artikeln kräver ingen speciell introduktion; i den finns också länkar till alla andra texter i debatten. Gillar emellertid skarpt att den bildsatts med herr Jobs: “Apples Steve Jobs med en nu mycket gammal dator”. Pas mal.

Om public service i DN

Jag har idag publicerat en längre text om idén med public service i DN Kultur – då och framöver: Så ska public service blomstra i en digital era. Artikeln kan betraktas som ett inspel diskussionen om public service, och om det är funktion eller institution som ska vara styrande för denna idé i en digital tid. Mer konkreta förslag kommer framöver inom ramen för den publicservice-kommission jag är med i, vars slutrapport kommer att presenteras på MEG16 i Göteborg i början av april.

New book: Business Innovation and Disruption in the Music Industry

I was really pleased to hear that the book, Business Innovation and Disruption in the Music Industry – edited by Patrik Wikström and Robert DeFillippi has now been published. I have written an article in the book, “More music is better music”, and information about the book, including a number of preview pages can be found at the publisher’s web page. Looking forward to get a copy of my own!

Media intelligence

The work I have done as as guest professor at Södertörn university during autumn 2015, within a project run by my colleague Lars Degerstedt at the School of Natural Science, Technology and Environmental Studies, has come to an end, but is now slowly developing into a jointly written article. Below is the framework around the concept of “media intelligence” that we are trying to develop. The article ought to be finished within a week or so.

As the arguably leading business intelligence company in Scandinavia, Cision—with a history dating back to 1892 under the name, Svenska telegrambyrån, a company that provided press clipping services in Sweden—boasts online of being a global enterprise within communication and media intelligence. “Cision serves the complete workflow of today’s communications, social media and content marketing professionals” (Cision, 2016). Yet, what does the specific notion of media intelligence actually mean? Basically, it refers to various forms of updated media monitoring practices—both manual and automatic—foremost regarding print and broadcast media. Obviously, online media has also played an increasingly important role for the media intelligence business during the last two decades. Media intelligence can also be understood by comparing it to adjacent intelligence fields. There are for example analogies to the notion of intelligence operations, as these are executed within the military domain. Military intelligence is, in short, a defense discipline that exploits a number of information collection practices and strategies in order to provide guidance for commanders. Beside military usages, computational intelligence also offers some insights. In automatised versions, crawling the social web to identify relevant conversations for example, media intelligence resembles artificial intelligence, thus linking the notion of intelligence to both machines and systems as well as networks. Rather than sticking to a strict scholarly media and communication research perspective, media intelligence should hence be understood within a larger framework of military and artificial intelligence, as well as more traditional forms of business and market analytics, adjacent to fields as strategic communication, public relations and communication management.

If military intelligence is sometimes divided into strategic, operational and tactical intelligence, the same basically goes for media intelligence. Within the commercial business sector media intelligence is often roughly divided into: (A.) business intelligence (on a particular company level), and (B.) competitive intelligence (between similar companies regarding for example shared markets). In general, the latter is different from the former since it uses and analyses data outside company firewalls. However, during the last decade—mainly due to profound technological changes brought about by digitisation—the specificities of and boundaries between business intelligence and competitive intelligence have been modified. When society is gradually turning into a market of different mediated “value networks”, as Sven Hamrefors argued in 2010, “communication functions can no longer stay in their restricted domains and only deal with traditional communication issues” (Hamrefors, 2010). On the one hand, intelligence on a strategic business-to-business-level hence should not be separated from a more practical business-to-consumer-level. On the other hand, an increasing number of companies (foremost within the tech domain) operate in different market segments, making it more or less impossible to intelligence and monitor all relevant markets. The most obvious example is Google, which started with search, soon began making operating systems, run different forms of content platforms—and now even produces cars. The same can be said of an international media group as Schibsted who does business in a number of different digital domains (publishing, online marketplaces and services).

In more concrete terms, media intelligence uses data and computer science methods to analyse both social media and editorial media content. In general, within business intelligence today, it is often argued that a brave new world of insight awaits intelligence companies and their customers, if they have the courage—or the financial abilities—to analytically start working with the exponentially growing volumes of unstructured and semi-structured data—especially from new data sources as machines, sensors, logs, and (non-textual) social media and streaming data. Basic implementation for media intelligence involves curating data, keyword references and semantic analyses, as well as natural language processing via machine learning algorithms. In essence, most practices and operations are concerned with turning text into data for analysis. Text is hence still the dominant modality for most media intelligence operations, yet other modalities (images, sound and video) have during the last decade become increasingly important, especially in different social forms.

Most forms of media intelligence, departs from the ways in which we currently are enmeshed in an “interconnected communications ecosystem wherein social and traditional media sources feed each other for stories and conversation, and those conversations are supercharged by social technology” (Nuccio, 2015). Media intelligence hence refers to computational solutions that tries to synthesise innumerable online conversations into (more or less) appropriate insights that allow companies and organisations to manage, and sometimes even measure content performance and trends—with the paramount purpose to better forecast business strategies. As is to be expected, companies working within the intelligence business sector offer quite different suggestions as to what media intelligence actually means: “Media intelligence is the process of gathering all the data available through social media and news media outlets and analyzing the data to allow for better business decision making”, according to CustomScoop (2016); Volicon is said to be the “leading provider of enterprise media intelligence solutions serving the needs of broadcasters, networks, cable operators, and governments worldwide” (Volicon, 2016); and M-Brain states that its media intelligence solutions are designed to “monitor and measure your publicity and reputation” (M-Brain, 2016). What these companies have in common, however, is that they gather massive amounts of data points from user-generated content on social media sites, blogs and comment fields, combining these with traditional mass media output and other forms of publicly open data, all in order to provide—and ultimately sell—real-time insights and suggestions based on verifiable data. Media intelligence is hence, always about selling trust.

Some scholars have argued that during recent years media intelligence has witnessed a social transition, from various forms of social media monitoring to computational driven (social) media intelligence. The latter is said to be better equipped to minister noisy sociality, with an ability to uncover valuable insights hidden in the social media chatter (Moe & Schweidel, 2014). Research related to various forms of military intelligence, have furthermore tried to identify and forecast civil unrest and radical mobilization by mining textual content in open-source social media (Agarwal & Sureka, 2015). In general, social media intelligence is based on a rudimentary data management model where social data is segmented—from automatically categorised subsets of social data, to customising rules or filtering, based on criteria like date, location, web page type, sentiment and gender. Segmentation can also be done on more specific data, for example regarding Twitter or Facebook statistics (retweets, likes, comments, media type etcetera.) In essence, data management within social media intelligence collects massive volumes of data and separates it into structured and manageable packages that can help answer particular questions via different forms of machine learning and data mining. The notions are similar; machine learning and data mining often overlap. But they also differ in that machine learning focuses on prediction—based on known properties within the collected data—whereas data mining is about the discovery of unknown properties in the same data.

Then again, if sociality has recast media intelligence during the last decade, modalities of media content is another alteration. Traditionally, business intelligence companies have relied on textual offerings—basically because machine learning algorithms need text based documents (and databases) to be able to perform automatic analyses of large scale data sets. The modality of text has, in short, been default, not the least apparent in the ways companies within the business intelligence sector have advertised themselves: “Keep track of what is written about you, your company or your competitors” (Cision, 2016); “Infomap:r is a system for predictive analytics and text mining” (Infomap:r 2016); “We keep your organization up to date on what is written and said about you and your business environment” (Newsmachine, 2016).

We argue however, that if online media has experiencing a shift towards the social, at the same time online interaction has increasingly been enriched with images, sound and videos. These new media modalities have brought forth changes that are currently having profound effects on the media intelligence business. The before mentioned infographic from Domo serves as a vivid illustration of challenges in both social and non textual media form(at)s facing the intelligence business. If YouTube has been the epitome of an ever increasing non textuality of the information landscape during the last decade (Snickars & Vonderau, 2009), the blended mix of Facebook posts in different modalities in many ways acts as its social counterpart. However, ‘social’ and ‘multimedia’ is also converging. In January 2014, for example, Facebook announced an increasing shift towards visual content, “especially with video … In just one year, the number of video posts per person has increased 75% globally and 94% in the US” (Facebook, 2015a). In fact, during 2014 Facebook had an average of more than one billion video views every day. Social video is thus an increasing trend, and the release of Facebook Instant Articles in May 2015 was consequently aimed towards the ability of watching audiovisual news material seamlessly. “Zoom in and explore high-resolution photos by tilting your phone. Watch auto-play videos come alive as you scroll through stories. Explore interactive maps, listen to audio captions” (Facebook, 2015b).

If business intelligence in automated forms have traditionally relied on text mining to monitor, detect and analyse plain online text sources, the transition to new social media modalities hence causes difficulties, both conceptually and technologically. If humans can perceive their surroundings naturally in visual form, “this undertaking is quite challenging for machines”, according to Damian Borth. Within the field of computer science machine learning in non-textual forms is, in short, utterly complicated. “The lack of correspondence between the low-level features that machines can extract from videos (i.e., the raw pixel values) and the high-level conceptual interpretation a human associates with perceived visual content is referred to as the semantic gap” (Borth, 2014). Nevertheless, even if media intelligence is struggling with new social media modalities, companies within business intelligence are also increasingly trying to cope with this semantic gap. The media intelligence company Opoint for example, are said to be “the only player in the market that monitors real-time radio and TV… [and analyses] all types of media.” Sound bites are, for example, delivered direct to customers; that is, Opoint are not using speech recognition software transforming sound into text (Opoint, 2016). Another similar company, Lissly asserts that its “tool collects, sorts and visualizes data from different digital media.” Lissly offers its customers to be able to “listen to the conversations in your market” (Lissly, 2016). In fact, the metaphor of listening is often used today within the media intelligence business, a semantic indication that other media modalities than text are becoming continuously more important: “Notified takes social media listening and management to the next level” (Notified, 2016); “We believe that the world will be a little bit better if we listen more” (NewsMachine, 2016); “There is power in listening to what customers and stakeholders are saying about your business through social media platforms” (M-Brain, 2016).

The major challenge that all business intelligence (based on other media modalities than text) are faced with today, is the somewhat paradoxical movement away from the content of communication towards the medium of communication. On the one hand, there needs to be a market demand for this kind of transition to occur—that is: the request of monitoring other (or new forms of digital) media (as data streams). At present such demand is still by and large insufficient, yet mainly because media intelligence algorithms still cannot produce appropriate results from non textual information. On the other hand, the transition—or perhaps dialectics between content and medium—also resonates in an interesting way with debates within classical media theory as to what constitutes the bias of communication. Content and medium have, in short, always been intertwined. Following Harold Innis in the 1950s, and his belief that the stability of cultures depended on the balance and proportion of each particular media form—from clay to papyrus—he claimed that each medium embodied a certain bias in terms of organisation and control of information (Innis, 1950). Marshall Mcluhan’s 1960s media theory, where the medium itself constituted the message, followed Innis ideas closely—or as McLuhan famously stated: “I am pleased to think of my own book Gutenberg Galaxy as a footnote to the observations of Innis” (McLuhan, 1964). Then again, similar ideas have also been put forward within research fields associated with media intelligence. So called “media richness theory”, for example, have been developed within organisation and management studies to describe a medium’s ability to reproduce information. Basically, communications that require a long time to enable understanding are lower in richness. “Rich media are personal and involve face-to-face contact between managers, while media of lower richness are impersonal and rely on rules, forms, procedures, or data bases”, according to Richard L. Daft and Robert H. Lengel (1986). Following media richness theory, face-to-face communication is thus perceived as the richest media form since it provides immediate feedback, and such ideas also resonates in the contemporary work of Sherry Turkle, as in her new book, Reclaiming Conversation. The Power of Talk in a Digital Age (2015). Daft and Lengel’s media richness theory was introduced in the 1980s to help organisations cope with various forms of communication challenges. In an equivalent way these ideas can be today be helpful when trying to explain the conceptual and technical obstacles facing the media intelligence business, regarding for example social competitive intelligence and media analytics.

Post-analogt

Min kollega Alexandra Borg har idag skrivit en utmärkt understreckare i SvD – Bilderboken uppfinner seendet på nytt. Borg är litteraturvetare vid Uppsala universitet och håller så sakteliga på att etablera sig som landets kanske främsta kännare av bokmediets digitala konvulsioner (tillsammans driver vid sedan två år ett nätverk kring just bokmediets omvandling, med en kommande workshop i Malmö om några månader). Framför allt är Borgs resonemang kring hur digitaliseringen revitaliserar det analoga bokmediet intressant. Digi­taliseringen “har haft en stimulerande effekt på bilderboken som föremål”, skriver hon. “När boken-som-medium är under omvandling, och dess territorium hotat, är det som om författarna mer än tidigare utforskar uttrycksformens gränser.” Tidigare hette det ofta att denna tendens handlade om en sorts analog nostalgi – vilket väl fortsatt titt som tätt är fallet, exempelvis beträffande Quentin Tarantinos nya film, The Hateful Eight, inspelad på 70 mm vintage-celluloid.

Men jag undrar om detta koncept inte snarare borde föras över mot resonemang kring ett slags övergripande post-analogitet som tar spjärn mot “det digitala”. Den revival som vinyl-skivor upplever sorterar in här, men boken är förmodligen den medieform där en sam- och framtida post-analogitet framträder som allra tydligast. Borg skriver mot slutet av sin text att “fenomenet kan relateras till två andra postdigitala, mediestrategiska tendenser: storytelling och så kallat 360-gradigt berättande.” Det stämmer – men bara delvis. Dels tror jag man bör överge det post-digitala som begrepp och beskrivning av analoga medieformer, dels är dessa bägge tendenser tydliga även i helt andra mediesammanhang, som exempelvis dataspel och den vertikala integration som länge präglat Hollywood och som nu är legio inom såväl spel- som filmbranschen. Olika slags pågående omförhandlingar kring boken (och dess materialitet) som medium bör snarare betecknas som post-analogt – och här är det givetvis också intressant att fundera på vilka medieformer som inte tar sig den här typen av uttryck, televisionen framför allt. Post-analog tv finns ju inte på kartan (även om det linjära tittandet fortfarande är omfattande).

More Media, More People

During autumn I have been working as a guest professor at Södertörn university, within a project run by my colleague Lars Degerstedt at the School of Natural Science, Technology and Environmental Studies. Together we are now writing an article which deals with new forms of media intelligence, and different challenges for the competitive intelligence business. The idea is to have a finished article in mid January and submitting it to Nordicom Review. At present the article has the title: “More Media, More People—Conceptual Challenges for Social and Multimodal Data Driven Competitive Intelligence”. The introduction gives some hints of what we are trying to do, and the text starts like this:

Today, the amount of data produced in a single minute is mind-numbing. Streams—if not floods—of social and multimodal data consequently pose a pivotal challenge for companies within the competitive intelligence business. One of these, the computer software company Domo, has marketed itself as a service designed to provide direct and simplified, real time access to business data without IT involvement. According to Domo, the contemporary data deluge shows no sign of slowing down. “Data Never Sleeps” has hence been the appropriate title of a series of infographics the company has released. The latest version 3.0 was presented in August 2015. Much of what we do every day happens in the digital realm, Domo states. These activities leave an ever increasing digital t®ail “that can be measured and analysed”. Correspondingly, the infographic “Data Never Sleeps 3.0” revealed that every minute users liked a staggering 4,166,667 posts on Facebook, 347,222 tweets were sent on Twitter, at Netflix 77,160 hours of video were streamed every minute—and 300 hours of video uploaded on YouTube. Furthermore, 284,722 images were shared on Snapchat, and at Apple 51,000 apps were downloaded. Notably, these social data transactions occurred every minute, around the clock (Domo, 2015).

Sleepless data hence seems to be the perfect description of today’s global information landscape. Crowd or community based social media, in short, produces data flows that are both a blessing and a curse for competitive intelligence businesses. Handling new forms of social and multimodal data, however, requires new skills—conceptually as well as technologically. However, no data is error-free. On the contrary. There are a number of myths that flourish within the contemporary hype of Big Data. So called data cleansing for example, always has to be performed before, say the depicted data in Domo’s infographic can be analysed. Moreover, the same data also has to be interpreted. All forms of information and media management within the competitive intelligence business basically follows the same pattern: data needs to be collected, entered, compiled, stored, processed, mined, and interpreted. And, importantly: ”the final term in this sequence—interpretation—haunts its predecessors”, as Lisa Gitelman has stressed in the aptly titled book, Raw Data is an Oxymoron (Gitelman, 2013, 3).

With “each click, share and like, the world’s data pool is expanding faster than we comprehend”, the Domo infographic informs potential customers. At a Domo event prior to the launch of the infographic 3.0, the data artist—yes, that is the way he describes himself—Jer Thorp, stated that “not only are we doing more with data, data is doing more with us”. For consumers and business users alike, “improving our lives” thus requires a better understanding of what contemporary “interactions with data” actually mean, according to both Thorp (and Domo). And naturally this is exactly what is being marketed: only Domo can help a business make sense of the “endless stream of data”. The company even has a business intelligence tool with the enticing name “Magic”, that lets customers “cleanse, combine and transform” their data. Data combinatorics provides greater insights, Domo asserts, and thus enables customers to see the whole picture. “Magic provides several intuitive tools to help you prepare your data”—and especially so, if Magic is combined with the company’s presentational tool kit that “quickly interprets the data for you, and suggests how to visualize it for maximum impact and clarity” (Domo, 2015). In other words, the infographic of Domo is aesthetically pleasing for a reason. Today within the competitive intelligence business, maximum impact simply requires Beatiful Data—which happens to be the title of a fascinating book by Orit Halpern. According to her, all data “must be crafted and mined in order to [become] valuable and beautiful” (Halpern, 2014, 5).

Domo is in many ways a successfull American start-up, currently funded by venture capital, but also with a cristal clear business plan. In a video demo, Domo state that their core idea revolves around “the future of business management”. The demo gives viewers an “exclusive look at Domo”, ending with the invocation: “what you need is a platform that brings your people and all the data they rely on together in one place.” In short, Domo is all about business intelligence as social data. Via this video demo, the beautiful infographic and the sleepless data presented by Domo, the purpose of this article is to address similar challenges facing competitive intelligence in an a gradually modified information landscape. When data structures information—what to collect and analyse? If Domo promises it’s customers that their platform makes it “easy to see the information you care about”, how is data perceive and conceptualised? (Domo, 2015). In this article, we argue that data driven competitive intelligence—which is basically what companies like Domo do—particularly needs to pay attention to new forms of (A.) crowd orientated and (B.) media saturated information. If business intelligence traditionally has referred to a set of techniques and tools that transforms textual data into useful information for business analysis, such techniques need to consider that the media landscape has been altered in both a social and non textual direction.

If more data—is better data (as some would have it), accordingly more people that create more media, should be understood in a similar way. This article will consequently start with some introductory remarks around the broader concept of “media intelligence”, and the ways that competitive intelligence businesses has adapted to a transformed media environment—turned datascape. In the subsequent sections, the notions of “social competitive intelligence” and “media analytics” are used as two further concepts that media intelligence evolve around. Firstly, social competitive intelligence tries to understand how a changing information environment will impact organizations and companies by monitoring events, actors and trends. Information today doesn’t only want to be free—information wants to be social. If general usage of technology was once described with terms like social engineering, the linchpin of today’s culture of connectivity is social software. By presenting some findings from the so called CIBAS-project, we thus describe how organisations and companies increasingly rely on (more or less) (in)formal social networking structures and individual decision making as a means to increase rapid response and agile creativity. Secondly, if business analytics focuses on developing insights primarily on textual data and statistical methods, media analytics basically does the same—yet giving priority to audiovisual media streams, often with a slant of sociality—so called social video is for example perceived as increasingly trendy in the way businesses will use social media in years ahead. In our article we use “fashion analytics” as an example, gleaned from a commercial sector where audiovisual big data is currently in vogue. Finally, some concluding remarks are presented.