Blogs

Managing Tweets as records part 2: how to capture

By Jesse Wilkins posted 07-12-2010 12:13

  

In my last post, I talked about Twitter and records management, with emphasis on whether a given Tweet might rise to the level of a record and need to be managed. In this post I will talk more about the technical aspects of capturing and managing Tweets. 

One way to manage Tweets more effectively is by essentially translating them into email messages (technically posts, at least in Outlook 2007). There is an Outlook plugin called TwInbox (formerly OutTwit) that does exactly this. Once the plugin is installed, it downloads all Twitter content from a particular account. This includes tweets sent from that account, mentions, direct messages, and the main Twitter stream of users that account follows - even those whose streams are not public. The posts are stored in the local Outlook message store and can then be moved into file shares, ECRM repositories, SharePoint libraries, etc. And since the Tweets are converted into Outlook items, they can be managed with rules, folders, and Search Folders in Outlook. 
 
Another way is to subscribe to a Tweet stream or query results using RSS. The end results will vary depending on the RSS client used to subscribe to the stream, but Outlook 2007 converts RSS posts into Outlook items which can be managed as described above. Other clients can output RSS streams into Excel spreadsheets or flat files, XML documents, email messages, or even PDFs, all of which could then be managed as with other similar types of records. 
 
A number of vendors have begun to target this issue with solutions designed to archive Twitter and other social media websites. As of this post, vendors in the market include but are probably not limited to Autonomy, Backupify, FaceTime, Iterasi, Smarsh, Socialware, Sonian, and ZL Technologies (using FaceTime's Unified Security Gateway). And there are any number of Twitter-specific websites that offer to backup and/or archive Twitter posts including BackupMyTweets, Tweetake, TweetBackup, TweetScan, and TwitterBackup. 
 
Some of these solutions are cloud-based, which could result in the same issues with discovery and compliance that Twitter presents in terms of how to make Tweets available for review and production. In others, the Tweets are archived locally to an application server or an appliance, but users still need to confirm the format used to store Tweets and other archived content. They also need to confirm whether Tweets can be kept selectively or whether only the entire stream is kept; the format used to store and manage Tweets and other content; and how that content could be exported out and/or made available to other parties in the case of discovery, public records, audit, or some other reason. 
 
And the format will vary substantially between providers. Backupify, for example, allows users to export the stream as a single XML file. This is readable, but results in essentially an all-or-nothing approach requiring archiving and production of the entire stream within the backup rather than individual items. Tweetake, on the other hand, exports to Excel, one item per line. 
 
Finally, one of the things all organizations should be aware of is that the Library of Congress (LoC) has acquired Twitter's archives in its entirety. Every public Tweet ever sent will be stored and made available through the LoC's website at some point. The announcement can be found here. As of this post, the archive is not available yet; per the agreement, "Only after a six-month delay can the Tweets be used for internal library use, for non-commercial research, public display by the library itself, and preservation." I bring this up because regardless of what retention period your organization assigns to a given Tweet, once the LoC has it, it will be preserved and presumably could be made available. This isn't any different than any other communications that cross the firewall, but it is something to be taken into consideration. 


#Records-Management #twitter #ElectronicRecordsManagement #ERM
0 comments
119 views

Permalink