Maybe sixth-grade English was more helpful than we thought. One of the dullest grammar exercises is being used to help find potential terrorists, and save companies a bundle.Diagramming sentences - picking out subject, verb, object, adjective and other parts of speech - has been a staple of middle and high school grammar lessons for decades. Now, with financing from the Central Intelligence Agency, a California firm is using the technique to comb through e-mail messages and chat room talks, which can be a rich lode of corporate and government information, and a tough one to mine.Figuring out the connections among people, places and things is something computer algorithms do pretty well, as long as that information is structured, or categorized and put into a database. Looking through a company's customer file for a person named Bonds, for example, is fairly simple. But if the data is unstructured - if the word "bonds" hasn't been classified as the name of a ballplayer or as an investment option - searching becomes much more difficult.For people in business or in public service, only 20 percent or so of their information is kept in formal databases, noted Nick Patience, an analyst with the 451 Group, a technology research firm. The rest is unstructured, tucked away in e-mail messages, call logs, memos and instant messages.Attensity, based in Palo Alto, Calif., and financed in part by In-Q-Tel, the C.I.A.'s investment arm, has developed a method to parse electronic documents almost instantly, and diagram all of the sentences inside. ("Moby-Dick," for instance, took all of nine and a half seconds.) By labeling subjects and verbs and other parts of speech, Attensity's software gives the documents a definable structure, a way to fit into a database. And that helps turn day-to-day chatter into information that is relevant and usable.My article in today's New York Times had details.
Related TopicsDefenseTech >
© Copyright 2019 Military.com. All rights reserved. This material may not be published, broadcast, rewritten or redistributed.