Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

A Primer on Foreign Language e-Discovery

By Ari Kaplan
November 26, 2007
While e-discovery may be Greek to many, it is those documents written in Chinese, Japanese, Korean and Russian that cause much of the trouble. These 'multi-byte' languages have exponentially more characters than the 26 letters and few other punctuation marks that Latin languages like English, Spanish, French and German need. In fact, the number of Chinese characters included in the Kangxi dictionary is over 47,000 (though only 3,000-4,000 are reportedly necessary for full literacy). The impact on e-discovery is significant considering the increased sophistication necessary for case evaluation.

At the most basic level, computers think in ones and zeros, with a one or zero being a bit. Eight bits is a byte. There are 256 different combinations of numbers you can create using a byte (2 (bits) to the eighth power). For languages that are not based solely on letters, i.e., those where symbols represent a concept or a syllable, you need to add bytes (256 x 256, which equals 66,536). That is the essence of multi-byte vs. single-byte languages ' single-byte languages have 256 possible combinations, while multi-byte languages have 66,536.

Confused? Then let's address codings. An encoding is a programmatical translation of what you input to what you get on the screen. The problem is when you have multiple encodings. For example, when analyzing an Outlook 2000 e-mail file (PST format) under a Japanese operating system that you then convert to an English-language machine for review, there will be problems because the native data in Japanese is corrupted due to linguistic differences.

Read These Next
The DOJ's Corporate Enforcement Policy: One Year Later Image

The DOJ's Criminal Division issued three declinations since the issuance of the revised CEP a year ago. Review of these cases gives insight into DOJ's implementation of the new policy in practice.

The DOJ's New Parameters for Evaluating Corporate Compliance Programs Image

The parameters set forth in the DOJ's memorandum have implications not only for the government's evaluation of compliance programs in the context of criminal charging decisions, but also for how defense counsel structure their conference-room advocacy seeking declinations or lesser sanctions in both criminal and civil investigations.

Use of Deferred Prosecution Agreements In White Collar Investigations Image

This article discusses the practical and policy reasons for the use of DPAs and NPAs in white-collar criminal investigations, and considers the NDAA's new reporting provision and its relationship with other efforts to enhance transparency in DOJ decision-making.

Bankruptcy Sales: Finding a Diamond In the Rough Image

There is no efficient market for the sale of bankruptcy assets. Inefficient markets yield a transactional drag, potentially dampening the ability of debtors and trustees to maximize value for creditors. This article identifies ways in which investors may more easily discover bankruptcy asset sales.

Compliance Officers: Recent Regulatory Guidance and Enforcement Actions and Mitigating the Risk of Personal Liability Image

This article explores legal developments over the past year that may impact compliance officer personal liability.