Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

A Primer on Foreign Language e-Discovery

By Ari Kaplan
November 26, 2007
While e-discovery may be Greek to many, it is those documents written in Chinese, Japanese, Korean and Russian that cause much of the trouble. These 'multi-byte' languages have exponentially more characters than the 26 letters and few other punctuation marks that Latin languages like English, Spanish, French and German need. In fact, the number of Chinese characters included in the Kangxi dictionary is over 47,000 (though only 3,000-4,000 are reportedly necessary for full literacy). The impact on e-discovery is significant considering the increased sophistication necessary for case evaluation.

At the most basic level, computers think in ones and zeros, with a one or zero being a bit. Eight bits is a byte. There are 256 different combinations of numbers you can create using a byte (2 (bits) to the eighth power). For languages that are not based solely on letters, i.e., those where symbols represent a concept or a syllable, you need to add bytes (256 x 256, which equals 66,536). That is the essence of multi-byte vs. single-byte languages ' single-byte languages have 256 possible combinations, while multi-byte languages have 66,536.

Confused? Then let's address codings. An encoding is a programmatical translation of what you input to what you get on the screen. The problem is when you have multiple encodings. For example, when analyzing an Outlook 2000 e-mail file (PST format) under a Japanese operating system that you then convert to an English-language machine for review, there will be problems because the native data in Japanese is corrupted due to linguistic differences.

Read These Next
Law Firms are Reducing Redundant Real Estate by Bringing Support Services Back to the Office Image

A trend analysis of the benefits and challenges of bringing back administrative, word processing and billing services to law offices.

Bankruptcy Sales: Finding a Diamond In the Rough Image

There is no efficient market for the sale of bankruptcy assets. Inefficient markets yield a transactional drag, potentially dampening the ability of debtors and trustees to maximize value for creditors. This article identifies ways in which investors may more easily discover bankruptcy asset sales.

Bit Parts Image

Summary Judgment Denied Defendant in Declaratory Action by Producer of To Kill a Mockingbird Broadway Play Seeking Amateur Theatrical Rights

Risks of “Baseball Arbitration” in Resolving Real Estate Disputes Image

“Baseball arbitration” refers to the process used in Major League Baseball in which if an eligible player's representative and the club ownership cannot reach a compensation agreement through negotiation, each party enters a final submission and during a formal hearing each side — player and management — presents its case and then the designated panel of arbitrators chooses one of the salary bids with no other result being allowed. This method has become increasingly popular even beyond the sport of baseball.

Disconnect Between In-House and Outside Counsel Image

'Disconnect Between In-House and Outside Counsel is a continuation of the discussion of client expectations and the disconnect that often occurs. And although the outside attorneys should be pursuing how inside-counsel actually think, inside counsel should make an effort to impart this information without waiting to be asked.